чт, 24 апр. 2025 г., 18:39 Andrew Randrianasulu <randrianasulu@gmail.com>:
note, openCL is different to openGL, mostly being more about more accurate computations.

On AMD FX4300, 32bit userspace but llvm probably uses avx?


guest@slax:/dev/shm/mesa/BUILD$ RUSTICL_ENABLE=llvmpipe  clpeak

Platform: rusticl
  Device: llvmpipe (LLVM 20.1.3, 256 bits)
    Driver version  : 25.2.0-devel (git-845611bb43) (Linux x86)
    Compute units   : 8
    Clock frequency : 300 MHz

    Global memory bandwidth (GBPS)
      float   : 3.72
      float2  : 4.08
      float4  : 3.59
      float8  : 2.81
      float16 : 2.09

    Single-precision compute (GFLOPS)
      float   : 14.67
      float2  : 17.86
      float4  : 15.99
      float8  : 14.72
      float16 : 14.63

    No half precision support! Skipped

    No double precision support! Skipped

    Integer compute (GIOPS)
      int   : 13.89
      int2  : 13.25
      int4  : 12.85                                                                                      int8  : 13.04
      int16 : 11.51

    Integer compute Fast 24bit (GIOPS)                                                                   int   : 13.65
      int2  : 13.29
      int4  : 13.23                                                                                      int8  : 12.90
      int16 : 11.08

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 2.82                                                             enqueueReadBuffer               : 1.08
      enqueueWriteBuffer non-blocking : 2.89
      enqueueReadBuffer non-blocking  : 1.02
      enqueueMapBuffer(for read)      : 1.15
        memcpy from mapped ptr        : 3.02
      enqueueUnmap(after write)       : 2.22
        memcpy to mapped ptr          : 3.01

    Kernel launch latency : 21.55 us

guest@slax:/dev/shm/mesa/BUILD$

command to build somewhat minimal mesa (llvmpipe + amd):


meson ../ --prefix=/usr/X11R7 --libdir=lib --strip --buildtype debugoptimized -Degl=enabled -Dosmesa=true -Dplatforms=x11 -Dgallium-drivers=r600,radeonsi,llvmpipe -Dvulkan-drivers=amd,swrast  -Dgallium-nine=true -Dgallium-va=enabled  -Dgallium-xa=disabled -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021  -Dvideo-codecs="all"

of course you can set your own prefix ( I have X installed into non-default location).

Biggest obstacle for me was that mesa git require some new llvm, and just released two days ago SPIRV-Tools-2024.4 !

And github "release" is of course broken, in sense you need to manually fetch headers at specific commit.

Of course "real gpu" will get like >200 GFLOPS , even my puny GF710 was that fast, but possibility of lock up makes this option less attractive ;)



but of course real ffmpeg command fail mysteriously:

RUSTICL_ENABLE=llvmpipe ffmpeg  -init_hw_device opencl=ocl -filter_hw_device ocl  -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -s 512:384 -r 10 -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=nv12,hwdownload,format=nv12" -c:a copy -c:s copy -c:v libx264 -f mp4 /dev/null -debug verbose

ffmpeg: ../src/compiler/nir/nir_metadata.c:172: nir_metadata_check_validation_flag: Assertion `!(impl->valid_metadata & nir_metadata_not_properly_reset)' failed.
Aborted