чт, 24 апр. 2025 г., 18:39 Andrew Randrianasulu <[email protected]>:
note, openCL is different to openGL, mostly being more about more accurate computations.
On AMD FX4300, 32bit userspace but llvm probably uses avx?
guest@slax:/dev/shm/mesa/BUILD$ RUSTICL_ENABLE=llvmpipe clpeak
Platform: rusticl Device: llvmpipe (LLVM 20.1.3, 256 bits) Driver version : 25.2.0-devel (git-845611bb43) (Linux x86) Compute units : 8 Clock frequency : 300 MHz
Global memory bandwidth (GBPS) float : 3.72 float2 : 4.08 float4 : 3.59 float8 : 2.81 float16 : 2.09
Single-precision compute (GFLOPS) float : 14.67 float2 : 17.86 float4 : 15.99 float8 : 14.72 float16 : 14.63
No half precision support! Skipped
No double precision support! Skipped
Integer compute (GIOPS) int : 13.89 int2 : 13.25 int4 : 12.85 int8 : 13.04 int16 : 11.51
Integer compute Fast 24bit (GIOPS) int : 13.65 int2 : 13.29 int4 : 13.23 int8 : 12.90 int16 : 11.08
Transfer bandwidth (GBPS) enqueueWriteBuffer : 2.82 enqueueReadBuffer : 1.08 enqueueWriteBuffer non-blocking : 2.89 enqueueReadBuffer non-blocking : 1.02 enqueueMapBuffer(for read) : 1.15 memcpy from mapped ptr : 3.02 enqueueUnmap(after write) : 2.22 memcpy to mapped ptr : 3.01
Kernel launch latency : 21.55 us
guest@slax:/dev/shm/mesa/BUILD$
command to build somewhat minimal mesa (llvmpipe + amd):
meson ../ --prefix=/usr/X11R7 --libdir=lib --strip --buildtype debugoptimized -Degl=enabled -Dosmesa=true -Dplatforms=x11 -Dgallium-drivers=r600,radeonsi,llvmpipe -Dvulkan-drivers=amd,swrast -Dgallium-nine=true -Dgallium-va=enabled -Dgallium-xa=disabled -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021 -Dvideo-codecs="all"
of course you can set your own prefix ( I have X installed into non-default location).
Biggest obstacle for me was that mesa git require some new llvm, and just released two days ago SPIRV-Tools-2024.4 !
And github "release" is of course broken, in sense you need to manually fetch headers at specific commit.
Of course "real gpu" will get like >200 GFLOPS , even my puny GF710 was that fast, but possibility of lock up makes this option less attractive ;)
but of course real ffmpeg command fail mysteriously: RUSTICL_ENABLE=llvmpipe ffmpeg -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -s 512:384 -r 10 -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=nv12,hwdownload,format=nv12" -c:a copy -c:s copy -c:v libx264 -f mp4 /dev/null -debug verbose ffmpeg: ../src/compiler/nir/nir_metadata.c:172: nir_metadata_check_validation_flag: Assertion `!(impl->valid_metadata & nir_metadata_not_properly_reset)' failed. Aborted