note, openCL is different to openGL, mostly being more about more accurate computations. On AMD FX4300, 32bit userspace but llvm probably uses avx? guest@slax:/dev/shm/mesa/BUILD$ RUSTICL_ENABLE=llvmpipe clpeak Platform: rusticl Device: llvmpipe (LLVM 20.1.3, 256 bits) Driver version : 25.2.0-devel (git-845611bb43) (Linux x86) Compute units : 8 Clock frequency : 300 MHz Global memory bandwidth (GBPS) float : 3.72 float2 : 4.08 float4 : 3.59 float8 : 2.81 float16 : 2.09 Single-precision compute (GFLOPS) float : 14.67 float2 : 17.86 float4 : 15.99 float8 : 14.72 float16 : 14.63 No half precision support! Skipped No double precision support! Skipped Integer compute (GIOPS) int : 13.89 int2 : 13.25 int4 : 12.85 int8 : 13.04 int16 : 11.51 Integer compute Fast 24bit (GIOPS) int : 13.65 int2 : 13.29 int4 : 13.23 int8 : 12.90 int16 : 11.08 Transfer bandwidth (GBPS) enqueueWriteBuffer : 2.82 enqueueReadBuffer : 1.08 enqueueWriteBuffer non-blocking : 2.89 enqueueReadBuffer non-blocking : 1.02 enqueueMapBuffer(for read) : 1.15 memcpy from mapped ptr : 3.02 enqueueUnmap(after write) : 2.22 memcpy to mapped ptr : 3.01 Kernel launch latency : 21.55 us guest@slax:/dev/shm/mesa/BUILD$ command to build somewhat minimal mesa (llvmpipe + amd): meson ../ --prefix=/usr/X11R7 --libdir=lib --strip --buildtype debugoptimized -Degl=enabled -Dosmesa=true -Dplatforms=x11 -Dgallium-drivers=r600,radeonsi,llvmpipe -Dvulkan-drivers=amd,swrast -Dgallium-nine=true -Dgallium-va=enabled -Dgallium-xa=disabled -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021 -Dvideo-codecs="all" of course you can set your own prefix ( I have X installed into non-default location). Biggest obstacle for me was that mesa git require some new llvm, and just released two days ago SPIRV-Tools-2024.4 ! And github "release" is of course broken, in sense you need to manually fetch headers at specific commit. Of course "real gpu" will get like >200 GFLOPS , even my puny GF710 was that fast, but possibility of lock up makes this option less attractive ;)
чт, 24 апр. 2025 г., 18:39 Andrew Randrianasulu <[email protected]>:
note, openCL is different to openGL, mostly being more about more accurate computations.
On AMD FX4300, 32bit userspace but llvm probably uses avx?
guest@slax:/dev/shm/mesa/BUILD$ RUSTICL_ENABLE=llvmpipe clpeak
Platform: rusticl Device: llvmpipe (LLVM 20.1.3, 256 bits) Driver version : 25.2.0-devel (git-845611bb43) (Linux x86) Compute units : 8 Clock frequency : 300 MHz
Global memory bandwidth (GBPS) float : 3.72 float2 : 4.08 float4 : 3.59 float8 : 2.81 float16 : 2.09
Single-precision compute (GFLOPS) float : 14.67 float2 : 17.86 float4 : 15.99 float8 : 14.72 float16 : 14.63
No half precision support! Skipped
No double precision support! Skipped
Integer compute (GIOPS) int : 13.89 int2 : 13.25 int4 : 12.85 int8 : 13.04 int16 : 11.51
Integer compute Fast 24bit (GIOPS) int : 13.65 int2 : 13.29 int4 : 13.23 int8 : 12.90 int16 : 11.08
Transfer bandwidth (GBPS) enqueueWriteBuffer : 2.82 enqueueReadBuffer : 1.08 enqueueWriteBuffer non-blocking : 2.89 enqueueReadBuffer non-blocking : 1.02 enqueueMapBuffer(for read) : 1.15 memcpy from mapped ptr : 3.02 enqueueUnmap(after write) : 2.22 memcpy to mapped ptr : 3.01
Kernel launch latency : 21.55 us
guest@slax:/dev/shm/mesa/BUILD$
command to build somewhat minimal mesa (llvmpipe + amd):
meson ../ --prefix=/usr/X11R7 --libdir=lib --strip --buildtype debugoptimized -Degl=enabled -Dosmesa=true -Dplatforms=x11 -Dgallium-drivers=r600,radeonsi,llvmpipe -Dvulkan-drivers=amd,swrast -Dgallium-nine=true -Dgallium-va=enabled -Dgallium-xa=disabled -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021 -Dvideo-codecs="all"
of course you can set your own prefix ( I have X installed into non-default location).
Biggest obstacle for me was that mesa git require some new llvm, and just released two days ago SPIRV-Tools-2024.4 !
And github "release" is of course broken, in sense you need to manually fetch headers at specific commit.
Of course "real gpu" will get like >200 GFLOPS , even my puny GF710 was that fast, but possibility of lock up makes this option less attractive ;)
but of course real ffmpeg command fail mysteriously: RUSTICL_ENABLE=llvmpipe ffmpeg -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -s 512:384 -r 10 -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=nv12,hwdownload,format=nv12" -c:a copy -c:s copy -c:v libx264 -f mp4 /dev/null -debug verbose ffmpeg: ../src/compiler/nir/nir_metadata.c:172: nir_metadata_check_validation_flag: Assertion `!(impl->valid_metadata & nir_metadata_not_properly_reset)' failed. Aborted
чт, 24 апр. 2025 г., 19:54 Andrew Randrianasulu <[email protected]>:
чт, 24 апр. 2025 г., 18:39 Andrew Randrianasulu <[email protected]>:
note, openCL is different to openGL, mostly being more about more accurate computations.
On AMD FX4300, 32bit userspace but llvm probably uses avx?
guest@slax:/dev/shm/mesa/BUILD$ RUSTICL_ENABLE=llvmpipe clpeak
Platform: rusticl Device: llvmpipe (LLVM 20.1.3, 256 bits) Driver version : 25.2.0-devel (git-845611bb43) (Linux x86) Compute units : 8 Clock frequency : 300 MHz
Global memory bandwidth (GBPS) float : 3.72 float2 : 4.08 float4 : 3.59 float8 : 2.81 float16 : 2.09
Single-precision compute (GFLOPS) float : 14.67 float2 : 17.86 float4 : 15.99 float8 : 14.72 float16 : 14.63
No half precision support! Skipped
No double precision support! Skipped
Integer compute (GIOPS) int : 13.89 int2 : 13.25 int4 : 12.85 int8 : 13.04 int16 : 11.51
Integer compute Fast 24bit (GIOPS) int : 13.65 int2 : 13.29 int4 : 13.23 int8 : 12.90 int16 : 11.08
Transfer bandwidth (GBPS) enqueueWriteBuffer : 2.82 enqueueReadBuffer : 1.08 enqueueWriteBuffer non-blocking : 2.89 enqueueReadBuffer non-blocking : 1.02 enqueueMapBuffer(for read) : 1.15 memcpy from mapped ptr : 3.02 enqueueUnmap(after write) : 2.22 memcpy to mapped ptr : 3.01
Kernel launch latency : 21.55 us
guest@slax:/dev/shm/mesa/BUILD$
command to build somewhat minimal mesa (llvmpipe + amd):
meson ../ --prefix=/usr/X11R7 --libdir=lib --strip --buildtype debugoptimized -Degl=enabled -Dosmesa=true -Dplatforms=x11 -Dgallium-drivers=r600,radeonsi,llvmpipe -Dvulkan-drivers=amd,swrast -Dgallium-nine=true -Dgallium-va=enabled -Dgallium-xa=disabled -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021 -Dvideo-codecs="all"
of course you can set your own prefix ( I have X installed into non-default location).
Biggest obstacle for me was that mesa git require some new llvm, and just released two days ago SPIRV-Tools-2024.4 !
And github "release" is of course broken, in sense you need to manually fetch headers at specific commit.
Of course "real gpu" will get like >200 GFLOPS , even my puny GF710 was that fast, but possibility of lock up makes this option less attractive ;)
but of course real ffmpeg command fail mysteriously:
RUSTICL_ENABLE=llvmpipe ffmpeg -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -s 512:384 -r 10 -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=nv12,hwdownload,format=nv12" -c:a copy -c:s copy -c:v libx264 -f mp4 /dev/null -debug verbose
ffmpeg: ../src/compiler/nir/nir_metadata.c:172: nir_metadata_check_validation_flag: Assertion `!(impl->valid_metadata & nir_metadata_not_properly_reset)' failed. Aborted
real hw opencl from RX550 and ffmpeg 7.1.1 works, just with ~5 fps ;) ./ffmpeg -hwaccel vaapi -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=p010,hwdownload,format=p010" -c:a copy -c:s copy -c:v rawvideo -f avi /dev/null -loglevel verbose [out#0/avi @ 0xbe7b000] Starting thread... frame= 1 fps=0.7 q=-0.0 size= 10KiB time=00:00:00.01 bitrate=4750.2kbits/s speed=0.0111x frame= 3 fps=1.5 q=-0.0 size= 32256KiB time=00:00:00.05 bitrate=5279543.5kbits/s speed=0.025x frame= 5 fps=2.0 q=-0.0 size= 80896KiB time=00:00:00.08 bitrate=7944424.2kbits/s speed=0.0334x frame= 8 fps=2.7 q=-0.0 size= 113408KiB time=00:00:00.13 bitrate=6960809.3kbits/s speed=0.0445x frame= 10 fps=2.9 q=-0.0 size= 161792KiB time=00:00:00.18 bitrate=7222219.5kbits/s speed=0.0524x frame= 13 fps=3.2 q=-0.0 size= 194304KiB time=00:00:00.23 bitrate=6814911.2kbits/s speed=0.0584x frame= 15 fps=3.3 q=-0.0 size= 242944KiB time=00:00:00.26 bitrate=7455793.2kbits/s speed=0.0593x frame= 18 fps=3.6 q=-0.0 size= 275456KiB time=00:00:00.31 bitrate=7118790.4kbits/s speed=0.0634x frame= 20 fps=3.6 q=-0.0 size= 323840KiB time=00:00:00.35 bitrate=7572134.4kbits/s speed=0.0637x frame= 23 fps=3.8 q=-0.0 size= 372480KiB time=00:00:00.40 bitrate=7620769.6kbits/s speed=0.0667x frame= 26 fps=4.0 q=-0.0 size= 404992KiB time=00:00:00.45 bitrate=7365289.1kbits/s speed=0.0693x frame= 28 fps=4.0 q=-0.0 size= 453632KiB time=00:00:00.48 bitrate=7680906.9kbits/s speed=0.0691x frame= 31 fps=4.1 q=-0.0 size= 485888KiB time=00:00:00.53 bitrate=7455779.2kbits/s speed=0.0711x frame= 33 fps=4.1 q=-0.0 size= 534528KiB time=00:00:00.56 bitrate=7719673.2kbits/s speed=0.0709x frame= 36 fps=4.2 q=-0.0 size= 567040KiB time=00:00:00.61 bitrate=7525222.1kbits/s speed=0.0726x frame= 38 fps=4.2 q=-0.0 size= 615680KiB time=00:00:00.65 bitrate=7751710.7kbits/s speed=0.0723x frame= 41 fps=4.3 q=-0.0 size= 664320KiB time=00:00:00.70 bitrate=7766675.4kbits/s speed=0.0737x frame= 43 fps=4.3 q=-0.0 size= 680448KiB time=00:00:00.73 bitrate=7593625.7kbits/s speed=0.0734x frame= 46 fps=4.4 q=-0.0 size= 745216KiB time=00:00:00.78 bitrate=7785584.9kbits/s speed=0.0746x frame= 49 fps=4.5 q=-0.0 size= 777728KiB time=00:00:00.83 bitrate=7637736.5kbits/s speed=0.0758x frame= 51 fps=4.4 q=-0.0 size= 826368KiB time=00:00:00.86 bitrate=7803284.3kbits/s speed=0.0754x frame= 54 fps=4.5 q=-0.0 size= 858624KiB time=00:00:00.91 bitrate=7665625.7kbits/s speed=0.0764xframe= 56 fps=4.5 q=-0.0 size= 891136KiB time=00:00:00.95 bitrate=7676729.7kbits/s speed=0.076x frame= 59 fps=4.5 q=-0.0 size= 955904KiB time=00:00:01.00 bitrate=7822942.6kbits/s speed=0.077x frame= 61 fps=4.5 q=-0.0 size= 972032KiB time=00:00:01.03 bitrate=7698318.0kbits/s speed=0.0766xframe= 64 fps=4.6 q=-0.0 size= 1036800KiB time=00:00:01.08 bitrate=7832287.4kbits/s speed=0.0774xframe= 67 fps=4.6 q=-0.0 size= 1069345KiB time=00:00:01.13 bitrate=7721752.5kbits/s speed=0.0782xframe= 69 fps=4.6 q=-0.0 size= 1117985KiB time=00:00:01.16 bitrate=7842330.5kbits/s speed=0.0778xframe= 72 fps=4.6 q=-0.0 size= 1150241KiB time=00:00:01.21 bitrate=7737010.4kbits/s speed=0.0785xframe= 74 fps=4.6 q=-0.0 size= 1198881KiB time=00:00:01.25 bitrate=7849136.7kbits/s speed=0.0782xframe= 77 fps=4.7 q=-0.0 size= 1231393KiB time=00:00:01.30 bitrate=7751917.8kbits/s speed=0.0788xframe= 79 fps=4.6 q=-0.0 size= 1263649KiB time=00:00:01.33 bitrate=7756100.8kbits/s speed=0.0785xframe= 82 fps=4.7 q=-0.0 size= 1328673KiB time=00:00:01.38 bitrate=7860442.5kbits/s speed=0.0791xframe= 85 fps=4.7 q=-0.0 size= 1360929KiB time=00:00:01.43 bitrate=7770411.2kbits/s speed=0.0797xframe= 87 fps=4.7 q=-0.0 size= 1409569KiB time=00:00:01.46 bitrate=7865219.6kbits/s speed=0.0793xframe= 90 fps=4.7 q=-0.0 size= 1442081KiB time=00:00:01.51 bitrate=7781358.9kbits/s speed=0.0799xframe= 92 fps=4.7 q=-0.0 size= 1490465KiB time=00:00:01.55 bitrate=7869477.9kbits/s speed=0.0795xframe= 95 fps=4.7 q=-0.0 size= 1522977KiB time=00:00:01.60 bitrate=7789851.9kbits/s speed=0.08x frame= 97 fps=4.7 q=-0.0 size= 1555489KiB time=00:00:01.63 bitrate=7793775.1kbits/s speed=0.0797xframe= 100 fps=4.8 q=-0.0 size= 1620257KiB time=00:00:01.68 bitrate=7877157.6kbits/s speed=0.0802xframe= 103 fps=4.8 q=-0.0 size= 1652513KiB time=00:00:01.73 bitrate=7802226.5kbits/s speed=0.0807xframe= 105 fps=4.8 q=-0.0 size= 1701153KiB time=00:00:01.76 bitrate=7880335.1kbits/s speed=0.0803xframe= 108 fps=4.8 q=-0.0 size= 1733665KiB time=00:00:01.81 bitrate=7809906.9kbits/s speed=0.0808xframe= 110 fps=4.8 q=-0.0 size= 1782305KiB time=00:00:01.85 bitrate=7884354.4kbits/s speed=0.0805xframe= 113 fps=4.8 q=-0.0 size= 1830945KiB time=00:00:01.90 bitrate=7886377.1kbits/s speed=0.0809xframe= 115 fps=4.8 q=-0.0 size= 1863201KiB time=00:00:01.93 bitrate=7886943.6kbits/s speed=0.0806xframe= 118 fps=4.8 q=-0.0 size= 1911841KiB time=00:00:01.98 bitrate=7888816.1kbits/s speed=0.081x frame= 120 fps=4.8 q=-0.0 size= 1927969KiB time=00:00:02.01 bitrate=7823873.9kbits/s speed=0.0807xframe= 123 fps=4.8 q=-0.0 size= 1992993KiB time=00:00:02.06 bitrate=7892075.9kbits/s speed=0.0811xframe= 126 fps=4.8 q=-0.0 size= 2025249KiB time=00:00:02.11 bitrate=7830362.5kbits/s speed=0.0814xframe= 128 fps=4.8 q=-0.0 size= 2073889KiB time=00:00:02.15 bitrate=7894104.9kbits/s speed=0.0812xframe= 131 fps=4.8 q=-0.0 size= 2106401KiB time=00:00:02.20 bitrate=7835635.4kbits/s speed=0.0815xframe= 133 fps=4.8 q=-0.0 size= 2154809KiB time=00:00:02.23 bitrate=7896071.0kbits/s speed=0.0813xframe= 136 fps=4.9 q=-0.0 size= 2187321KiB time=00:00:02.28 bitrate=7839692.4kbits/s speed=0.0816xframe= 138 fps=4.8 q=-0.0 size= 2235961KiB time=00:00:02.31 bitrate=7898718.1kbits/s speed=0.0813xframe= 141 fps=4.9 q=-0.0 size= 2284601KiB time=00:00:02.36 bitrate=7900038.5kbits/s speed=0.0816xframe= 144 fps=4.9 q=-0.0 size= 2316857KiB time=00:00:02.41 bitrate=7845821.4kbits/s speed=0.082x frame= 146 fps=4.9 q=-0.0 size= 2365497KiB time=00:00:02.45 bitrate=7901548.2kbits/s speed=0.0817xframe= 149 fps=4.9 q=-0.0 size= 2398009KiB time=00:00:02.50 bitrate=7849946.2kbits/s speed=0.082x frame= 151 fps=4.9 q=-0.0 size= 2446649KiB time=00:00:02.53 bitrate=7903785.6kbits/s speed=0.0818xframe= 154 fps=4.9 q=-0.0 size= 2478905KiB time=00:00:02.58 bitrate=7852993.8kbits/s speed=0.0821xframe= 156 fps=4.9 q=-0.0 size= 2527545KiB time=00:00:02.61 bitrate=7905082.9kbits/s speed=0.0818xframe= 159 fps=4.9 q=-0.0 size= 2576185KiB time=00:00:02.66 bitrate=7906135.4kbits/s speed=0.0821xframe= 161 fps=4.9 q=-0.0 size= 2608697KiB time=00:00:02.70 bitrate=7907073.1kbits/s speed=0.0819xframe= 164 fps=4.9 q=-0.0 size= 2657081KiB time=00:00:02.75 bitrate=7907295.6kbits/s speed=0.0821xframe= 166 fps=4.9 q=-0.0 size= 2673465KiB time=00:00:02.78 bitrate=7860770.3kbits/s speed=0.0819xframe= 169 fps=4.9 q=-0.0 size= 2738233KiB time=00:00:02.83 bitrate=7909127.1kbits/s speed=0.0822xframe= 172 fps=4.9 q=-0.0 size= 2770745KiB time=00:00:02.88 bitrate=7864254.0kbits/s speed=0.0824x rawvideo here only for testing opencl and decoder alone, without x264 overhead (can't make opencl operate fully on GPU) Not exactly stellar results, but may be GB/S in pcie bandwidth line report in dmesg mean GigaBITS, not gigabytes? Then it should give just 4 gigabytes per second, this card only can do 8x and motherboard only PCIE 2.0
пт, 25 апр. 2025 г., 22:06 Andrew Randrianasulu <[email protected]>:
чт, 24 апр. 2025 г., 19:54 Andrew Randrianasulu <[email protected]>:
чт, 24 апр. 2025 г., 18:39 Andrew Randrianasulu <[email protected]
:
note, openCL is different to openGL, mostly being more about more accurate computations.
On AMD FX4300, 32bit userspace but llvm probably uses avx?
guest@slax:/dev/shm/mesa/BUILD$ RUSTICL_ENABLE=llvmpipe clpeak
Platform: rusticl Device: llvmpipe (LLVM 20.1.3, 256 bits) Driver version : 25.2.0-devel (git-845611bb43) (Linux x86) Compute units : 8 Clock frequency : 300 MHz
Global memory bandwidth (GBPS) float : 3.72 float2 : 4.08 float4 : 3.59 float8 : 2.81 float16 : 2.09
Single-precision compute (GFLOPS) float : 14.67 float2 : 17.86 float4 : 15.99 float8 : 14.72 float16 : 14.63
No half precision support! Skipped
No double precision support! Skipped
Integer compute (GIOPS) int : 13.89 int2 : 13.25 int4 : 12.85 int8 : 13.04 int16 : 11.51
Integer compute Fast 24bit (GIOPS) int : 13.65 int2 : 13.29 int4 : 13.23 int8 : 12.90 int16 : 11.08
Transfer bandwidth (GBPS) enqueueWriteBuffer : 2.82 enqueueReadBuffer : 1.08 enqueueWriteBuffer non-blocking : 2.89 enqueueReadBuffer non-blocking : 1.02 enqueueMapBuffer(for read) : 1.15 memcpy from mapped ptr : 3.02 enqueueUnmap(after write) : 2.22 memcpy to mapped ptr : 3.01
Kernel launch latency : 21.55 us
guest@slax:/dev/shm/mesa/BUILD$
command to build somewhat minimal mesa (llvmpipe + amd):
meson ../ --prefix=/usr/X11R7 --libdir=lib --strip --buildtype debugoptimized -Degl=enabled -Dosmesa=true -Dplatforms=x11 -Dgallium-drivers=r600,radeonsi,llvmpipe -Dvulkan-drivers=amd,swrast -Dgallium-nine=true -Dgallium-va=enabled -Dgallium-xa=disabled -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021 -Dvideo-codecs="all"
of course you can set your own prefix ( I have X installed into non-default location).
Biggest obstacle for me was that mesa git require some new llvm, and just released two days ago SPIRV-Tools-2024.4 !
And github "release" is of course broken, in sense you need to manually fetch headers at specific commit.
Of course "real gpu" will get like >200 GFLOPS , even my puny GF710 was that fast, but possibility of lock up makes this option less attractive ;)
but of course real ffmpeg command fail mysteriously:
RUSTICL_ENABLE=llvmpipe ffmpeg -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -s 512:384 -r 10 -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=nv12,hwdownload,format=nv12" -c:a copy -c:s copy -c:v libx264 -f mp4 /dev/null -debug verbose
ffmpeg: ../src/compiler/nir/nir_metadata.c:172: nir_metadata_check_validation_flag: Assertion `!(impl->valid_metadata & nir_metadata_not_properly_reset)' failed. Aborted
real hw opencl from RX550 and ffmpeg 7.1.1 works, just with ~5 fps ;)
./ffmpeg -hwaccel vaapi -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -vf "format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=p010,hwdownload,format=p010" -c:a copy -c:s copy -c:v rawvideo -f avi /dev/null -loglevel verbose
[out#0/avi @ 0xbe7b000] Starting thread... frame= 1 fps=0.7 q=-0.0 size= 10KiB time=00:00:00.01 bitrate=4750.2kbits/s speed=0.0111x frame= 3 fps=1.5 q=-0.0 size= 32256KiB time=00:00:00.05 bitrate=5279543.5kbits/s speed=0.025x frame= 5 fps=2.0 q=-0.0 size= 80896KiB time=00:00:00.08 bitrate=7944424.2kbits/s speed=0.0334x frame= 8 fps=2.7 q=-0.0 size= 113408KiB time=00:00:00.13 bitrate=6960809.3kbits/s speed=0.0445x frame= 10 fps=2.9 q=-0.0 size= 161792KiB time=00:00:00.18 bitrate=7222219.5kbits/s speed=0.0524x frame= 13 fps=3.2 q=-0.0 size= 194304KiB time=00:00:00.23 bitrate=6814911.2kbits/s speed=0.0584x frame= 15 fps=3.3 q=-0.0 size= 242944KiB time=00:00:00.26 bitrate=7455793.2kbits/s speed=0.0593x frame= 18 fps=3.6 q=-0.0 size= 275456KiB time=00:00:00.31 bitrate=7118790.4kbits/s speed=0.0634x frame= 20 fps=3.6 q=-0.0 size= 323840KiB time=00:00:00.35 bitrate=7572134.4kbits/s speed=0.0637x frame= 23 fps=3.8 q=-0.0 size= 372480KiB time=00:00:00.40 bitrate=7620769.6kbits/s speed=0.0667x frame= 26 fps=4.0 q=-0.0 size= 404992KiB time=00:00:00.45 bitrate=7365289.1kbits/s speed=0.0693x frame= 28 fps=4.0 q=-0.0 size= 453632KiB time=00:00:00.48 bitrate=7680906.9kbits/s speed=0.0691x frame= 31 fps=4.1 q=-0.0 size= 485888KiB time=00:00:00.53 bitrate=7455779.2kbits/s speed=0.0711x frame= 33 fps=4.1 q=-0.0 size= 534528KiB time=00:00:00.56 bitrate=7719673.2kbits/s speed=0.0709x frame= 36 fps=4.2 q=-0.0 size= 567040KiB time=00:00:00.61 bitrate=7525222.1kbits/s speed=0.0726x frame= 38 fps=4.2 q=-0.0 size= 615680KiB time=00:00:00.65 bitrate=7751710.7kbits/s speed=0.0723x frame= 41 fps=4.3 q=-0.0 size= 664320KiB time=00:00:00.70 bitrate=7766675.4kbits/s speed=0.0737x frame= 43 fps=4.3 q=-0.0 size= 680448KiB time=00:00:00.73 bitrate=7593625.7kbits/s speed=0.0734x frame= 46 fps=4.4 q=-0.0 size= 745216KiB time=00:00:00.78 bitrate=7785584.9kbits/s speed=0.0746x frame= 49 fps=4.5 q=-0.0 size= 777728KiB time=00:00:00.83 bitrate=7637736.5kbits/s speed=0.0758x frame= 51 fps=4.4 q=-0.0 size= 826368KiB time=00:00:00.86 bitrate=7803284.3kbits/s speed=0.0754x frame= 54 fps=4.5 q=-0.0 size= 858624KiB time=00:00:00.91 bitrate=7665625.7kbits/s speed=0.0764xframe= 56 fps=4.5 q=-0.0 size= 891136KiB time=00:00:00.95 bitrate=7676729.7kbits/s speed=0.076x frame= 59 fps=4.5 q=-0.0 size= 955904KiB time=00:00:01.00 bitrate=7822942.6kbits/s speed=0.077x frame= 61 fps=4.5 q=-0.0 size= 972032KiB time=00:00:01.03 bitrate=7698318.0kbits/s speed=0.0766xframe= 64 fps=4.6 q=-0.0 size= 1036800KiB time=00:00:01.08 bitrate=7832287.4kbits/s speed=0.0774xframe= 67 fps=4.6 q=-0.0 size= 1069345KiB time=00:00:01.13 bitrate=7721752.5kbits/s speed=0.0782xframe= 69 fps=4.6 q=-0.0 size= 1117985KiB time=00:00:01.16 bitrate=7842330.5kbits/s speed=0.0778xframe= 72 fps=4.6 q=-0.0 size= 1150241KiB time=00:00:01.21 bitrate=7737010.4kbits/s speed=0.0785xframe= 74 fps=4.6 q=-0.0 size= 1198881KiB time=00:00:01.25 bitrate=7849136.7kbits/s speed=0.0782xframe= 77 fps=4.7 q=-0.0 size= 1231393KiB time=00:00:01.30 bitrate=7751917.8kbits/s speed=0.0788xframe= 79 fps=4.6 q=-0.0 size= 1263649KiB time=00:00:01.33 bitrate=7756100.8kbits/s speed=0.0785xframe= 82 fps=4.7 q=-0.0 size= 1328673KiB time=00:00:01.38 bitrate=7860442.5kbits/s speed=0.0791xframe= 85 fps=4.7 q=-0.0 size= 1360929KiB time=00:00:01.43 bitrate=7770411.2kbits/s speed=0.0797xframe= 87 fps=4.7 q=-0.0 size= 1409569KiB time=00:00:01.46 bitrate=7865219.6kbits/s speed=0.0793xframe= 90 fps=4.7 q=-0.0 size= 1442081KiB time=00:00:01.51 bitrate=7781358.9kbits/s speed=0.0799xframe= 92 fps=4.7 q=-0.0 size= 1490465KiB time=00:00:01.55 bitrate=7869477.9kbits/s speed=0.0795xframe= 95 fps=4.7 q=-0.0 size= 1522977KiB time=00:00:01.60 bitrate=7789851.9kbits/s speed=0.08x frame= 97 fps=4.7 q=-0.0 size= 1555489KiB time=00:00:01.63 bitrate=7793775.1kbits/s speed=0.0797xframe= 100 fps=4.8 q=-0.0 size= 1620257KiB time=00:00:01.68 bitrate=7877157.6kbits/s speed=0.0802xframe= 103 fps=4.8 q=-0.0 size= 1652513KiB time=00:00:01.73 bitrate=7802226.5kbits/s speed=0.0807xframe= 105 fps=4.8 q=-0.0 size= 1701153KiB time=00:00:01.76 bitrate=7880335.1kbits/s speed=0.0803xframe= 108 fps=4.8 q=-0.0 size= 1733665KiB time=00:00:01.81 bitrate=7809906.9kbits/s speed=0.0808xframe= 110 fps=4.8 q=-0.0 size= 1782305KiB time=00:00:01.85 bitrate=7884354.4kbits/s speed=0.0805xframe= 113 fps=4.8 q=-0.0 size= 1830945KiB time=00:00:01.90 bitrate=7886377.1kbits/s speed=0.0809xframe= 115 fps=4.8 q=-0.0 size= 1863201KiB time=00:00:01.93 bitrate=7886943.6kbits/s speed=0.0806xframe= 118 fps=4.8 q=-0.0 size= 1911841KiB time=00:00:01.98 bitrate=7888816.1kbits/s speed=0.081x frame= 120 fps=4.8 q=-0.0 size= 1927969KiB time=00:00:02.01 bitrate=7823873.9kbits/s speed=0.0807xframe= 123 fps=4.8 q=-0.0 size= 1992993KiB time=00:00:02.06 bitrate=7892075.9kbits/s speed=0.0811xframe= 126 fps=4.8 q=-0.0 size= 2025249KiB time=00:00:02.11 bitrate=7830362.5kbits/s speed=0.0814xframe= 128 fps=4.8 q=-0.0 size= 2073889KiB time=00:00:02.15 bitrate=7894104.9kbits/s speed=0.0812xframe= 131 fps=4.8 q=-0.0 size= 2106401KiB time=00:00:02.20 bitrate=7835635.4kbits/s speed=0.0815xframe= 133 fps=4.8 q=-0.0 size= 2154809KiB time=00:00:02.23 bitrate=7896071.0kbits/s speed=0.0813xframe= 136 fps=4.9 q=-0.0 size= 2187321KiB time=00:00:02.28 bitrate=7839692.4kbits/s speed=0.0816xframe= 138 fps=4.8 q=-0.0 size= 2235961KiB time=00:00:02.31 bitrate=7898718.1kbits/s speed=0.0813xframe= 141 fps=4.9 q=-0.0 size= 2284601KiB time=00:00:02.36 bitrate=7900038.5kbits/s speed=0.0816xframe= 144 fps=4.9 q=-0.0 size= 2316857KiB time=00:00:02.41 bitrate=7845821.4kbits/s speed=0.082x frame= 146 fps=4.9 q=-0.0 size= 2365497KiB time=00:00:02.45 bitrate=7901548.2kbits/s speed=0.0817xframe= 149 fps=4.9 q=-0.0 size= 2398009KiB time=00:00:02.50 bitrate=7849946.2kbits/s speed=0.082x frame= 151 fps=4.9 q=-0.0 size= 2446649KiB time=00:00:02.53 bitrate=7903785.6kbits/s speed=0.0818xframe= 154 fps=4.9 q=-0.0 size= 2478905KiB time=00:00:02.58 bitrate=7852993.8kbits/s speed=0.0821xframe= 156 fps=4.9 q=-0.0 size= 2527545KiB time=00:00:02.61 bitrate=7905082.9kbits/s speed=0.0818xframe= 159 fps=4.9 q=-0.0 size= 2576185KiB time=00:00:02.66 bitrate=7906135.4kbits/s speed=0.0821xframe= 161 fps=4.9 q=-0.0 size= 2608697KiB time=00:00:02.70 bitrate=7907073.1kbits/s speed=0.0819xframe= 164 fps=4.9 q=-0.0 size= 2657081KiB time=00:00:02.75 bitrate=7907295.6kbits/s speed=0.0821xframe= 166 fps=4.9 q=-0.0 size= 2673465KiB time=00:00:02.78 bitrate=7860770.3kbits/s speed=0.0819xframe= 169 fps=4.9 q=-0.0 size= 2738233KiB time=00:00:02.83 bitrate=7909127.1kbits/s speed=0.0822xframe= 172 fps=4.9 q=-0.0 size= 2770745KiB time=00:00:02.88 bitrate=7864254.0kbits/s speed=0.0824x
rawvideo here only for testing opencl and decoder alone, without x264 overhead (can't make opencl operate fully on GPU)
Not exactly stellar results, but may be GB/S in pcie bandwidth line report in dmesg mean GigaBITS, not gigabytes? Then it should give just 4 gigabytes per second, this card only can do 8x and motherboard only PCIE 2.0
And this line decodes on GPU, pulls it to host, scales to FHD size, uploads to GPU for opencl tonemapping, then downloads back to host for x264 encoding: ./ffmpeg -hwaccel vaapi -init_hw_device opencl=ocl -filter_hw_device ocl -i ~/K38_sdcard1/Documents/iPhone11_4K-recorder_59.940HDR10.mov -vf "scale=1920:1080,format=p010,hwupload,tonemap_opencl=tonemap=mobius:param=0.01:desat=0:r=tv:p=bt709:t=bt709:m=bt709:format=nv12,hwdownload,format=nv12" -c:a copy -c:s copy -c:v libx264 -f mp4 /dev/shm/fhd-sdr-ocl.mp4 speeds up to nearly 7.0 fps! Without hw decoding it goes slower, at may be 5 fps. But may be on 64bit userspace sw decoding will be faster!
participants (1)
-
Andrew Randrianasulu