[Cin] libaom encoding speed

Andrew Randrianasulu randrianasulu at gmail.com
Thu Mar 19 14:04:46 CET 2020


It finally finished!

int FFMPEG::init_encoder(const char*):
mismatch audio/video file format: /dev/shm/av1.webm
---this was due to missing av1 audio opts file-----

Render::render_single: Session finished.
Render::render_single: Session finished.
** rendered 272 frames in 15931.815 secs, 0.017 fps


and this was with just 720x400 frame size!

As another data point, just before this experiment I run x265 encode over same region:

x265 [info]: HEVC encoder version 3.2.1+1-b5c86a64bbbe
x265 [info]: build info [Linux][GCC 5.5.0][32 bit][noasm] 10bit
x265 [info]: using cpu capabilities: none!
x265 [info]: Main 4:2:2 10 profile, Level-3 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices                              : 1
x265 [info]: frame threads / pool features       : 2 / wpp(7 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias: 30 / 30 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra
x265 [info]: tools: strong-intra-smoothing deblock sao
x265 [info]: frame I:     13, Avg QP:24.57  kb/s: 2595.82
x265 [info]: frame P:     66, Avg QP:28.35  kb/s: 167.28
x265 [info]: frame B:    193, Avg QP:35.08  kb/s: 16.89
x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0%
x265 [info]: consecutive B-frames: 8.9% 13.9% 2.5% 73.4% 1.3%

encoded 272 frames in 56.67s (4.80 fps), 176.64 kb/s, Avg QP:32.95
Render::render_single: Session finished.
** rendered 272 frames in 56.747 secs, 4.793 fps
x265 [info]: HEVC encoder version 3.2.1+1-b5c86a64bbbe
x265 [info]: build info [Linux][GCC 5.5.0][32 bit][noasm] 10bit
x265 [info]: using cpu capabilities: none!
x265 [info]: Main 4:2:2 10 profile, Level-3 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices                              : 1
x265 [info]: frame threads / pool features       : 2 / wpp(7 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias: 30 / 30 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra
x265 [info]: tools: strong-intra-smoothing deblock sao
x265 [info]: frame I:     13, Avg QP:24.58  kb/s: 2591.60
x265 [info]: frame P:     65, Avg QP:28.29  kb/s: 170.42
x265 [info]: frame B:    194, Avg QP:35.10  kb/s: 16.73
x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0%
x265 [info]: consecutive B-frames: 7.7% 14.1% 2.6% 73.1% 2.6%

encoded 272 frames in 59.33s (4.58 fps), 176.53 kb/s, Avg QP:32.97
Render::render_single: Session finished.
** rendered 272 frames in 60.858 secs, 4.469 fps

As a side question, I wonder if adding frame size/project's colorspace
actually implementable (without much effort) idea. (you see, x265/x264 
prints their own stats, but other encoders might be not so verbose)

May be those 'extended' stats should be hidden behind configuration parameter.
(if anyone/anything relies on exact this type of output)


So, may be I should use one of those speed tunables in libaom:
-cpu-used 8
-tile-columns 1
-tile-rows 0

https://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=130284&pageNum=2

Currently testing cpu-used 8

Render::render_single: Session finished.
** rendered 272 frames in 2567.477 secs, 0.106 fps

\o/

so, current settings (in Cinelerra):

strict=-2
threads=8
cpu-used=8
tile-columns=2
row-mt=true
tile-rows=0

On this machine:
LANG=C lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       48 bits physical, 48 bits virtual
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
Vendor ID:           AuthenticAMD
CPU family:          21
Model:               2
Model name:          AMD FX(tm)-4300 Quad-Core Processor
Stepping:            0
CPU MHz:             2585.307
CPU max MHz:         3800.0000
CPU min MHz:         1400.0000
BogoMIPS:            7600.76
Virtualization:      AMD-V
L1d cache:           16K
L1i cache:           64K
L2 cache:            2048K
L3 cache:            4096K
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold

du -h /dev/shm/av1-cpu-used-8.webm
232K    /dev/shm/av1-cpu-used-8.webm

With libaom as included in Cinelerra. from feb, 1, 2019 (as far as I can see)


More information about the Cin mailing list