[Cin] Observations using GPU on DNxHD and MPEG proxy while running CinelerraGG

Pierre autourduglobe p.autourduglobe at gmail.com
Fri May 17 04:15:01 CEST 2019

I wouldn't have believed it.... But you are absolutely right!

Disable "Sync to VBlank" (option for OpenGL) in NVIDIA X Server 
Settings... has solved the problem!

In my tests using 4 mixers, whether the sources are in DNxHD, HDV or 
mgeg proxies, all now have an image rate close to 29.97 frame/sec 
(corresponding to the shooting rate).

Only my sources in AVC H264.mp4 do not reach this rate and are limited 
to about 15 to 22 frames/sec. But the proxies do.

I think you saved me the cost of buying a new video card.

Thank you.


On 19-05-15 18 h 28, Andrew Randrianasulu wrote:
> wild guess:
> Try to enable/disable Vsync in ... driver's control application (I assume you use proprietary drivers with Nvidia GTX-750ti)
> And also same in window manager settings.
> Try to set CPU and GPU to maximum performance (I think I observed some unusually slow playback
> when I tried to play av1 files with my libdav1d hack at just 1.8Ghz * 4 cores. Setting CPU to 2.6 Ghz fixed this!
> In both cases CPU was not completely loaded, according to gkrellm I have in a corner)
> Try to check how fast your PCI-E link.
> (lspci -vv as root)
> ---------------
> 01:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 8800 GS] (rev a2) (prog-if 00 [VGA controller])
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0, Cache Line Size: 64 bytes
>          Interrupt: pin A routed to IRQ 38
>          Region 0: Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
>          Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
>          Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
>          Region 5: I/O ports at e000 [size=128]
>          Expansion ROM at 000c0000 [disabled] [size=128K]
>          Capabilities: [60] Power Management version 3
>                  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
>                  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>          Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000fee00000  Data: 0000
>          Capabilities: [78] Express (v2) Endpoint, MSI 00
>                  DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
>                          ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                          RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 128 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>                  LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <256ns, L1 <1us
>                          ClockPM- Surprise- LLActRep- BwNot-
>                  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk+
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                  DevCap2: Completion Timeout: Not Supported, TimeoutDis+
>                  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>                  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
>                           Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                           Compliance De-emphasis: -6dB
>                  LnkSta2: Current De-emphasis Level: -6dB
>          Capabilities: [100 v1] Virtual Channel
>                  Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>                  Arb:    Fixed- WRR32- WRR64- WRR128-
>                  Ctrl:   ArbSelect=Fixed
>                  Status: InProgress-
>                  VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>                          Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>                          Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>                          Status: NegoPending- InProgress-
>          Capabilities: [128 v1] Power Budgeting <?>
>          Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
>          Kernel driver in use: nouveau
> ----------------
> LnkSta: Speed 5GT/s, Width x16 - sounds like PCI-E 2.0
> Check if VDPAU works for simple players - mpv, mplayer.
> В сообщении от Thursday 16 May 2019 00:22:30 Pierre autourduglobe написал(а):
>> Yes, I am also inclined to believe that my video card is the culprit...
>> for the lack of frame rate. It would not be able, through Open-GL, to
>> decode simultaneously the 5 streams (composer + 4 mixers).
>> I've never played any games on my computers either... but "gamer" cards
>> are much cheaper than pro cards, while being relatively powerful, and
>> that's why I've always chosen them for video editing.
>> My current video card dates from 2014, it's a Nvidia GTX-750ti:
>> https://www.gigabyte.com/Graphics-Card/GV-N75TOC-2GI#ov
>> It includes 2 GB of GDDR5 memory, 128-bit memory interface and a
>> Bandwidth of 86.4 GB/s
>> If it becomes clear that it is the guilty one... I'm ready to buy
>> another more powerful one.
>> I started looking at what could be bought, which would not be too
>> expensive and would be compatible with my current power supply (which I
>> don't want to change).
>> I also don't know if Nvidia video cards or AMD cards would be the most
>> compatible and optimized for Cinelerra-GG.
>> Here are the models I'm considering right now:
>> - Nvidia GeForce GTX 1070 (8GB, 256-Bit GDDR5, Bandwidth 256 GB/s
>> - Nvidia GeForce GTX 1660 Ti (6GB, 192-Bit GDDR6, Bandwidth 288 GB/s
>> - AMD Radeon RX 580 (8GB, 256-Bit GDDR5, Bandwidth 256 GB/s
>> - AMD Radeon RX 570 (4GB, 256-Bit GDDR5, Bandwidth 224 GB/s
>> But I'm not ready to buy right now....
>> Pierre
>> On 19-05-15 16 h 21, Phyllis Smith wrote:
>>> Pierre:
>>>   From your last 2 emails and tests as compared to what I see, I am
>>> thinking that the graphics board is the bottleneck.  Doing similar tests
>>> with the Clowns, as compared with your observations below, I am always
>>> getting close to 29.97 fps in either X11 or X11-OpenGL.  The reason I
>>> think it is probably your graphics board is because my laptop is not
>>> really a "work" computer but rather a "gaming" computer (it was an
>>> inexpensive AMD computer that has never, ever played a single game!) so
>>> I would imagine the graphics board is meant to be pretty good.
>>>      The results of these tests of the mpeg proxies tell me that with both
>>>      the X11-OpenGL driver and the X11 driver, using vdpau results in a very
>>>      slight reduction in the use of my CPU, but that this does not improve
>>>      the frame rate possible that these video drivers allow to display...
>>> The above seems to indicate that the graphics board does not improve
>>> anything and you have plenty of CPU anyway, so you might as well use that.
>>>      X11 allows in all cases to display at least 29.97 frame/sec sources
>>>      that
>>>      have been shot at this speed.
>>>      X11-OpenGL is always limited to a maximum of about 12 frames/sec.
>>>      These results are approximately true for all the types of media I
>>>      tested, whether DNxHD.mov, HDV (MPEG-2.m2t), AVC H264.mp4 or even
>>>      proxies in mpeg.mpeg.
>>>      Given these results, I don't really see the advantage of using
>>>      proxies... In any case, the video driver used will determine the
>>>      possible frame rate regardless of the type of media used.
>>>      I'm actually wondering if the constant frame rate limit of 12
>>>      frames/sec
>>>      provided by X11-OpenGL in my tests with 4 mixers, regardless of the
>>>      media type, doesn't actually indicate a bug somewhere or a limit
>>>      inherent in my equipment. But then how do you explain the best
>>>      throughput with X-11?
>>> Instead of working with 29.97 fps media, I loaded Big Buck Bunny which
>>> is 60 frames per second.  And there may be something strange going on as
>>> Pierre indicated that I will have to test on a faster computer.  Because
>>> when I played this, like Pierre, it seems to limit it at always 30 fps
>>> whether I user X11 or OpenGL.  Then when I proxy it to 1/2, I thought I
>>> should have improved the frame rate but it too was at only 30 fps.
>>> I will have to do the tests on GG's computer to eliminate the
>>> possibility of a limitation / bug.  Phyllis

More information about the Cin mailing list