[Cin] Stupid question about pixel conversions in CinGG (xfer functions)

Wed Feb 19 20:31:24 CET 2020

В сообщении от Wednesday 19 February 2020 18:54:38 Good Guy написал(а):
> yes this looks like it could be good.  However, I am just one coder.
> This could take quite some time.  It is true that a lot of the calculations
> in the xfer setup are not the most optimal (BC_CModel et al) but these
> are more to ensure consistency, not speed.  The bulk of the time is spent
> shoveling data from here to there.  Surprisingly, one big problem is to
> decide how big the slices should be, relative to the task setup for the
> slicer.  For a small frame, this overhead is a considerable fraction of
> time.
> 
> Pixel format colorspace mapping is done using YUV::yuv.
>  YUV::yuv.rgb_to_yuv_8(r, g, b, y, u, v);
> YUV::yuv.yuv_to_rgb_8(r, g, b, y, u, v);
> which is applied in the transfer functions generated by bccmdl.py
> 
> Insead of using a constant lookup table (lut) that could be changed
> easily to a pointer, as in yuv->rgb_to_yuv_8
> since the lut init already accepts (Kb,Kr) , and can be created and
> cached pretty easily.  The xfer function could use a lut
> that depends on the demand.  On some examination, I found that
> the number of operational parameters needed makes the colorspace
> conversion messy: in_cs, out_cs, in_range, out_range, at least... I
> was up to 6 params when I decided it was too messy to start that day.
> This is manageable, but I am pretty backed up on projects and problems.

Yeah, sorry, I tend to forgot all those colorspace details ..I also forgot about INTERLACED videos ..:} 
For those apparently only line (row) based buffers are easy? 
If we try to process them via ext. library, unaware of such layouts

> 
> The ffmpeg transfer function is great.  It has a lot of parameter
> flexibility
> and frequently will use asm for the most likely transfers.  Even so, It is
> complex, and can be slower than cin.  The colormodel<->AVPixelFormat
> mapping is good, but not great.  When frame data is transmitted between
> the two, direct moves are used if the mapping is good, and the data is
> moved twice when an intermediate compatible format is needed .
> 
> A pretty good idea (which CV/Einar uses, so there is a chunk of code
> that is already moving this way) might be to abandon the BC_CModels
> and use AVPixelFormat.  Then all transfers could be done with
> ffmpeg swscale.  This puts all of the eggs in the ffmpeg basket, for
> better or worse.  The interface would probably speed up, but all of the
> plugins and other stuff would need a lot of work.

Well, I was hoping for quick PoC hack ..
Not sure if I understand Cinelerra's video stages correctly?

Like, frame from stage1 (video_track_1) set its own output as
say yuv420p, 

then Stage2 (say, plugin_1 on top of video_track_1) demand, say RGB

so, transfer function invoked in this case - yuv420p -> rgb32 (say)

But does this transfer function ONLY do colorspace, 
or it also indirectly work as video muxing stage/fader?

Same for scaling .. scaling done outside of those, by separate pass 
(set of functions.. but I don't think in-place conversion possible in this case, 
like, you turn  2 byte per pixel format into 4 byte per pixel format and 
w/o some move of data it will overwrite your pixels! So, src/dst buffer can't be same) ?

> 
> There is a  class called LoadBalence (bad name) that creates load
> client threads that apply a function to a parameter set (usually a slice).
> Row data is usually best when it is memory cache align, and
> sequentially/locally processed in non-overlapping chunks.  It is hard
> to imagine that using a more complex schedule would improve
> throughput.  I have not benchmarked the xfer recently, but that may
> be a good idea to see if thirdparty functions offer help.
> 
> Permuting/Remapping color components  is a really
> weird idea in the first place, since everything seems to agree on
> what RGB is, and nobody seems to agree on what YUV is.
> The attached picture shows an RGB cube (smaller black cube
> inside the others) and the enclosing YUV cubes for BT601,
> BT709, and BT2020.   The black line in the middle is the white line
> which goes from YUV(0.0,0.5,0.5) black to YUV(1.0,0.5,0.5) white.
> As you can see, a bunch of the numerical space is outside of the
> RGB cube.  In fact, since the diagonal of RGB(grey scale) is an
> axis, it makes the YUV cube approx 2.8 times as large, and so
> the coverage is really very poor.  Almost 2/3 of the YUV colors
> are not legal RGB values.
> 

Oh, well ... but Cin doesn't have yuv > 8 bit per component internal colormodels?
So, 10-bit thingies must unpack itself into RGBA-float for having some sense of their high-bit-depthness ....

> To much data...
> morrroww
> 

Sorry for taking your time. I wish you relatively good day, outside of programming :}

> 
> 
> On Tue, Feb 18, 2020 at 11:18 PM Andrew Randrianasulu <
> randrianasulu at gmail.com> wrote:
> 
> > Hi, all!
> >
> > I was looking again at
> >
> > https://git.cinelerra-gg.org/git/?p=goodguy/cinelerra.git;a=blob;f=cinelerra-5.1/guicast/bcxfer.C;h=8eb6533baa68b7cc8b3fe1d763f03d65a3d28b4e;hb=HEAD
> >
> > and I think it works (via generated  functions from guicast/xfer) by
> > doing  pixel-at-a-time, with slices done at different CPUs.
> >
> > Is there possibility to optionally add few stagin buffesr, so something
> > like http://gegl.org/babl/index.html#Usage will have chance to work?
> >
> > "The processing operation that babl performs is copying including
> > conversions if needed
> > between linear buffers containing the same count of pixels, with different
> > pixel formats."
> >
> > So, address calculations usually send down to Cin's own (slow?) functions
> > will be reused to calculate size of buffer
> > (per slice, as done by slicer function), and then babl or ffmpeg or
> > something will transfer pixels to this new buffer, and then ..
> > I dunno, signal 'done' to upper calling function and switch pointers to
> > this new buffer? So, tricking Cin
> > into thinking conversion (per sliced area) was done by its of functions,
> > yet using different set of them
> > (not those row-based macro expanded functions I see in build tree)?  As
> > long as pixels organized the same (row x column),
> > and internally represented by same datatypes (like, first 4 bytes is R in
> > float, second is G, third is B, and last 4 bytes is alpha ...
> > for total of 16 bytes per pixel) this should work? Or I missed something?
> > --
> > Cin mailing list
> > Cin at lists.cinelerra-gg.org
> > https://lists.cinelerra-gg.org/mailman/listinfo/cin
> >
>