Sunday, June 07, 2009

Clip Signed Data To Arbitrary Unsigned Range in SIMD/Assembly

This post is again dedicated to Video domain. But I certainly can say there are various other applications too where we can use it.

Clipping is very simple algorithm as it name indicates, we clip our data to certain range. There will be High value and Low value. If data value is less than Low value, assign data to Low and if data value goes upper than High value assign data to High. something like this:

Clip(Data,Low,High) = if Data is less than Low then data = Low, else if Data is greater than High then data= High , else data = Data ------------(1)

In video domain, after IDCT operation we get signed data for pixel, which should be technically unsigned data type. Here we do clip operation for pixel data, and limit the data between unsigned data type range, and in normal situation pixel bit depth is 8 (i.e unsigned char). So equ. (1) becomes :

Clip(Data,0,255) =if Data is less than 0 then data = 0, else if Data is greater than 255 data= 255, else data = Data ------------------------(2)

But pixel bit-depth is not limited to 8. As I mentioned in my previous post "H264:How to do conversion from 8 bits to 14 bits bit depth support" under the label "Video", H.264 support till 14 bit bit-depth, when you don't want to compromise with quality, go for higher bit -depth. And here pixel data type will be 'unsigned short'. Now we have to modify the 'Clip' function for higher bit-depth. And this time it is not fixed to 8 and not even 14, rather it can vary from 8 to 14 depends upon the YUV input bit-depth for encoder and luma or chroma bit-depth information (bit_depth_luma_minus8 and bit_depth_chroma_minus8) from input H.264 coded video input for decoder. So lets do this clipping in generic form. And remember Low value will be 0 always only High value will change. So equ (2) modified as :

High = (1^Pixel_Bit_Depth)-1
Clip(Data, 0, High) = if Data is less than 0 then data = 0, else if Data is greater than High then data= High , else data = Data ------------------------(3)

There are other optimized ways too for equ.(3), but that's not my concern as of today. so moving ahead for SIMD/assembly (MMX/SSE/SSE2). Now how to achieve the same operation in assembly. Actually if bit-depth is 8 then there is a single instruction available in SIMD as :

packuswb Rx0, Rx0 ;Considered data is in Rx0 (mm/xmm) SIMD register (if pixel type is unsigned char)

or if Rx1 is '0' then

paddusb Rx0, Rx1

if you want to saturate for unsigned short then we have

paddusw Rx0, Rx1

But that is not our case, so we have to go by other way. As we have data type 'unsigned short ' and Max value will be (1^Pixel_Bit_Depth)-1 , So here we goes :

unsigned short High = (1^Pixel_Bit_Depth)-1
unsigned short Range = 0x8000;
unsigned short Low = 0x7FFF - High;
unsigned short MaxHigh = 0xFFFF - High;

movdqu Rx1, Range
movdqu Rx2, Low
movdqu Rx3, MaxHigh
paddw Rx0, Rx1
paddusw Rx0, Rx2 ;Add unsigned saturation with Low
psubusw Rx0, Rx3 ;Subtract unsigned saturation with MaxHigh

(Note above code instructions are for SSE2 but applied for MMX too, also I tried to wrote for one pixel data, to use SIMD advantage properly some data shuffling is required, here my intention was to give idea, not the complete code for cut n paste.)


No comments:

Post a Comment