There are lots of professional applications which require higher bit depth support like studio application, HD application. In H.264 out of 11 profiles there are 7 profiles which supports more than 8 bits bit depth starting from High10 which supports 10 bits bit depth. There are High 444 Predictive and some related profiles which support upto 14 bits. Anyway the conversion procedure wise both are pretty much same except the specific values.

One more things we should keep in mind that bit depth may be different for Luma and Chroma components(both Cb and Cr will be of same bit depth).

So here I am describing the process conversion of encoder/decoder for than 8 bits, lets say specific to 14 bits support. For simplification I am taking both Luma and Chomra compo nets are of equal bit depth of BitDepth =14.So for this case BitDepthY = BitDepthC = BitDepth.

Note:For standardization reason, before that you must support at least main profile.I will put corresponding equation with equation number from the standard version ITU-T Rec. H.264 (11/2007) .

1)Generally for pixel variables we use 'char', first thing is convert this to 'short'

2)Change all your variables related to pixel/samples for 'short' like arrays, pointers, file read , file write , memcpy etc.

3)Change your 'clip' functions for pixels according bit depth for both Luma and Chroma components.

Clip1Y( x ) = Clip3( 0, ( 1 << BitDepthY ) – 1, x ) (5-3)

Clip1C( x ) = Clip3( 0, ( 1 << BitDepthC ) – 1, x ) (5-4)

4)Now decoder has to know the bit depth of the pixels so it has to read 'bit_depth_luma_minus8 ' and 'bit_depth_chroma_minus8 ' in the SPS header. With these parameters find out 'BitDepthY ' and 'QpBdOffsetY ' and similarly for chroma components.

BitDepthY = 8 + bit_depth_luma_minus8 (7-2)

QpBdOffsetY = 6 * bit_depth_luma_minus8 (7-3)

And

BitDepthC = 8 + bit_depth_chroma_minus8 (7-4)

QpBdOffsetC = 6 * ( bit_depth_chroma_minus8 + residual_colour_transform_flag ) (7-5)

In the encoder side the 'bit_depth_luma_minus8 ' and 'bit_depth_chroma_minus8 ' should be send in the SPS header to .264 bitstream.

5)As now each sample has bit depth of BitDepthY for luma and BitDepthC for chroma components , the PCM samples of I_PCM should be accordingly modified.

6)For intra prediction DC prediction mode value will change according to BitDepth.

pred4x4L[ x, y ] = ( 1 << ( BitDepthY – 1 ) ) (8-52)

pred8x8L[ x, y ] = ( 1 << ( BitDepthY – 1 ) ) (8-96)

predL[ x, y ] = ( 1 << ( BitDepthY – 1 ) ), with x, y = 0..15 (8-123)

And as well as Chroma components

predC[ x + xO, y + yO ] = ( 1 << ( BitDepthC – 1 ) ), with x, y = 0..3. (8-139)

predC[ x + xO, y + yO ] = ( 1 << ( BitDepthC – 1 ) ), with x, y = 0..3. (8-142)

predC[ x + xO, y + yO ] = ( 1 << ( BitDepthC – 1 ) ), with x, y = 0..3. (8-145)

7)If you are using prediction weights then some work we have to do here also.

o0C = luma_offset_l0[ refIdxL0WP ] * ( 1 << ( BitDepthY – 8 ) ) (8-295)

o1C = luma_offset_l1[ refIdxL1WP ] * ( 1 << ( BitDepthY – 8 ) ) (8-296)

And for chroma components

o0C = chroma_offset_l0[ refIdxL0WP ][ iCbCr ] * ( 1 << ( BitDepthC – 8 ) ) (8-300)

o1C = chroma_offset_l1[ refIdxL1WP ][ iCbCr ] * ( 1 << ( BitDepthC – 8 ) ) (8-301)

8)As bit depth of pixels has changed so it will affect a lot to quantization.

1.'pic_init_qp_minus26' range will be now -(26 + QpBdOffsetY ) to +25, inclusive.

2.SliceQPY will be in the range of -QpBdOffsetY to +51, inclusive.

SliceQPY = 26 + pic_init_qp_minus26 + slice_qp_delta (7-28)

So if we have bit depth of 14 so our SliceQPY will be in the range of -36 to +51.

3.'mb_qp_delta' will be in the range of –( 26 + QpBdOffsetY / 2) to +( 25 + QpBdOffsetY / 2 )

The value of QPY is derived as

QPY = ( ( QPY,PREV + mb_qp_delta + 52 + 2 * QpBdOffsetY ) % ( 52 + QpBdOffsetY ) ) - QpBdOffsetY (7-35)

And the working QP for luma components will be QP'Y , which is derived as

QP'Y = QPY + QpBdOffsetY (7-36)

Remember QP quantisation parameter values QPY is always in the range of –QpBdOffsetY to 51, inclusive. QP quantisation parameter values QPC is always in the range of –QpBdOffsetC to 51, inclusive.

4.For the chroma quantization parameters the value of QPC is determined from the current value of QPY (NOT QP'Y)and the value of 'chroma_qp_index_offset' (for Cb) or 'second_chroma_qp_index_offset' (for Cr).

If the chroma component is the Cb component, qPOffset is

qPOffset = chroma_qp_index_offset (8-315)

Otherwise (the chroma component is the Cr component), qPOffset is

qPOffset = second_chroma_qp_index_offset (8-316)

The value of qPI for each chroma component is derived as

qPI = Clip3( –QpBdOffsetC, 51, QPY + qPOffset ) (8-317)

And QPC = Chroma Quantization table[qPI]

Finally

The value of QP'C for the chroma components will be

QP'C = QPC + QpBdOffsetC (8-318)

5.The variable qP for quantization wil be QP'Y for luma components and QP'C for chorma components.

9)The bit depth also affect in deblocking process.

1.For average quantization parameter qPav the qPp and qPq will be correspond to QPY for chromaEdgeFlag equal to 0 (luma components) and QPC for chromaEdgeFlag equal to 1 (choma components).

2.Threshold variables a and ß will vary as

If chromaEdgeFlag is equal to 0,

a = a' * (1 << ( BitDepthY – 8 ) ) (8-466)

ß = ß' * (1 << ( BitDepthY – 8 ) ) (8-467)

Otherwise (chromaEdgeFlag is equal to 1),

a = a' * (1 << ( BitDepthC – 8 ) ) (8-468)

ß = ß' * (1 << ( BitDepthC – 8 ) ) (8-469)

3.Threshold variable tC0 will vary as

If chromaEdgeFlag is equal to 0,

tC0 = t'C0 * (1 << ( BitDepthY – 8 ) ) (8-476)

Otherwise (chromaEdgeFlag is equal to 1),

tC0 = t'C0 * (1 << ( BitDepthC – 8 ) ) (8-477)

So now we are ready for professional applications with 14 bits bit depth support for higher quality and by providing best compression with the power of H.264.

Tip for the topic: As you changed all pixel related data types to 'short' to support mote than 8 bits bit depth, just check for input which has 8 bits bit depth only. Is your code working fine???

I guess you dont want two different code base for 8 bits and more than 8 bits.Think hard and think naturally...you definitely dont need two different code base... ;)

really nice work..best of luck

ReplyDeleteThank you for giving detailed information on higher bit-depth implementations.

ReplyDeleteAlso, I think precision required for Quantized coefficients is more than 16-bit for 10-bit and 14-bit YUV inputs.

Excellent, and very helpful; this saved me days of thrashing about. Thank you very much.

ReplyDeleteEP, Massachusetts