Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf
Скачиваний:
30
Добавлен:
23.08.2013
Размер:
4.27 Mб
Скачать

DESIGN AND PERFORMANCE

238

Figure 7.13 8 × 8 block after FDCT, quant, rescale, IDCT

7.2.4 Wavelet Transform

The DWT was chosen for MPEG-4 still texture coding because it can out-perform blockbased transforms for still image coding (although the Intra prediction and transform in H.264 performs well for still images). A number of algorithms have been proposed for the efficient coding and decoding of the DWT [23–25]. One issue related to software and hardware implementations of the DWT is that it requires substantially more memory than block transforms, since the transform operates on a complete image or a large section of an image (rather than a relatively small block of samples).

7.2.5 Quantise/Rescale

Scalar quantisation and rescaling (Chapter 3) can be implemented by division and/or multiplication by constant parameters (controlled by a quantisation parameter or quantiser step size). In general, multiplication is an expensive computation and some gains may be achieved by integrating the quantisation and rescaling multiplications with the forward and inverse transforms respectively. In H.264, the specification of the quantiser is combined with that of the transform in order to facilitate this combination (see Chapter 6).

7.2.6 Entropy Coding

7.2.6.1 Variable-Length Encoding

In Chapter 3 we introduced the concept of entropy coding using variable-length codes (VLCs). In MPEG-4 Visual and H.264, the VLC required to encode each data symbol is defined by the standard. During encoding each data symbol is replaced by the appropriate VLC, determined by (a) the context (e.g. whether the data symbol is a header value, transform coefficient,

FUNCTIONAL DESIGN

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

239

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Variable-length encoding example

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 7.1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Input VLC

 

 

 

 

R (before output)

 

 

 

R (after output)

 

 

 

 

 

 

 

Value, V

Length, L

Value

Size

 

 

 

Value

Size

 

 

 

Output

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

0

 

 

 

 

 

 

101

 

 

3

 

 

 

 

 

101

3

 

101

 

3

 

 

 

 

 

 

11100

 

 

5

 

 

 

 

11100101

8

 

 

 

 

 

0

 

 

 

11100101

 

100

 

 

3

 

 

 

 

 

100

3

 

100

 

3

 

 

 

 

 

 

101

 

 

3

 

 

 

 

 

101100

6

 

101100

 

6

 

 

 

 

 

 

101

 

 

3

 

 

 

 

101101100

9

 

1

 

1

 

 

 

01101100

 

11100

 

 

5

 

 

 

 

 

111001

6

 

111001

 

6

 

 

 

 

 

 

1101

 

 

4

 

 

 

 

1101111001

10

 

11

 

2

 

 

 

01111001

 

 

. . . etc.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

New data

 

 

 

Select VLC

 

 

 

Look up value

Pack L bits of

 

 

 

 

More than S

 

 

 

 

 

 

 

 

Finished data

 

 

 

 

 

 

V into output

 

 

 

 

 

 

no

 

 

 

 

symbol

 

table

 

 

 

V and length L

 

 

register R

 

 

 

 

bytes in R ?

 

 

 

 

 

 

 

symbol

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

yes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Write S least

 

 

Right-shift R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

significant bytes to

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

by S bytes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

stream

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7.14 Variable length encoding flowchart

 

 

 

 

 

 

 

 

motion vector component, etc.) and (b) the value of the data symbol. Chapter 3 presented some examples of pre-defined VLC tables from MPEG-4 Visual.

VLCs (by definition) contain variable numbers of bits but in many practical transport situations it is necessary to map a series of VLCs produced by the encoder to a stream of bytes or words. A mechanism for carrying this out is shown in Figure 7.14. An output register, R, collects encoded VLCs until enough data are present to write out one or more bytes to the stream. When a new data symbol is encoded, the value V of the VLC is concatenated with the previous contents of R (with the new VLC occupying the most significant bits). A count of the number of bits held in R is incremented by L (the length of the new VLC in bits). If R contains more than S bytes (where S is the number of bytes to be written to the stream at a time), the S least significant bytes of R are written to the stream and the contents of R are right-shifted by S bytes.

Example

A series of VLCs (from Table 3.12, Chapter 3) are encoded using the above method. S = 1, i.e. 1 byte is written to the stream at a time. Table 7.1 shows the variable-length encoding process at each stage with each output byte highlighted in bold type.

Figure 7.15 shows a basic architecture for carrying out the VLE process. A new data symbol and context indication (table selection) are passed to a look-up unit that returns the value V and length L of the codeword. A packer unit concatenates sequences of VLCs and outputs S bytes at a time (in a similar way to the above example).

240

 

 

 

 

 

DESIGN AND PERFORMANCE

 

data

 

 

value V

 

 

 

 

 

 

Look-up

 

Pack

 

sequence of

 

table select

 

length L

 

 

 

table

output

 

S-byte words

 

 

 

Figure 7.15 Variable length encoding architecture

 

 

 

 

incomplete

 

 

 

 

Start decoding

Select VLC

Read 1 bit

VLC detected?

valid

Return syntax

Finished

table

 

element

decoding

 

 

 

 

 

 

 

 

 

 

 

invalid

 

Return error

 

 

 

 

 

 

 

indication

 

 

 

 

 

 

 

 

 

Figure 7.16 Flowchart for decoding one VLC

Issues to consider when designing a variable length encoder include computational efficiency and look-up table size. In software, VLE can be processor-intensive because of the large number of bit-level operations required to pack and shift the codes. Look-up table design can be problematic because of the large size and irregular structure of VLC tables. For example, the MPEG-4 Visual TCOEF table (see Chapter 3) is indexed by the three parameters Run (number of preceding zero coefficients), Level (nonzero coefficient level) and Last (final nonzero coefficient in a block). There are only 102 valid VLCs but over 16 000 valid combinations of Run, Level and Last, each corresponding to a VLC of up to 13 bits or a 20-bit ‘Escape’ code, and so this table may require a significant amount of storage. In the H.264 Variable Length Coding scheme, many symbols are represented by ‘universal’ Exp-Golomb codes that can be calculated from the data symbol value (avoiding the need for large VLC look-up tables) (see Chapter 6).

7.2.6.2 Variable-length Decoding

Decoding VLCs involves ‘scanning’ or parsing a received bitstream for valid codewords, extracting these codewords and decoding the appropriate syntax elements. As with the encoding process, it is necessary for the decoder to know the current context in order to select the correct codeword table. Figure 7.16 illustrates a simple method of decoding one VLC. The decoder reads successive bits of the input bitstream until a valid VLC is detected (the usual case) or an invalid VLC is detected (i.e. a code that is not valid within the current context). For example, a code starting with nine or more zeros is not a valid VLC if the decoder is expecting an MPEG-4 Transform Coefficient. The decoder returns the appropriate syntax element if a valid VLC is found, or an error indication if an invalid VLC is detected.

VLC decoding can be computationally intensive, memory intensive or both. One method of implementing the decoder is as a Finite State Machine. The decoder starts at an initial state and moves through successive states based on the value of each bit. Eventually, the decoder reaches a state that corresponds to (a) a complete, valid VLC or (b) an invalid VLC. The