Добавил:

Andrey Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Санкт-Петербургский государственный электротехнический университет "ЛЭТИ"

Предмет:

Электротехника

Файл:

Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf

Скачиваний:

Добавлен:

23.08.2013

Размер:

4.27 Mб

Скачать

☆

<<< < Предыдущая 18 19 20 21 22 23 24 25 26 27 28 2930 / 5530 31 32 33 34 35 36 37 38 39 40 41 42 > Следующая >>>

TEXTURE CODING	•
	149

Rectangular Temporal Scalability (Section 5.5.2);

Rectangular Spatial Scalability (Section 5.5.1);

Object-based Spatial Scalability (Section 5.5.1).

5.5.6 The Fine Granular Scalability Proﬁle

The FGS proﬁle includes Simple and Advanced Simple objects plus the FGS object which includes these tools:

B-VOP, Interlace and Alternate Quantiser tools;

FGS Spatial Scalability;

FGS Temporal Scalability.

FGS ‘Spatial Scalability’ uses the encoding and decoding techniques described in Section 5.5.3 to encode each frame as a base layer and an FGS enhancement layer. FGS ‘Temporal Scalability’ combines FGS (Section 5.5.3) with temporal scalability (Section 5.5.2). An enhancement-layer frame is encoded using forward or bidirectional prediction from base layer frame(s) only. The DCT coefﬁcients of the enhancement-layer frame are encoded in bitplanes using the FGS technique.

5.6 TEXTURE CODING

The applications targeted by the developers of MPEG4 include scenarios where it is necessary to transmit still texture (i.e. still images). Whilst block transforms such as the DCT are widely considered to be the best practical solution for motion-compensated video coding, the Discrete Wavelet Transform (DWT) is particularly effective for coding still images (see Chapter 3) and MPEG-4 Visual uses the DWT as the basis for tools to compress still texture. Applications include the coding of rectangular texture objects (such as complete image frames), coding of arbitrary-shaped texture regions and coding of texture to be mapped onto animated 2D or 3D meshes (see Section 5.8).

The basic structure of a still texture encoder is shown in Figure 5.68. A 2D DWT is applied to the texture object, producing a DC component (low-frequency subband) and a number of AC (high-frequency) subbands (see Chapter 3). The DC subband is quantised, predictively encoded (using a form of DPCM) and entropy encoded using an arithmetic encoder. The AC subbands are quantised and reordered (‘scanned’), zero-tree encoded and entropy encoded.

Discrete Wavelet Transform

The DWT adopted for MPEG-4 Still Texture coding is the Daubechies (9,3)-tap biorthogonal ﬁlter [6]. This is essentially a matched pair of ﬁlters, one low pass (with three ﬁlter coefﬁcients or ‘taps’) and one high pass (with nine ﬁlter taps).

Quantisation

The DC subband is quantised using a scalar quantiser (see Chapter 3). The AC subbands may be quantised in one of three ways:

150				MPEG-4 VISUAL

•			Scalable
	Scalable still		Scalable
	texture		Texture
				Advanced
			Scalable shape	Scalable
			Scalable shape	Texture
			coding	Texture
			coding


			Texture error
			resilience


			Wavelet tiling

		Figure 5.67 Tools and objects for texture coding

DC subband

Quant

Predictive

Still

coding

DWT

Arithmetic

Coded

texture

encoder

bitstream

Quant and

Zero-tree

Scanning

coding

AC subbands

Figure 5.68 Wavelet still texture encoder block diagram

1.Scalar quantisation using a single quantiser (‘mode 1’), prior to reordering and zero-tree encoding.

2.‘Bilevel’ quantisation (‘mode 3’) after reordering. The reordered coefﬁcients are coded one bitplane at a time (see Section 5.5.3 for a discussion of bitplanes) using zero-tree encoding. The coded bitstream can be truncated at any point to provide highly scalable decoding (in a similar way to FGS, see previous section).

3.‘Multilevel’ quantisation (‘mode 2’) prior to reordering and zero-tree encoding. A series of quantisers are applied, from coarse to ﬁne, with the output of each quantiser forming a series of layers (a type of scalable coding).

Reordering

The coefﬁcients of the AC subbands are scanned or reordered in one of two ways:

1.Tree-order. A ‘parent’ coefﬁcient in the lowest subband is coded ﬁrst, followed by its ‘child’ coefﬁcients in the next higher subband, and so on. This enables the EZW coding (see below) to exploit the correlation between parent and child coefﬁcients. The ﬁrst three trees to be coded in a set of coefﬁcients are shown in Figure 5.69.

TEXTURE CODING	•
	151

1st tree

2nd tree

3rd tree

Figure 5.69 Tree-order scanning

1st band 2nd band

3rd band

Figure 5.70 Band-by-band scanning

2.Band-by-band order. All the coefﬁcients in the ﬁrst AC subband are coded, followed by all the coefﬁcients in the next subband, and so on (Figure 5.70). This scanning method tends to reduce coding efﬁciency but has the advantage that it supports a form of spatial scalability since a decoder can extract a reduced-resolution image by decoding a limited number of subbands.

DC Subband Coding

The coefﬁcients in the DC subband are encoded using DPCM. Each coefﬁcient is spatially predicted from neighbouring, previously-encoded coefﬁcients.

152		MPEG-4 VISUAL

		Table 5.9 Zero-tree coding symbols

	Symbol	Meaning
•		Meaning
	ZeroTree Root (ZTR)	The current coefﬁcient and all subsequent coefﬁcients in the tree
	Value + ZeroTree Root	(or band) are zero. No further data is coded for this tree (or band).
	Value + ZeroTree Root	The current coefﬁcient is nonzero but all subsequent coefﬁcients are
	(VZTR)	zero. No further data is coded for this tree/band.
	Value (VAL)	The current coefﬁcient is nonzero and one or more subsequent
		coefﬁcients are nonzero. Further data must be coded.
	Isolated Zero (IZ)	The current coefﬁcient is zero but one or more subsequent coefﬁcients
		are nonzero. Further data must be coded.

AC Subband Coding

Coding of coefﬁcients in the AC subbands is based on EZW (Embedded Zerotree Wavelet coding). The coefﬁcients of each tree (or each subband if band-by-band scanning is used) are encoded starting with the ﬁrst coefﬁcient (the ‘root’ of the tree if tree-order scanning is used) and each coefﬁcient is coded as one of the four symbols listed in Table 5.9.

Entropy Coding

The symbols produced by the DC and AC subband encoding processes are entropy coded using a context-based arithmetic encoder. Arithmetic coding is described in Chapter 3 and the principle of context-based arithmetic coding is discussed in Section 5.4.1 and Chapter 6.

5.6.1 The Scalable Texture Proﬁle

The Scalable Texture Proﬁle contains just one object which in turn contains one tool, Scalable Texture. This tool supports the coding process described in the preceding section, for rectangular video objects only. By selecting the scanning mode and quantiser method it is possible to achieve several types of scalable coding.

(a)Single quantiser, tree-ordered scanning: no scalability.

(b)Band-by-band scanning: spatial scalability (by decoding a subset of the bands).

(c)Bilevel quantiser: bitplane-based scalability, similar to FGS.

(d)Multilevel quantiser: ‘quality’ scalability, with one layer per quantiser.

5.6.2The Advanced Scalable Texture Proﬁle

The Advanced Scalable Texture proﬁle contains the Advanced Scalable Texture object which adds extra tools to Scalable Texture. Wavelet tiling enables an image to be divided into several nonoverlapping sub-images or ‘tiles’, each coded using the wavelet texture coding process described above. This tool is particularly useful for CODECs with limited memory, since the wavelet transform and other processing steps can be applied to a subset of the image at a time. The shape coding tool adds object-based capabilities to the still texture coding process by adapting the DWT to deal with arbitrary-shaped texture objects. Using the error

CODING STUDIO-QUALITY VIDEO	153
I-VOP	•
Studio Slice
Studio DPCM
Studio Binary and	Simple Studio
Studio Binary and
Gray Shape	Core Studio
	Core Studio
Interlace	P-VOP
	P-VOP
Frame/Field	Studio Sprite
	Studio Sprite

Figure 5.71 Tools and objects for studio coding

resilience tool, the coded texture is partitioned into packets (‘texture packets’). The bitstream is processed in Texture Units (TUs), each containing a DC subband, a complete coded tree structure (tree-order scanning) or a complete coded subband (band-by-band scanning). A texture packet contains one or more coded TUs. This packetising approach helps to minimise the effect of a transmission error by localising it to one decoded TU.

5.7 CODING STUDIO-QUALITY VIDEO

Before broadcasting digital video to the consumer it is necessary to code (or transcode) the material into a compressed format. In order to maximise the quality of the video delivered to the consumer it is important to maintain high quality during capture, editing and distribution between studios. The Simple Studio and Core Studio proﬁles of MPEG-4 Visual are designed to support coding of video at a very high quality for the studio environment. Important considerations include maintaining high ﬁdelity (with near-lossless or lossless coding), support for 4:4:4 and 4:2:2 colour depths and ease of transcoding (conversion) to/from legacy formats such as MPEG-2.

5.7.1 The Simple Studio Proﬁle

The Simple Studio object is intended for use in the capture, storage and editing of high quality video. It supports only I-VOPs (i.e. no temporal prediction) and the coding process is modiﬁed in a number of ways.

Source format: The Simple Studio proﬁle supports coding of video sampled in 4:2:0, 4:2:2 and 4:4:4 YCbCr formats (see Chapter 2 for details of these sampling modes) with progressive

154											MPEG-4 VISUAL

•
	0		1	4			5

	2		3	6			7
		Y		Cb			Cr

	0		1	4		8		5		9

	2		3	6		10		7		11

		Y			Cb				Cr

4:4:4 macroblock structure (12 blocks)

Figure 5.72 Modiﬁed macroblock structures (4:2:2 and 4:4:4 video)

0	1
2	3

	4
6	5	8
6	7	8
	9

......etc

Figure 5.73 Example slice structure

or interlaced scanning. The modiﬁed macroblock structures for 4:2:2 and 4:4:4 video are shown in Figure 5.72.

Transform and quantisation: The precision of the DCT and IDCT are extended by three fractional bits. Together with modiﬁcations to the forward and inverse quantisation processes, this enables fully lossless DCT-based encoding and decoding. In some cases, lossless DCT coding of intra data may result in a coded frame that is larger than the original and for this reason the encoder may optionally use DPCM to code the frame data instead of the DCT (see Chapter 3).

Shape coding: Binary shape information is coded using PCM rather than arithmetic coding (in order to simplify the encoding and decoding process). Alpha (grey) shape may be coded with an extended resolution of up to 12 bits.

Slices: Coded data are arranged in slices in a similar way to MPEG-2 coded video [7]. Each slice includes a start code and a series of coded macroblocks and the slices are arranged in raster order to cover the coded picture (see for example Figure 5.73). This structure is adopted to simplify transcoding to/from an MPEG-2 coded representation.

VOL headers: Additional data ﬁelds are added to the VOL header, mimicking those in an MPEG-2 picture header in order to simplify MPEG-2 transcoding.

<<< < Предыдущая 18 19 20 21 22 23 24 25 26 27 28 2930 / 5530 31 32 33 34 35 36 37 38 39 40 41 42 > Следующая >>>

Соседние файлы в предмете Электротехника

#
23.08.20131.4 Mб12Revised report on the algorithmic language Algol-68.pdf
#
23.08.2013111.05 Кб10Rich H.H.J reference card.V6.01.2006.pdf
#
23.08.20131.79 Mб18Rich H.J for C programmers.2006.pdf
#
23.08.2013798.85 Кб16Richards M.The BCPL Cintcode and Cintpos user guide.2005.pdf
#
23.08.201341.83 Кб19Richards M.The BCPL reference manual.1967.pdf
#
23.08.20134.27 Mб30Richardson I.E.H.264 and MPEG-4 video compression.2003.pdf
#
23.08.2013718.38 Кб96Ridley R.Потери в обмотках вследствие эффекта близости.pdf
#
23.08.201364.93 Кб25Ritchie D.M.The development of the C language.1993.pdf
#
23.08.2013379.35 Кб13Rivard F.Smalltalk.A reflective language.pdf
#
23.08.201323.5 Mб11Rivero L.Encyclopedia of database technologies and applications.2006.pdf
#
23.08.2013672.52 Кб11Robertson G.D.A practical introduction to APL-1 & APL-2.2004.PDF