H.264 is a standard for video compression. It is also known as MPEG-4 Part 10, or MPEG-4 AVC (for Advanced Video Coding). It is one of the latest block-oriented motion-estimation-based codecs developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG) as the product of a partnership effort known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are jointly maintained so that they have identical technical content. The final drafting work on the first version of the standard was completed in May 2003.
Contents |
The intent of the H.264/AVC project was to create a standard capable of providing good video quality at substantially lower bit rates than previous standards (e.g. half or less the bit rate of MPEG-2, H.263, or MPEG-4 Part 2), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement. An additional goal was to provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems, including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems.
The standardization of the first version of H.264/AVC was completed in May 2003. The JVT then developed extensions to the original standard that are known as the Fidelity Range Extensions (FRExt). These extensions enable higher quality video coding by supporting increased sample bit depth precision and higher-resolution color information, including sampling structures known as YUV 4:2:2 and YUV 4:4:4. Several other features are also included in the Fidelity Range Extensions project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-specified perceptual-based quantization weighting matrices, efficient inter-picture lossless coding, and support of additional color spaces. The design work on the Fidelity Range Extensions was completed in July 2004, and the drafting work on them was completed in September 2004.
Further recent extensions of the standard have included adding five new profiles intended primarily for professional applications, adding extended-gamut color space support, defining additional aspect ratio indicators, defining two additional types of "supplemental enhancement information" (post-filter hint and tone mapping), and deprecating one of the prior FRExt profiles that industry feedback indicated should have been designed differently.
Scalable Video Coding as specified in Annex G of H.264/AVC allows the construction of bitstreams that contain sub-bitstreams that conform to H.264/AVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. In this case, high-level syntax and inter prediction reference pictures in the bitstream are constructed accordingly. For spatial and quality bitstream scalability, i.e. the presence of a sub-bitstream with lower spatial resolution or quality than the bitstream, NAL units are removed from the bitstream when deriving the sub-bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality signal by data of the lower spatial resolution or quality signal, is typically used for efficient coding. The Scalable Video Coding extension was completed in November 2007.
The H.264 name follows the ITU-T naming convention, where the standard is a member of the H.26x line of VCEG video coding standards; the MPEG-4 AVC name relates to the naming convention in ISO/IEC MPEG, where the standard is part 10 of ISO/IEC 14496, which is the suite of standards known as MPEG-4. The standard was developed jointly in a partnership of VCEG and MPEG, after earlier development work in the ITU-T as a VCEG project called H.26L. It is thus common to refer to the standard with names such as H.264/AVC, AVC/H.264, H.264/MPEG-4 AVC, or MPEG-4/H.264 AVC, to emphasize the common heritage. The name H.26L, referring to its ITU-T history, is less common, but still used. Occasionally, it is also referred to as "the JVT codec", in reference to the Joint Video Team (JVT) organization that developed it. (Such partnership and multiple naming is not uncommon—for example, the video codec standard known as MPEG-2 also arose from the partnership between MPEG and the ITU-T, where MPEG-2 video is known to the ITU-T community as H.262.[1])
H.264/AVC/MPEG-4 Part 10 contains a number of new features that allow it to compress video much more effectively than older standards and to provide more flexibility for application to a wide variety of network environments. In particular, some such key features include:
These techniques, along with several others, help H.264 to perform significantly better than any prior standard under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less, especially on high bit rate and high resolution situations.[2]
Like other ISO/IEC MPEG video standards, H.264/AVC has a reference software implementation that can be freely downloaded[3]. Its main purpose is to give examples of H.264/AVC features, rather than being a useful application per se. Some reference hardware design work is also under way in the Moving Picture Experts Group. The above mentioned are complete features of H.264/AVC covering all profiles of H.264. A profile for a codec is a set of features of that codec identified to meet certain set of specifications of intended applications. This means that many of the features listed are not supported in some profiles. Various profiles of H.264/AVC are discussed in next section.
The standard includes the following seven sets of capabilities, which are referred to as profiles, targeting specific classes of applications:
In addition, the standard contains four additional all-Intra profiles, which are defined as simple subsets of other corresponding profiles. These are mostly for professional (e.g., camera and editing system) applications:
As a result of the Scalable Video Coding extension, the standard contains three additional scalable profiles, which are defined as a combination of the H.264/AVC profile for the base layer (2nd word in scalable profile name) and tools that achieve the scalable extension:
| Baseline | Extended | Main | High | High 10 | High 4:2:2 | High 4:4:4
Predictive |
|
|---|---|---|---|---|---|---|---|
| I and P Slices | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| B Slices | No | Yes | Yes | Yes | Yes | Yes | Yes |
| SI and SP Slices | No | Yes | No | No | No | No | No |
| Multiple Reference Frames | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| In-Loop Deblocking Filter | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CAVLC Entropy Coding | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CABAC Entropy Coding | No | No | Yes | Yes | Yes | Yes | Yes |
| Flexible Macroblock Ordering (FMO) | Yes | Yes | No | No | No | No | No |
| Arbitrary Slice Ordering (ASO) | Yes | Yes | No | No | No | No | No |
| Redundant Slices (RS) | Yes | Yes | No | No | No | No | No |
| Data Partitioning | No | Yes | No | No | No | No | No |
| Interlaced Coding (PicAFF, MBAFF) | No | Yes | Yes | Yes | Yes | Yes | Yes |
| 4:2:0 Chroma Format | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Monochrome Video Format (4:0:0) | No | No | No | Yes | Yes | Yes | Yes |
| 4:2:2 Chroma Format | No | No | No | No | No | Yes | Yes |
| 4:4:4 Chroma Format | No | No | No | No | No | No | Yes |
| 8 Bit Sample Depth | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9 and 10 Bit Sample Depth | No | No | No | No | Yes | Yes | Yes |
| 11 to 14 Bit Sample Depth | No | No | No | No | No | No | Yes |
| 8x8 vs. 4x4 Transform Adaptivity | No | No | No | Yes | Yes | Yes | Yes |
| Quantization Scaling Matrices | No | No | No | Yes | Yes | Yes | Yes |
| Separate Cb and Cr QP control | No | No | No | Yes | Yes | Yes | Yes |
| Separate Color Plane Coding | No | No | No | No | No | No | Yes |
| Predictive Lossless Coding | No | No | No | No | No | No | Yes |
| Baseline | Extended | Main | High | High 10 | High 4:2:2 | High 4:4:4
Predictive |
| Level number | Max macro blocks per second | Max frame size (macro blocks) | Max video bit rate (VCL) for Baseline, Extended and Main Profiles | Max video bit rate (VCL) for High Profile | Max video bit rate (VCL) for High 10 Profile | Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Predictive Profiles | Examples for high resolution @ frame rate (max stored frames) in Level |
|---|---|---|---|---|---|---|---|
| 1 | 1485 | 99 | 64 kbit/s | 80 kbit/s | 192 kbit/s | 256 kbit/s | 128x96@30.9 (8) 176x144@15.0 (4) |
| 1b | 1485 | 99 | 128 kbit/s | 160 kbit/s | 384 kbit/s | 512 kbit/s | 128x96@30.9 (8) 176x144@15.0 (4) |
| 1.1 | 3000 | 396 | 192 kbit/s | 240 kbit/s | 576 kbit/s | 768 kbit/s | 176x144@30.3 (9) 320x240@10.0 (3) 352x288@7.5 (2) |
| 1.2 | 6000 | 396 | 384 kbit/s | 480 kbit/s | 1152 kbit/s | 1536 kbit/s | 320x240@20.0 (7) 352x288@15.2 (6) |
| 1.3 | 11880 | 396 | 768 kbit/s | 960 kbit/s | 2304 kbit/s | 3072 kbit/s | 320x240@36.0 (7) 352x288@30.0 (6) |
| 2 | 11880 | 396 | 2 Mbit/s | 2.5 Mbit/s | 6 Mbit/s | 8 Mbit/s | 320x240@36.0 (7) 352x288@30.0 (6) |
| 2.1 | 19800 | 792 | 4 Mbit/s | 5 Mbit/s | 12 Mbit/s | 16 Mbit/s | 352x480@30.0 (7) 352x576@25.0 (6) |
| 2.2 | 20250 | 1620 | 4 Mbit/s | 5 Mbit/s | 12 Mbit/s | 16 Mbit/s | 352x480@30.7(10) 352x576@25.6 (7) 720x480@15.0 (6) 720x576@12.5 (5) |
| 3 | 40500 | 1620 | 10 Mbit/s | 12.5 Mbit/s | 30 Mbit/s | 40 Mbit/s | 352x480@61.4 (12) 352x576@51.1 (10) 720x480@30.0 (6) 720x576@25.0 (5) |
| 3.1 | 108000 | 3600 | 14 Mbit/s | 17.5 Mbit/s | 42 Mbit/s | 56 Mbit/s | 720x480@80.0 (13) 720x576@66.7 (11) 1280x720@30.0 (5) |
| 3.2 | 216000 | 5120 | 20 Mbit/s | 25 Mbit/s | 60 Mbit/s | 80 Mbit/s | 1280x720@60.0 (5) 1280x1024@42.2 (4) |
| 4 | 245760 | 8192 | 20 Mbit/s | 25 Mbit/s | 60 Mbit/s | 80 Mbit/s | 1280x720@68.3 (9) 1920x1080@30.1 (4) 2048x1024@30.0 (4) |
| 4.1 | 245760 | 8192 | 50 Mbit/s | 50 Mbit/s | 150 Mbit/s | 200 Mbit/s | 1280x720@68.3 (9) 1920x1080@30.1 (4) 2048x1024@30.0 (4) |
| 4.2 | 522240 | 8704 | 50 Mbit/s | 50 Mbit/s | 150 Mbit/s | 200 Mbit/s | 1920x1080@64.0 (4) 2048x1080@60.0 (4) |
| 5 | 589824 | 22080 | 135 Mbit/s | 168.75 Mbit/s | 405 Mbit/s | 540 Mbit/s | 1920x1080@72.3 (13) 2048x1024@72.0 (13) 2048x1080@67.8 (12) 2560x1920@30.7 (5) 3680x1536@26.7 (5) |
| 5.1 | 983040 | 36864 | 240 Mbit/s | 300 Mbit/s | 720 Mbit/s | 960 Mbit/s | 1920x1080@120.5 (16) 4096x2048@30.0 (5) 4096x2304@26.7 (5) |
| Level number | Max macro blocks per second | Max frame size (macro blocks) | Max video bit rate (VCL) for Baseline, Extended and Main Profiles | Max video bit rate (VCL) for High Profile | Max video bit rate (VCL) for High 10 Profile | Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Predictive Profiles | Examples for high resolution @ frame rate (max stored frames) in Level |
In early 1998 the Video Coding Experts Group (VCEG – ITU-T SG16 Q.6) issued a call for proposals on a project called H.26L, with the target to double the coding efficiency (which means halving the bit rate necessary for a given level of fidelity) in comparison to any other existing video coding standards for a broad variety of applications. VCEG was chaired by Gary Sullivan (Microsoft [formerly PictureTel], USA). The first draft design for that new standard was adopted in August 1999. In 2000, Thomas Wiegand (Heinrich Hertz Institute, Germany) became VCEG co-chair. In December 2001, VCEG and the Moving Picture Experts Group (MPEG – ISO/IEC JTC 1/SC 29/WG 11) formed a Joint Video Team (JVT), with the charter to finalize the video coding standard. Formal approval of the specification came in March 2003. The JVT was (is) chaired by Gary Sullivan, Thomas Wiegand, and Ajay Luthra (Motorola, USA). In June 2004, the Fidelity range extensions (FRExt) project was finalized. From January 2005 to November 2007, the JVT was working on an extension of H.264/AVC towards scalability by an Annex called Scalable Video Coding (SVC). The JVT management team was extended by Jens-Reiner Ohm (Aachen University, Germany). Since July 2006, the JVT works on an extension of H.264/AVC towards multi-view video coding (MVC).
Versions of the H.264/AVC standard include the following completed revisions, corrigenda, and amendments (dates are final approval dates in ITU-T, while final "International Standard" approval dates in ISO/IEC are somewhat different and slightly later in most cases). Each version represents changes relative to the next lower version that is integrated into the text. Bold faced versions are published (or planned to be published).
Planned additions:
In countries where software patent regulations are upheld, the vendors of products which make use of H.264/AVC are expected to pay patent licensing royalties for the patented technology that their products use. This applies to the Baseline Profile as well [4]. A private organization known as MPEG LA, which is not affiliated in any way with the MPEG standardization organization, administers the licenses for patents applying to this standard, as well as the patent pools for MPEG-2 Part 1 Systems, MPEG-2 Part 2 Video, MPEG-4 Part 2 Video, and other technologies.
In January 2007, a U.S. District court jury gave an advisory opinion that one patent owned by Qualcomm should be invalidated.[5] Qualcomm had claimed that the patent had been incorporated in H.264 in violation of its patent.[6][7] The U.S. District Court judge has yet to rule on the verdict.[8]
Discussions are often held regarding the legality of free software implementations of codecs like H.264, especially concerning the legal use of GNU LGPL and GPL implementations of H.264 and other patented codecs. Consensus in discussions is that the allowable use depends on the laws of local jurisdictions. If operating or shipping a product in a country or group of countries where none of the patents covering H.264 apply, then using, for example, an LGPL implementation of the codec is not a problem: There is no conflict between the software license and the (non-existent) patent license.
Conversely, shipping (not necessarily implementing) a product in the U.S. which includes an LGPL H.264 decoder/encoder would be in violation of the software license of the codec implementation. In simple terms, the LGPL and GPL licenses require that any rights held in conjunction with distributing the code also apply to anyone receiving the code, and no further restrictions are put on distribution or use. If there is a requirement for a patent license to be sought, this is a clear violation of both the GPL and LGPL terms. Thus, the right to distribute patent-encumbered code under those licenses as part of the product is revoked per the terms of the GPL and LGPL. However, if the initial implementor of the code did not hold the appropriate rights to build and distribute the code, the legal situation becomes less clear, and it is likely that all users of the implementation, whether LGPL or not, would be in breach of the relevant patents.
There have been no known court cases testing this legal interpretation to be correct; however, its interpretation fits best with statements regarding the topic made by the Free Software Foundation on this patent rights issue, in cases likely to use an expert/authoritative source on interpretation of the GPL and LGPL in a possible lawsuit.[citation needed]
H.264/AVC experienced widespread adoption within a few years of the completion of the standard. It is employed widely in applications ranging from television broadcast to video for mobile devices. In order to ensure compatibility and problem-free adoption of H.264/AVC, many standards bodies have amended or added to video standards so that users of these standards can employ H.264/AVC.
Both of the major candidate next-generation optical video disc rival formats deployed in 2006 include the H.264/AVC High Profile as a mandatory player feature—specifically:
The Digital Video Broadcast (DVB) standards body in Europe approved the use of H.264/AVC for broadcast television in Europe in late 2004. The Advanced Television Systems Committee (ATSC) standards body in the United States is considering the possibility of specifying one or two advanced video codecs for its optional Enhanced-VSB (E-VSB) transmission mode for use in U.S. broadcast television. It has included H.264/AVC and VC-1 into Candidate Standards as CS/TSG-659r2[9] and CS/TSG-658r1[10] respectively for this purpose. The status of terrestrial broadcast adoption in some specific countries is as follows:
Direct broadcast satellite TV services will use the new standard, including:
H.264 is used in a number of IPTV services. In particular, including:
USDTV (now bankrupt) had announced plans to use H.264 for its pay-for-premium ATSC channels, which could only be decrypted by USDTV's set top boxes.
Inuk Networks the largest IPTV provider in the UK broadcasts in H.264.
The 3rd Generation Partnership Project (3GPP) has approved the inclusion of H.264/AVC as an optional feature in release 6 of its mobile multimedia telephony services specifications.
The North Atlantic Treaty Organisation (NATO) and the Motion Imagery Standards Board (MISB) of the United States Department of Defense (DoD) have adopted H.264/AVC as their preferred video codec for a broad variety of military applications.
The Internet Engineering Task Force (IETF) has completed a payload packetization format (RFC 3984) for carrying H.264/AVC video using its Real-time Transport Protocol (RTP).
The Internet Streaming Media Alliance (ISMA) has adopted H.264/AVC for its new ISMA 2.0 specifications.
Based on ITU-T H.32x standards, H.264/AVC is widely used for videoconferencing. Essentially all new videoconferencing products now support it.
The International Telecommunications Union-Radiocom. Sector (ITU-R) has adopted H.264/AVC in
In October 2005, Apple Inc began selling H.264-encoded videos over the Internet through their iTunes Music Store.[13] Initially selling just television series and music videos, they expanded in September 2006 to sell films. On May 30, 2007 Apple announced plans to integrate streaming of YouTube videos into the Apple TV. In a later interview, Apple VP David Moody revealed that all of YouTube's videos are going to be transcoded to H.264 for higher compatibility and quality on the Apple TV. Starting in June, 2007, YouTube will be automatically encoding all new uploads with H.264. Their intention is to have the entire video catalog available in H.264 by autumn 2007. Apple's iPhone and iPod Touch support H.264 Baseline Profile, Levels 2.1 and 3, at resolutions up to 480x320 or 640x480 and bitrates up to 1.5 Mbit/s and is capable of playing the YouTube video content.[14]
Adobe supports H.264 in its Flash Player [15].
Selected videos on the regular (non-mobile) YouTube website (including suitable quality videos uploaded after June 2007) are available in a selectable "High Quality" version which uses H.264, as well as having a higher bitrate and resolution.
The Australian Broadcasting Corporation offers streaming video online in a service called Iview using H.264 video[citation needed].
AVCHD is a high-definition recording format designed by Sony and Panasonic that uses H.264 (conforming to H.264 while adding additional application-specific features and constraints).
AVC-Intra is an intraframe compression only format, developed by Panasonic
| QuickTime | Nero Digital | LEAD | x264 | Mainconcept | Elecard | Telestream | VSofts | ProCoder3 | |
|---|---|---|---|---|---|---|---|---|---|
| I and P Slices | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| B Slices | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| SI and SP Slices | No | No | No | No | No | No | No | No | No |
| Multiple Reference Frames | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| In-Loop Deblocking Filter | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CAVLC Entropy Coding | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CABAC Entropy Coding | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Flexible Macroblock Ordering (FMO) | No | No | No | No | No | No | No | Yes | No |
| Arbitrary Slice Ordering (ASO) | No | No | No | No | No | No | No | No | No |
| Redundant Slices (RS) | No | No | No | No | No | No | No | No | No |
| Data Partitioning | No | No | No | No | No | No | No | No | No |
| Interlaced Coding (PicAFF, MBAFF) | No | No | Yes
(MBAFF) |
Yes
(MBAFF) |
Yes | Yes | No | Yes
(MBAFF) |
Yes |
| 4:2:0 Chroma Format | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Monochrome Video Format (4:0:0) | No | No | No | No | No | Yes | No | No | No |
| 4:2:2 Chroma Format | No | No | No | No | No | No | Yes | Yes | No |
| 4:4:4 Chroma Format | No | No | No | No | No | No | No | Yes | No |
| 8 Bit Sample Depth | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9 and 10 Bit Sample Depth | No | No | No | No | No | No | No | Yes | No |
| 11 to 14 Bit Sample Depth | No | No | No | No | No | No | No | No | No |
| 8x8 vs. 4x4 Transform Adaptivity | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Quantization Scaling Matrices | No | No | No | Yes | No | No | No | Yes | No |
| Separate Cb and Cr QP control | No | No | No | Yes | Yes | Yes | No | No | No |
| Separate Color Plane Coding | No | No | No | No | No | No | No | No | No |
| Predictive Lossless Coding | No | No | No | Yes | No | Yes | No | No | No |
| Film Grain Modeling | No | No | No | No | No | No | No | No | No |
| QuickTime | Nero Digital | LEAD | x264 | Mainconcept | Elecard | Telestream | VSofts | ProCoder3 |
| |||||||||||||||||||||||||||||||||||||