DivX Help Guide

DivX Version 5.0.4
Document revision 3 (2003-04-17)
Written by: The Members of DARC (DivX Advanced Research Centre)

About this Document

This page gives a detailed overview of the various features of the DivX® 5.0.4 Codec, the DivX Pro™ 5.0.4 Software, and the Electrokompressiongraph™ (EKG) video encoding tool.

Introduction

This document provides a detailed overview of the various features of DivX, DivX Pro, and the EKG. For more information, review the other how-to guides in the DivX Support section.

Feature Comparison

Version DivX DivX Pro
Latest version 5.0.4 5.0.4
Encoding DivX DivX Pro
I-VOP, P-VOP
B-VOP  
Quality-based encoding
One-pass VBR
Multi-pass encoding
User-specified keyframe insertion
Automated keyframe insertion
Automated scene detection
Patent-pending rate control algorithm
5 levels of pre-defined encoding quality
Quarter pel  
Global motion compensation  
Psychovisual modeling ([email protected])
Video Buffer Verifier Rate Control
(1-Pass, 2-Pass, Nth Pass™)
DivX Certified Profile Verification
Integrated patent-pending pre-processing  
Integrated inverse telecine  
Integrated de-interlacing  
Integrated resizing  
Integrated cropping  
Interlaced video support  
Resolution any integer with a multiple of 4 up to 1920x1088 any integer with a multiple of 4 up to 1920x1088
Bitrate > 20 Kbps > 20 Kbps
Decoding DivX DivX Pro
I-VOP, P-VOP
B-VOP
Quarter pel
Global motion compensation
Psychovisual modeling
Post-processing
MPEG-4 short header (H.263 stream)
Video packet re-synch markers
Data partitioning
Overlapped block motion compensation
Reversible VLD
Interlaced video support
DivX Decoder Logo
Playback of all DivX content
Resolution Any integer with a multiple of 4 up to 1920x1088 Any integer with a multiple of 4 up to 1920x1088
Miscellaneous DivX DivX Pro
Quantization/Quantization type MPEG-2/H.263
Adjustable keyframe, quantization levels, rate control, and algorithm levels
Performance DivX DivX Pro
Decoding 360+ fps 360+ fps
Encoding 70+ fps 70+ fps
Input DivX DivX Pro
YUV 4:2:0 (Y-V-U or Y-U-V)
24 and 32 bit RGB
YUY2
UYVY
YVYU
Output DivX DivX Pro
16-bit (555 and 565)
24 and 32-bit
YUY2
UYVY
Planar 4:2:0 (Y-V-U and Y-U-V)
File formats DivX DivX Pro
AVI file format support
Standards DivX DivX Pro
MPEG-4 simple profile capable
MPEG-4 advanced simple profile capable
Platforms DivX DivX Pro
Windows (98/ME/NT/2000/XP)
Linux Coming soon
Mac OS Coming soon
PocketPC Coming soon  

Performance measured with full screen resolution, software-based performance on top-of-the-line hardware. Note: The DivX codec and its features are independent of applications and can be used by any application that supports a common video interface such as the VFW interface.

DivX Decoder

Ensuring Decoder Performance

There are several things you can do to ensure your system is prepared to play DivX content with optimal performance.

  • Use the latest DivX Player application included in the DivX Codec Bundle. The DivX Player is the official player application of DivX video and is optimized to handle all the latest features.
  • Use the DivX® decoder filter with other player applications. Many other Windows players use one of two methods to decode DivX the "Video for Windows codec," or b) the "DivX decoder filter." For speeds up to two times faster, make sure Windows Based Media Players uses the DivX decoder filter for playback. The DivX decoder filter is installed with the codec. To check, you can access the filter from the start menu where you installed the DivX codec. If it is not there you can do the following:
    • Set your graphics acceleration to maximum in the display properties
    • Ensure that you are not running a video capture or TV tuner application.
    • Try switching the display depth. The filter usually does not work in 8-bit color modes.
    • Make sure you have the most recent drivers for your video card

If nothing helps, it is recommended you switch to 16-bit display depth, because it usually fastest when using the VfW codec.

Post-Processing

Post Processing Dialog

DivX post-processing is comprised of the horizontal and vertical deblocking filters and the deringing filter. Post-processing is a CPU-Intensive process, often burning more CPU time than decoding itself. You may not want to post-process, especially if you prefer the un-post-processed image, or if your PC is not powerful enough. To cater to users' post-processing preferences, 6 different levels of post-processing have been defined. At the minimum (level 0), no post-processing algorithm is used; at the maximum (level 6) every clever algorithm is used to enhance the appearance of your video.

Since the human eye is less sensitive to the chrominance components of a video signal and the more sensitive to the luminance components only the luminance plane is processed at lower post-processing levels, while higher levels activate the same algorithms on the chrominance planes.

The post-processing algorithms used are activated in the following order: deblocking on luminance planes, deblocking on chrominace planes, deringing. You can adjust the post-processing level by accessing the slider in the configuration dialog of the DivX DirectShow filter, which can be accessed by launching the post-processing filter from the start menu. You can do this by running the decoder configure dialog box application from the control panel (Start -> Programs -> DivX -> Decoder Config).

De-blocking

The filter operates along the 8x8 block edges, on both the luminance and chrominance color planes. It helps to reduce the blocking artifacts caused by the DCT spatial compression algorithm used by the codec.

Blocking is the most noticeable artifact, so this is the first algorithm to be applied.

De-ringing

The de-ringing filter is used to eliminate noise near sharp edges caused by the quantization process (the so-called Gibbs effect). The noise is more noticeable in animations, where there are higher frequency coefficients. The deringing on the luminance plane is activated at post-processing level 6. Due to high CPU requirements and relatively low impact of the filter, it is only turned on for Pentium-III or newer processors.

FilmFX

The film fx post-processing algorithm adds warmth to video for those who like the warmth behind film versus the crispness of digital video. The FilmFx filter is great at not only adding warmth to video but reducing perceived "blocking" in digital video and very little cpu overhead for decoding.

Chart of post-processing levels

Post processing level Horizontal deblocking luminance Vertical deblocking luminance Horizontal deblocking chrominance Vertical deblocking chrominance Horizontal deringing Vertical deringing
1          
2        
3      
4    
5
6

Quality Settings

Post Processing Dialog

Smooth Playback

Turning this off will allow B-frame encoded content to playback with lower CPU usage. However, enabling this option will introduce a 1 frame delay in the decoder (because of buffering), which may cause the last frame of the video to not be displayed. Enabling this setting is recommended.

YUV Extended Mode

When selected, the codec will attempt to use YV12 mode to decode the video. This is the fastest way to decode DivX content, but the drawback is that brightness/contrast/saturation controls cannot be used in this mode and are disabled. Enabling this setting is recommended.

Overlay Extended Mode

Selecting this will cause the filter to try to display video using the hardware overlay instead of the software overlay. The hardware overlay is much faster, but may not be supported on some video cards. When this mode is enabled, DirectShow-based player applications will be unable to open more than one window at a time.

Double Buffering

Enabling this will force the video card to allocate a second buffer for the video playback. This will increase the smoothness of the video playback, but may not be supported on low-end video cards with less than 8 MB of RAM.

Film Effect

This is a warming filter that when enabled will add film noise to the decoded picture. This may increase the perceived visual quality of the picture, especially if you are used to watching film. It's a personal preference, however, so use it if you want.

Force Color Mode (hidden)

You can manually set the registry key "HKEY_CURRENT_USER\Software\DivXNetworks\Force Color Mode" in the Windows Registry and assign it a number from 1 to 7. This will force the video card to use a particular color mode, depending upon the value. This is only necessary in rare circumstances to solve video card problems. Supported color modes:

  • 1: YV12
  • 2: YUY2
  • 3: YUYV
  • 4: RGB32
  • 5: RGB24
  • 6: RGB555
  • 7: RGB565

DivX Encoder

DivX Codec Main Menu

Codec Main Menu

New Bitrate Modes (DivX 5.0.3) Using VBV (Video Buffer Verfier)

Introduction

Video is, by its very nature, a very dynamic beast whose characteristics are continuously changing. The amount of motion and texture, the number and placement of scene changes, fades and other effects all conspire to increase the dynamic range of the video stream's entropy.

This entropy variability manifests itself as a video bitrate that can vary dramatically over time. Because communications systems, including the Internet, operate at near constant bitrates, we have to design a means of coupling the variable bitrate requirements of the video stream to the constant bitrate limits of the channel.

By convention we consider what happens at the decoder. A video buffer is placed between the channel (constant or capped bitrate) and the decoder input (variable bitrate according to decoded content). To avoid duplication, readers unfamiliar with video buffers should refer to ISO/IEC 14496-2:2001(E) Appendix D "video buffer verifier" for a tutorial in the operation of VBV models.

If DivX video is to be successfully delivered over a restricted channel in real time to a decoder, then the encoder's rate control must ensure that the decoder's buffer is not violated. If this is done properly then overflow and underflow will never occur and the encoder is said to be "VBV compliant". It makes no difference whether the video is encoded in 1-pass or many, in real-time or offline, it is the encoder's rate control that must ensure compliance.

Requirements

There are three main requirements of a VBV-compliant rate control algorithm, in order of importance:

  1. VBV compliance: When VBV parameters are specified (size, initial occupancy, channel characteristics) the rate control should avoid producing video that is in danger of violating the decoder's buffer.
  2. Target Bitrate: We aim to meet the user's requirements as accurately as possible here. This is directly related to filesize.
  3. Good-looking video: Rate control should aim to produce the subjectively best quality video given other constraints. Consistency is important here as changes in quality are very noticeable and video is often judged by the worse quality segment in the whole sequence.

The RC algorithm must achieve the above requirements by adjusting frame quantiser (or other parameters affecting compressed frame size) on frame or block basis.

Codec Main Menu

Version 5.03 VBV Rate Control Modes

There are two flavors of VBV RC. Both must comply with the three requirements above. The flavors differ in the amount of information available when attempting to satisfy the requirements:

  1. Causal RC (1-Pass): A causal system is one whose outputs depend only on current and previous input values. It cannot see into the future. Causal RC is used in 1-pass RC, or RC used for minimum latency real-time encoding. Nth pass RC can take advantage of information available from previous analysis (or analyses) of the video sequence. If a requirements on Nth Pass RC are identical to those on causal RC - but it should be possible to satisfy these requirements better by using information garnered from previous passes through the video sequence.DivX Profile is selected the maximum bitrate will also be selected to help insure reliable playback on DivX Certified Devices.
  2. Nth pass RC: An nth pass RC algorithm differs in that it has information available from previous analysis/analyses of the video sequence. That information can be used to our advantage to make better decisions in order to achieve the three requirements above. Nth pass RC can take advantage of information available from previous analysis (or analyses) of the video sequence. The requirements on nth pass RC are identical to those on causal RC - but it should be possible to satisfy these requirements better by using information garnered from previous passes through the video sequence. Nth pass RC has replaced the old 2 - Pass RC and can be used in exactly the same manner. To run 2 pass just Run "multipass first pass" and then "multipass Nth - Pass". This provides more accuracy than "1-Pass" encoding. In this mode, the encoder will try to make the subjective quality of the stream constant, and simultaneously ensure that the complete stream size is close to the number specified. Operation in this mode requires the video to be processed at least twice. N-pass encoding in independent from any application you may use for encoding and can be used with any program. If you want to encode a 2-hour show and be sure it fits on a 650Mb CD, then this is the choice for you. The new RC also allows you to fine tune the rate control depending on the complexity (ranging from low to high motion)of the video you ware encoding. Adjusting the "complexity modulation" parameter depending on video complexity will help to improve video quality.
DivX Profiles

Accessing DivX Original Rate Control, Qpel and GMC

Qpel, GMC and B-frame have been moved to a new GUI called "Profiles". Detailed information on the "DivX Profile" feature is explained in the "DivX Profile" section. If a DivX Profile is selected the DivX codec will automatically ensure that you select the optimal bitrate and setting for ensuring compatibility. This includes automatic selection of max bitrates advanced features, and feedback for choosing appropriate resolutions. However, you can still access Qpel, GMC and Original Bitrate Modesmodes by un-selecting.

The DivX 5.x encoder has three possible modes. "Choose Profile" in the "Profile" menu.

DivX No Profiles

By unselecting "Choose Your Profile" all bitrate modes can also be accessed:

Main No Profiles

Original (pre-5.0.3) Bitrate Modes

  1. 1-Pass Variable Bitrate Mode. The encoder will aim at making the average bitrate of a movie close to the specified value, allocating less data for low-motion scenes and more data in fast-motion scenes. You can either enter in the desired bitrate manually or use the slider.
  2. 1-Pass Quality Based Mode. The encoder will code everything with the same absolute quality without giving respect to the amount of motion. When using this mode all frames receive the same amount of compression, without regard to their complexity. While it is not the best choice for making archives, it is a good idea to use this mode when preparing content for future editing. It guarantees the preservation of quality in all frames. When selecting the "Quality Based" mode the slider will adjust the Quantizer on a percentage basis while showing the exact number that will be used to compress each frame.
  3. 2-Pass (Variable Bitrate Mode). Prior to encoding, the video is analyzed to understand its complexity. We can then allocate data more efficiently to scenes that need more and less data to scenes that do not. This provides more accuracy than "1-Pass" encoding. In this mode, the encoder will try to make the subjective quality of the stream constant, and simultaneously ensure that the complete stream size is close to the number specified. Operation in this mode requires the video to be processed twice. 2-pass encoding in independent from any application you may use for encoding and can be used with any program. If you want to encode a 2-hour show and be sure it fits on a 650Mb CD, then this is the choice for you.

Original (pre-5.0.3) Two-Pass Encoding Log Files

During the first pass of "2-pass", encoding data is gathered that will be reused at the second pass to increase accuracy and quality. To reduce encoding time when choosing "2-pass" the encoding data can be reused when there is a need to encode the same content in "2-pass" mode with the same settings. Encoding time is reduced since this will only require that the 2nd pass be performed. It is possible that external applications be developed to manipulate these logs for even more accurate adjustments in data allocation.

The "Protect Log/My File" setting will prevent the possibility of accidentally overwriting any existing log files.

MPEG-4 Tools

B-frames/Bi-directional encoding

There are three types of frames that are possible within a DivX video stream. These frames are called "I-frames" (Intra), "P-frames" (Predicted) and "B-frames" (Bi-directional). Prior to DivX 5.0 the only frame types were I and P. I-frames are encoded only using information from within its own frame. It does not use any information from other frames (temporal compression). An I-frame is similar in concept to encoding a single frame using JPEG. P-frames (Predicted) are forward predicted and may refer to either an I-frame or P-frame. They are encoded from the frame that precedes it. In any video sequence a group of frames will have many of the same images. For example, if you were to watch a news anchorperson you'll notice they barely move and the background would stay almost identical for every frame. (Remember that there are commonly up to 30 frames in a single second). So instead of encoding each one of those 30 frames independently as you would in an image file such as a JPEG you can exploit the redundancy of each frame by the use of P-frames. Essentially a P-frame is a future frame that determines where a block in the previous frame has moved to in it's current P-frame. So instead of spatially encoding (JPEG) the frame the P-frame just says "Hey the block in the previous frame has moved to location (X,Y) which requires much less data then encoding each frame spatially. Essentially we transmit the difference between frames which is more efficient than transmitting the original I-frame.

DivX Pro 5.0 introduces the ability to also use "B-frames". B-frames allow the DivX codec to predict frames from the future, choosing the best prediction match among 2 prediction frames instead of only one. B-frames are not only codec by using forward predicted frames but also from backward predicted frames which can be an I or P frame. Using B-frames reduces the amount of data needed to code a frame and improves quality more specifically in areas where moving objects reveal hidden areas.

Global Motion Compensation

Global Motion Compensation (GMC) helps to improve complex scenes where zooming and panning are present. The ability to reduce the required data from one frame to the next can be reduced since there is a commonality within panning and zooming scenes that can be used to more efficiently compensate for what is more normally a group of blocks in such scenes.

Quarter Pel

As explained in the "B-frames" summary, data is reduced when the difference between two frames (prediction error) is transmitted instead of the entire image being sent. The difference in a successive frames composition is generally computed on a macroblock-by-macroblock basis (16x16 pels) or on a block by block basis (8x8 pels). For example, a part of an image located in a block at grid location (1,1) may move to grid location (1,2) in the next frame. As you may realize an image in one block will likely need more accuracy than just the ability to move on a limited block by block basis with an accuracy that is limited to an integer pixel unit (1,1). DivX has increased the previous accuracy of using a half pel (1.5, 1.5) to include the ability of using "Quarter Pel" (1.25, 1.75) accuracy with the Codec release. Quarter Pel performs a specific filtering on each block to produce a virtual block that should represent how the original block should appear if it is moved a 1/4 of a pixel unit.

"New" Quarter Pel

In version 5.0.3, Quarter Pixel Motion Estimation has been updated to the new revision of the standard 14496-2 DCOR1. (The older pre-5.0.3 implementation of Qpel is still decodable).

Quick Config CLI

See the Quick Config CLI Appendix for more information on this feature.

General Parameters Menu

Post Processing Dialog

Enable Crop

Cropping is commonly used to remove un-needed borders, which may take up un-needed data. This is seen commonly in "Widescreen Format" or "Letterboxed" movies. It is also common to see borders within content that are completely unnecessary or are in place due to poor encoding of the source material. Cropping will remove or "crop" any borders not desired. Just specify amount of pixels to crop each border by.

Enable Resize

Resize allows content to be encoded at a specified resolution. There are many target applications and environments suitable for using the DivX codec and the content should be appropriately encoded for these environments. Generally the smaller the resolution the lower the file size. The output resolution can be adjusted so that it preserves the original resolution, or so that it changes the resolution for applications that might require much smaller resolutions and bitrates, such as video conferencing or browser based videos. There are 4 options for "Resizing":

  1. Bilinear (Very Soft)
  2. Bicubic (Soft)
  3. Bicubic (Normal)
  4. Bicubic (Sharp)

From a mathematical viewpoint, it could be argued that the bicubic resize algorithm is best for enlarging the image while bilinear is more suited to size reduction. From our own experience, the opposite appears to be true - bicubic gives the best visual quality for reducing the video resolution. The bilinear algorithm is slightly less CPU intensive and will allow the codec to run faster. The choice of algorithm is very much up to personal preference.

Psychovisual Enhancements

By exploiting what we know about the Human Visual System (HVS) we have increased the efficiency of allocation video data helping to increase the perception of quality in video. For example, if the human visual system has very low sensitivity to a specific type of characteristic in an image we may decrease the amount of data located at this location and re-allocate this data to a location within an image where the human visual system is much more sensitive. The Psychovisual enhancements are applied to both a frame and macroblock basis. One of the important factors in evaluating Psychovisual Modeling is to NOT just compare a single frame but to compare a full sequence. An image may look worse or better when a single frame is examined but the key to reducing data is to reduce data in a way that the human visual system does not notice over a video sequence running at a full frame rate (e.g. 30 frames per second). Psychovisual modeling is a fairly new field when applied to real videos or movies. This area is full of possibilities we have only just started on and will continue to explore.

Pre-Processing

Video noise is often referred to as "specks", "snow", or "hair" within a video (i.e. "snow" that is visible when watching TV over an antenna"). Any number of the processes of video production and distribution can add noise into the video. Some of the worse video noise can be seen in old or poorly recorded movies. Noise can be a big problem when it comes to compressing the video as the noise consumes a large proportion of the bits available for wanted video.

The preprocessing filter uses digital signal processing techniques to remove the noise from the source material prior to encoding. Broadly, there are two classes of filter that can reduce noise: temporal and spatial. To explain how they work, let's consider a single pixel somewhere in the image. Spatial filtering looks at the neighborhood pixels within the pixel's own frame and applies a smoothing, or low-pass function. A temporal filter smoothes pixels at the same position over a few consecutive frames to reduce the effect of noise. By using these techniques to reduce noise prior to video encoding we can, in certain content, increase our compression ration and improve quality.

There are 4 settings for preprocessing:

  1. Light
  2. Normal
  3. Strong
  4. Extreme

As with all features there is a particular content that may be affected more than others. Generally old noisy content can see dramatic effects in file size reduction and quality. Normal pre-processing should not introduce any visual degradation of the source file, however we have provided a "Light" setting for very tricky source, the "Strong" and "Extreme" settings will wash the source a little, however it will remove the most amount of data and should be used when file size is more important than quality.

Keyframe

The DivX encoder will automatically insert a key frame every time it detects a scene change. However, long interval between scene changes are possible, and when they occur, the encoder automatically inserts keyframes with user specified frequencies. Keyframes are the largest of all frames, so the frequency of their placement can have a drastic effect on the encoded file. We have found 300 frames to be the maximum interval the encoder should go without inserting a keyframe. This corresponds to at least one keyframe every 10 seconds in a 30 fps stream. Also, depending on the player used, the maximum key frame interval may determine the maximum interval for seeking. This occurs when players are designed to seek to "I" or keyframes. Reducing the keyframe interval can also improve delays and the quality of streaming content.

Deinterlace

Interlacing, invented in the 1940's, is probably the earliest form of video compression. Instead of transmitting a complete video frame 60 times every second, engineers discovered that they could halve the bandwidth needed by the TV signal if they sent alternately odd and even "fields", each field comprising just the odd or even picture lines. Interlacing is most commonly found on material intended for TV broadcast, or material created by consumer camcorders.

Interlacing is not a problem if it is correctly displayed on an interlaced display device, i.e. a television. An interlaced video camera running at 30fps captures the odd-numbered lines of a frame in 1/60th of a second, and the even-numbered lines in the next 1/60th of a second. When viewed on a progressive display device, such as a PC, two fields are interlaced to create one frame. Because half the frame's lines are captured a fraction of a second later than the other half, fast-moving objects may appear jagged, the result of the object advancing slightly within 1/60th of a second. The "progressive" format is preferred for PC playback since the entire frame is captured each second and no de-interlacing will be required.

It is possible to remove the jagged-edge interlace artifacts by applying a process known as "de-interlacing" to the video. The DivX codec is able to de-interlace the source video prior to encoding. For this to work correctly, it is important that the video has not been resized vertically external to the codec. Resizing within the codec does not affect the operation of the codec's de-interlacing.

The DivX® codec has two main options for de-interlacing:

  1. "All frames are progressive" - This is the default setting where de-interlacing and IVTC are never used. It is suitable for material that is already in a progressive format.
  2. "All frames are interlaced" - The codec will use an adaptive algorithm to deinterlace every frame prior to resizing and encoding. The video should not be cropped or resized prior to encoding. Resizing within the codec will cause no problems.

Interlaced Video Support

Encoding and decoding of interlaced content is now supported. If the content you're encoding is interlaced you can either de-interlace the content so that it is progressive or preserve the interlaced fields. Preserving the interlaced fields may sometimes result in better video quality during playback, but the cost is a bigger file size. Interlace support is compliant with the MPEG-4 standard and uses block level decisions to make its selection versus progressive or interlace. This means that interlace coding is used when interlace artifacts are detected and progressive coding is used when motion is very low.

Due to the very nature of interlaced video the minimum number of lines for NTSC is 480 and 576 for PAL (actual number can be lower but result is not guaranteed). Interlaced content is usually found on TV or captured with a video camera (DV sources are usually interlaced).

Manage Settings Menu

Post Processing Dialog

The settings manager allows commonly used settings to be easily saved and accessed so that they may easily be reused at a later time. This is useful for many reasons such as making sure the settings are optimal for certain types of movies, decoding devices or environments that only support certain MP4 features. Also, this is extremely useful since it allows you to easily send anyone else the exact settings you are using. To add your current settings just click on "Add codec settings" and you will be able to name write a description for this setting. If you would like to save the settings, just select "Save settings to file" and if you would like to load a new setting just select "Load settings from a file".

Profiles Menu

DivX Profiles

With the new DivX Certification program, DivXNetworks is enabling third parties to create "DivX Certified" products that are rigorously tested and fully compatible with the entire suite of DivX™ video technologies. There are four levels of official DivX Certified video products: Handheld Video Devices, Portable Video Devices, Home Theater Devices and High Definition Video Devices. These levels quickly and clearly communicate what type and the DivX Certified device to insure optimal playback supports size of DivX video.

With DivX Certification, a new concept has been introduced: the "DivX Profile". Depending on your target device, the DivX codec helps to ensure your encoded clip will fall under certain certifications parameters for consumer electronic devices, which ensures optimal playback on a variety of devices. In order to keep it simple, the number of the profiles is reduced and the separation between them is based on a simple consideration: resolution (more exactly the number of 16x16 Block per second) and advanced features (B-frames). Depending on what your target application or device is the DivX Codec will automatically select the best default settings to ensure highest visual quality and compatibility with DivX Certified Devices.

If a DivX Profile is chosen the codec will automatically use the new VBV rate control algorithm explained in the Rate Control section of this document. The VBV insures that the maximum peak bit rate never exceeds either the users input "maximum peak" bit rate value or/and insures that the encoded stream never violates an Mpeg-4 compliant decoders buffer. This helps to prevent decoding failure in both hardware and software where memory may be limited yet compliant to the MPEG-4 standard as defined in ISO/IEC 14496-2:2001 (E). If DivX video is to be successfully delivered over a restricted channel in real time to a decoder, then the encoder's rate control must ensure that the decoder's buffer is not violated. If this is done properly then overflow and underflow will never occur and the encoder is said to be "VBV compliant". It makes no difference whether the video is encoded in 1-pass or many, in real-time or offline, it is the encoder's rate control that must ensure compliance.

Qpel and GMC are also never chosen when choosing a DivX Profile. After a thorough analysis of the current incomplete state of quarter pel within the MPEG-4 standard coupled with an evaluation of the technical factors necessary to support these features at the chip level, we have determined that support for quarter pel and GMC is not necessary for the first generation of DivX Certified devices.

Qpel, GMC and the DivX "Original Rate Control" modes can be accessed by unselecting "Choose Profile.

  Max Resolution at 30 fps Max Resolution at 25 fps Max Number of 16x16 Blocks/sec
Handheld 176x144 192x144 1485
Portable 352x240 352x288 9,900
Home Theater 720x480 720x576 40,500
High Definition 1920x1080 1920x1080 108,000

Advanced Parameters

The following section discusses additional parameters that can be set via the command line or via the Windows Registry.

Data Rate Control Parameters (RC)

The Data Rate Control parameters can only be changed through the operating system registry. The DivX Codec uses a patent-pending dual asymmetric rate control. It uses dual period control loops to achieve a best balance reacting and adjusting to the variations in a short time sale while controlling and averaging the bitrate in the long time scale.

Essentially, it is well balanced as it adapts dynamically to the content of the scene, providing optimal allocation of bandwidth. It is flexible and easily adjustable for different application scenarios. The creation of the DivX Rate Control algorithm comes from testing many real full-length movies against the DivX codec in multiple user environments (i.e. TV, PC, PDA, etc.).

There are several settings that may be experimented with. We highly recommend that only experiences users change these settings since minor changes can cause significant effects.

Maximum and Minimum Quantizers

The quantizer is one of the most important parameters in video coding. The quantizer controls how fine the encoder codes the video sequence. The rule of thumb is: for the same frame, a smaller quantizer equals better quality and higher bit consumption while a larger quantizer equals lower bit consumption and inferior quality. Since every frame has a different amount of complexity a subjective equality in quality can be seen among different frames even with the varying quantizers. Basically, the quantizer operates the rate control. Balancing the quality of video with bit consumption can be quite an art form.

Note: RC settings are truly "for the adventurous souls". These default settings should give near optimum results.

RC Averaging

RC Averaging controls how fast the RC forgets the rate history. Larger values usually result in better higher motion scenes and worse low motion scenes.

Rate Control Down/Up Reaction

RC Down/Up Reaction - control the relative sensitivity in reaction to high or low motion scenes. Larger values usually result in better high motion scenes, but larger bit consumption.

All these parameters are inter-correlated. The effect from their setting is approximate and often depends on the settings of the other parameters.

Data Partitioning

Data portioning may be useful in any situation where transmission errors may occur, such as a streaming or broadcast environments. Data Partitioning is a different way of organizing data in the stream. A frame is composed of adjacent macroblock and each macroblock usually includes motion vector (prediction) and texture information. This allows the stream to be more resilient to transmission errors, in this modality the motion vector and the texture are separate (not interlaced with each single macroblock) and grouped in video packets. Each video packet is and independent entity inside the steam and can be decoded separately from the others. Use of Data Partitioning can also permit the activation of a series of tools that allows for error recovery and packet resynchronization.

Performance/Quality

There are 5 settings available for Performance/Quality. Essentially if more quality is desired more CPU is needed. There should rarely be a time when you will need to pick any other quality setting other than "Slowest" as it produces the BEST quality. Accuracy in motion estimation is sacrificed at the to increase the performance of encoding content. With today's CPU's and the efficiency of the DivX Codec encoding at up to "Full Screen" resolutions at real-time encoding speeds is possible. However lower quality settings could be useful when there is not enough CPU power and a sacrifice in quality can be justified. Generally, real-time or faster than real-time encoding speeds are only necessary when broadcasting real time video feeds, yet the faster the encoder the lower the cost of encoding. Leave this setting at "Slowest" unless otherwise necessary.

DivX® Advanced Research Centre (DARC) Team

Digital Video Engineering Team

  • Eugene "Sparky" Kuznetsov
  • Andrea "e7abe7a" Graziani
  • John "eagle" Funnel
  • Adam "c0redumb" Li
  • Mac® OS development by Adrian "AdrianB" Bourke
  • Brian "Woody" Fudge

Management & QA

  • Jérôme "Gej" Rota
  • Darrius "Junto" Thompson
  • Ben "TheKid" Côté

Copyright and Trademarks

The DivX Codec and DivX Pro Software are Copyright © 2000-2003 DivXNetworks, Inc. MMX iDCT and fDCT implementations are © Intel Corp., 1998-2000.

DivX® and DivX Pro™ are trademarks of DivXNetworks, Inc.

Appendix

MPEG-4 Tools, Profiles and Levels

Visual Tools Advanced Simple Profile Simple
Basic X X
I-VOP    
P-VOP    
AC/DC Prediction    
4-MV, Unrestricted MV    
Error Resilience X X
Slice Resynchronization    
Data Partitioning    
Reversible VLC    
Short Header X X
B-VOP X  
Global Motion Compensation X  
Quarter-Pel Motion Compensation X  
Visual profile Level Typical visual session size Maximum bitrate (kbit/s)
Advanced Simple Profile L0 176x144 128
Advanced Simple Profile L1 176x144 128
Advanced Simple Profile L2 352x288 384
Advanced Simple Profile L3 352x288 768
Advanced Simple Profile L4 352x576 3000
Advanced Simple Profile L5 720x576 8000
Simple L3 CIF 384
Simple L2 CIF 128
Simple L1 QCIF 64

About the Electrokompressiongraph™ (EKG) Application

What is the EKG?

The EKG enables an advanced video compressionist to reallocate data to specific scenes within a video clip. Although the Rate Control algorithm within the DivX Codec is very good there is not a "perfect" rate control since quality is very subjective. The EKG will allow a compressionist the ability to change or reallocate data to specific areas within a DivX Encoded video that may need more or less data.

The DivX Codec Generates a Log file recording information about the source video and the decisions made by the Rate Control algorithm. During 2 Pass encoding this data is further analyzed so that bits may be more accurately distributed during the second pass of encoding. In the first pass, the rate control module looks through the whole video sequences, and record the complexity of each video frame. In the second pass, the rate control module then adjusts the quantizer for each individual frame based on its complexity and the overall averages which were obtained during the first pass of encoding.

For every frame the following data is recorded:

  • Time offset of video frame
  • Coding type (I-P-B)
  • Motion complexity
  • Texture complexity
  • Frame Size
  • Modulation Parameter

When a second (nth) pass encode is running, the rate control will try achieve three goals. First, it must be sure not to exceed the specified maximum bitrate. It will do whatever it takes to prevent this, even dropping frames in the most extreme cases. Second, it will try to meet the desired target bitrate as closely as possible. Its third goal is to distribute bits between frames to achieve equal quality, or to respect the user's modulation settings.

Although the codec is efficient there is always room for improvement and fine-tuning since visual quality is so subjective. The Electrokompressiongraph will enable the user to manually re-allocate data within a video file by adjusting the Modulation Parameter. The modulation parameter is set to one by default. The modulation parameter is a value that is multiplied with a base quantizer to determine a new quantizer that will be used to increase or decrease quality for a specified frame. For example, if the codec thinks every frame should be encoded with a quantizer of 4.0 and the user sets the modulation equal to 1.25 for some frames then these frames will be re-encoded using a quantizer of 5.0. (New Q = Modulation * Existing Q). The EKG will always aim toward meeting the target bitrate as closely as possible. If you increase data allocation for a group of frames this data will be borrowed fairly from other frames and if you decrease data from a group of frames this data will be given fairly to other frames aiming toward the target bitrate and the best quality possible.

An example of how a user may attempt to use the EKG would be one where data is decreased for the credit section of a movie and increase toward the high action sequences. A user might set modulation = 2.0 for movie credits and for the rest of the movie the codec will use quantizer needed to hit the users target bitrate (let's say it's 3.3) and when it reaches the credits it will switch to using 6.6 which will save bits on the credits. This can become rather confusing since a lower modulation level increases quality. (0.50 = 200% and 2=50%) To keep things simple modulation in the EKG is listed as a quality percentage. Since the modulation parameter is a multiplier of the Quantizer the range of the parameter is limited so that quality can be reduced by half or doubled. This is done by limiting the modulation parameter from 0.50 -> ' 2.0

1. Modulation Parameter range - 0.50 -> ' 2.0

Modulation Q
  2 12
0.50 1 6
0.60 1.2 7.2
0.70 1.4 8.4
0.80 1.6 9.6
0.90 1.8 10.8
1 2 12
1.1 2.2 13.2
1.2 2.4 14.4
1.3 2.6 15.6
1.4 2.8 16.8
1.5 3 18
1.6 3.2 19.2
1.7 3.4 20.4
1.8 3.6 21.6
1.9 3.8 22.8
2 4 24
Fig 1

With this we can see that the quantizer can be reduced by 50% or increased by 200% (0.5 -> ' 2). To keep things simple and easily understood the EKG plots the "Modulation Parameter" (Quality) as a percentage instead of the multiplier. An example graph of how this done is shown in Fig 1.

How Do I Use the EKG?

  1. Decide the destination where your log files will be written by the DivX Pro codec and accessed by the EKG for editing. Simply do this through the Bitrate Control Menu, Multipass encoding files section of the DivX Pro Codec. To prevent confusion later give your log file a unique name that corresponds to the video file. (i.e. TheMatrix.log)
     
  2. Encode your content as usual using Multipass Nth Pass Encoding. Make sure your run a full 2-Pass encode. You can encode your file to any destination but it is suggested you save the file in the same folder your log file has been written to.

  3. After the second pass of encoding is completed open the log file with the EKG.

  4. After the log file is opened you will be prompted to open the corresponding video file.



    Once the file is opened you'll immediately see your video along with a graph plotting the required information:

    • Coding type (I-P-B)
    • Motion Complexity
    • Texture Complexity
    • Frame Size
    • Modulation Parameter

    The thin vertical black line represents the frame you are currently viewing.

  5. You can go to a specific sequence and see how data was allocated by using either the slider, play button or by using the hand icon to grab frames and slide along the graph. Determine a frame or sequence of frames where you want to increase or decrease quality. You can do this in the default screen or if you'd like to see data more prominently select the "Swap" button.
  6. Select the edit icon.
     
  7. After deciding which frames or sequences of frames you would like to modify use your mouse pointer to select a frame or group of frames. To select multiple frames or a sequence of frames hold down the "Shift" button. Once selected the frames will turn light blue and you will see a thick blue horizontal line representing the modulation parameter.
     
  8. Increase or decrease the quality for the selected frames by adjusting moving the modulation parameter, which is represented by the thick horizontal line. 100% represents the files current quality level. Moving the modulation to 150% will increase the quality by 50% of the original value and moving it down to 50% will decrease the quality to 50% of it's original value. After changing the modulation parameter the horizontal thin beige line will be changed representing your new modulation parameter.
  9. Once satisfied with your changes go to the EKG menu and select File ' Save Changes.
     
  10. Go back to your encoding application and run another pass of encoding using "Nth-Pass". The DivX Pro Codec will use the modified log file to re-encode your video changing the quality of the areas you have selected while adding to or subtracting data intelligently from other areas so that your selected changes are possible while still aiming at an accurate target bitrate and high quality. You should insure that you are using the correct log file.
     
  11. After running another encoding pass you can open the file again with EKG and see that the quality and data rate allocation of your video has been changed. You can modify it further or just decide that you're done and enjoy your video.

You can also change any of the bar graphs to line graphs, parameters visible or not-visible, disable the video preview, change the information plotted on the X access, and zoom in and zoom out of your data.

DivX 5.0.4 CLI parameters

The command line interface parameters will be automatically updated as you use the GUI to change the codec parameters. The opposite is also true. When you type parameters into the CLI, the GUI will show the changes after you press the tab or enter key. This makes the CLI an easy shorthand for advanced users to use to manage their settings.

DivX Certification Profile

-Profile Profile_Number
 	Profile_Number =	0	Free
				1	Handheld Profile
                		2	Portable Profile
 				3	Home Theater Profile
  				4	High Definition Profile

Bitrate Mode

-b Bitrate
Bitratemode= 	-bv1  1 pass mode (DivX Certified)
		-bvn1 Multipass 1st pass (DivX Certified)
		-bvnn Multipass Nth pass (DivX Certified)
		-b1   original 1 pass CBR mode (Non DivX Certified)
		-b1q  1 pass Q based mode (Non DivX Certified)
		-b21  original 2 pass 1st pass (Non DivX Certified)
		-b22  original 2 pass 2nd pass (Non DivX Certified)

Examples:
		-bv1	512			1 pass followed by bitrate
		-bvn1 	768			1st pass followed by bitrate
		-bvnn 	768			Nth pass followed by bitrate
        	-b1   	780			original 1 pass followed by bitrate
        	-b1q  	50%,1000		1 pass Q based followed by
						a percent if %, The optional second
						parameter is the Max Bitrate
		-b1q  	10.2			1 pass Q based followed by a Quantized
		-b21  	800			original 2 pass 1st pass followed by bitrate
		-b22  	800			original 2 pass 2nd pass followed by bitrate

VBV settings

-vbv VBV_Bitrate,VBV_Size,VBV_Occupancy

Here is the Default Profile value:
	Handheld 		-vbv 128000,262144,196608
	Portable 		-vbv 768000,1048576,786432
	Home Theatre	 	-vbv 4000000,3145728,2359296
	High Definition		-vbv 8000000,6291456,4718592

	Do not mess with theses value unless you are absolutely sure of what you are doing.

Crop

-c left,right,top,bottom

Resize

-r width,height,Quality
Quality=	1 Bilinear
		2 Soft Bicubic
            	3 Normal Bicubic
            	4 Sharp Bicubic

Pre Processing Source

-pre [Strength] or [TemporalLevel,SpatialLevel,SpatialPasses]
 	Strength= 	1 	Light
               		2 	Normal
               		3 	Strong
               		4 	Extreme

TemporalLevel= from 0.0 to 1.0
SpatialLevel=  from 0.0 to 1.0
SpatialPasses= from 1 to 3

Psycho Visual Enhancements

-psy [Strength] or [FrameLevel,MacroblockLevel]
 	Strength= 	1 	Light
                 	2 	Normal
                	3 	Strong

FrameLevel= from 0.0 to 1.0
MacroblockLevel=  from 0.0 to 1.0

Maximum Key Frame interval

-key MaxInterval

Two pass encoding log and mv

-log LogFileName
-mv MVFileName     If -mv option is specified during 2nd pass the
                   mv file will be used, if no -mv option is specified
                   during 2nd second pass, MV will be recalculated

Protect log/mv file

-p

Source Interlace/deinterlace

-d Method
Method=		1	All Frames are progressive
 		2	All Frames are interlaced
 		3	Intelligent IVTC/deinterlace

Basic Video Deinterlace

-bvd

MPEG4 Tools

Quarter Pixel
-q
GMC
-g
Bi-directional encoding
-b

Original RC Data Rate

-dr MaxQ,MinQ,RCAveragingPeriod,RCReactionPeriod,RCDownUpRatio
Example
 	-dr 12,2,2000,10,20

Scene Change Thereshold

	-sc SceneChangePercent
Example
 	-sc 50
	-sc 70%

Performance/Quality

-pq Value
Value=		1	Fastest
 		2	Fast
 		3	Medium
 		4	Slow
 		5	Slowest