Reading a file

Most users simply want read access for a player toy or something. A good place to start is before opening the file, making sure it is Quicktime with quicktime_check_sig().


This returns 1 if it looks like a Quicktime file or 0 if it doesn't. Then you can open the file as described in opening.html.

Next get the number of tracks for each media type in the file:

int quicktime_video_tracks(quicktime_t *file);
int quicktime_audio_tracks(quicktime_t *file);

While Quicktime can store multiple video tracks, the audio track count is a bit more complicated. Usually you'll only encounter a single audio track. Inside the audio track is a variable number of channels. To get the channel count call:

int quicktime_track_channels(quicktime_t *file, int track);

With the track parameter set to track 0. Many routines require a track parameter to specify the track to operate on. Tracks are always numbered from 0 to the total number of tracks - 1 for the particular media type.

Audio tracks are numbered from 0 to the total number of audio tracks - 1. But like I said, you'll probably never encounter an audio track higher than 0. Other routines you might find useful for getting audio information are:

long quicktime_sample_rate(quicktime_t *file, int track);
long quicktime_audio_length(quicktime_t *file, int track);

quicktime_audio_length gives you the total number of samples. The sample rate is samples per second.

Routines you'll never use unless you want to write a codec are:

char* quicktime_audio_compressor(quicktime_t *file, int track);
int quicktime_audio_bits(quicktime_t *file, int track);

The audio compressor call returns a 4 byte array identifying the data compression of the track. These identifiers are 4 alphanumeric characters which go along with one of the #defines in quicktime.h. The bits function returns the number of bits in a sample, usually meaningless.

The most interesting contents of a Quicktime file are of course the video tracks. Quicktime stores multiple video tracks.

The available queries for each video track are:

long quicktime_video_length(quicktime_t *file, int track);
int quicktime_video_width(quicktime_t *file, int track);
int quicktime_video_height(quicktime_t *file, int track);
float quicktime_frame_rate(quicktime_t *file, int track);
long quicktime_frame_size(quicktime_t *file, long frame, int track);
int quicktime_video_depth(quicktime_t *file, int track);
quicktime_reads_cmodel(quicktime_t *file, int colormodel, int track);

Tracks are numbered 0 to the total number of tracks - 1. The video length is in frames. The width and height are in pixels. The frame rate is in frames per second. Depth returns the total number of bits per pixel. The only two values Quicktime for Linux returns are 24 and 32 and the 32 bit depth is only returned when the format has an alpha channel. There's no reason to use 16 or 8.

quicktime_reads_cmodel allows you to determine the optimum color model for decompression output. It requires a colormodel #define from colormodels.h. If the codec can generate the desired colormodel without downsampling it returns 1. If downsampling is required it returns 0. You can assume all colormodels in colormodels.h are supported, whether they require downsampling or not.

To get the four byte compressor type for the track issue:

char* quicktime_video_compressor(quicktime_t *file, int track);

Unless you get a really nihilistic file for reading, you can safely assume the encoding scheme for track 0 of audio or video is the same for all tracks.

Decoding video

The library decodes compressed video frames into a buffer in whatever colormodel you desire but before then you should issue

int quicktime_supported_video(quicktime_t *file, int track);

to find out if the data for the track can be decoded by the library. This returns 1 if it is and 0 if it isn't supported.

Then use

long quicktime_decode_video(quicktime_t *file,
	unsigned char **row_pointers,
	int track);

to decompress a frame at the current position of the track into **row_pointers and advance the current position. The array of rows must have enough space allocated for the entire frame, depending on the colormodel. Planar colormodels use only the first 3 row pointers, each pointing to one of the planes.

Several parameters determine the decoder output. They may be set before the call to quicktime_decode_video.

void quicktime_set_cmodel(quicktime_t *file, int colormodel);

Set the colormodel of the output frame to a value in colormodels.h. The default is BC_RGB888.

void quicktime_set_row_span(quicktime_t *file, int row_span);

Set the number of bytes in a row. The default is the row width * bytes per pixel.

void quicktime_set_window(quicktime_t *file,
	int in_x,                    /* Location of input frame to take picture */
	int in_y,
	int in_w,
	int in_h,
	int out_w,                   /* Dimensions of output frame */
	int out_h);

The decoder "sees" a region of the movie screen defined by in_x, in_y, in_w, in_h and transfers it to the frame buffer defined by **row_pointers. The size of the frame buffer is defined by out_w, out_h. The default is a 1:1 transfer from the codec to the output frame.

For more about the track's current position go to positioning


There are other routines for reading compressed data and chunks without a codec. These allow you to perform direct copying of video data from one movie to the other after editing it, without recompressing it.

Decoding audio

For reading audio, first use:

int quicktime_supported_audio(quicktime_t *file, int track);

To determine if the audio can be decompressed by the library. This returns 1 if it is and 0 if it isn't supported. Then use

int quicktime_decode_audio(quicktime_t *file, int16_t *output_i, float *output_f, long samples, int channel);

To read a buffer's worth of samples for a single channel starting at the current position in the track. Notice this command takes a channel argument not a track argument. The channel argument is automatically converted into a track and channel. Positioning information is automatically taken from the appropriate track and advanced for all the channels in the track.

Notice the int16_t* and float* parameters. This call can either return a buffer of int16 samples or float samples. The argument for the data format you want should be passed a preallocated buffer big enough to contain the sample range while the undesired format should be passed NULL. For a buffer of float samples you would say

result = quicktime_decode_audio(file, NULL, output_f, samples, channel);

For a buffer of signed int16 samples you would say

result = quicktime_decode_audio(file, output_i, NULL, samples, channel);

The data format you don't want should be passed a NULL. The decoder automatically fills the appropriate buffer. Floating point samples are from -1 to 0 to 1.

Reading raw video

long quicktime_read_frame(quicktime_t *file, unsigned char *video_buffer, int track);

quicktime_read_frame reads one frame worth of raw data from your current position on the specified video track and returns the number of bytes in the frame. You have to make sure the buffer is big enough for the frame. A return value of 0 means error.

long quicktime_frame_size(quicktime_t *file, long frame, int track);

gives up the number of bytes in the specified frame in the specified track even if you haven't read the frame yet. Frame numbers start on 0.

Accessing Keyframes

Quicktime offers very simple support for keyframes: a table of all the keyframe numbers in a track. There are two things you can do with the keyframe table: insert keyframe numbers and retrieve keyframe numbers.

long quicktime_get_keyframe_before(quicktime_t *file, long frame, int track)

Gets the keyframe number before the frame argument. The frames start on 0.

long quicktime_get_keyframe_after(quicktime_t *file, long frame, int track);

Gets the keyframe number after the frame argument. The frames start on 0.

void quicktime_insert_keyframe(quicktime_t *file, long frame, int track)

Inserts a keyframe into the table. The frame argument starts on 0.

int quicktime_has_keyframes(quicktime_t *file, int track);

Returns TRUE if the track has keyframes. The track starts on 0.

Reading raw audio

There is no simple read or write access to raw audio. Due to vagaries in the audio indexing and the lack of benefit in direct audio copying, you're better off using a codec.