Structure of a .wav file

The .riff chunk descriptor

Now that you're ready to create your application, let's discuss the .wav file structure and format.

The .wav file format is derived from the Resource Interchange File Format (.riff) specification, which includes primarily multimedia data. The structure of a .riff file is based on chunks and subchunks. Each chunk and subchunk has a type that is represented by a four-character tag in big-endian byte order. Multiple kinds of data can be stored in a .riff file, but we'll describe the canonical .wav file format here. The first element in the file is the .riff chunk descriptor, which describes the kind of .riff file that we have.

Field name and description

Field size in bytes

Endian byte order

Chunk identifier: This field contains the letters "RIFF" in ASCII form. Each character is 1 byte.

4

Big

Chunk size: This field contains the size of the memory chunk that appears immediately after this field. The total file size is the value of this field plus 8 bytes (for the chunk identifier and chunk size).

4

Little

Format: This field contains the letters "WAVE" in ASCII form. Each character is 1 byte.

4

Big

Format subchunk

Since the format value of the .riff file is specified as "WAVE", we know this .riff file contains .wav data only. The .wav file format requires that the format and data subchunks be handled.

The format subchunk appears after the .riff chunk descriptor. The format subchunk describes metadata about the .wav file, such as sample rate, sample width, and other important audio information, and is structured as follows:

Field name and description

Field size in bytes

Endian byte order

Subchunk identifier: This field contains the characters "fmt " (including the space) in ASCII form. Each character is 1 byte.

4

Big

Subchunk size: This field contains the size, in bytes, of the remaining format subchunk that appears immediately after this field, which is 16 bytes for pulse code modulation (PCM). The total format subchunk size is this value plus 4 bytes (for the subchunk identifier).

4

Little

Audio format: This field determines the audio format. For a .wav file, the value is 1. Values other than 1 indicate some form of compression.

2

Little

Number of channels: This field is the number of separate streams of audio information. For example, if you use audio data for one channel (mono), the value is 1.

2

Little

Sample rate: This field is the number of samples per unit of time, typically in seconds.

4

Little

Byte rate: This field is the number of bits per sample divided by eight (8 bits per byte) and multiplied by the sample rate.

4

Little

Block align: This field is the number of bits per sample divided by eight and multiplied by the number of channels.

2

Little

Bits per sample: This field is the number of bits used for each sample. For example, if you use 8 bits per sample, the value of this field is 8.

2

Little

In PCM format, there is nothing at the end of this subchunk, but in non-PCM format, there is space for extra parameters. In PCM format, this space is empty, but it must be skipped to read the file correctly.

Data subchunk

After the format subchunk is the data subchunk. The data subchunk contains the size of the audio data (in bytes) and the sound data. Here is the structure of the data subchunk:

Field name and description

Field size in bytes

Endian byte order

Subchunk identifier: This field contains the letters "data" in ASCII form. Each character is 1 byte.

4

Big

Subchunk size: This field contains the size of the audio data that appears immediately after this field. This data size can also be considered the number of bits per sample divided by eight, multiplied by the number of samples, and multiplied by the number of channels.

4

Little

Data: This field represents audio data. The data is read based on the metadata obtained above.

Little

Handling .wav audio

Now we can start to handle the .wav file. Your application may include other features, but for handling .wav files, you can follow a process that's similar to the following:

  • Parse the metadata from the .wav file
  • Set up the libasound mixer and PCM components
  • Set up the main audio and event loop
  • Handle audio event changes in the loop
  • Set up file descriptor sets
  • Fill the sample buffer and write the samples to the audio device

Last modified: 2013-12-21

comments powered by Disqus