Structure of .wav file
Now that you are ready to create your application, let's discuss the .wav file structure and format.
Structure of .wav file format
The .wav file format is a derived from the Resource Interchange File Format (.riff) specification, which can include different kinds of data, but include primarily multimedia data. The structure of a .riff file is based on chunks and sub-chunks. Each chunk or sub-chunk has a type that is represented by a four-character tag in big-endian byte order. Since multiple kinds of data can be stored in the .riff format, we will describe the canonical .wav file format. First in the file is the .riff chunk descriptor, which describes the kind of .riff file that we have.
The .riff chunk descriptor
|
Field name and description |
Field size in bytes |
Endian byte order |
|---|---|---|
|
Chunk identifier: This field contains the letters "RIFF" in ASCII form. Each character is a single byte. |
4 |
Big |
|
Chunk size: This field contains the size of the memory chunk that follows after this field. The total file size, in bytes, is the value of this field plus 8 bytes (chunk identifier and chunk size). |
4 |
Little |
|
Format: This field contains the letters "WAVE" in ASCII form. Each character is a single byte. |
4 |
Big |
Since the format value of the .riff file is specified as "WAVE", we know this .riff file contains .wav data only. The .wav file format requires two sub-chunks to be handled: format and data.
Format sub-chunk
Following the .riff descriptor chunk is the format sub-chunk. The format sub-chunk describes metadata about the .wav file, such as sample rate, sample width, and other important audio information. The structure of the format chunk is as follows:
|
Field name and description |
Field size in bytes |
Endian byte order |
|---|---|---|
|
Sub-chunk identifier: This field contains the characters "fmt " (including the space) in ASCII form. Each character is a single byte. |
4 |
Big |
|
Sub-chunk size: This field is the size, in bytes, of the remaining format sub-chunk that follows after this field, which is 16 bytes for Pulse-Code Modulation (PCM). The total format sub-chunk size is this value plus 4 bytes (for the sub-chunk identifier). |
4 |
Little |
|
Audio format: This field determines the audio format. For a .wav file, the value is 1. Values other than 1 indicate some form of compression. |
2 |
Little |
|
Number of channels: This field is the number of separate streams of audio information. An example would be if your audio data for 1 channel (mono) the value is 1. |
2 |
Little |
|
Sample rate: This field is the number of samples per unit of time. Typically the unit of time is in seconds. |
4 |
Little |
|
Byte rate: This field is the number of bits per sample divided by eight (8 bits per byte) and multiplied by the sample rate. |
4 |
Little |
|
Block align: This field is the number of bits per sample divided by eight and multiplied by number of channels. |
2 |
Little |
|
Bits per sample: This field is the number of bits used for each sample. An example is if your using 8 bits per sample, the value of this field is 8. |
2 |
Little |
| If PCM format, there is nothing here, but in non-PCM format, there is space for extra parameters. In PCM format, this space is empty, but it must be skipped to read the file correctly. | * | * |
Data sub-chunk
After the format sub-chunk is the data sub-chunk. The data sub-chunk contains the size of the audio data (in bytes) and the sound data. Here is the structure format of the data sub-chunk:
|
Field name and description |
Field size in bytes |
Endian byte order |
|---|---|---|
|
Sub-chunk Identifier: This field contains the letters "data" in ASCII form. Each character is a single byte. |
4 |
Big |
|
Sub-chunk Size: This field is the size of the audio data that follows this field. This data size can also be considered as the number of bits per sample divided by eight multiplied by the number of samples and multiplied by the number of channels. |
4 |
Little |
|
Data: This field represents audio data. The data is read based on the meta data obtained above. |
* |
Little |
Handling .wav audio
Now, we can start to handle the .wav file. Your application may include other features, but for handling .wav files, you can follow a process that's similar to the following:
- Parse the metadata from the .wav file
- Set up the libasound mixer and PCM components
- Set up the main audio and event loop
- Handle audio event changes inside the loop
- Set up file descriptor sets
- Fill the sample buffer and write the samples to the audio device