Recording audio

Recording audio is one of the fundamental tasks that many multimedia apps perform. You can record audio with Cascades, the multimedia renderer service, or the QNX Sound Architecture (QSA).

Prerequisites

To include the multimedia renderer client API, which exposes functions that you can use to connect to mm-renderer, create contexts, attach inputs and outputs, and manage playback you must include the renderer.h header file.

For more information about the prerequisites for multimedia, see Multimedia.

Permissions

Before your app can start recording or playing audio, there are two permissions that you must add to its bar-descriptor.xml file. The first permission is the record_audio permission, which allows your app to access the device microphone to record audio. The second permission is the access_shared permission, which lets your app access files that are located in the shared areas of the device. This access permission is required to play those media files.

You can set these permissions in the Momentics IDE by opening the project's bar-descriptor.xml file and selecting the Microphone check box and the Shared Files check box on the Application tab.

The microphone permissions check box.

Alternatively, you can manually edit the bar-descriptor.xml file on the Source tab and add the record_audio and access_shared elements there. Here's a code sample that shows you how to add the record_audio and access_shared elements to the XML of the bar-descriptor file:

//...

<permission>record_audio</permission>
<permission>access_shared</permission>

For more information, see The bar-descriptor.xml file.

For more information about the shared data areas of the device, see File system access.

Libraries for the QSA

The QSA only supports the Advanced Linux Sound Architecture (ALSA) 5 drivers, which are accessed using the libasound.so library. Direct use of ioctl() commands is not supported, because of the requirements of the ALSA API.

The ALSA API uses ioctl() commands in ways that are not permitted in the BlackBerry 10 OS. For example, passing a structure that contains a pointer by using an ioctl() function call is not supported in the BlackBerry 10 OS.

The asound library is required when using the QSA. The asound library is licensed under the Library GNU Public License (LGPL). You must include the asound library to your apps as a shared library (libasound.so), not as a static library.

For more information about using libraries, see Using libraries.

Recording audio with Cascades

Your app must create an AudioRecorder before you can record audio with Cascades. After an AudioRecorder has been created, it must set an output target to save the recorded content. Specifically, the output target is the name of a file where the recording is saved. The AudioRecorder class has no visual element, which means that you must create a UI to allow your app to record audio. The AudioRecorder can be used together with the NowPlayingConnection to display information about the media that is currently active on the device. Like the AudioRecorder class, the NowPlayingConnection class doesn't have a visual element, but it's capable of sending data about the media to the volume overlay where it can be displayed to the user. The NowPlayingConnection can also be used to receive media control event notifications.

When using AudioRecorder and NowPlayingConnection, you're responsible for connecting all signals and slots to your UI controls. After you've added controls to your UI and connected your signals and slots to them, you can begin to put in the code that makes everything work.

Setting up a recording app

After you've created an AudioRecorder, you can use it to call any one of the following functions:
  • prepare(): Your app can call this function to acquire the necessary resources for recording media content, without actually recording a track. When prepare() is called, the recorder acquires the necessary resources to record the track and then emits the prepared() signal.
  • record(): Your app can call this function to begin recording your track. If your app calls this function without first calling prepare(), then the recorder will call prepare() automatically. When recording begins, the recorder emits the recording() signal.
  • pause(): Your app can call this function to pause the recording task. When the recording is paused, the recorder emits the paused() signal. Calling this function while the recording is already paused does nothing.
  • reset(): Your app can call this function to release any resources that are currently held by the recorder, and move the recorder into the unprepared state. A call to reset() causes the recorder to emit the mediaStateChanged() signal, which can be used to notify your app that the recorder is in the unprepared() state and is no in possession of the resources required to record.

The following code samples assume that you have a UI that contains a set of Button controls, which can be used to call the AudioRecorder functions needed to start or stop audio recording.

For more information on supported media formats for your recordings, see BlackBerry 10 media support.

Sources of sound

With Cascades, playing sounds in an app requires some setup and a few lines of code. In addition to media files such as music, movie, web, or voice audio files, there are also BlackBerry 10 OS system sounds and custom sounds.

BlackBerry 10 OS system sounds

Every device comes preloaded with BlackBerry 10 OS system sounds. The operating system uses these sounds to indicate various events. For example, a General Notification sound alerts the user to an important system event, a Camera Shutter Event sound indicates that the device camera has been activated, and a Low Battery Event sound indicates that the battery power level is low. You can use system sounds in your apps, and they're available using the SystemSound class and some code.

Custom sounds

You can create one or more custom sounds, save them in any one of the most popular sound file formats, and play them in your apps using the MediaPlayer class and a few lines of code. The MediaPlayer class can play many sound file formats including .mp3, .wav, and .wma to name a few.

For more information on the file formats supported, see BlackBerry media support at a glance.

Recording audio with mm-renderer

You can record audio in mm-renderer by attaching the input to an audio recording device and directing the output to a file instead of a device.

The mm-renderer can record audio but not video. To record video content, use the Camera C APIs. For more information, see Camera Library .

You can record audio for as long as you like, but you must ensure your client app's output file can hold all the content you want to record. The size of the generated output depends on many settings, including the sampling rate and number of channels. For example, you can increase the sampling rate to get better audio quality, such as using the standard CD sampling rate of 44.1 MHz (frate=44100000). However, increasing the sampling rate also increases the size of the generated output.

Record audio with Cascades or mm-renderer

Each platform has its own way of recording audio. For example, in Cascades you can use QML or C++ with signals and slots. In the mm-renderer service you can make various C API function calls, and in the QNX Sound Architecture (QSA) you can use its C API to set up PCM devices to use for recording audio.

Here are some code samples that demonstrate how to record audio. The first two (in the QML and C++ tabs) show you how to record audio in Cascades. The last code sample (in the C tab) shows you how to record audio with the multimedia renderer service.

A code sample that shows you how to record audio with the QSA wasn't included here because recording audio with that platform is a process that provides you with many options. One brief code sample could not adequately represent all of the options available for recording audio with the QSA. For more information about using the QSA, see QNX Sound Architecture APIs.

Here's a code sample that uses signals and slots to start and stop a recording by responding to the button control's clicked() signal. When the user clicks the btnRecord button, its onClicked slot is called to begin recording. When the user clicks the btnStop button, its onClicked slot is called and the recording is stopped.

The code sample shows you how to set the path (and file name) for the outputUrl attribute of the AudioRecorder. The outputUrl attribute determines where the recording will be saved. In this case, the recording is saved in a file called recording.m4a within the misc folder of the shared area on the device.

import bb.multimedia 1.4

// ... 
    
attachedObjects: [ 
    AudioRecorder { 
        id: recorder 
        outputUrl: "file:///accounts/1000/shared/misc/recording.m4a" 
    } 
]

// ...

Button {
    id: btnRecord
    text: "Record"
    
    onClicked: { 
        recorder.record(); 
    }
}

Button {
    id: btnStop
    text: "Stop"
    
    onClicked: { 
        recorder.reset(); 
    }
}

You must call the setOutputUrl() function, and set a valid path to the location where you want the recorder to save your recording. You must call the setOutputUrl() function before calling the prepare() function. The path that you set in the setOutputUrl() function should point to the location of a local file on the device. This parameter represents the path to the target file where the recording is saved. The setOutputUrl() function takes a QUrl as its only parameter.

The recording process is started by calling the record() function and it's stopped by calling the reset() function. Using button control signals and slots is a good way to start and stop audio recordings. For example, when the user clicks a button called btnRecord, its btnRecordOnClick() slot is called. The btnRecordOnClick() slot calls the record() function to begin recording. When the user clicks a button called btnStop, its btnStopOnClick() slot is called and audio recording is stopped by calling the reset() function.

ApplicationUI::ApplicationUI()
{
    //...
    
    bool result;
    Q_UNUSED(result);
    
    result = connect(btnRecord, SIGNAL(clicked()), 
                     this, SLOT(btnRecordOnClick()));
                  
    Q_ASSERT(result);
    
    result = connect(btnStop, SIGNAL(clicked()), 
                     this, SLOT(btnStopOnClick()));
    
    Q_ASSERT(result);
    
    // ...
}

// ...

void ApplicationUI::btnRecordOnClick()
{
    // Set the path to the location of the recording
    AudioRecorder recorder; 
    recorder.setOutputUrl(QUrl("file:///accounts/1000/shared/misc/recording.m4a")); 
    recorder.record();
}

void ApplicationUI::btnStopOnClick()
{
    // Stop the recorder
    recorder.reset();
}

// ...

The following mm-renderer code sample shows you how to record audio from a microphone and store it in a file. The code sample gives mm-renderer an input URL of type snd: to select and configure an audio capture device (microphone) and sets an output URL type of file: to target a file. The code sample starts and stops playback to record audio content to the targeted file. The snd: input URL format works only with the file: output type, so your code must follow this design.

The code sample records in mono by specifying one channel (nchan=1) in the input URL. Depending on your platform, your microphone device might have two recorders, so you could record in stereo by setting two channels (nchan=2).

For more information on the available device options, see the list of URL parameters for audio capture devices .

You can modify the code sample to record a voice call by using "snd:/dev/snd/voicebandc" as the input URL to mmr_input_attach() instead of "snd:/dev/snd/pcmPreferredc?nchan=1&frate=8000". Unless you specify different values, mm-renderer uses a 48 MHz sampling rate and 2 channels, which is equivalent to using "snd:/dev/snd/voicebandc?frate=48000&nchan=2" as the input URL.

This code sample uses an AMR file for the output, but mm-renderer supports other formats, such as wideband AMR (see the list of supported output file formats ).

The BlackBerry 10 Device Simulator only has codecs that support .wav files for the output. To record audio in other file formats (including .amr files), you must use real hardware that supports that format.

void record_AMR_file() 
{
  mmr_connection_t *connection;
  mmr_context_t *context;
  const char* context_name = "AnyNameYouWant";
  int output = 0;
  const char* outputFile = "/tmp/testFile.amr";
  int input = 0;

  connection = mmr_connect(NULL);

  if (connection) {
      context = mmr_context_create( connection, 
                                    context_name, 
                                    0, 
                                    S_IRWXU );

      if (context) {
          // Specify a file output so the audio is 
          // not played but recorded in a file
          output = mmr_output_attach( context,
                                      outputFile,
                                      "file" );
                                     
          // Specify the audio device under /dev/snd you want to 
          // use for the recording and the recording details 
          // (in this case, we use a sampling rate of 8000 Hz and 
          // 1 channel for mono (not stereo) recording)
          input = mmr_input_attach( context,
                  "snd:/dev/snd/pcmPreferredc?nchan=1&frate=8000",
                                   "track" );

          // Start recording
          mmr_play(context);

          // Delay for the length of time you want to record 
          // (in this case, 30 seconds)
          sleep(30);             

          // Stop recording
          mmr_stop(context);
          
          // Clean up the context
          mmr_input_detach(context);
          mmr_output_detach(context, output); 
          mmr_context_destroy(context); 
      }

      mmr_disconnect(connection); 
  }
}

Recording audio with QSA

Before you can record audio, you have to open and configure a PCM recording device and prepare the PCM subchannel.

Working with PCM devices

The software processes for recording or playing audio are similar. This section describes the common steps.

Open the PCM device

In order to play or record sound, you must open a connection to a PCM playback or recording device.

The functions used to open a PCM device are:

snd_pcm_open_name()
Use this function when you want to open a specific hardware device and you know its name.
snd_pcm_open()
Use this function when you want to open a specific hardware device and you know its card and device number.
snd_pcm_open_preferred()
Use this function to open the user's preferred device. Using this function makes your app more flexible because you don't need to know the card and device numbers. The function can provide you with the card and device that it opened.

All of these functions set a PCM connection handle that you can use as an argument to all other PCM functions. This handle is analogous to a file stream handle. It's a pointer to a snd_pcm_t structure, which is an opaque data type.

These functions, like others in the QSA API, work for both record and playback channels. They take as an argument a channel direction, which is one of:

  • SND_PCM_OPEN_CAPTURE
  • SND_PCM_OPEN_PLAYBACK

The following code sample uses these functions to open a playback device:

When a card and device number are specified, the code opens a connection to that specific PCM playback device. When a card is not specified, the code creates a connection to the preferred PCM playback device, then snd_pcm_open_preferred() stores the card and device numbers in the given variables.

Not applicable

Not applicable

if (card == -1)
{
    if ((rtn = snd_pcm_open_preferred (&pcm_handle,
                  &card, &dev,
                  SND_PCM_OPEN_PLAYBACK)) < 0)
        return err ("device open");
}
else
{
    if ((rtn = snd_pcm_open (&pcm_handle, card, dev,
                  SND_PCM_OPEN_PLAYBACK)) < 0)
        return err ("device open");
}

Configure the PCM device

Before you can begin playing or capturing a sound stream, you must let the device know the format of the data that you're about to send to it or that you want to receive from it. You can define the data format by creating a structure and using it to call snd_pcm_channel_params() or snd_pcm_plugin_params(). For more information, see PCM plugin converters.

Both of these functions will fail when the device cannot support the data parameters that you set, or when all of the subchannels of the device are currently in use.

The functions used to determine the current capabilities of a PCM device are:

snd_pcm_plugin_info()
Use the plugin converters. When the hardware has a free subchannel, the capabilities that are returned are extensive because the plugin converters make any necessary conversion.
snd_pcm_channel_info()
Access the hardware directly. This function returns only what the hardware capabilities are.

Both of these functions take a pointer to a snd_pcm_channel_info_t structure as an argument. You must set the channel member of this structure to the desired direction (SND_PCM_CHANNEL_CAPTURE or SND_PCM_CHANNEL_PLAYBACK) before calling the functions. The functions fill in the other members of the structure.

When you configure the channel, a subchannel is allocated to your app. Stated another way, hundreds of apps can open a handle to a PCM device with only one subchannel, but only one can configure it. After a client app allocates a subchannel, it isn't returned to the free pool until the handle is closed. One result of this mechanism is that, from moment to moment, the capabilities of a PCM device change as other applications allocate and free subchannels. Additionally configuring and allocating a subchannel changes its state from SND_PCM_STATUS_NOTREADY to SND_PCM_STATUS_READY.

If the function succeeds, all specified parameters are accepted and are guaranteed to be in effect, except for the frag_size parameter, which is only a suggestion to the hardware. The hardware may adjust the fragment size, based on hardware requirements. For example, if the hardware can't deal with fragments crossing 64-kilobyte boundaries, and the suggested frag_size is 60 KB, the driver adjusts it to 64 KB.

Another aspect of configuration is determining how big to make the hardware buffer. This size determines how much latency that the app has when sending data to the driver or reading data from it. The hardware buffer size is determined by multiplying the frag_size by the max_frags parameter, so for the app to know the buffer size, it must determine the actual frag_size that the driver is using.

You can determine the actual frag_size that the driver is using by calling snd_pcm_channel_setup() or snd_pcm_plugin_setup(), depending on whether or not your application is using the plugin converters. Both of these functions take a pointer to a snd_pcm_channel_setup_t structure that they fill with information about how the channel is configured, including the true frag_size.

Control voice conversion

The libasound library supports devices with up to eight voices. Configuration of the libasound library is based on the maximum number of voices that are supported in hardware. If the numbers of source and destination voices are different, then snd_pcm_plugin_params() creates an instance of a voice converter.

The default voice conversion behavior is as follows:

From To Conversion

Mono

Stereo

Replicate channel 1 (left) to channel 2 (right)

Stereo

Mono

Remove channel 2 (right)

Mono

4-channel

Replicate channel 1 to all other channels

Stereo

4-channel

Replicate channel 1 (front left) to channel 3 (rear left), and channel 2 (front right) to channel 4 (rear right)

Previous versions of libasound converted stereo to mono by averaging the left and right channels to generate the mono stream. Now, by default, the right channel is dropped.

You can use the voice conversion API to configure the conversion behavior and place any source channel in any destination channel slot:

snd_pcm_plugin_get_voice_conversion()
Get the current voice conversion structure for a channel
snd_pcm_plugin_set_voice_conversion()
Set the current voice conversion structure for a channel

The snd_pcm_voice_conversion_t structure, which is defined below, controls the actual conversion:

Not applicable

Not applicable

typedef struct snd_pcm_voice_conversion
{       
   uint32_t     app_voices;
   uint32_t     hw_voices;
   uint32_t     matrix[32];
} snd_pcm_voice_conversion_t

The matrix member forms a 32-by-32-bit array that specifies how to convert the voices. The array is ranked with rows representing app voices, with voice 0 first. The columns represent hardware voices, with the low voice being Least Significant Bit (LSB) aligned and increasing right to left.

For example, consider a mono app stream sent to a 4-voice hardware device. A bit array of:

matrix[0] = 0x1;  //  00000001

causes the sound to be output on only the first hardware channel. A bit array of:

matrix[0] = 0x9;   // 00001001

causes the sound to be output on the first and last hardware channel.

Another example is a stereo app stream to a 6-channel (5.1) output device. A bit array of:

matrix[0] = 0x1;  //  00000001
matrix[1] = 0x2;  //  00000010

causes the sound to be output on only the front two channels, while:

matrix[0] = 0x5;  //  00000101
matrix[1] = 0x2;  //  00000010

causes the stream signal to be output on the first four channels (likely the front and rear pairs, but not on the center or Low-Frequency Effects (LFE) channels). The bitmap that's used to describe the hardware (the columns) depends on the hardware. Further, the actual hardware that it's running on determines how the channels are mapped. For example:

  • When the hardware arranges the channels such that the center channel is the third channel, then bit 2 represents the center.
  • When the hardware arranges the channels such that the rear left is the third channel, then bit 2 represents the rear left.

When the number of source voices matches the number of destination voices, the converter isn't invoked and the channels cannot be rerouted. When you're playing a stereo file on stereo hardware, you can't use the voice matrix to swap the channels, because the voice converter isn't used in this case.

When you call snd_pcm_plugin_get_voice_conversion() or snd_pcm_plugin_set_voice_conversion() before the voice conversion plugin has been instantiated, the functions fail and return -ENOENT.

Prepare the PCM subchannel

Before you can begin playing or recording a sound stream, you must prepare the allocated PCM subchannel to run. Call one of the following functions to prepare the allocated PCM subchannel:

The snd_pcm_channel_prepare() function calls snd_pcm_capture_prepare() or snd_pcm_playback_prepare(), depending on the channel direction that you specify.

This step and the SND_PCM_STATUS_PREPARED state are required to correctly handle buffer underrun conditions when playing a sound stream and buffer overrun conditions when capturing a sound stream. For more information, see PCM subchannel stops during playback or PCM subchannel stops during recording.

Close the PCM subchannel

When you've finished playing or recording audio, you must close the subchannel by calling snd_pcm_close(). The call to snd_pcm_close() releases the subchannel and closes the handle.

Select what to record

Most sound cards allow only one analog signal to be connected to the analog-to-digital converter (ADC). This means that in order to record audio, an input source must be selected.

Some sound cards allow multiple signals to be connected to the ADC. In this case, you must make sure that the signal that you want to record is connected to the ADC. Use the snd_mixer_group_write() function to control the mixer. Using this function allows your app to set up the correct signal. For more information, see Working with audio mixers.

Recording states

Let's consider the state transitions for PCM devices during record. The state diagram for a PCM device during recording is shown below.

State diagram showing state transitions for PCM devices during recording.

The transition between SND_PCM_STATUS_* states can occur as the result of having made a function call, or can occur because of conditions that exist in the hardware:

From

To

Cause

NOTREADY

READY

Calling snd_pcm_channel_params() or snd_pcm_plugin_params()

READY

PREPARED

Calling snd_pcm_capture_prepare(), snd_pcm_channel_prepare(), or snd_pcm_plugin_prepare()

PREPARED

RUNNING

Calling snd_pcm_read() or snd_pcm_plugin_read(), calling select() against the recording file descriptors, snd_pcm_channel_go(), or against the recording file descriptors, snd_pcm_capture_go()

RUNNING

PAUSED

Calling snd_pcm_capture_pause() or snd_pcm_channel_pause()

PAUSED

RUNNING

Calling snd_pcm_capture_resume(), or snd_pcm_channel_resume()

RUNNING

OVERRUN

The hardware buffer became full during recording; snd_pcm_read() and snd_pcm_plugin_read() fail

RUNNING

UNSECURE

The app marked the stream as protected, the hardware level supports a secure transport (such as HDCP for HDMI), and authentication was lost

RUNNING

CHANGED

The stream has changed

PAUSED

CHANGED

The stream has changed or an event has occurred

PREPARED

CHANGED

The stream has changed or an event has occurred

RUNNING

ERROR

A hardware error has occurred

OVERRUN, UNSECURE, CHANGE, or ERROR

PREPARED

Calling snd_pcm_capture_prepare(), snd_pcm_channel_prepare(), or snd_pcm_plugin_prepare()

RUNNING

PREEMPTED

Audio is blocked because a new libasound session has begun playback, and the audio driver has determined that new session has a higher priority

For more information, see the Audio Library .

Receive data from the PCM subchannel

Which function you use to receive data from the subchannel depends on whether or not you're using plug-in converters.

snd_pcm_read()
The number of bytes read must be a multiple of the fragment size, or the read does not succeed.
snd_pcm_plugin_read()
The plug-in reads an entire fragment from the driver and fulfills requests for partial reads from that buffer until another full fragment has to be read.

A full nonblocking read mode is supported when your app cannot afford to be blocked on the PCM subchannel. You can enable nonblocking mode when you open the handle or by using the snd_pcm_nonblock_mode() function.

Using this approach results in a polled operation mode, which is not recommended.

Another approach that your app can use to avoid blocking while reading is to use select() to wait until the PCM subchannel has more data. This technique allows the app to wait on user input while receiving the record data from the PCM subchannel.

To get the file descriptor to pass to select(), call snd_pcm_file_descriptor().

With this technique, select() returns when the number of bytes in the subchannel are equal to frag_size. If your app tries to read more data than this amount, it may block on the call.

PCM subchannel stops during recording

When recording audio, the PCM subchannel stops when the hardware buffer is full. This situation can happen if the app can't consume data at the same rate as the hardware is producing it. A real-world example of this case is when the app is preempted by a higher-priority process. If this preemption continues long enough, the data buffer may be filled before the app can consume any data.

When the data buffer is filled before the app can consume any data, the PCM subchannel changes its state to SND_PCM_STATUS_OVERRUN. In this state, the PCM subchannel does not provide any more data, snd_pcm_read(), and snd_pcm_plugin_read() does not succeed and the PCM subchannel doesn't restart recording.

The only way to move out of this state is to close the PCM subchannel or to reprepare the PCM subchannel as you did before. Preparing the PCM subchannel forces the app to recognize the underrun state and try to get out of it.

This approach is useful for apps that want to synchronize audio with something else. Consider the difficulties with synchronization if the PCM subchannel were to move back into the SND_PCM_STATUS_RUNNING state from an overrun state when space became available. The recorded sample would not be continuous as expected.

Stop recording

When your app wants to stop recording audio, it can stop reading data and let the PCM subchannel overrun. However, there's a better way to stop recording audio. You can call one of the flush functions when you want to stop recording audio immediately and delete any unread data from the hardware buffer:

Synchronize with the PCM subchannel

The QSA provides some basic synchronization capabilities. An app can retrieve the current position of the recording hardware in the stream. The hardware driver is responsible for resolving this position.

You can use these functions to work with the current position of the recording hardware:

Both functions can be used to create a snd_pcm_channel_status_t data structure. You can access the following members of this data structure:

scount
The hardware recording position, in bytes, relative to the start of the stream since you last prepared the channel. Preparing a channel resets this count.
count
The recording position, in bytes, in the hardware buffer.

The count member is not used with the mmap plug-in. To disable the mmap plug-in, call snd_pcm_plugin_set_disable().

Working with audio mixers

Audio mixers allow you control many of the audio properties of the files that you record or play. Audio mixers can be created from a small number of components. Each of these components performs a specific mixing function. A summary of these components are shown below:

Input
A connection point where an external analog signal is brought into the mixer
Output
A connection point where an analog signal is taken from the mixer
ADC
An element that converts analog signals to digital samples
DAC
An element that converts digital samples to analog signals
Switch
An element that can connect two or more points together. A switch may be used as a mute control. More complicated switches can mute the channels of a stream individually, or can even form crossbar matrices where n input signals can be connected to n output signals.
Volume
An element that adjusts the amplitude level of a signal by applying attenuation or gain
Accumulator
An element that adds all signals input to it and produces an output signal
Multiplexer
An element that selects the signal from one of its inputs and forwards it to a single output line

Building a sound card mixer

Using the elements shown above, you can build a sound card mixer. The following diagram shows a simplified representation of the Audio Codec '97 mixer:

Diagram showing a simplified representation of the Audio Codec '97 mixer.

In the diagram, the mute figures are switches, and the MIC and CD are input elements.

It's possible to control these mixer elements directly using the snd_mixer_element_read() and snd_mixer_element_write() functions, but this approach isn't recommended because:

  • The arguments to these functions are dependent on the element type.
  • Controlling many elements to change mixer functionality is difficult with this approach.
  • There's a better way to it.

The element interface is the lowest level of control for a mixer and is complicated to control. One solution to this complexity is to arrange elements that are associated with a function into a mixer group. To further refine this idea, groups are classified as either playback groups or capture (record) groups. To simplify creating and managing groups, a hard set of rules was developed for how groups are built from elements:

  • A playback group contains at most one volume element and one switch element (as a mute).
  • A capture group contains at most one each of a volume element, switch element (as a mute), and record selection element. The record selection element may be a multiplexer or a switch.

If you apply these rules to the mixer in the above diagram, you get the following:

Playback Group PCM
Elements B (volume) and C (switch)
Playback Group MIC
Elements E (volume) and F (switch)
Playback Group CD
Elements L (volume) and M (switch)
Playback Group MASTER
Elements H (volume) and I (switch)
Capture Group MIC
Element N (multiplexer); there's no volume or switch
Capture Group CD
Element N (multiplexer); there's no volume or switch
Capture Group INPUT
Elements O (volume) and P (switch)

By separating the elements into groups, you reduce the complexity of control (there are 7 groups instead of 17 elements), and each group associates well with the apps that you want to control.

Open the mixer device

To open a connection to the mixer device, call snd_mixer_open(). You can select the card and mixer device number to open by passing them as parameters to the function. Most sound cards have one mixer, but there may be additional mixers in special cases.

The snd_mixer_open() function returns a mixer handle that you can use as an argument for additional function calls that are applied to this device. The mixer handle is a pointer to a snd_mixer_t structure, which is an opaque data type.

Control a mixer group

The best way to control a mixer group is to use the read-modify-write technique. Using this technique, you can examine the group capabilities and ranges before adjusting the group.

To read the properties and settings of a mixer group, your app must identify the group. Every mixer group has a name, but because two groups may have the same name, a name alone isn't enough to identify a specific mixer group. In order to make groups unique, mixer groups are identified by the combination of name and index. The index is an integer that represents the instance number of the name. In most cases, the index is 0. In the case of two mixer groups with the same name, the first has an index of 0, and the second has an index of 1.

To read a mixer group, call the snd_mixer_group_read() function. The arguments to this function are the mixer handle and the group control structure. The group control structure is of type snd_mixer_group_t.

To read a particular group, you must set its name and index in the gid substructure (see snd_mixer_gid_t) before making the call. If the call to snd_mixer_group_read() succeeds, the function fills in the structure with the group's capabilities and current settings.

Now that you have the group capabilities and current settings, you can change them before you write them back to the mixer group.

To write the changes to the mixer group, call snd_mixer_group_write(), passing as arguments the mixer handle and the group control structure.

The best mixer group for your PCM subchannel

In a typical mixer, there are many playback mixer group controls. Several of these controls manage the volume and mute functionality of the stream that your app is playing.

Consider what happens when you increase your .wav file volume by using the Master group. Other streams, such as those playing an .mp3 file, are affected as well. The best group to use is the PCM subchannel because it affects only your stream. However, on some cards, a subchannel group may not exist, so you need a better way to find the best group.

The best way to find the best group for a PCM subchannel is to let the driver (for example, the driver author) do it for you. You can obtain the identity of the best mixer group for a PCM subchannel by calling snd_pcm_channel_setup() or snd_pcm_plugin_setup(), as shown:

Not applicable

Not applicable

memset (&setup, 0, sizeof (setup));
memset (&group, 0, sizeof (group));
setup.channel = SND_PCM_CHANNEL_PLAYBACK;
setup.mixer_gid = &group.gid;
if ((rtn = snd_pcm_plugin_setup (pcm_handle, &setup)) < 0)
{
    return -1;
}

Your app must initialize the setup structure to zero and then set the mixer_gid pointer to a storage location for the group identifier.

The best group may change depending on the state of the PCM subchannel. The PCM subchannels aren't allocated to an app until the parameters of the channel are established. Similarly, the subchannel mixer group isn't available until the subchannel is allocated. Using the Sound Blaster Live! example, the best mixer group before the subchannel is allocated is the PCM group and, after allocation, the PCM subchannel group.

Find all mixer groups

You can get a complete list of mixer groups by calling snd_mixer_groups(). You call snd_mixer_groups() twice: once to get the total number of mixer groups, then a second time to read their IDs.

The arguments to the call are the mixer handle and an snd_mixer_group_t structure. The structure contains a pointer to where the groups' identifiers are stored, which is an array of snd_mixer_gid_t structures, and the size of that array. The call fills in the structure with the number of identifiers that were stored, and indicates when some couldn't be stored because they would exceed the storage size.

Here's an example where the snd_strerror() function prints error messages for the sound functions:

Not applicable

Not applicable

while (1)
{
    memset (&groups, 0, sizeof (groups));
    if ((ret = snd_mixer_groups (mixer_handle, &groups) < 0))
    {
       fprintf (stderr, "snd_mixer_groups function - %s",
                snd_strerror (ret));
    }

    mixer_n_groups = groups.groups_over;
    if (mixer_n_groups > 0)
    {
        groups.groups_size = mixer_n_groups;
        groups.pgroups = (snd_mixer_gid_t *) malloc (
           sizeof (snd_mixer_gid_t) * mixer_n_groups);

        if (groups.pgroups == NULL)
            fprintf (stderr, "Unable to malloc group array - %s",
                     strerror (errno));

        groups.groups_over = 0;
        groups.groups = 0;

        if (snd_mixer_groups (mixer_handle, &groups) < 0)
            fprintf (stderr, "No Mixer Groups ");

        if (groups.groups_over > 0)
        {
            free (groups.pgroups);
            continue;
        }
        else
        {
            printf ("sorting GID table \n");
            snd_mixer_sort_gid_table (groups.pgroups, mixer_n_groups,
                snd_mixer_default_weights);
            break;
        }
    }
}

Mixer event notification

By default, all mixer apps are required to keep current with all mixer changes. Keeping current with all mixer changes is done by queuing a mixer-change event on all apps other than the app that is making the change. The driver queues these events on all apps that have an open mixer handle, unless the app uses the snd_mixer_set_filter() function to mask out events that it's not interested in.

Apps use the snd_mixer_read() function to read the queued mixer events. The arguments to these functions are the mixer handle and a structure of callback functions to call based on the event type.

Not applicable

Not applicable

You can use the select() function to determine when to call snd_mixer_read(). To get the file descriptor to pass to select(), call snd_mixer_file_descriptor().

static void mixer_callback_group (void *private_data,
                                  int cmd,
                                  snd_mixer_gid_t * gid)
{
    switch (cmd)
    {
    case SND_MIXER_READ_GROUP_VALUE:
        printf ("Mixer group %s %d changed value \n",
                gid->name, gid->index);
        break;

    case SND_MIXER_READ_GROUP_ADD:
        break;

    case SND_MIXER_READ_GROUP_REMOVE:
        break;
    }
}

int mixer_update (int fd, void *data, unsigned mode)
{
    snd_mixer_callbacks_t callbacks = { 0, 0, 0, 0 };

    callbacks.group = mixer_callback_group;
    snd_mixer_read (mixer_handle, &callbacks);
    return (Pt_CONTINUE);
}

int main (void)
{
    snd_mixer_t *mixer_handle;
    int ret;

    if ((ret = snd_mixer_open (&mixer_handle, 0, 0) < 0))
        printf ("Unable to open/read mixer - %s",
                snd_strerror (ret));

    PtAppAddFd (NULL,
                snd_mixer_file_descriptor (mixer_handle),
                Pt_FD_READ, mixer_update, NULL);
    ...
}

Close the mixer device

Closing the mixer device frees all of the resources that are associated with the mixer handle and shuts down the connection to the sound mixer interface. To close the mixer handle, call snd_mixer_close() function.

Last modified: 2015-07-24



Got questions about leaving a comment? Get answers from our Disqus FAQ.

comments powered by Disqus