Synchronous message passing

Synchronous messaging is the main form of IPC in the BlackBerry 10 OS. A thread that does a MsgSend() to another thread (which could be within another process) is blocked until the target thread does a MsgReceive(), processes the message, and executes a MsgReply(). If a thread executes a MsgReceive() without a previously sent message pending, it blocks until another thread executes a MsgSend().

In BlackBerry 10 OS, a server thread typically loops, waiting to receive a message from a client thread. As described earlier, a thread—whether a server or a client—is in the READY state if it can use the CPU. It might not actually be getting any CPU time because of its and other threads' priority and scheduling policy, but the thread isn't blocked.

Let's look first at the client thread:

Diagram showing the changes of state for a client thread in a send-receive-reply transaction.

  • If the client thread calls MsgSend(), and the server thread hasn't yet called MsgReceive(), then the client thread becomes SEND blocked. When the server thread calls MsgReceive(), the kernel changes the client thread's state to be REPLY blocked, which means that server thread has received the message and now must reply. When the server thread calls MsgReply(), the client thread becomes READY.
  • If the client thread calls MsgSend(), and the server thread is already blocked on the MsgReceive(), then the client thread immediately becomes REPLY blocked, skipping the SEND-blocked state completely.
  • If the server thread fails, exits, or disappears, the client thread becomes READY, with MsgSend() indicating an error.

Next, let's consider the server thread:

Diagram showing the changes of state for a server thread in a send-receive-reply transaction.

  • If the server thread calls MsgReceive(), and no other thread has sent to it, then the server thread becomes RECEIVE blocked. When another thread sends to it, the server thread becomes READY.
  • If the server thread calls MsgReceive(), and another thread has already sent to it, then MsgReceive() returns immediately with the message. In this case, the server thread doesn't block.
  • If the server thread calls MsgReply(), it doesn't become blocked.

This inherent blocking synchronizes the execution of the sending thread, since the act of requesting that the data be sent also causes the sending thread to be blocked and the receiving thread to be scheduled for execution. This happens without requiring explicit work by the kernel to determine which thread to run next (as would be the case with most other forms of IPC). Execution and data move directly from one context to another.

Data-queuing capabilities are omitted from these messaging primitives because queueing could be implemented when needed within the receiving thread. The sending thread is often prepared to wait for a response; queueing is unnecessary overhead and complexity (that is, it slows down the nonqueued case). As a result, the sending thread doesn't need to make a separate, explicit blocking call to wait for a response (as it would if some other IPC form had been used).

While the send and receive operations are blocking and synchronous, MsgReply() (or MsgError()) doesn't block. Since the client thread is already blocked waiting for the reply, no additional synchronization is required, so a blocking MsgReply() isn't needed. This allows a server to reply to a client and continue processing while the kernel and/or networking code asynchronously passes the reply data to the sending thread and marks it ready for execution. Since most servers tend to do some processing to prepare to receive the next request (at which point they block again), this works out well.

MsgReply() vs MsgError()

The MsgReply() function is used to return a status and zero or more bytes to the client. MsgError(), on the other hand, is used to return only a status to the client. Both functions unblock the client from its MsgSend().

Message copying

Since our messaging services copy a message directly from the address space of one thread to another without intermediate buffering, the message-delivery performance approaches the memory bandwidth of the underlying hardware. The kernel attaches no special meaning to the content of a message—the data in a message has meaning only as mutually defined by sender and receiver. However, well-defined message types are also provided so that user-written processes or threads can augment or substitute for system-supplied services.

The messaging primitives support multipart transfers, so that a message delivered from the address space of one thread to another needn't pre-exist in a single, contiguous buffer. Instead, both the sending and receiving threads can specify a vector table that indicates where the sending and receiving message fragments reside in memory. Note that the size of the various parts can be different for the sender and receiver.

Multipart transfers allow messages that have a header block separate from the data block to be sent without performance-consuming copying of the data to create a contiguous message. In addition, if the underlying data structure is a ring buffer, specifying a three-part message allows a header and two disjoint ranges within the ring buffer to be sent as a single atomic message. A hardware equivalent of this concept would be that of a scatter/gather DMA facility.

Diagram showing a BlackBerry 10 OS message-pass vector.

The multipart transfers are also used extensively by file systems. On a read, the data is copied directly from the file system cache into the application using a message with n parts for the data. Each part points into the cache and compensates for the fact that cache blocks aren't contiguous in memory with a read starting or ending within a block.

For example, with a cache block size of 512 bytes, a read of 1454 bytes can be satisfied with a five-part message:

Diagram showing a five-part IOV.

Since message data is explicitly copied between address spaces (rather than by doing page table manipulations), messages can be easily allocated on the stack instead of from a special pool of page-aligned memory for MMU page flipping. As a result, many of the library routines that implement the API between client and server processes can be trivially expressed, without elaborate IPC-specific memory allocation calls.

For example, the code used by a client thread to request that the file system manager execute lseek on its behalf is implemented as follows:

#include <unistd.h>
#include <errno.h>
#include <sys/iomsg.h>

off64_t lseek64(int fd, off64_t offset, int whence) {
    io_lseek_t                            msg;
    off64_t                               off;

    msg.i.type = _IO_LSEEK;
    msg.i.combine_len = sizeof msg.i;
    msg.i.offset = offset;
    msg.i.whence = whence;
    msg.i.zero = 0;
    if(MsgSend(fd, &msg.i, sizeof msg.i, &off, sizeof off) == -1) {
        return -1;
    }
    return off;
}

off64_t tell64(int fd) {
    return lseek64(fd, 0, SEEK_CUR);
}

off_t lseek(int fd, off_t offset, int whence) {
    return lseek64(fd, offset, whence);
}

off_t tell(int fd) {
    return lseek64(fd, 0, SEEK_CUR);
}

This code essentially builds a message structure on the stack, populates it with various constants and passed parameters from the calling thread, and sends it to the file system manager associated with fd. The reply indicates the success or failure of the operation.

This implementation doesn't prevent the kernel from detecting large message transfers and choosing to implement page flipping for those cases. Since most messages passed are quite tiny, copying messages is often faster than manipulating MMU page tables. For bulk data transfer, shared memory between processes (with message-passing or the other synchronization primitives for notification) is also a viable option.

Simple messages

For simple single-part messages, the OS provides functions that take a pointer directly to a buffer without the need for an IOV (input/output vector). In this case, the number of parts is replaced by the size of the message directly pointed to. In the case of the message send primitive—which takes a send and a reply buffer—this introduces four variations:

Function Send message Reply message
MsgSend() Simple Simple
MsgSendsv() Simple IOV
MsgSendvs() IOV Simple
MsgSendv() IOV IOV

The other messaging primitives that take a direct message simply drop the trailing v in their names:

Channels and connections

In the BlackBerry 10 OS, message passing is directed towards channels and connections, rather than targeted directly from thread to thread. A thread that wishes to receive messages first creates a channel; another thread that wishes to send a message to that thread must first make a connection by attaching to that channel. Channels are required by the message kernel calls and are used by servers to MsgReceive() messages on. Connections are created by client threads to connect to the channels made available by servers. When connections are established, clients can MsgSend() messages over them. If a number of threads in a process all attach to the same channel, then the connections all map to the same kernel object for efficiency. Channels and connections are named within a process by a small integer identifier. Client connections map directly into file descriptors.

Architecturally, this is a key point. By having client connections map directly into FDs, we have eliminated yet another layer of translation. We don't need to figure out where to send a message based on the file descriptor (for example, via a read(fd) call). Instead, we can send a message directly to the file descriptor (such as the connection ID).

Function Description
ChannelCreate() Create a channel to receive messages on.
ChannelDestroy() Destroy a channel.
ConnectAttach() Create a connection to send messages on.
ConnectDetach() Detach a connection.
Diagram showing BlackBerry 10 OS channels and connections.

A process acting as a server would implement an event loop to receive and process messages as follows:

chid = ChannelCreate(flags);
SETIOV(&iov, &msg, sizeof(msg));
for(;;) {
    rcv_id = MsgReceivev( chid, &iov, parts, &info );

    switch( msg.type ) {
        /* Perform message processing here */
    }

    MsgReplyv( rcv_id, &iov, rparts );
}

This loop allows the thread to receive messages from any thread that had a connection to the channel.

The server can also use name_attach() to create a channel and associate a name with it. The sender process can then use name_open() to locate that name and create a connection to it.

The channel has several lists of messages associated with it:

Receive
A LIFO queue of threads waiting for messages.
Send
A priority FIFO queue of threads that have sent messages that haven't yet been received.
Reply
An unordered list of threads that have sent messages that have been received, but not yet replied to.

While in any of these lists, the waiting thread is blocked (that is, RECEIVE-, SEND-, or REPLY-blocked). Multiple threads and multiple clients may wait on one channel.

Pulses

In addition to the synchronous Send/Receive/Reply services, the OS also supports fixed-size, nonblocking messages. These are referred to as pulses and carry a small payload (four bytes of data plus a single byte code).

Pulses pack a relatively small payload—eight bits of code and 32 bits of data. Pulses are often used as a notification mechanism within interrupt handlers. They also allow servers to signal clients without blocking on them.

Diagram showing a pulse's payload.

Priority inheritance and messages

A server process receives messages and pulses in priority order. As the threads within the server receive requests, they then inherit the priority (but not the scheduling policy) of the sending thread. As a result, the relative priorities of the threads requesting work of the server are preserved, and the server work is executed at the appropriate priority. This message-driven priority inheritance avoids priority-inversion problems. For example, suppose the system includes the following:

  • A server thread, at priority 22
  • A client thread, T1, at priority 13
  • A client thread, T2, at priority 10

Without priority inheritance, if T2 sends a message to the server, it's effectively getting work done for it at priority 22, so T2's priority has been inverted.

What actually happens is that when the server receives a message, its effective priority changes to that of the highest-priority sender. In this case, T2's priority is lower than the server's, so the change in the server's effective priority takes place when the server receives the message.

Next, suppose that T1 sends a message to the server while it's still at priority 10. Since T1's priority is higher than the server's current priority, the change in the server's priority happens when T1 sends the message.

The change happens before the server receives the message to avoid another case of priority inversion. If the server's priority remains unchanged at 10, and another thread, T3, starts to run at priority 11, the server has to wait until T3 lets it have some CPU time so that it can eventually receive T1's message. So, T1 would would be delayed by a lower-priority thread, T3.

You can turn off priority inheritance by specifying the _NTO_CHF_FIXED_PRIORITY flag when you call ChannelCreate().

Message-passing API

The message-passing API consists of the following functions:

Function Description
MsgSend() Send a message and block until reply.
MsgReceive() Wait for a message.
MsgReceivePulse() Wait for a tiny, nonblocking message (pulse).
MsgReply() Reply to a message.
MsgError() Reply only with an error status. No message bytes are transferred.
MsgRead() Read additional data from a received message.
MsgWrite() Write additional data to a reply message.
MsgInfo() Obtain info on a received message.
MsgSendPulse() Send a tiny, nonblocking message (pulse).
MsgDeliverEvent() Deliver an event to a client.
MsgKeyData() Key a message to allow security checks.

For information about messages from the programming point of view, see Message passing.

Robust implementations with send/receive/reply

Architecting a BlackBerry 10 OS application as a team of cooperating threads and processes via Send/Receive/Reply results in a system that uses synchronous notification. IPC thus occurs at specified transitions within the system, rather than asynchronously.A significant problem with asynchronous systems is that event notification requires signal handlers to be run. Asynchronous IPC can make it difficult to thoroughly test the operation of the system and make sure that no matter when the signal handler runs, that processing continues as intended. Applications often try to avoid this scenario by relying on a window explicitly opened and shut, during which signals are tolerated.

With a synchronous, nonqueued system architecture built around Send/Receive/Reply, robust application architectures can be very readily implemented and delivered.

Avoiding deadlock situations is another difficult problem when constructing applications from various combinations of queued IPC, shared memory, and miscellaneous synchronization primitives. For example, suppose thread A doesn't release mutex 1 until thread B releases mutex 2. Unfortunately, if thread B is in the state of not releasing mutex 2 until thread A releases mutex 1, a standoff results. Simulation tools are often invoked to ensure that deadlock won't occur as the system runs.

The Send/Receive/Reply IPC primitives allow the construction of deadlock-free systems with the observation of only these simple rules:

  1. Never have two threads send to each other.
  2. Always arrange your threads in a hierarchy, with sends going up the tree.

The first rule is an obvious avoidance of the standoff situation, but the second rule requires further explanation. The team of cooperating threads and processes is arranged as follows:

Diagram showing a thread tree.

Here the threads at any given level in the hierarchy never send to each other, but send only upwards instead.

One example of this might be a client application that sends to a database server process, which in turn sends to a file system process. Since the sending threads block and wait for the target thread to reply, and since the target thread isn't send-blocked on the sending thread, deadlock can't happen.

B ut how does a higher-level thread notify a lower-level thread that it has the results of a previously requested operation? (Assume the lower-level thread didn't want to wait for the replied results when it last sent.)

The BlackBerry 10 OS provides a very flexible architecture with the MsgDeliverEvent() kernel call to deliver nonblocking events. All of the common asynchronous services can be implemented with this. For example, the server-side of the select() call is an API that an application can use to allow a thread to wait for an I/O event to complete on a set of file descriptors. In addition to an asynchronous notification mechanism being needed as a "back channel" for notifications from higher-level threads to lower-level threads, we can also build a reliable notification system for timers, hardware interrupts, and other event sources around this.

Diagram showing a pulse event.

A related issue is the problem of how a higher-level thread can request work of a lower-level thread without sending to it, risking deadlock. The lower-level thread is present only to serve as a worker thread for the higher-level thread, doing work on request. The lower-level thread would send to report for work, but the higher-level thread wouldn't reply then. It would defer the reply until the higher-level thread had work to be done, and it would reply (which is a nonblocking operation) with the data describing the work. In effect, the reply is being used to initiate work, not the send, which neatly side-steps rule #1.

Last modified: 2015-05-07



Got questions about leaving a comment? Get answers from our Disqus FAQ.

comments powered by Disqus