Bound multiprocessing (BMP)

Bound multiprocessing provides the scheduling control of an asymmetric multiprocessing model, while preserving the hardware abstraction and management of symmetric multiprocessing. BMP is similar to SMP, but you can specify which processors a thread can run on. You can use both SMP and BMP on the same system, allowing some threads to migrate from one processor to another, while other threads are restricted to one or more processors.

As with SMP, a single copy of the OS maintains an overall view of all system resources, allowing them to be dynamically allocated and shared among applications. But, during application initialization, a setting determined by the system designer forces all of an application's threads to execute only on a specified CPU.

Compared to full, floating SMP operation, this approach offers several advantages:

  • It eliminates the cache thrashing that can reduce performance in an SMP system by allowing applications that share the same data set to run exclusively on the same CPU.
  • It offers simpler application debugging than SMP since all execution threads within an application run on a single CPU.
  • It helps legacy applications that use poor techniques for synchronizing shared data to run correctly, again by letting them run on a single CPU.

With BMP, an application locked to one CPU can't use other CPUs, even if they're idle. However, the BlackBerry 10 OS lets you dynamically change the designated CPU, without having to checkpoint, and then stop and restart the application.

BlackBerry 10 OS supports the concept of hard processor affinity through a runmask. Each bit that's set in the runmask represents a processor that a thread can run on. By default, a thread's runmask is set to all ones, allowing it to run on any processor. A value of 0x01 would allow a thread to execute only on the first processor.

By default, a process's or thread's children don't inherit the runmask; there's a separate inherit mask.

By careful use of these masks, a systems designer can further optimize the runtime performance of a system (for example, by relegating nonrealtime processes to a specific processor). In general, however, this shouldn't be necessary, because our realtime scheduler always preempts a lower-priority thread immediately when a higher-priority thread becomes ready. Processor locking likely affects only the efficiency of the cache, since threads can be prevented from migrating.

You can specify the runmask for a new thread or process by:

  • Setting the runmask member of the inheritance structure and specifying the SPAWN_EXPLICIT_CPU flag when you call spawn()


  • Using the -C or -R option to the on utility when you launch a program. This also sets the process's inherit mask to the same value.

You can change the runmask for an existing thread or process by:

  • using the _NTO_TCTL_RUNMASK or _NTO_TCTL_RUNMASK_GET_AND_SET_INHERIT command to the ThreadCtl() kernel call


  • using the -C or -R option to the slay utility. If you also use the -i option, slay sets the inherit mask to the same value.

For more information, see Multicore processing.

A viable migration strategy

As a midway point between AMP and SMP, BMP offers a viable migration strategy if you want to move towards full SMP, but you're concerned that your existing code may operate incorrectly in a truly concurrent execution model. You can port legacy code to a multicore system and initially bind it to a single CPU to ensure correct operation. By judiciously binding applications (and possibly single threads) to specific CPUs, you can isolate potential concurrency issues down to the application and thread level. Resolving these issues allows the application to run fully concurrently, thereby maximizing the performance gains provided by the multiple processors.

Last modified: 2015-03-31

Got questions about leaving a comment? Get answers from our Disqus FAQ.

comments powered by Disqus