Pathname management

As other resource managers adopt their respective domains of authority, procnto becomes responsible for maintaining a pathname tree to track the processes that own portions of the pathname space. An adopted pathname is sometimes referred to as a prefix because it prefixes any pathnames that lie beneath it; prefixes can be arranged in a hierarchy called a prefix tree. The adopted pathname is also called a mountpoint, because that's where a server mounts into the pathname.

This approach to pathname space management is what allows BlackBerry 10 OS to preserve the POSIX semantics for device and file access, while making the presence of those services optional for small embedded systems.

At startup, procnto populates the pathname space with the following pathname prefixes:

Prefix Description
/ Root of the file system.
/proc/boot/ Some of the files from the boot image presented as a flat file system.
/proc/ pid The running processes, each represented by its process ID (PID). For more information, see Controlling processes using the /proc file system.
/dev/zero A device that always returns zero. Used for allocating zero-filled pages using the mmap() function.
/dev/mem A device that represents all physical memory.

Resolving pathnames

When a process opens a file, the POSIX-compliant open() library routine first sends the pathname to procnto, where the pathname is compared against the prefix tree to determine which resource managers should be sent the open() message. The prefix tree may contain identical or partially overlapping regions of authority—multiple servers can register the same prefix. If the regions are identical, the order of resolution can be specified (see Ordering mountpoints). If the regions are overlapping, the responses from the path manager are ordered with the longest prefixes first; for prefixes of equal length, the same specified order of resolution applies as for identical regions.

For example, suppose we have these prefixes registered:

Prefix Description
/ QNX 4 disk-based file system
/dev/ser1 Serial device manager ( devc-ser* )
/dev/ser2 Serial device manager (devc-ser*)
/dev/hd0 Raw disk volume

The file system manager has registered a prefix for a mounted QNX 4 file system (that is, /). The block device driver has registered a prefix for a block special file that represents an entire physical hard drive (that is, /dev/hd0). The serial device manager has registered two prefixes for the two PC serial ports.

The following table illustrates the longest-match rule for pathname resolution:

This pathname: matches: and resolves to:
/dev/ser1 /dev/ser1 devc-ser*
/dev/ser2 /dev/ser2 devc-ser*
/dev/ser / fs-qnx4.so
/dev/hd0 /dev/hd0 devb-eide.so
/usr/jhsmith/test / fs-qnx4.so

Ordering mountpoints

Generally the order of resolving a filename is the order in which you mounted the file systems at the same mountpoint (that is, new mounts go on top of or in front of any existing ones). You can specify the order of resolution when you mount the file system. For example, you can use:

  • the before and after keywords for block I/O ( devb-* ) drivers, in the blk options

You can also use the -o option to mount with these keywords:

before
Mount the file system so that it's resolved before any other file systems mounted at the same pathname (in other words, it's placed in front of any existing mount). When you access a file, the system looks on this file system first.
after
Mount the file system so that it's resolved after any other file systems mounted at the same pathname (in other words, it's placed behind any existing mounts). When you access a file, the system looks on this file system last, and only if the file wasn't found on any other file systems.

If you specify the appropriate before option, the file system floats in front of any other filesystems mounted at the same mountpoint, except those that you later mount with before. If you specify after, the file system goes behind any any other file systems mounted at the same mountpoint, except those that are already mounted with after. So, the search order for these file systems is:

  1. Those mounted with before
  2. Those mounted with no flags
  3. Those mounted with after

with each list searched in order of mount requests. The first server to claim the name gets it. You would typically use after to have a file system wait at the back and pick up things the no one else is handling, and before to make sure a file systems looks first at filenames.

Single-device mountpoints

Server A
A QNX 4 file system. Its mountpoint is /. It contains the files bin/true and bin/false.
Server B
A flash file system. Its mountpoint is /bin. It contains the files ls and echo.
Server C
A single device that generates numbers. Its mountpoint is /dev/random.

At this point, the process manager's internal mount table would look like this:

Mountpoint Server
/ Server A (QNX 4 file system)
/bin Server B (flash file system)
/dev/random Server C (device)

Of course, each Server name is actually an abbreviation for the nd,pid,chid for that particular server channel.

Now suppose a client wants to send a message to Server C. The client's code might look like this:

int fd;
fd = open("/dev/random", ...);
read(fd, ...);
close(fd);

In this case, the C library asks the process manager for the servers that could potentially handle the path /dev/random. The process manager would return a list of servers:

  • Server C (most likely; longest path match)
  • Server A (least likely; shortest path match)

From this information, the library then contacts each server in turn and send it an open message, including the component of the path that the server should validate:

  1. Server C receives a null path, since the request came in on the same path as the mountpoint.
  2. Server A receives the path dev/random, since its mountpoint was /.

As soon as one server positively acknowledges the request, the library won't contact the remaining servers. This means Server A is contacted only if Server C denies the request.

This process is fairly straightforward with single device entries, where the first server is generally the server that handles the request. Where it becomes interesting is in the case of unioned file system mountpoints.

Unioned file system mountpoints

Let's assume we have two servers set up as before:

Server A
A QNX 4 file system. Its mountpoint is /. It contains the files bin/true and bin/false.
Server B
A flash file system. Its mountpoint is /bin. It contains the files ls and echo.

Note that each server has a /bin directory, but with different contents.

When both servers are mounted, you would see the following due to the unioning of the mountpoints:

/
Server A
/bin
Servers A and B
/bin/echo
Server B
/bin/false
Server A
/bin/ls
Server B
/bin/true
Server A

What's happening here is that the resolution for the path /bin takes place as before, but rather than limit the return to just one connection ID, all the servers are contacted and asked about their handling for the path:

DIR *dirp;
dirp = opendir("/bin", ...);
closedir(dirp);

which results in:

  1. Server B receives a null path, since the request came in on the same path as the mountpoint.
  2. Server A receives the path bin, since its mountpoint was /.

The result now is that we have a collection of file descriptors to servers who handle the path /bin (in this case two servers); the actual directory name entries are read in turn when a readdir() is called. If any of the names in the directory are accessed with a regular open, then the normal resolution procedure takes place and only one server is accessed.

Why overlay mountpoints?

This overlaying of mountpoints is a very handy feature when doing field updates, servicing, and so on. It also makes for a more unified system, where pathnames result in connections to servers regardless of what services they're providing, thus resulting in a more unified API.

Symbolic prefixes

We've discussed prefixes that map to a resource manager. A second form of prefix, known as a symbolic prefix, is a simple string substitution for a matched prefix. You create symbolic prefixes using the POSIX ln (link) command. This command is typically used to create hard or symbolic links on a file system by using the -s option. If you also specify the -P option, then a symbolic link is created in the in-memory prefix space of procnto.

Command Description
ln -s existing_file symbolic_link Create a file system symbolic link.
ln -Ps existing_file symbolic_link Create a prefix tree symbolic link.

Note that a prefix tree symbolic link always takes precedence over a file system symbolic link.

Creating special device names

You can also use symbolic prefixes to create special device names. For example, if your modem was on /dev/ser1, you could create a symbolic prefix of /dev/modem as follows:

ln -Ps /dev/ser1 /dev/modem

Any request to open /dev/modem is replaced with /dev/ser1. This mapping would allow the modem to be changed to a different serial port simply by changing the symbolic prefix and without affecting any applications.

Relative pathnames

Pathnames need not start with slash. In such cases, the path is considered relative to the current working directory. The OS maintains the current working directory as a character string. Relative pathnames are always converted to full network pathnames by prepending the current working directory string to the relative pathname.

Note that different behaviors result when your current working directory starts with a slash versus starting with a network root.

Network root

If the current working directory begins with a network root in the form /net/ node_name, it's said to be specific and locked to the pathname space of the specified node. If you don't specify a network root, the default one is prepended. For example, this command:

cd /net/percy

is an example of the first (specific) form, and would lock future relative pathname evaluation to be on node percy, no matter what your default network root happens to be. Subsequently entering cd dev would put you in /net/percy/dev.

On the other hand, this command:

cd /

would be of the second form, where the default network root would affect the relative pathname resolution. For example, if your default network root were /net/florence, then entering cd dev would put you in /net/florence/dev. Since the current working directory doesn't start with a node override, the default network root is prepended to create a fully specified network pathname.

To run a command with a specific network root, use the on command, specifying the -f option:

on -f /net/percy command

This runs the given command with /net/percy as the network root; that is, it searches for the command—and any files with relative paths specified as arguments—on /net/percy and runs the command on /net/percy. In contrast, this:

on -n /net/percy command

searches for the given command—and any files with relative paths—on your local node and runs the command on /net/percy.

In a program, you can specify a network root when you call chroot().

This really isn't as complicated as it may seem. Most of the time, you don't specify a network root, and everything you do works within your namespace (defined by your default network root). Most users log in, accept the normal default network root (that is, the namespace of their own node), and work within that environment.

A note about cd

In some traditional UNIX systems, the cd (change directory) command modifies the pathname given to it if that pathname contains symbolic links. As a result, the pathname of the new current working directory may differ from the one given to cd. In BlackBerry 10 OS, however, cd doesn't modify the pathname—aside from collapsing .. references. For example:

cd /usr/home/dan/test/../doc

would result in a current working directory—which you can display with pwd —of /usr/home/dan/doc, even if some of the elements in the pathname were symbolic links.

File descriptor namespace

When an I/O resource has been opened, a different namespace comes into play. The open() returns an integer referred to as a file descriptor (FD), which is used to direct all further I/O requests to that resource manager. Unlike the pathname space, the file descriptor namespace is completely local to each process. The resource manager uses the combination of a SCOID (server connection ID) and FD (file descriptor/connection ID) to identify the control structure associated with the previous open() call. This structure is referred to as an open control block (OCB) and is contained within the resource manager.

The following diagram shows an I/O manager taking some SCOID, FD pairs and mapping them to OCBs.

Figure showing open control blocks.

Open control blocks

The open control block (OCB) contains active information about the open resource. For example, the file system keeps the current seek point within the file here. Each open() creates a new OCB. Therefore, if a process opens the same file twice, any calls to lseek() using one FD does not affect the seek point of the other FD. The same is true for different processes opening the same file.

The following diagram shows two processes, in which one opens the same file twice, and the other opens it once. There are no shared FDs.

Figure showing two processes opening the same file

FDs are a process resource, not a thread resource.

Several file descriptors in one or more processes can refer to the same OCB. This is accomplished by two means:

  • A process may use the dup(), dup2(), or fcntl() functions to create a duplicate file descriptor that refers to the same OCB.
  • When a new process is created via vfork(), fork(), posix_spawn(), or spawn(), all open file descriptors are by default inherited by the new process; these inherited descriptors refer to the same OCBs as the corresponding file descriptors in the parent process.

When several FDs refer to the same OCB, then any change in the state of the OCB is immediately seen by all processes that have file descriptors linked to the same OCB.

For example, if one process uses the lseek() function to change the position of the seek point, then reading or writing takes place from the new position no matter which linked file descriptor is used.

The following diagram shows two processes in which one opens a file twice, then does a dup() to get a third FD. The process then creates a child that inherits all open files.

Figure showing a process using the dup() function to open a file twice

You can prevent a file descriptor from being inherited when you posix_spawn(), spawn(), or exec*() by calling the fcntl() function and setting the FD_CLOEXEC flag.

Last modified: 2015-05-07



Got questions about leaving a comment? Get answers from our Disqus FAQ.

comments powered by Disqus