Writing shell scripts

Shell scripting, at its most basic, is taking a series of commands you might type at a command line and putting them into a file, so you can reproduce them again at a later date, or run them repeatedly without having to type them over again. You can use scripts to automate repeated tasks, handle complex tasks that might be difficult to do correctly without repeated tries, redoing some of the coding, or both.

Available shells

For scripting under BlackBerry 10 OS you can use the ksh shell, a public-domain implementation of the Korn shell. The sh command is usually a symbolic link to ksh. For more information about this shell, see:

  • The Using the command line
  • The entry for ksh in Utilities
  • Rosenblatt, Bill, and Arnold Robbins. 2002. Learning the Korn Shell, 2nd Edition. Sebastopol, CA: O'Reilly & Associates. ISBN 0-596-00195-9

BlackBerry 10 OS also supplies or uses some other scripting environments:

  • An OS buildfile has a script file section tagged by +script. The mkifs parses this script, but it's executed by procnto at boot time. It provides a very simple scripting environment, with the ability to run a series of commands, and a small amount of synchronization.
  • sed is a stream editor, which makes it most useful for performing repeated changes to a file, or set of files. It's often used for scripts, or as a utility within other scripts.
  • gawk (GNU awk) is a programming language for pattern matching and working with the contents of files. You can also use it for scripting or call it from within scripts.

In general, a shell script is most useful and powerful when working with the execution of programs or modifying files in the context of the file system, whereas sed, gawk, and perl are primarily for working with the contents of files. For more information, see:

  • the entries for gawk and sed in the Utilities Reference
  • Robbins, Arnold, and Dale Dougherty. 1997. sed & awk, 2nd Edition. Sebastopol, CA: O'Reilly & Associates. ISBN 1-56592-225-5
  • Schwartz, Randal L., and Tom Phoenix. 2001. Learning Perl. Sebastopol, CA: O'Reilly & Associates. ISBN 0-59600-132-0

Running a shell script

You can execute a shell script in these ways:

  • Invoke another shell with the name of your shell script as an argument:
    sh myscript
  • Load your script as a dot file into the current shell:
    . myscript
  • Use chmod to make the shell script executable, and then invoke it, like this:
    chmod 744 myscript
    ./myscript

    In this instance, your shell automatically invokes a new shell to execute the shell script.

The first line

The first line of a script can identify the interpreter to use. The first line of many—if not most—shell scripts is in this form:

#! interpreter [arg]

For example, a Korn shell script likely starts with:

#! /bin/sh

The line starts with a #, which indicates a comment, so the line is ignored by the shell processing this script. The initial two characters, #!, aren't important to the shell, but the loader code in procnto recognizes them as an instruction to load the specified interpreter and pass it:

  1. The path to the interpreter
  2. The optional argument specified on the first line of the script
  3. The path to the script
  4. Any arguments you pass to the script

For example, if your script is called my_script, and you invoke it as:

./my_script my_arg1 my_arg2 ...

then procnto loads:

interpreter [arg] ./my_script my_arg1 my_arg2 ...

  • The interpreter can't be another #! script.
  • The kernel ignores any setuid and getuid permissions on the script; the child still has the same user and group IDs as its parent. (For more information, see Setuid and setgid.

Some interpreters adjust the list of arguments:

  • ksh removes itself from the arguments
  • gawk changes its own path to be simply gawk
  • perl removes itself and the name of the script from the arguments, and puts the name of the script into the $0 variable

For example, let's look at some simple scripts that echo their own arguments.

Arguments to a ksh script

Suppose we have a script called ksh_script that looks like this:

#! /bin/sh
echo $0
for arg in "$@" ; do
  echo $arg
done

If you invoke it as ./ksh_script one two three, the loader invokes it as /bin/sh ./ksh_script one two three, and then ksh removes itself from the argument list. The output looks like this:

./ksh_script
one
two
three

Arguments to a gawk script

Next, let's consider the gawk version, gawk_script, which looks like this:

#!/usr/bin/gawk -f
BEGIN {
        for (i = 0; i < ARGC; i++)
                print ARGV[i]
}

The -f argument is important; it tells gawk to read its script from the given file. Without -f, this script doesn't work as expected.

If you run this script as ./gawk_script one two three, the loader invokes it as /usr/bin/gawk -f ./gawk_script one two three, and then gawk changes its full path to gawk. The output looks like this:

gawk
one
two
three

Arguments to a perl script

The perl version of the script, perl_script, looks like this:

#! /usr/bin/perl
for ($i = 0; $i <= $#ARGV; $i++) {
    print "$ARGV[$i]\n";
}

If you invoke it as ./perl_script one two three, the loader invokes it as /usr/bin/perl ./perl_script one two three, and then perl removes itself and the name of the script from the argument list. The output looks like this:

one
two
three

Example of a Korn shell script

Let's look at a script that searches C source and header files in the current directory tree for a string passed on the command line:

#!/bin/sh
#
# tfind:
# script to look for strings in various files and dump to less

case $# in
1)
    find . -name '*.[ch]' | xargs grep $1 | less
    exit 0   # good status
esac

echo "Use tfind stuff_to_find                               "
echo "      where : stuff_to_find = search string           "
echo "                                                      "
echo "for example, tfind console_state looks through all files in  "    
echo "     the current directory and below and displays all "
echo "     instances of console_state."
exit 1    # bad status

As described above, the first line identifies the program, /bin/sh, to run to interpret the script. The next few lines are comments that describe what the script does. Then we see:

case $# in
1)
  ...
esac

The case ... in is a shell builtin command, one of the branching structures provided by the Korn shell, and is equivalent to the C switch statement.

The $# is a shell variable. When you refer to a variable in a shell, put a $ before its name to tell the shell that it's a variable rather than a literal string. The shell variable, $#, is a special variable that represents the number of command-line arguments to the script.

The 1) is a possible value for the case, the equivalent of the C case statement. This code checks to see if you've passed exactly one parameter to the shell.

The esac line completes and ends the case statement. Both the if and case commands use the command's name reversed to represent the end of the branching structure.

Inside the case we find:

find . -name '*.[ch]' | xargs grep $1 | less

This line does the bulk of the work, and breaks down into these pieces:

  • find . -name '*.[ch]'
  • xargs grep $1
  • less

which are joined by the | or pipe character. A pipe is one of the most powerful things in the shell; it takes the output of the program on the left, and makes it the input of the program to its right. The pipe lets you build complex operations from simpler building blocks. For more information, see Redirecting input and output.

The first piece, find . -name '*.[ch]', uses another powerful and commonly used command. Most file systems are recursive through a hierarchy of directories, and find is a utility that descends through the hierarchy of directories recursively. In this case, it searches for files that end in either .c or .h—that is, C source or header files—and prints out their names.

The filename wildcards are wrapped in single quotes (') because they're special characters to the shell. Without the quotes, the shell expands the wildcards in the current directory, but we want find to evaluate them, so we prevent the shell from evaluating them by quoting them. For more information, see Quoting special characters in Using the command line.

The next piece, xargs grep $1, does a couple of things:

  • grep is a file-contents search utility. It searches the files given on its command line for the first argument. The $1 is another special variable in the shell that represents the first argument we passed to the shell script (that is, the string we're looking for).
  • xargs is a utility that takes its input and turns it into command-line parameters for some other command that you give it. Here, it takes the list of files from find and makes them command-line arguments to grep. In this case, we're using xargs primarily for efficiency; we could do something similar with just find:
    find . -name '*.[ch]' -exec grep $i {} | less

    which loads and runs the grep program for every file found. The command that we actually used:

    find . -name '*.[ch]' | xargs grep $1 | less

    runs grep only when xargs has accumulated enough files to fill a command line, generally resulting in far fewer invocations of grep and a more efficient script.

The final piece, less , is an output pager. The entire command may generate a lot of output that might scroll off the terminal, so less presents this to you a page at a time, with the ability to move backwards and forwards through the data.

The case statement also includes the following after the find command:

exit 0   # good status

This returns a value of 0 from this script. In shell programming, zero means true or success, and anything nonzero means false or failure. (This is the opposite of the meanings in the C language.)

The final block:

echo "Use tfind stuff_to_find       "
echo "      where : stuff_to_find = search string           "
echo "      "
echo "for example, tfind console_state looks through all files in  "    
echo "     the current directory and below and displays all "
echo "     instances of console_state."
exit 1    # bad status

is just a bit of help; if you pass incorrect arguments to the script, it prints a description of how to use it, and then returns a failure code.

Efficiency

In general, a script isn't as efficient as a custom-written C or C++ program, because it:

  • Is interpreted, not compiled
  • Does most of its work by running other programs

However, developing a script can take less time than writing a program, especially if you use pipes and existing utilities as building blocks in your script.

Caveat scriptor

If you need to write shell scripts, there are a few things to bear in mind.

  • To run a script as if it were a utility, you must make it executable by using the chmod command. For example, if you want anyone to be able to run your script, type:
    chmod a+x script_name

    Your script doesn't have to be executable if you plan to invoke it by passing it as a shell argument:

      ksh script_name

    or if you use it as a dot file, like this:

      . script_name

  • Just as for any executable, if your script isn't in one of the directories in your PATH, you have to specify the path to the script to run it. For example:
    ~/bin/my_script
  • When you run a script, it inherits its environment from the parent process. If your script executes a command that might not be in the PATH, you should either specify the path to the command or add the path to the script's PATH variable.
  • A script can't change its parent shell's environment or current directory, unless you run it as a dot file.
  • A script won't run if it contains DOS end-of-line characters. If you edit a BlackBerry 10 OS script on a Windows machine, use the textto utility with the -l option to convert the file to the format used by the Power-Safe file system.

Last modified: 2014-11-17



Got questions about leaving a comment? Get answers from our Disqus FAQ.

comments powered by Disqus