Unix Shell Programming

Revised February 3, 2001. Press your browser's Reload or Refresh button to get the latest version.


What is a shell?

A shell is an operating system command interpreter. A Unix shell is also a programming language for manipulating files and programs. You can put shell commands in a file called a shell script and then run the script just like any other program.

Why write shell scripts?

Shell scripts are useful for automating repetitive operations, such as running a program on a large collection of input files. You can use shell scripts as "wrappers" to adapt and customize compute-intensive programs (in C, Lisp, ...) for the local file system. Shell scripts start up and configure Unix itself when you turn on the computer; other shell scripts configure your session when you log in.

Which shell?

There are several Unix shells: sh came first, csh and then tcsh were popular for a time, now bash is the default login shell in most Linux installations, including the ACC lab machines at Evergreen. I strongly recommend bash for shell programming as well. It is like sh but also includes many conveniences inspired by csh and tcsh. And, if you use the same shell for programming and interactive use, you can try out each command in your program just by typing at the command prompt.

The sh/bash command prompt is usually a dollar sign $ while the csh/tcsh prompt is usually a percent sign %.

The chapter on shell programming (lesson 14) in the Ten Minutes textbook inexplicably uses csh/tcsh not sh/bash. Ignore that chapter and use the Nutshell book instead (chapters 6 and 7 are on bash).

Ordinary commands and built-in commands

There are two kinds of shell commands, which are described in two different places.

Most shell commands, for example ls, cat, more etc., etc. are just programs in some system directory such as /bin or /usr/bin. These work the same way in any shell. There is an alphabetic listing in chapter 3 in the Nutshell book and you can get on-line information using man ls etc.

Other shell commands, for example cd and pwd, are built into the shell program itself. Some of these built-ins only exist in some shells, or work differently in different shells. There is an alphabetic listing of the built-in commands for bash in chapter 7 in the Nutshell book; to get on-line information you must use man bash.

Evaluation and echo

The shell evaluates some command arguments before passing them to the command. For example, it expands the asterisk or star * to a list of the files in the current working directory. The echo command simply evaluates its arguments. Use it to see the effects of evaluation. For example:

   $ echo hello
   hello
   $ echo *
   #shell.html# emacs-intro-dl.html emacs-intro.html emacs-intro.html~
   emacs-telnet files-outtakes.html fofc-announcements.html ...

Variables and substitution

Create and assign variables using the equal sign = (with no spaces around it). To substitute a variable (obtain its value), you must precede it with the dollar sign $.

   $ x=3
   $ echo x
   x
   $ echo $x
   3

Environment variables

Ordinary variables are like local variables: they are only recognized in the script where they are defined (more accurately, they are only recognized in the process where they are defined; usually each script and program executes in its own process). Environment variables are like global variables: they are recognized in other processes (scripts and programs) as well. Use the export command to make environment variables.

   export y=4

Unix provides a set of built-in environment variables that define the configuration for your session: your home directory, your path etc. These variables are spelled in all capital letters. Use the printenv command to display your environment (no $'s needed, omit the argument to see all environment variables).

   $ printenv y
   4
   $ printenv PATH
   /usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jon/bin  
   $ printenv
   ...
   y=4
   PATH=/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jon/bin
   HOME=/home/jon
   SHELL=/bin/bash         
   ...

Quotes and escapes

Sometimes you need to suppress evaluation by the shell so you can use a string that contains special characters as an argument to a command. Use single quotes '...' to suppress all evaluation by the shell. Use double quotes "..." to suppress most evaluation including wildcard expansion but allow variable substitution. Use backslash \ to suppress ("escape") interpretation of individual characters.

   $ echo The value of x in *.1 is $x
   The value of x in script.1 is 3
   $ echo 'The value of x *.1 is $x'
   The value of x in *.1 is $x
   $ echo "The value of x in *.1 is $x"
   The value of x in *.1 is 3
   $ echo "He said "The value of x in *.1 is $x""
   He said The value of x in script.1 is 3
   $ echo "He said \"The value of x in *.1 is $x\""
   He said "The value of x in *.1 is 3"       

Command interpolation and backquotes

Use backquotes `...` to interpolate the output from one command into another command.

   $ date
   Sat Feb  3 12:12:02 PST 2001
   $ echo Now it is `date`
   Now it is Sat Feb 3 12:12:28 PST 2001

Command editing, command history, and multi-line commands

When using bash interactively, you can edit the command line using arrow keys, backspace, delete, and emacs-style commands (C-a to move to the beginning of the line, C-e to the end, C-k to cut, C-y to paste or "yank"). Use the up-arrow key to retrieve the previous command for execution or editing, use the up- and down- arrow keys to scroll throught the history of recent commands.

Use the backslash character at the end of the line to create long commands that extend over more than one line (this just uses backslash to escape the usual interpretation of the newline as a signal to execute the command). The shell prints a different prompt character (usually >) to indicate continuation lines.

   $ echo This command \
   > extends over \
   > several lines
   This command extends over several lines      

Return status

In addition to producing output, all commands return a status value, where 0 indicates success and any nonzero value indicates failure. The shell does not normally display the status value but the status returned by the last command is also the value of the variable $?. Status values are used to control the execution of shell scripts.

  $ ls index.html
  index.html
  $ echo $?
  0
  $ ls nowhere
  ls: nowhere: No such file or directory
  $ echo $?
  1                          

Conditional execution

The conditional operators are && (and) and || (or) test command status to determine whether to execute the next command. In cmd1 && cmd2, cmd2 executes only if cmd1 succeeds. In cmd1 || cmd2, cmd2 executes only if cmd1 fails. In cmd1 && cmd2 || cmd3, cmd2 executes if cmd1 succeeds and cmd3 executes if it fails.

   $ file=index.html
   $ ls $file && echo All is well || echo Something is wrong
   index.html
   All is well
   $ file=nowhere.html
   $ ls $file && echo All is well || echo Something is wrong
   ls: nowhere.html: No such file or directory
   Something is wrong      

Test command

The test command is used to control execution of shell scripts. It computes and returns a status value but produces no output. Use command options to select the condition to test. For example the -a option tests if a file exists.

The command test condition can be abbreviated [ condition ]

   $ file=index.html
   $ test -a $file && echo $file exists || echo $file does not exist
   index.html exists
   $ file=nowhere.html
   $ [ -a $file ] && echo $file exists || echo $file does not exist
   nowhere.html does not exist      

If command

The if command provides another way to provide conditional execution. You can put many commands in each block (after the then or after the else). The shell can tell when commands like this are incomplete so you needn't put backslashes at the end of lines (however, these commands do seem to be sensitive to where you put the linebreaks).

  $ if [ -a $file ]
  > then
  >    echo $file exists
  > else
  >    echo $file does not exist
  > fi
  nowhere.html does not exist  

Writing shell scripts

To write a shell script, just put commands in a file (using some text editor such as emacs). There are several shells so the first line in the file should identify the shell language you use in your script. For bash, the line is

   #!/bin/bash

After the first line, the pound sign # indicates a comment. The shell ignores everything starting at the pound sign through the end of the line.

Usually, shell script files do not have any extension (any dot etc.) at the end of the name.

Running shell scripts

There are two ways to run shell scripts.

The easiest way is to source the script. Just execute the command source file (or just . file). You can type this command in your interactive session or put this command in another script. This causes the shell to execute the commands in file in the current process, just as if you had typed them then and there, so ordinary (un-exported) variables defined in the script can be used by subsequent commands in the same process. You only need read permission for file (you don't need execute permission) and file need not be in a directory that is in your path.

Another way is to treat the script as a full-fledged Unix command. Just execute the command file. This causes Unix to execute file in its own process, so unexported variables cannot be used by subsequent commands in the process where the command was issued. You must have execute permission for file, and file must be in a directory that is in your path.

In most Linux setups, your working directory . is usually not in your path (there are security reasons for this), but your own ~/bin directory is in your path. This is a good place to put your shell scripts once you've got them working the way you want.

For command

The for command enables you to repeat a sequence of commands for each item in some collection.

  $ for i in 1 2 3
  > do
  >   echo $i
  > done
  1
  2
  3                            

You can use for to perform sn operation on every file in some collection by using the file glob character star *.

   $ for file in *.html
   > do
   >  echo The file is $file
   > done    
   This is file emacs-intro-dl.html
   This is file emacs-intro.html
   This is file files-outtakes.html
   ...

Command arguments

You can invoke scripts with arguments. The special variables $1, $2, $3, etc. are the first, second, third etc. command line arguments.

Working with filenames: dirname and basename

You often need to take apart and create filenames in scripts. The dirname command outputs the directory part. The basename basename command outputs the filename part, and takes an optional argument that can remove the extension.

  $ file=/usr/users4/fofc/z2html/tex/dcs.tex
  $ dirname $file
  /usr/users4/fofc/z2html/tex             
  $ basename $file   
  dcs.tex
  $ basename $file .tex
  dcs

These commands are often used with backquote to create new filenames.

  $ echo We have `basename $file` but we want `basename $file .tex`.html
  We have dcs.tex but we want dcs.html   

Other commands: read, while, etc.

The read command reads one line from the standard input and assigns its contents to variables. The read command returns 0 (success) until it reaches end of file, so it can be used with the while command to read files one line at a time.

There are other control structures including case and until.

Other pages


Jon Jacky, jackyj@evergreen.edu