Go to the previous, next section.
Processes are the primitive units for allocation of system resources. Each process has its own address space and (usually) one thread of control. A process executes a program; you can have multiple processes executing the same program, but each process has its own copy of the program within its own address space and executes it independently of the other copies.
This chapter explains what your program should do to handle the startup of a process, to terminate its process, and to receive information (arguments and the environment) from the parent process.
The system starts a C program by calling the function main
. It
is up to you to write a function named main
---otherwise, you
won't even be able to link your program without errors.
You can define main
either to take no arguments, or to take two
arguments that represent the command line arguments to the program, like
this:
int main (int argc, char *argv[])
The command line arguments are the whitespace-separated tokens given in
the shell command used to invoke the program; thus, in `cat foo
bar', the arguments are `foo' and `bar'. The only way a
program can look at its command line arguments is via the arguments of
main
. If main
doesn't take arguments, then you cannot get
at the command line.
The value of the argc argument is the number of command line
arguments. The argv argument is a vector of C strings; its
elements are the individual command line argument strings. The file
name of the program being run is also included in the vector as the
first element; the value of argc counts this element. A null
pointer always follows the last element: argv[argc]
is this null pointer.
For the command `cat foo bar', argc is 3 and argv has
three elements, "cat"
, "foo"
and "bar"
.
If the syntax for the command line arguments to your program is simple
enough, you can simply pick the arguments off from argv by hand.
But unless your program takes a fixed number of arguments, or all of the
arguments are interpreted in the same way (as file names, for example),
you are usually better off using getopt
to do the parsing.
POSIX recommends these conventions for command line arguments.
getopt
(see section Parsing Program Options) makes it easy to implement them.
isalnum
;
see section Classification of Characters).
ld
command requires an argument--an output file name.
The implementation of getopt
in the GNU C library normally makes
it appear as if all the option arguments were specified before all the
non-option arguments for the purposes of parsing, even if the user of
your program intermixed option and non-option arguments. It does this
by reordering the elements of the argv array. This behavior is
nonstandard; if you want to suppress it, define the
_POSIX_OPTION_ORDER
environment variable. See section Standard Environment Variables.
GNU adds long options to these conventions. Long options consist of `--' followed by a name made of alphanumeric characters and dashes. Option names are typically one to three words long, with hyphens to separate words. Users can abbreviate the option names as long as the abbreviations are unique.
To specify an argument for a long option, write `--name=value'. This syntax enables a long option to accept an argument that is itself optional.
Eventually, the GNU system will provide completion for long option names in the shell.
Here are the details about how to call the getopt
function. To
use this facility, your program must include the header file
`unistd.h'.
If the value of this variable is nonzero, then getopt
prints an
error message to the standard error stream if it encounters an unknown
option character or an option with a missing required argument. This is
the default behavior. If you set this variable to zero, getopt
does not print any messages, but it still returns the character ?
to indicate an error.
When getopt
encounters an unknown option character or an option
with a missing required argument, it stores that option character in
this variable. You can use this for providing your own diagnostic
messages.
This variable is set by getopt
to the index of the next element
of the argv array to be processed. Once getopt
has found
all of the option arguments, you can use this variable to determine
where the remaining non-option arguments begin. The initial value of
this variable is 1
.
This variable is set by getopt
to point at the value of the
option argument, for those options that accept arguments.
Function: int getopt (int argc, char **argv, const char *options)
The getopt
function gets the next option argument from the
argument list specified by the argv and argc arguments.
Normally these values come directly from the arguments received by
main
.
The options argument is a string that specifies the option characters that are valid for this program. An option character in this string can be followed by a colon (`:') to indicate that it takes a required argument.
If the options argument string begins with a hyphen (`-'), this is treated specially. It permits arguments that are not options to be returned as if they were associated with option character `\0'.
The getopt
function returns the option character for the next
command line option. When no more option arguments are available, it
returns -1
. There may still be more non-option arguments; you
must compare the external variable optind
against the argc
parameter to check this.
If the option has an argument, getopt
returns the argument by
storing it in the varables optarg. You don't ordinarily need to
copy the optarg
string, since it is a pointer into the original
argv array, not into a static area that might be overwritten.
If getopt
finds an option character in argv that was not
included in options, or a missing option argument, it returns
`?' and sets the external variable optopt
to the actual
option character. If the first character of options is a colon
(`:'), then getopt
returns `:' instead of `?' to
indicate a missing option argument. In addition, if the external
variable opterr
is nonzero (which is the default), getopt
prints an error message.
getopt
Here is an example showing how getopt
is typically used. The
key points to notice are:
getopt
is called in a loop. When getopt
returns
-1
, indicating no more options are present, the loop terminates.
switch
statement is used to dispatch on the return value from
getopt
. In typical use, each case just sets a variable that
is used later in the program.
#include <unistd.h> #include <stdio.h> int main (int argc, char **argv) { int aflag = 0; int bflag = 0; char *cvalue = NULL; int index; int c; opterr = 0; while ((c = getopt (argc, argv, "abc:")) != -1) switch (c) { case 'a': aflag = 1; break; case 'b': bflag = 1; break; case 'c': cvalue = optarg; break; case '?': if (isprint (optopt)) fprintf (stderr, "Unknown option `-%c'.\n", optopt); else fprintf (stderr, "Unknown option character `\\x%x'.\n", optopt); return 1; default: abort (); } printf ("aflag = %d, bflag = %d, cvalue = %s\n", aflag, bflag, cvalue); for (index = optind; index < argc; index++) printf ("Non-option argument %s\n", argv[index]); return 0; }
Here are some examples showing what this program prints with different combinations of arguments:
% testopt aflag = 0, bflag = 0, cvalue = (null) % testopt -a -b aflag = 1, bflag = 1, cvalue = (null) % testopt -ab aflag = 1, bflag = 1, cvalue = (null) % testopt -c foo aflag = 0, bflag = 0, cvalue = foo % testopt -cfoo aflag = 0, bflag = 0, cvalue = foo % testopt arg1 aflag = 0, bflag = 0, cvalue = (null) Non-option argument arg1 % testopt -a arg1 aflag = 1, bflag = 0, cvalue = (null) Non-option argument arg1 % testopt -c foo arg1 aflag = 0, bflag = 0, cvalue = foo Non-option argument arg1 % testopt -a -- -b aflag = 1, bflag = 0, cvalue = (null) Non-option argument -b % testopt -a - aflag = 1, bflag = 0, cvalue = (null) Non-option argument -
To accept GNU-style long options as well as single-character options,
use getopt_long
instead of getopt
. You should make every
program accept long options if it uses any options, for this takes
little extra work and helps beginners remember how to use the program.
This structure describes a single long option name for the sake of
getopt_long
. The argument longopts must be an array of
these structures, one for each long option. Terminate the array with an
element containing all zeros.
The struct option
structure has these fields:
const char *name
int has_arg
no_argument
,
required_argument
and optional_argument
.
int *flag
int val
If flag
is a null pointer, then the val
is a value which
identifies this option. Often these values are chosen to uniquely
identify particular long options.
If flag
is not a null pointer, it should be the address of an
int
variable which is the flag for this option. The value in
val
is the value to store in the flag to indicate that the option
was seen.
Function: int getopt_long (int argc, char **argv, const char *shortopts, struct option *longopts, int *indexptr)
Decode options from the vector argv (whose length is argc).
The argument shortopts describes the short options to accept, just as
it does in getopt
. The argument longopts describes the long
options to accept (see above).
When getopt_long
encounters a short option, it does the same
thing that getopt
would do: it returns the character code for the
option, and stores the options argument (if it has one) in optarg
.
When getopt_long
encounters a long option, it takes actions based
on the flag
and val
fields of the definition of that
option.
If flag
is a null pointer, then getopt_long
returns the
contents of val
to indicate which option it found. You should
arrange distinct values in the val
field for options with
different meanings, so you can decode these values after
getopt_long
returns. If the long option is equivalent to a short
option, you can use the short option's character code in val
.
If flag
is not a null pointer, that means this option should just
set a flag in the program. The flag is a variable of type int
that you define. Put the address of the flag in the flag
field.
Put in the val
field the value you would like this option to
store in the flag. In this case, getopt_long
returns 0
.
For any long option, getopt_long
tells you the index in the array
longopts of the options definition, by storing it into
*indexptr
. You can get the name of the option with
longopts[*indexptr].name
. So you can distinguish among
long options either by the values in their val
fields or by their
indices. You can also distinguish in this way among long options that
set flags.
When a long option has an argument, getopt_long
puts the argument
value in the variable optarg
before returning. When the option
has no argument, the value in optarg
is a null pointer. This is
how you can tell whether an optional argument was supplied.
When getopt_long
has no more options to handle, it returns
-1
, and leaves in the variable optind
the index in
argv of the next remaining argument.
#include <stdio.h> /* Flag set by `--verbose'. */ static int verbose_flag; int main (argc, argv) int argc; char **argv; { int c; while (1) { static struct option long_options[] = { /* These options set a flag. */ {"verbose", 0, &verbose_flag, 1}, {"brief", 0, &verbose_flag, 0}, /* These options don't set a flag. We distinguish them by their indices. */ {"add", 1, 0, 0}, {"append", 0, 0, 0}, {"delete", 1, 0, 0}, {"create", 0, 0, 0}, {"file", 1, 0, 0}, {0, 0, 0, 0} }; /*getopt_long
stores the option index here. */ int option_index = 0; c = getopt_long (argc, argv, "abc:d:", long_options, &option_index); /* Detect the end of the options. */ if (c == -1) break; switch (c) { case 0: /* If this option set a flag, do nothing else now. */ if (long_options[option_index].flag != 0) break; printf ("option %s", long_options[option_index].name); if (optarg) printf (" with arg %s", optarg); printf ("\n"); break; case 'a': puts ("option -a\n"); break; case 'b': puts ("option -b\n"); break; case 'c': printf ("option -c with value `%s'\n", optarg); break; case 'd': printf ("option -d with value `%s'\n", optarg); break; case '?': /*getopt_long
already printed an error message. */ break; default: abort (); } } /* Instead of reporting `--verbose' and `--brief' as they are encountered, we report the final status resulting from them. */ if (verbose_flag) puts ("verbose flag is set"); /* Print any remaining command line arguments (not options). */ if (optind < argc) { printf ("non-option ARGV-elements: "); while (optind < argc) printf ("%s ", argv[optind++]); putchar ('\n'); } exit (0); }
When a program is executed, it receives information about the context in
which it was invoked in two ways. The first mechanism uses the
argv and argc arguments to its main
function, and is
discussed in section Program Arguments. The second mechanism uses
environment variables and is discussed in this section.
The argv mechanism is typically used to pass command-line arguments specific to the particular program being invoked. The environment, on the other hand, keeps track of information that is shared by many programs, changes infrequently, and that is less frequently accessed.
The environment variables discussed in this section are the same
environment variables that you set using assignments and the
export
command in the shell. Programs executed from the shell
inherit all of the environment variables from the shell.
Standard environment variables are used for information about the user's home directory, terminal type, current locale, and so on; you can define additional variables for other purposes. The set of all environment variables that have values is collectively known as the environment.
Names of environment variables are case-sensitive and must not contain the character `='. System-defined environment variables are invariably uppercase.
The values of environment variables can be anything that can be represented as a string. A value must not contain an embedded null character, since this is assumed to terminate the string.
The value of an environment variable can be accessed with the
getenv
function. This is declared in the header file
`stdlib.h'.
Function: char * getenv (const char *name)
This function returns a string that is the value of the environment
variable name. You must not modify this string. In some systems
not using the GNU library, it might be overwritten by subsequent calls
to getenv
(but not by any other library function).
If the environment variable name is not defined, the value is a
null pointer.
Function: int putenv (const char *string)
The putenv
function adds or removes definitions from the environment.
If the string is of the form `name=value', the
definition is added to the environment. Otherwise, the string is
interpreted as the name of an environment variable, and any definition
for this variable in the environment is removed.
The GNU library provides this function for compatibility with SVID; it may not be available in other systems.
You can deal directly with the underlying representation of environment objects to add more variables to the environment (for example, to communicate with another program you are about to execute; see section Executing a File).
The environment is represented as an array of strings. Each string is of the format `name=value'. The order in which strings appear in the environment is not significant, but the same name must not appear more than once. The last element of the array is a null pointer.
This variable is declared in the header file `unistd.h'.
If you just want to get the value of an environment variable, use
getenv
.
These environment variables have standard meanings. This doesn't mean that they are always present in the environment; but if these variables are present, they have these meanings, and that you shouldn't try to use these environment variable names for some other purpose.
HOME
This is a string representing the user's home directory, or initial default working directory.
The user can set HOME
to any value.
If you need to make sure to obtain the proper home directory
for a particular user, you should not use HOME
; instead,
look up the user's name in the user database (see section User Database).
For most purposes, it is better to use HOME
, precisely because
this lets the user specify the value.
LOGNAME
This is the name that the user used to log in. Since the value in the
environment can be tweaked arbitrarily, this is not a reliable way to
identify the user who is running a process; a function like
getlogin
(see section Identifying Who Logged In) is better for that purpose.
For most purposes, it is better to use LOGNAME
, precisely because
this lets the user specify the value.
PATH
A path is a sequence of directory names which is used for
searching for a file. The variable PATH
holds a path used
for searching for programs to be run.
The execlp
and execvp
functions (see section Executing a File)
use this environment variable, as do many shells and other utilities
which are implemented in terms of those functions.
The syntax of a path is a sequence of directory names separated by colons. An empty string instead of a directory name stands for the current directory (see section Working Directory).
A typical value for this environment variable might be a string like:
:/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local:/usr/local/bin
This means that if the user tries to execute a program named foo
,
the system will look for files named `foo', `/bin/foo',
`/etc/foo', and so on. The first of these files that exists is
the one that is executed.
TERM
This specifies the kind of terminal that is receiving program output.
Some programs can make use of this information to take advantage of
special escape sequences or terminal modes supported by particular kinds
of terminals. Many programs which use the termcap library
(see section 'Finding a Terminal Description' in The Termcap Library Manual) use the TERM
environment variable, for example.
TZ
This specifies the time zone. See section Specifying the Time Zone with TZ
, for information about
the format of this string and how it is used.
LANG
This specifies the default locale to use for attribute categories where
neither LC_ALL
nor the specific environment variable for that
category is set. See section Locales and Internationalization, for more information about
locales.
LC_COLLATE
This specifies what locale to use for string sorting.
LC_CTYPE
This specifies what locale to use for character sets and character classification.
LC_MONETARY
This specifies what locale to use for formatting monetary values.
LC_NUMERIC
This specifies what locale to use for formatting numbers.
LC_TIME
This specifies what locale to use for formatting date/time values.
_POSIX_OPTION_ORDER
If this environment variable is defined, it suppresses the usual
reordering of command line arguments by getopt
. See section Program Argument Syntax Conventions.
The usual way for a program to terminate is simply for its main
function to return. The exit status value returned from the
main
function is used to report information back to the process's
parent process or shell.
A program can also terminate normally by calling the exit
function.
In addition, programs can be terminated by signals; this is discussed in
more detail in section Signal Handling. The abort
function causes
a signal that kills the program.
A process terminates normally when the program calls exit
.
Returning from main
is equivalent to calling exit
, and
the value that main
returns is used as the argument to exit
.
Function: void exit (int status)
The exit
function terminates the process with status
status. This function does not return.
Normal termination causes the following actions:
atexit
or on_exit
functions are called in the reverse order of their registration. This
mechanism allows your application to specify its own "cleanup" actions
to be performed at program termination. Typically, this is used to do
things like saving program state information in a file, or unlocking
locks in shared data bases.
tmpfile
function are removed; see section Temporary Files.
_exit
is called, terminating the program. See section Termination Internals.
When a program exits, it can return to the parent process a small
amount of information about the cause of termination, using the
exit status. This is a value between 0 and 255 that the exiting
process passes as an argument to exit
.
Normally you should use the exit status to report very broad information about success or failure. You can't provide a lot of detail about the reasons for the failure, and most parent processes would not want much detail anyway.
There are conventions for what sorts of status values certain programs should return. The most common convention is simply 0 for success and 1 for failure. Programs that perform comparison use a different convention: they use status 1 to indicate a mismatch, and status 2 to indicate an inability to compare. Your program should follow an existing convention if an existing convention makes sense for it.
A general convention reserves status values 128 and up for special purposes. In particular, the value 128 is used to indicate failure to execute another program in a subprocess. This convention is not universally obeyed, but it is a good idea to follow it in your programs.
Warning: Don't try to use the number of errors as the exit status. This is actually not very useful; a parent process would generally not care how many errors occurred. Worse than that, it does not work, because the status value is truncated to eight bits. Thus, if the program tried to report 256 errors, the parent would receive a report of 0 errors--that is, success.
For the same reason, it does not work to use the value of errno
as the exit status--these can exceed 255.
Portability note: Some non-POSIX systems use different
conventions for exit status values. For greater portability, you can
use the macros EXIT_SUCCESS
and EXIT_FAILURE
for the
conventional status value for success and failure, respectively. They
are declared in the file `stdlib.h'.
This macro can be used with the exit
function to indicate
successful program completion.
On POSIX systems, the value of this macro is 0
. On other
systems, the value might be some other (possibly non-constant) integer
expression.
This macro can be used with the exit
function to indicate
unsuccessful program completion in a general sense.
On POSIX systems, the value of this macro is 1
. On other
systems, the value might be some other (possibly non-constant) integer
expression. Other nonzero status values also indicate future. Certain
programs use different nonzero status values to indicate particular
kinds of "non-success". For example, diff
uses status value
1
to mean that the files are different, and 2
or more to
mean that there was difficulty in opening the files.
Your program can arrange to run its own cleanup functions if normal
termination happens. If you are writing a library for use in various
application programs, then it is unreliable to insist that all
applications call the library's cleanup functions explicitly before
exiting. It is much more robust to make the cleanup invisible to the
application, by setting up a cleanup function in the library itself
using atexit
or on_exit
.
Function: int atexit (void (*function) (void))
The atexit
function registers the function function to be
called at normal program termination. The function is called with
no arguments.
The return value from atexit
is zero on success and nonzero if
the function cannot be registered.
Function: int on_exit (void (*function)(int status, void *arg), void *arg)
This function is a somewhat more powerful variant of atexit
. It
accepts two arguments, a function function and an arbitrary
pointer arg. At normal program termination, the function is
called with two arguments: the status value passed to exit
,
and the arg.
This function is included in the GNU C library only for compatibility for SunOS, and may not be supported by other implementations.
Here's a trivial program that illustrates the use of exit
and
atexit
:
#include <stdio.h> #include <stdlib.h> void bye (void) { puts ("Goodbye, cruel world...."); } int main (void) { atexit (bye); exit (EXIT_SUCCESS); }
When this program is executed, it just prints the message and exits.
You can abort your program using the abort
function. The prototype
for this function is in `stdlib.h'.
The abort
function causes abnormal program termination. This
does not execute cleanup functions registered with atexit
or
on_exit
.
This function actually terminates the process by raising a
SIGABRT
signal, and your program can include a handler to
intercept this signal; see section Signal Handling.
Future Change Warning: Proposed Federal censorship regulations may prohibit us from giving you information about the possibility of calling this function. We would be required to say that this is not an acceptable way of terminating a program.
The _exit
function is the primitive used for process termination
by exit
. It is declared in the header file `unistd.h'.
Function: void _exit (int status)
The _exit
function is the primitive for causing a process to
terminate with status status. Calling this function does not
execute cleanup functions registered with atexit
or
on_exit
.
When a process terminates for any reason--either by an explicit termination call, or termination as a result of a signal--the following things happen:
wait
or waitpid
; see
section Process Completion.
init
process, with process ID 1.)
SIGCHLD
signal is sent to the parent process.
SIGHUP
signal is sent to each process in the foreground job,
and the controlling terminal is disassociated from that session.
See section Job Control.
SIGHUP
signal and a SIGCONT
signal are sent to each process in the
group. See section Job Control.
Go to the previous, next section.