glibc/manual/=process.texinfo

@node Processes, Job Control, Signal Handling, Top
@chapter Processes

@cindex process
@dfn{Processes} are the primitive units for allocation of system
resources.  Each process has its own address space and (usually) one
thread of control.  A process executes a program; you can have multiple
processes executing the same program, but each process has its own copy
of the program within its own address space and executes it
independently of the other copies.

Processes are organized hierarchically.  Child processes are created by
a parent process, and inherit many of their attributes from the parent
process.

This chapter describes how a program can create, terminate, and control
child processes.

@menu
* Program Arguments::           Parsing the command-line arguments to
				 a program.
* Environment Variables::       How to access parameters inherited from
				 a parent process.
* Program Termination::         How to cause a process to terminate and
				 return status information to its parent.
* Creating New Processes::      Running other programs.
@end menu


@node Program Arguments, Environment Variables,  , Processes
@section Program Arguments
@cindex program arguments
@cindex command line arguments

@cindex @code{main} function
When your C program starts, it begins by executing the function called
@code{main}.  You can define @code{main} either to take no arguments,
or to take two arguments that represent the command line arguments
to the program, like this:

@example
int main (int @var{argc}, char *@var{argv}[])
@end example

@cindex argc (program argument count)
@cindex argv (program argument vector)
The command line arguments are the whitespace-separated tokens typed by
the user to the shell in invoking the program.  The value of the
@var{argc} argument is the number of command line arguments.  The
@var{argv} argument is a vector of pointers to @code{char}; sometimes it
is also declared as @samp{char **@var{argv}}.  The elements of
@var{argv} are the individual command line argument strings.  By
convention, @code{@var{argv}[0]} is the file name of the program being
run, and @code{@var{argv}[@var{argc}]} is a null pointer.

If the syntax for the command line arguments to your program is simple
enough, you can simply pick the arguments off from @var{argv} by hand.
But unless your program takes a fixed number of arguments, or all of the
arguments are interpreted in the same way (as file names, for example),
you are usually better off using @code{getopt} to do the parsing.

@menu
* Argument Syntax Conventions::  By convention, program
                                                 options are specified by a
                                                 leading hyphen.
* Parsing Program Arguments::   The @code{getopt} function.
* Example Using getopt::  An example of @code{getopt}.
@end menu

@node Argument Syntax Conventions, Parsing Program Arguments,  , Program Arguments
@subsection Program Argument Syntax Conventions
@cindex program argument syntax
@cindex syntax, for program arguments
@cindex command argument syntax

The @code{getopt} function decodes options following the usual
conventions for POSIX utilities:

@itemize @bullet
@item
Arguments are options if they begin with a hyphen delimiter (@samp{-}).

@item
Multiple options may follow a hyphen delimiter in a single token if
the options do not take arguments.  Thus, @samp{-abc} is equivalent to
@samp{-a -b -c}.

@item
Option names are single alphanumeric (as for @code{isalnum};
see @ref{Classification of Characters}).

@item
Certain options require an argument.  For example, the @samp{-o}
command of the ld command requires an argument---an output file name.

@item
An option and its argument may or may appear as separate tokens.  (In
other words, the whitespace separating them is optional.)  Thus,
@samp{-o foo} and @samp{-ofoo} are equivalent.

@item
Options typically precede other non-option arguments.

The implementation of @code{getopt} in the GNU C library normally makes
it appear as if all the option arguments were specified before all the
non-option arguments for the purposes of parsing, even if the user of
your program intermixed option and non-option arguments.  It does this
by reordering the elements of the @var{argv} array.  This behavior is
nonstandard; if you want to suppress it, define the
@code{_POSIX_OPTION_ORDER} environment variable.  @xref{Standard
Environment Variables}.

@item
The argument @samp{--} terminates all options; any following arguments
are treated as non-option arguments, even if they begin with a hyphen.

@item
A token consisting of a single hyphen character is interpreted as an
ordinary non-option argument.  By convention, it is used to specify
input from or output to the standard input and output streams.

@item
Options may be supplied in any order, or appear multiple times.  The
interpretation is left up to the particular application program.
@end itemize

@node Parsing Program Arguments, Example Using getopt, Argument Syntax Conventions, Program Arguments
@subsection Parsing Program Arguments
@cindex program arguments, parsing
@cindex command arguments, parsing
@cindex parsing program arguments

Here are the details about how to call the @code{getopt} function.  To
use this facility, your program must include the header file
@file{unistd.h}.
@pindex unistd.h

@comment unistd.h
@comment POSIX.2
@deftypevar int opterr
If the value of this variable is nonzero, then @code{getopt} prints an
error message to the standard error stream if it encounters an unknown
option character or an option with a missing required argument.  This is
the default behavior.  If you set this variable to zero, @code{getopt}
does not print any messages, but it still returns @code{?} to indicate
an error.
@end deftypevar

@comment unistd.h
@comment POSIX.2
@deftypevar int optopt
When @code{getopt} encounters an unknown option character or an option
with a missing required argument, it stores that option character in
this variable.  You can use this for providing your own diagnostic
messages.
@end deftypevar

@comment unistd.h
@comment POSIX.2
@deftypevar int optind
This variable is set by @code{getopt} to the index of the next element
of the @var{argv} array to be processed.  Once @code{getopt} has found
all of the option arguments, you can use this variable to determine
where the remaining non-option arguments begin.  The initial value of
this variable is @code{1}.
@end deftypevar

@comment unistd.h
@comment POSIX.2
@deftypevar {char *} optarg
This variable is set by @code{getopt} to point at the value of the
option argument, for those options that accept arguments.
@end deftypevar

@comment unistd.h
@comment POSIX.2
@deftypefun int getopt (int @var{argc}, char **@var{argv}, const char *@var{options})
The @code{getopt} function gets the next option argument from the
argument list specified by the @var{argv} and @var{argc} arguments.
Normally these arguments' values come directly from the arguments of
@code{main}.

The @var{options} argument is a string that specifies the option
characters that are valid for this program.  An option character in this
string can be followed by a colon (@samp{:}) to indicate that it takes a
required argument.

If the @var{options} argument string begins with a hyphen (@samp{-}), this
is treated specially.  It permits arguments without an option to be
returned as if they were associated with option character @samp{\0}.

The @code{getopt} function returns the option character for the next
command line option.  When no more option arguments are available, it
returns @code{-1}.  There may still be more non-option arguments; you
must compare the external variable @code{optind} against the @var{argv}
parameter to check this.

If the options has an argument, @code{getopt} returns the argument by
storing it in the varables @var{optarg}.  You don't ordinarily need to
copy the @code{optarg} string, since it is a pointer into the original
@var{argv} array, not into a static area that might be overwritten.

If @code{getopt} finds an option character in @var{argv} that was not
included in @var{options}, or a missing option argument, it returns
@samp{?} and sets the external variable @code{optopt} to the actual
option character.  In addition, if the external variable @code{opterr}
is nonzero, @code{getopt} prints an error message.
@end deftypefun

@node Example Using getopt,  , Parsing Program Arguments, Program Arguments
@subsection Example of Parsing Program Arguments

Here is an example showing how @code{getopt} is typically used.  The
key points to notice are:

@itemize @bullet
@item
Normally, @code{getopt} is called in a loop.  When @code{getopt} returns
@code{-1}, indicating no more options are present, the loop terminates.

@item
A @code{switch} statement is used to dispatch on the return value from
@code{getopt}.  In typical use, each case just sets a variable that
is used later in the program.

@item
A second loop is used to process the remaining non-option arguments.
@end itemize

@example
@include testopt.c.texi
@end example

Here are some examples showing what this program prints with different
combinations of arguments:

@example
% testopt
aflag = 0, bflag = 0, cvalue = (null)

% testopt -a -b
aflag = 1, bflag = 1, cvalue = (null)

% testopt -ab
aflag = 1, bflag = 1, cvalue = (null)

% testopt -c foo
aflag = 0, bflag = 0, cvalue = foo

% testopt -cfoo
aflag = 0, bflag = 0, cvalue = foo

% testopt arg1
aflag = 0, bflag = 0, cvalue = (null)
Non-option argument arg1

% testopt -a arg1
aflag = 1, bflag = 0, cvalue = (null)
Non-option argument arg1

% testopt -c foo arg1
aflag = 0, bflag = 0, cvalue = foo
Non-option argument arg1

% testopt -a -- -b
aflag = 1, bflag = 0, cvalue = (null)
Non-option argument -b

% testopt -a -
aflag = 1, bflag = 0, cvalue = (null)
Non-option argument -
@end example

@node Environment Variables, Program Termination, Program Arguments, Processes
@section Environment Variables

@cindex environment variable
When a program is executed, it receives information about the context in
which it was invoked in two ways.  The first mechanism uses the
@var{argv} and @var{argc} arguments to its @code{main} function, and is
discussed in @ref{Program Arguments}.  The second mechanism is
uses @dfn{environment variables} and is discussed in this section.

The @var{argv} mechanism is typically used to pass command-line
arguments specific to the particular program being invoked.  The
environment, on the other hand, keeps track of information that is
shared by many programs, changes infrequently, and that is less
frequently accessed.

The environment variables discussed in this section are the same
environment variables that you set using the assignments and the
@code{export} command in the shell.  Programs executed from the shell
inherit all of the environment variables from the shell.

@cindex environment
Standard environment variables are used for information about the user's
home directory, terminal type, current locale, and so on; you can define
additional variables for other purposes.  The set of all environment
variables that have values is collectively known as the
@dfn{environment}.

Names of environment variables are case-sensitive and must not contain
the character @samp{=}.  System-defined environment variables are
invariably uppercase.

The values of environment variables can be anything that can be
represented as a string.  A value must not contain an embedded null
character, since this is assumed to terminate the string.


@menu
* Environment Access::          How to get and set the values of
					 environment variables.
* Standard Environment Variables::  These environment variables have
					 standard interpretations.
@end menu

@node Environment Access, Standard Environment Variables,  , Environment Variables
@subsection Environment Access
@cindex environment access
@cindex environment representation

The value of an environment variable can be accessed with the
@code{getenv} function.  This is declared in the header file
@file{stdlib.h}.
@pindex stdlib.h

@comment stdlib.h
@comment ANSI
@deftypefun {char *} getenv (const char *@var{name})
This function returns a string that is the value of the environment
variable @var{name}.  You must not modify this string.  In some systems
not using the GNU library, it might be overwritten by subsequent calls
to @code{getenv} (but not by any other library function).  If the
environment variable @var{name} is not defined, the value is a null
pointer.
@end deftypefun


@comment stdlib.h
@comment SVID
@deftypefun int putenv (const char *@var{string})
The @code{putenv} function adds or removes definitions from the environment.
If the @var{string} is of the form @samp{@var{name}=@var{value}}, the
definition is added to the environment.  Otherwise, the @var{string} is
interpreted as the name of an environment variable, and any definition
for this variable in the environment is removed.

The GNU library provides this function for compatibility with SVID; it
may not be available in other systems.
@end deftypefun

You can deal directly with the underlying representation of environment
objects to add more variables to the environment (for example, to
communicate with another program you are about to execute; see
@ref{Executing a File}).

@comment unistd.h
@comment POSIX.1
@deftypevar {char **} environ
The environment is represented as an array of strings.  Each string is
of the format @samp{@var{name}=@var{value}}.  The order in which
strings appear in the environment is not significant, but the same
@var{name} must not appear more than once.  The last element of the
array is a null pointer.

This variable is not declared in any header file, but if you declare it
in your own program as @code{extern}, the right thing will happen.

If you just want to get the value of an environment variable, use
@code{getenv}.
@end deftypevar

@node Standard Environment Variables,  , Environment Access, Environment Variables
@subsection Standard Environment Variables
@cindex standard environment variables

These environment variables have standard meanings.
This doesn't mean that they are always present in the
environment, though; it just means that if these variables @emph{are}
present, they have these meanings, and that you shouldn't try to use
these environment variable names for some other purpose.

@table @code
@item HOME
@cindex HOME environment variable
@cindex home directory
This is a string representing the user's @dfn{home directory}, or
initial default working directory.  @xref{User Database}, for a
more secure way of determining this information.

@comment RMS says to explay why HOME is better, but I don't know why.

@item LOGNAME
@cindex LOGNAME environment variable
This is the name that the user used to log in.  Since the value in the
environment can be tweaked arbitrarily, this is not a reliable way to
identify the user who is running a process; a function like
@code{getlogin} (@pxref{User Identification Functions}) is better for
that purpose.

@comment RMS says to explay why LOGNAME is better, but I don't know why.

@item PATH
@cindex PATH environment variable
A @dfn{path} is a sequence of directory names which is used for
searching for a file.  The variable @var{PATH} holds a path The
@code{execlp} and @code{execvp} functions (@pxref{Executing a File})
uses this environment variable, as do many shells and other utilities
which are implemented in terms of those functions.

The syntax of a path is a sequence of directory names separated by
colons.  An empty string instead of a directory name stands for the
current directory.  (@xref{Working Directory}.)

A typical value for this environment variable might be a string like:

@example
.:/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local:/usr/local/bin
@end example

This means that if the user tries to execute a program named @code{foo},
the system will look for files named @file{./foo}, @file{/bin/foo},
@file{/etc/foo}, and so on.  The first of these files that exists is
the one that is executed.

@item TERM
@cindex TERM environment variable
This specifies the kind of terminal that is receiving program output.
Some programs can make use of this information to take advantage of
special escape sequences or terminal modes supported by particular kinds
of terminals.  Many programs which use the termcap library
(@pxref{Finding a Terminal Description,Find,,termcap,The Termcap Library
Manual}) use the @code{TERM} environment variable, for example.

@item TZ
@cindex TZ environment variable
This specifies the time zone.  @xref{Time Zone}, for information about
the format of this string and how it is used.

@item LANG
@cindex LANG environment variable
This specifies the default locale to use for attribute categories where
neither @code{LC_ALL} nor the specific environment variable for that
category is set.  @xref{Locales}, for more information about
locales.

@item LC_ALL
@cindex LC_ALL environment variable
This is similar to the @code{LANG} environment variable.  However, its
value takes precedence over any values provided for the individual
attribute category environment variables, or for the @code{LANG}
environment variable.

@item LC_COLLATE
@cindex LC_COLLATE environment variable
This specifies what locale to use for string sorting.

@item LC_CTYPE
@cindex LC_CTYPE environment variable
This specifies what locale to use for character sets and character
classification.

@item LC_MONETARY
@cindex LC_MONETARY environment variable
This specifies what locale to use for formatting monetary values.

@item LC_NUMERIC
@cindex LC_NUMERIC environment variable
This specifies what locale to use for formatting numbers.

@item LC_TIME
@cindex LC_TIME environment variable
This specifies what locale to use for formatting date/time values.

@item _POSIX_OPTION_ORDER
@cindex _POSIX_OPTION_ORDER environment variable.
If this environment variable is defined, it suppresses the usual
reordering of command line arguments by @code{getopt}.  @xref{Program
Argument Syntax Conventions}.
@end table

@node Program Termination, Creating New Processes, Environment Variables, Processes
@section Program Termination
@cindex program termination
@cindex process termination

@cindex exit status value
The usual way for a program to terminate is simply for its @code{main}
function to return.  The @dfn{exit status value} returned from the
@code{main} function is used to report information back to the process's
parent process or shell.

A program can also terminate normally calling the @code{exit}
function

In addition, programs can be terminated by signals; this is discussed in
more detail in @ref{Signal Handling}.  The @code{abort} function causes
a terminal that kills the program.

@menu
* Normal Program Termination::
* Exit Status::                 Exit Status
* Cleanups on Exit::            Cleanups on Exit
* Aborting a Program::
* Termination Internals::       Termination Internals
@end menu

@node Normal Program Termination, Exit Status,  , Program Termination
@subsection Normal Program Termination

@comment stdlib.h
@comment ANSI
@deftypefun void exit (int @var{status})
The @code{exit} function causes normal program termination with status
@var{status}.  This function does not return.
@end deftypefun

When a program terminates normally by returning from its @code{main}
function or by calling @code{exit}, the following actions occur in
sequence:

@enumerate
@item
Functions that were registered with the @code{atexit} or @code{on_exit}
functions are called in the reverse order of their registration.  This
mechanism allows your application to specify its own ``cleanup'' actions
to be performed at program termination.  Typically, this is used to do
things like saving program state information in a file, or unlock locks
in shared data bases.

@item
All open streams are closed; writing out any buffered output data.  See
@ref{Opening and Closing Streams}.  In addition, temporary files opened
with the @code{tmpfile} function are removed; see @ref{Temporary Files}.

@item
@code{_exit} is called.  @xref{Termination Internals}
@end enumerate

@node Exit Status, Cleanups on Exit, Normal Program Termination, Program Termination
@subsection Exit Status
@cindex exit status

When a program exits, it can return to the parent process a small
amount of information about the cause of termination, using the
@dfn{exit status}.  This is a value between 0 and 255 that the exiting
process passes as an argument to @code{exit}.

Normally you should use the exit status to report very broad information
about success or failure.  You can't provide a lot of detail about the
reasons for the failure, and most parent processes would not want much
detail anyway.

There are conventions for what sorts of status values certain programs
should return.  The most common convention is simply 0 for success and 1
for failure.  Programs that perform comparison use a different
convention: they use status 1 to indicate a mismatch, and status 2 to
indicate an inability to compare.  Your program should follow an
existing convention if an existing convention makes sense for it.

A general convention reserves status values 128 and up for special
purposes.  In particular, the value 128 is used to indicate failure to
execute another program in a subprocess.  This convention is not
universally obeyed, but it is a good idea to follow it in your programs.

@strong{Warning:} Don't try to use the number of errors as the exit
status.  This is actually not very useful; a parent process would
generally not care how many errors occurred.  Worse than that, it does
not work, because the status value is truncated to eight bits.
Thus, if the program tried to report 256 errors, the parent would
receive a report of 0 errors---that is, success.

For the same reason, it does not work to use the value of @code{errno}
as the exit status---these can exceed 255.

@strong{Portability note:} Some non-POSIX systems use different
conventions for exit status values.  For greater portability, you can
use the macros @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} for the
conventional status value for success and failure, respectively.  They
are declared in the file @file{stdlib.h}.
@pindex stdlib.h

@comment stdlib.h
@comment ANSI
@deftypevr Macro int EXIT_SUCCESS
This macro can be used with the @code{exit} function to indicate
successful program completion.

On POSIX systems, the value of this macro is @code{0}.  On other
systems, the value might be some other (possibly non-constant) integer
expression.
@end deftypevr

@comment stdlib.h
@comment ANSI
@deftypevr Macro int EXIT_FAILURE
This macro can be used with the @code{exit} function to indicate
unsuccessful program completion in a general sense.

On POSIX systems, the value of this macro is @code{1}.  On other
systems, the value might be some other (possibly non-constant) integer
expression.  Other nonzero status values also indicate future.  Certain
programs use different nonzero status values to indicate particular
kinds of "non-success".  For example, @code{diff} uses status value
@code{1} to mean that the files are different, and @code{2} or more to
mean that there was difficulty in opening the files.
@end deftypevr

@node Cleanups on Exit, Aborting a Program, Exit Status, Program Termination
@subsection Cleanups on Exit

@comment stdlib.h
@comment ANSI
@deftypefun int atexit (void (*@var{function}))
The @code{atexit} function registers the function @var{function} to be
called at normal program termination.  The @var{function} is called with
no arguments.

The return value from @code{atexit} is zero on success and nonzero if
the function cannot be registered.
@end deftypefun

@comment stdlib.h
@comment GNU
@deftypefun int on_exit (void (*@var{function})(int @var{status}, void *@var{arg}), void *@var{arg})
This function is a somewhat more powerful variant of @code{atexit}.  It
accepts two arguments, a function @var{function} and an arbitrary
pointer @var{arg}.  At normal program termination, the @var{function} is
called with two arguments:  the @var{status} value passed to @code{exit},
and the @var{arg}.

This function is a GNU extension, and may not be supported by other
implementations.
@end deftypefun

Here's a trivial program that illustrates the use of @code{exit} and
@code{atexit}:

@example
#include <stdio.h>
#include <stdlib.h>

void bye (void)
@{
  printf ("Goodbye, cruel world....\n");
@}

void main (void)
@{
  atexit (bye);
  exit (EXIT_SUCCESS);
@}
@end example

@noindent
When this program is executed, it just prints the message and exits.


@node Aborting a Program, Termination Internals, Cleanups on Exit, Program Termination
@subsection Aborting a Program
@cindex aborting a program

You can abort your program using the @code{abort} function.  The prototype
for this function is in @file{stdlib.h}.
@pindex stdlib.h

@comment stdlib.h
@comment ANSI
@deftypefun void abort ()
The @code{abort} function causes abnormal program termination, without
executing functions registered with @code{atexit} or @code{on_exit}.

This function actually terminates the process by raising a
@code{SIGABRT} signal, and your program can include a handler to
intercept this signal; see @ref{Signal Handling}.

@strong{Incomplete:}  Why would you want to define such a handler?
@end deftypefun

@node Termination Internals,  , Aborting a Program, Program Termination
@subsection Termination Internals

The @code{_exit} function is the primitive used for process termination
by @code{exit}.  It is declared in the header file @file{unistd.h}.
@pindex unistd.h

@comment unistd.h
@comment POSIX.1
@deftypefun void _exit (int @var{status})
The @code{_exit} function is the primitive for causing a process to
terminate with status @var{status}.  Calling this function does not
execute cleanup functions registered with @code{atexit} or
@code{on_exit}.
@end deftypefun

When a process terminates for any reason---either by an explicit
termination call, or termination as a result of a signal---the
following things happen:

@itemize @bullet
@item
All open file descriptors in the process are closed.  @xref{Low-Level
Input/Output}.

@item
The low-order 8 bits of the return status code are saved to be reported
back to the parent process via @code{wait} or @code{waitpid}; see
@ref{Process Completion}.

@item
Any child processes of the process being terminated are assigned a new
parent process.  (This is the @code{init} process, with process ID 1.)

@item
A @code{SIGCHLD} signal is sent to the parent process.

@item
If the process is a session leader that has a controlling terminal, then
a @code{SIGHUP} signal is sent to each process in the foreground job,
and the controlling terminal is disassociated from that session.
@xref{Job Control}.

@item
If termination of a process causes a process group to become orphaned,
and any member of that process group is stopped, then a @code{SIGHUP}
signal and a @code{SIGCONT} signal are sent to each process in the
group.  @xref{Job Control}.
@end itemize

@node Creating New Processes,  , Program Termination, Processes
@section Creating New Processes

This section describes how your program can cause other programs to be
executed.  Actually, there are three distinct operations involved:
creating a new child process, causing the new process to execute a
program, and coordinating the completion of the child process with the
original program.

The @code{system} function provides a simple, portable mechanism for
running another program; it does all three steps automatically.  If you
need more control over the details of how this is done, you can use the
primitive functions to do each step individually instead.

@menu
* Running a Command::           The easy way to run another program.
* Process Creation Concepts::   An overview of the hard way to do it.
* Process Identification::      How to get the process ID of a process.
* Creating a Process::          How to fork a child process.
* Executing a File::            How to get a process to execute another
				         program.
* Process Completion::          How to tell when a child process has
				         completed.
* Process Completion Status::   How to interpret the status value
                                         returned from a child process.
* BSD wait Functions::  More functions, for backward
                                         compatibility.
* Process Creation Example::    A complete example program.
@end menu


@node Running a Command, Process Creation Concepts,  , Creating New Processes
@subsection Running a Command
@cindex running a command

The easy way to run another program is to use the @code{system}
function.  This function does all the work of running a subprogram, but
it doesn't give you much control over the details: you have to wait
until the subprogram terminates before you can do anything else.

@pindex stdlib.h

@comment stdlib.h
@comment ANSI
@deftypefun int system (const char *@var{command})
This function executes @var{command} as a shell command.  In the GNU C
library, it always uses the default shell @code{sh} to run the command.
In particular, it searching the directories in @code{PATH} to find
programs to execute.  The return value is @code{-1} if it wasn't
possible to create the shell process, and otherwise is the status of the
shell process.  @xref{Process Completion}, for details on how this
status code can be interpreted.
@pindex sh
@end deftypefun

The @code{system} function is declared in the header file
@file{stdlib.h}.

@strong{Portability Note:} Some C implementations may not have any
notion of a command processor that can execute other programs.  You can
determine whether a command processor exists by executing @code{system
(o)}; in this case the return value is nonzero if and only if such a
processor is available.

The @code{popen} and @code{pclose} functions (@pxref{Pipe to a
Subprocess}) are closely related to the @code{system} function.  They
allow the parent process to communicate with the standard input and
output channels of the command being executed.

@node Process Creation Concepts, Process Identification, Running a Command, Creating New Processes
@subsection Process Creation Concepts

This section gives an overview of processes and of the steps involved in
creating a process and making it run another program.

@cindex process ID
@cindex process lifetime
Each process is named by a @dfn{process ID} number.  A unique process ID
is allocated to each process when it is created.  The @dfn{lifetime} of
a process ends when its termination is reported to its parent process;
at that time, all of the process resources, including its process ID,
are freed.

@cindex creating a process
@cindex forking a process
@cindex child process
@cindex parent process
Processes are created with the @code{fork} system call (so the operation
of creating a new process is sometimes called @dfn{forking} a process).
The @dfn{child process} created by @code{fork} is an exact clone of the
original @dfn{parent process}, except that it has its own process ID.

After forking a child process, both the parent and child processes
continue to execute normally.  If you want your program to wait for a
child process to finish executing before continuing, you must do this
explicitly after the fork operation.  This is done with the @code{wait}
or @code{waitpid} functions (@pxref{Process Completion}).  These
functions give the parent information about why the child
terminated---for example, its exit status code.

A newly forked child process continues to execute the same program as
its parent process, at the point where the @code{fork} call returns.
You can use the return value from @code{fork} to tell whether the program
is running in the parent process or the child.

@cindex process image
Having all processes run the same program is usually not very useful.
But the child can execute another program using one of the @code{exec}
functions; see @ref{Executing a File}.  The program that the process is
executing is called its @dfn{process image}.  Starting execution of a
new program causes the process to forget all about its current process
image; when the new program exits, the process exits too, instead of
returning to the previous process image.


@node Process Identification, Creating a Process, Process Creation Concepts, Creating New Processes
@subsection Process Identification

The @code{pid_t} data type represents process IDs.  You can get the
process ID of a process by calling @code{getpid}.  The function
@code{getppid} returns the process ID of the parent of the parent of the
current process (this is also known as the @dfn{parent process ID}).
Your program should include the header files @file{unistd.h} and
@file{sys/types.h} to use these functions.
@pindex sys/types.h
@pindex unistd.h

@comment sys/types.h
@comment POSIX.1
@deftp {Data Type} pid_t
The @code{pid_t} data type is a signed integer type which is capable
of representing a process ID.  In the GNU library, this is an @code{int}.
@end deftp

@comment unistd.h
@comment POSIX.1
@deftypefun pid_t getpid ()
The @code{getpid} function returns the process ID of the current process.
@end deftypefun

@comment unistd.h
@comment POSIX.1
@deftypefun pid_t getppid ()
The @code{getppid} function returns the process ID of the parent of the
current process.
@end deftypefun

@node Creating a Process, Executing a File, Process Identification, Creating New Processes
@subsection Creating a Process

The @code{fork} function is the primitive for creating a process.
It is declared in the header file @file{unistd.h}.
@pindex unistd.h

@comment unistd.h
@comment POSIX.1
@deftypefun pid_t fork ()
The @code{fork} function creates a new process.

If the operation is successful, there are then both parent and child
processes and both see @code{fork} return, but with different values: it
returns a value of @code{0} in the child process and returns the child's
process ID in the parent process.  If the child process could not be
created, a value of @code{-1} is returned in the parent process.  The
following @code{errno} error conditions are defined for this function:

@table @code
@item EAGAIN
There aren't enough system resources to create another process, or the
user already has too many processes running.

@item ENOMEM
The process requires more space than the system can supply.
@end table
@end deftypefun

The specific attributes of the child process that differ from the
parent process are:

@itemize @bullet
@item
The child process has its own unique process ID.

@item
The parent process ID of the child process is the process ID of its
parent process.

@item
The child process gets its own copies of the parent process's open file
descriptors.  Subsequently changing attributes of the file descriptors
in the parent process won't affect the file descriptors in the child,
and vice versa.  @xref{Control Operations}.

@item
The elapsed processor times for the child process are set to zero;
see @ref{Processor Time}.

@item
The child doesn't inherit file locks set by the parent process.
@xref{Control Operations}.

@item
The child doesn't inherit alarms set by the parent process.
@xref{Setting an Alarm}.

@item
The set of pending signals (@pxref{Delivery of Signal}) for the child
process is cleared.  (The child process inherits its mask of blocked
signals and signal actions from the parent process.)
@end itemize


@comment unistd.h
@comment BSD
@deftypefun pid_t vfork (void)
The @code{vfork} function is similar to @code{fork} but more efficient;
however, there are restrictions you must follow to use it safely.

While @code{fork} makes a complete copy of the calling process's address
space and allows both the parent and child to execute independently,
@code{vfork} does not make this copy.  Instead, the child process
created with @code{vfork} shares its parent's address space until it calls
one of the @code{exec} functions.  In the meantime, the parent process
suspends execution.

You must be very careful not to allow the child process created with
@code{vfork} to modify any global data or even local variables shared
with the parent.  Furthermore, the child process cannot return from (or
do a long jump out of) the function that called @code{vfork}!  This
would leave the parent process's control information very confused.  If
in doubt, use @code{fork} instead.

Some operating systems don't really implement @code{vfork}.  The GNU C
library permits you to use @code{vfork} on all systems, but actually
executes @code{fork} if @code{vfork} isn't available.
@end deftypefun

@node Executing a File, Process Completion, Creating a Process, Creating New Processes
@subsection Executing a File
@cindex executing a file
@cindex @code{exec} functions

This section describes the @code{exec} family of functions, for executing
a file as a process image.  You can use these functions to make a child
process execute a new program after it has been forked.

The functions in this family differ in how you specify the arguments,
but otherwise they all do the same thing.  They are declared in the
header file @file{unistd.h}.
@pindex unistd.h

@comment unistd.h
@comment POSIX.1
@deftypefun int execv (const char *@var{filename}, char *const @var{argv}@t{[]})
The @code{execv} function executes the file named by @var{filename} as a
new process image.

The @var{argv} argument is an array of null-terminated strings that is
used to provide a value for the @code{argv} argument to the @code{main}
function of the program to be executed.  The last element of this array
must be a null pointer.  @xref{Program Arguments}, for information on
how programs can access these arguments.

The environment for the new process image is taken from the
@code{environ} variable of the current process image; see @ref{Environment
Variables}, for information about environments.
@end deftypefun

@comment unistd.h
@comment POSIX.1
@deftypefun int execl (const char *@var{filename}, const char *@var{arg0}, @dots{})
This is similar to @code{execv}, but the @var{argv} strings are
specified individually instead of as an array.  A null pointer must be
passed as the last such argument.
@end deftypefun

@comment unistd.h
@comment POSIX.1
@deftypefun int execve (const char *@var{filename}, char *const @var{argv}@t{[]}, char *const @var{env}@t{[]})
This is similar to @code{execv}, but permits you to specify the environment
for the new program explicitly as the @var{env} argument.  This should
be an array of strings in the same format as for the @code{environ}
variable; see @ref{Environment Access}.
@end deftypefun

@comment unistd.h
@comment POSIX.1
@deftypefun int execle (const char *@var{filename}, const char *@var{arg0}, char *const @var{env}@t{[]}, @dots{})
This is similar to @code{execl}, but permits you to specify the
environment for the new program explicitly.  The environment argument is
passed following the null pointer that marks the last @var{argv}
argument, and should be an array of strings in the same format as for
the @code{environ} variable.
@end deftypefun

@comment unistd.h
@comment POSIX.1
@deftypefun int execvp (const char *@var{filename}, char *const @var{argv}@t{[]})
The @code{execvp} function is similar to @code{execv}, except that it
searches the directories listed in the @code{PATH} environment variable
(@pxref{Standard Environment Variables}) to find the full file name of a
file from @var{filename} if @var{filename} does not contain a slash.

This function is useful for executing installed system utility programs,
so that the user can control where to look for them.  It is also useful
in shells, for executing commands typed by the user.
@end deftypefun

@comment unistd.h
@comment POSIX.1
@deftypefun int execlp (const char *@var{filename}, const char *@var{arg0}, @dots{})
This function is like @code{execl}, except that it performs the same
file name searching as the @code{execvp} function.
@end deftypefun


The size of the argument list and environment list taken together must not
be greater than @code{ARG_MAX} bytes.  @xref{System Parameters}.

@strong{Incomplete:}  The POSIX.1 standard requires some statement here
about how null terminators, null pointers, and alignment requirements
affect the total size of the argument and environment lists.

These functions normally don't return, since execution of a new program
causes the currently executing program to go away completely.  A value
of @code{-1} is returned in the event of a failure.  In addition to the
usual file name syntax errors (@pxref{File Name Errors}), the following
@code{errno} error conditions are defined for these functions:

@table @code
@item E2BIG
The combined size of the new program's argument list and environment list
is larger than @code{ARG_MAX} bytes.

@item ENOEXEC
The specified file can't be executed because it isn't in the right format.

@item ENOMEM
Executing the specified file requires more storage than is available.
@end table

If execution of the new file is successful, the access time field of the
file is updated as if the file had been opened.  @xref{File Times}, for
more details about access times of files.

The point at which the file is closed again is not specified, but
is at some point before the process exits or before another process
image is executed.

Executing a new process image completely changes the contents of memory,
except for the arguments and the environment, but many other attributes
of the process are unchanged:

@itemize @bullet
@item
The process ID and the parent process ID.  @xref{Process Creation Concepts}.

@item
Session and process group membership.  @xref{Job Control Concepts}.

@item
Real user ID and group ID, and supplementary group IDs.  @xref{User/Group
IDs of a Process}.

@item
Pending alarms.  @xref{Setting an Alarm}.

@item
Current working directory and root directory.  @xref{Working Directory}.

@item
File mode creation mask.  @xref{Setting Permissions}.

@item
Process signal mask; see @ref{Process Signal Mask}.

@item
Pending signals; see @ref{Blocking Signals}.

@item
Elapsed processor time associated with the process; see @ref{Processor Time}.
@end itemize

If the set-user-ID and set-group-ID mode bits of the process image file
are set, this affects the effective user ID and effective group ID
(respectively) of the process.  These concepts are discussed in detail
in @ref{User/Group IDs of a Process}.

Signals that are set to be ignored in the existing process image are
also set to be ignored in the new process image.  All other signals are
set to the default action in the new process image.  For more
information about signals, see @ref{Signal Handling}.

File descriptors open in the existing process image remain open in the
new process image, unless they have the @code{FD_CLOEXEC}
(close-on-exec) flag set.  The files that remain open inherit all
attributes of the open file description from the existing process image,
including file locks.  File descriptors are discussed in @ref{Low-Level
Input/Output}.

Streams, by contrast, cannot survive through @code{exec} functions,
because they are located in the memory of the process itself.  The new
process image has no streams except those it creates afresh.  Each of
the streams in the pre-@code{exec} process image has a descriptor inside
it, and these descriptors do survive through @code{exec} (provided that
they do not have @code{FD_CLOEXEC} set.  The new process image can
reconnect these to new streams using @code{fdopen}.

@node Process Completion, Process Completion Status, Executing a File, Creating New Processes
@subsection Process Completion
@cindex process completion
@cindex waiting for completion of child process
@cindex testing exit status of child process

The functions described in this section are used to wait for a child
process to terminate or stop, and determine its status.  These functions
are declared in the header file @file{sys/wait.h}.
@pindex sys/wait.h

@comment sys/wait.h
@comment POSIX.1
@deftypefun pid_t waitpid (pid_t @var{pid}, int *@var{status_ptr}, int @var{options})
The @code{waitpid} function is used to request status information from a
child process whose process ID is @var{pid}.  Normally, the calling
process is suspended until the child process makes status information
available by terminating.

Other values for the @var{pid} argument have special interpretations.  A
value of @code{-1} or @code{WAIT_ANY} requests status information for
any child process; a value of @code{0} or @code{WAIT_MYPGRP} requests
information for any child process in the same process group as the
calling process; and any other negative value @minus{} @var{pgid}
requests information for any child process whose process group ID is
@var{pgid}.

If status information for a child process is available immediately, this
function returns immediately without waiting.  If more than one eligible
child process has status information available, one of them is chosen
randomly, and its status is returned immediately.  To get the status
from the other programs, you need to call @code{waitpid} again.

The @var{options} argument is a bit mask.  Its value should be the
bitwise OR (that is, the @samp{|} operator) of zero or more of the
@code{WNOHANG} and @code{WUNTRACED} flags.  You can use the
@code{WNOHANG} flag to indicate that the parent process shouldn't wait;
and the @code{WUNTRACED} flag to request status information from stopped
processes as well as processes that have terminated.

The status information from the child process is stored in the object
that @var{status_ptr} points to, unless @var{status_ptr} is a null pointer.

The return value is normally the process ID of the child process whose
status is reported.  If the @code{WNOHANG} option was specified and no
child process is waiting to be noticed, a value of zero is returned.  A
value of @code{-1} is returned in case of error.  The following
@code{errno} error conditions are defined for this function:

@table @code
@item EINTR
The function was interrupted by delivery of a signal to the calling
process.

@item ECHILD
There are no child processes to wait for, or the specified @var{pid}
is not a child of the calling process.

@item EINVAL
An invalid value was provided for the @var{options} argument.
@end table
@end deftypefun

These symbolic constants are defined as values for the @var{pid} argument
to the @code{waitpid} function.

@table @code
@item WAIT_ANY
This constant macro (whose value is @code{-1}) specifies that
@code{waitpid} should return status information about any child process.

@item WAIT_MYPGRP
This constant (with value @code{0}) specifies that @code{waitpid} should
return status information about any child process in the same process
group as the calling process.

These symbolic constants are defined as flags for the @var{options}
argument to the @code{waitpid} function.  You can bitwise-OR the flags
together to obtain a value to use as the argument.

@item WNOHANG
This flag specifies that @code{waitpid} should return immediately
instead of waiting if there is no child process ready to be noticed.

@item WUNTRACED
This macro is used to specify that @code{waitpid} should also report the
status of any child processes that have been stopped as well as those
that have terminated.
@end table

@deftypefun pid_t wait (int *@var{status_ptr})
This is a simplified version of @code{waitpid}, and is used to wait
until any one child process terminates.

@example
wait (&status)
@end example

@noindent
is equivalent to:

@example
waitpid (-1, &status, 0)
@end example

Here's an example of how to use @code{waitpid} to get the status from
all child processes that have terminated, without ever waiting.  This
function is designed to be used as a handler for @code{SIGCHLD}, the
signal that indicates that at least one child process has terminated.

@example
void
sigchld_handler (int signum)
@{
  int pid;
  int status;
  while (1) @{
    pid = waitpid (WAIT_ANY, Estatus, WNOHANG);
    if (pid < 0) @{
      perror ("waitpid");
      break;
    @}
    if (pid == 0)
      break;
    notice_termination (pid, status);
  @}
@}
@end example
@end deftypefun

@node Process Completion Status, BSD wait Functions, Process Completion, Creating New Processes
@subsection Process Completion Status

If the exit status value (@pxref{Program Termination}) of the child
process is zero, then the status value reported by @code{waitpid} or
@code{wait} is also zero.  You can test for other kinds of information
encoded in the returned status value using the following macros.
These macros are defined in the header file @file{sys/wait.h}.
@pindex sys/wait.h

@comment sys/wait.h
@comment POSIX.1
@deftypefn Macro int WIFEXITED (int @var{status})
This macro returns a non-zero value if the child process terminated
normally with @code{exit} or @code{_exit}.
@end deftypefn

@comment sys/wait.h
@comment POSIX.1
@deftypefn Macro int WEXITSTATUS (int @var{status})
If @code{WIFEXITED} is true of @var{status}, this macro returns the
low-order 8 bits of the exit status value from the child process.
@end deftypefn

@comment sys/wait.h
@comment POSIX.1
@deftypefn Macro int WIFSIGNALED (int @var{status})
This macro returns a non-zero value if the child process terminated
by receiving a signal that was not handled.
@end deftypefn

@comment sys/wait.h
@comment POSIX.1
@deftypefn Macro int WTERMSIG (int @var{status})
If @code{WIFSIGNALED} is true of @var{status}, this macro returns the
number of the signal that terminated the child process.
@end deftypefn

@comment sys/wait.h
@comment BSD
@deftypefn Macro int WCOREDUMP (int @var{status})
This macro returns a non-zero value if the child process terminated
and produced a core dump.
@end deftypefn

@comment sys/wait.h
@comment POSIX.1
@deftypefn Macro int WIFSTOPPED (int @var{status})
This macro returns a non-zero value if the child process is stopped.
@end deftypefn

@comment sys/wait.h
@comment POSIX.1
@deftypefn Macro int WSTOPSIG (int @var{status})
If @code{WIFSTOPPED} is true of @var{status}, this macro returns the
number of the signal that caused the child process to stop.
@end deftypefn


@node BSD wait Functions, Process Creation Example, Process Completion Status, Creating New Processes
@subsection BSD Process Completion Functions

The GNU library also provides these related facilities for compatibility
with BSD Unix.  BSD uses the @code{union wait} data type to represent
status values rather than an @code{int}.  The two representations are
actually interchangeable; they describe the same bit patterns. The macros
such as @code{WEXITSTATUS} are defined so that they will work on either
kind of object, and the @code{wait} function is defined to accept either
type of pointer as its @var{status_ptr} argument.

These functions are declared in @file{sys/wait.h}.
@pindex sys/wait.h

@comment sys/wait.h
@comment BSD
@deftp {union Type} wait
This data type represents program termination status values.  It has
the following members:

@table @code
@item int w_termsig
This member is equivalent to the @code{WTERMSIG} macro.

@item int w_coredump
This member is equivalent to the @code{WCOREDUMP} macro.

@item int w_retcode
This member is equivalent to the @code{WEXISTATUS} macro.

@item int w_stopsig
This member is equivalent to the @code{WSTOPSIG} macro.
@end table

Instead of accessing these members directly, you should use the
equivalent macros.
@end deftp

@comment sys/wait.h
@comment BSD
@deftypefun pid_t wait3 (union wait *@var{status_ptr}, int @var{options}, void * @var{usage})
If @var{usage} is a null pointer, this function is equivalent to
@code{waitpid (-1, @var{status_ptr}, @var{options})}.

The @var{usage} argument may also be a pointer to a
@code{struct rusage} object.  Information about system resources used by
terminated processes (but not stopped processes) is returned in this
structure.

@strong{Incomplete:}  The description of the @code{struct rusage} structure
hasn't been written yet.  Put in a cross-reference here.
@end deftypefun

@comment sys/wait.h
@comment BSD
@deftypefun pid_t wait4 (pid_t @var{pid}, union wait *@var{status_ptr}, int @var{options}, void *@var{usage})
If @var{usage} is a null pointer, this function is equivalent to
@code{waitpid (@var{pid}, @var{status_ptr}, @var{options})}.

The @var{usage} argument may also be a pointer to a
@code{struct rusage} object.  Information about system resources used by
terminated processes (but not stopped processes) is returned in this
structure.

@strong{Incomplete:}  The description of the @code{struct rusage} structure
hasn't been written yet.  Put in a cross-reference here.
@end deftypefun

@node Process Creation Example,  , BSD wait Functions, Creating New Processes
@subsection Process Creation Example

Here is an example program showing how you might write a function
similar to the built-in @code{system}.  It executes its @var{command}
argument using the equivalent of @samp{sh -c @var{command}}.

@example
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

/* @r{Execute the command using this shell program.}  */
#define SHELL "/bin/sh"

int
my_system (char *command)
@{
  int status;
  pid_t pid;

  pid =  fork ();
  if (pid == 0) @{
    /* @r{This is the child process.  Execute the shell command.} */
    execl (SHELL, SHELL, "-c", command, NULL);
    exit (EXIT_FAILURE);
  @}
  else if (pid < 0)
    /* @r{The fork failed.  Report failure.}  */
    status = -1;
  else @{
    /* @r{This is the parent process.  Wait for the child to complete.}  */
    if (waitpid (pid, &status, 0) != pid)
      status = -1;
  @}
  return status;
@}
@end example

@comment Yes, this example has been tested.

There are a couple of things you should pay attention to in this
example.

Remember that the first @code{argv} argument supplied to the program
represents the name of the program being executed.  That is why, in the
call to @code{execl}, @code{SHELL} is supplied once to name the program
to execute and a second time to supply a value for @code{argv[0]}.

The @code{execl} call in the child process doesn't return if it is
successful.  If it fails, you must do something to make the child
process terminate.  Just returning a bad status code with @code{return}
would leave two processes running the original program.  Instead, the
right behavior is for the child process to report failure to its parent
process.  To do this, @code{exit} is called with a failure status.