A process is the name given to a program in execution. The operating system represent each processes with a process control block (PCB). The PCB is a data structure (a C structure called task_struct in
<linux/sched.h>) that store all the information and resources for a process, some of these are:
- Process state
- new - if the process is being created
- running - instructions are being executed
- waiting - if the process is waiting for some events to occur
- ready - if the process is waiting to be assigned to a processor
- terminated - the process ha finished execution
- CPU registers - all registers used during execution
- Program counter - the register that store the value of next instruction
- CPU-scheduling information - process priority and other parameters
- Memory-management information
- Accounting information - amount of CPU, time used, time limits, process number
- I/O status information - a list of open files
All PCBs are stored inside an array called process table.
Each new process is initially put in a job queue that contains all process in the system. Processes that are ready to execute are kept inside a ready queue, stored as linked list. The list of process waiting for a particular device is called device queue.
The process scheduler is in charge of choosing the which process execute next, it’s the unit that moves a process among the various queues. Usually there are always at least two schedulers
- short-term scheduler - it select processes among the ones already loaded in memory. It must operate very frequently since a process may execute for just only few milliseconds before waiting for some events. A new process is usually select every 100 milliseconds.
long-term scheduler - it’s responsible for selecting the process that need to be loaded in memory, in this way it controls the degree of multiprogramming. To keep it stable the rate of process creation should be equal to the average rate of process termination.
- medium-term scheduler - (not so common) its task is to decrease the degree of multiprogramming by removing some process form memory. Later, they can be reintroduced into memory and their execution continued. This scheme is called swapping.
The long-term scheduler, must carefully decide which process to load next, it important to keep a process mix of I/O-bound process (those that spend more time doing I/O, are slow) and CPU-bound process (those that spend more time doing computation, are usually fast).
When an interrupt occur it must be handled by the CPU, the system need to save the context of the process running (the PCB) in order to restore it in the future. The task of saving a process state and restoring the state of another process is called context switch. The technique of granting usage of the CPU to different process sequentially is called time sharing.
Operations on Processes
During its execution, a process may create several new processes (a parent process create child processes). Each of these new child processes may create other process, forming a tree of processes. To keep a well organized structure, each process is identified by a unique integer number called process identifier (pid). The
init process that represent the root of the processes tree has pid of 1. When a process creates a child process that child process will obtain some resource (from the operating system or from the parent process).
fork() function creates a new process that is an exact copy of the parent process. Both processes continue execution at the instruction after fork with a different return value: 0 in child, pid of child in parent or -1 on error.
Modern implementations don’t perform a complete copy the parent’s data, but instead, a technique called copy-on-write is used. Initially the memory is shared between parent and child, if either process tries to modify these regions, the kernel make a copy of that data only.
A nice usage of the fork call might put the fork inside a
The fork function also duplicate all file descriptors that are open in the parent. There are in general two cases for handling the descriptors after a fork.
Parent and child go their own ways. In this case parent and child keep open just the descriptor of the files they will use and close the other (usually the child execute another program with the
The parent waits for the child to complete using the
wait()sys call. In this case the parent will have the offset of each files updated according to the child operations on the files.
vfork() function creates a new process for the purpose of executing a new different program. It creates a new process as
fork do, but without copying the address space of the parent into the child. The reason for that is the child won’t need any information from the parent.
However, until the the child call the
exec() procedure, it will run in the address space of its parent. Any changes made by the child on the data, heap or stack segment will be visible to the parent once it resumes. The parent, after a
vfork is stopped until the child call either
exit. It’s also guaranteed that after a vfork the child will be scheduled before the parent (it’s not guaranteed for the fork call).
Executing new program
When a child process is forked, the child it’s usually used to execute a complete different program. It’s possible to load a new program into a child process with the
execve() system call.
execve() take as parameters pathname (the pathname of the new program to be loaded into memory), argv (specifies the command line arguments) and envp (that specify the environment list for the new program). A successful
execve() never return, if an error occour the return value will be -1.
There are several library function that wrap the
execve() system call and are used in different scenario
wait() system call allows the parent to wait until a child process complete. When a process terminates it always send to the parent a
wait() system call return the
pid of the child that terminates or -1 on error and takes a pointer to int argument to store the termination status. It’s possible to be notified if the child is stopped by a signal (
SIGTTIN) or when it’s resume (
waitpid() system call provide more functionality. It can behave as the
wait() sys call by setting the
pid to -1 (it this case it waits for any child process).
In addition, if the
pid is greater than 0, it waits for the specific process to terminate. This system call provide also useful options as bitmask:
- WUNTRACED - also return information when a process is stopped.
- WCONTINUED - also return information when a stopped process is resumed (by a
- WNOHANG - if no child with the given
pidhas yet changed state it returns immediately with a value of 0.
<sys/wait.h> are defined macros that can be used to examine a wait status value.
WIFEXITED(status)- return true when the child process exited normally
WIFSIGNALED(status)- return true when the child process was killed by a signal, if that’s the case
WTERMSIG(status)return the number of the singal that terminated the process.
WIFSTOPPED(status)- return true when the child process was stopped by a signal, if that’s the case
WSTOPSIG(status)return the number of the singal that stooped the process.
WIFCONTINUED(status)- return true when the child process is resumed by
A process terminates when it call the
exit() system call, when it reach the
return statement of the main or when there is an abnormal termination (calling abort, when it receives certain signals). On termination the kernel close all open descriptors and releases the memory the process was using.
Usually it’s used instead the library function
When a process ends its entry in the process table (its PCB) must remain stored until its parent calls
waitpid()), because it holds the process’ exit status. A process that has terminate but whose parent has not yet called
wait()is called zombie process. It may also happen that the parent terminates whiteout calling the wait procedure, if that’s the case the process bacame an orphan process. An orphan process is assigned as child to the
init process that periodically invokes
wait()to free the resources.
On some systems when a parent terminates all its children are terminated too, this event is called cascading termination.
Interprocess Communication - (IPC)
Process cooperation if mandatory can drastically increase the performances of a multiprocess program. There are two fundamentals models of ICP: message passing and shared memory
It’s often utilized in distributed system where the communicating processes reside on different computer connected by a network, it’s usually easier to implement than shared memory but it also slower.
It require for the communicating processes to establish a region of shared memory that and, since it might happen that more processes accesses it simultaneously, each process should access the memory on particular condition (mutex, semaphore).