A process is basically an invocation of a program. Whenever you type `ls', a new process is created (and destroyed when it's finished). The shell you typed `ls' in is a process. The xterm the shell is running in is a process. Your window manager is one or more processes. X is a process.
Your window manager (usually) provides lots of different ways to create processes (run programs), plus some ways of managing them (you can close or destroy the window the process is using, which often results in the process terminating as well). Your shell provides more: it also allows processes to be suspended, restarted, put "in the background", etc.
But sometimes this isn't enough, which is where tools such as ps, top, nice, kill, etc. come in.
Back in the "good old days" when everybody used character-based terminals, it was convenient to be able to "drop out" of a running program and return to the shell, run a few other commands, and return to the program you were using. Thus most shells recognise a "suspend" character (typically Ctrl-Z) which allows one to temporarily stop a running program and return to the shell. When one is ready to return to the program, one simply types `fg' (foreground) and the last suspended program is resumed.
Since one can have multiple programs suspended, it is useful to be able to get a list of them. That's where the `jobs' command comes in:
wharvey@bowman% jobs
[1] + Suspended vi Mmakefile
[2] Suspended info make
[3] - Exit 1 cvs diff -u |&
Suspended less
[4] Suspended info texinfo
[5] Suspended vi ~/TODO
[8] Suspended info -f /usr/info/gcc
[9] Suspended info -f ~/mercury/mercury/info/mercury
wharvey@bowman%
The first column gives the "job number". The `+' in the second column marks the most recently suspended job, while the `-' marks the next most recently suspended job. The third column gives the process status, while the fourth lists the command executed to create the job. Note that job three consists of two processes connected by a pipe, and the status of each is given independently.
Of course a list of jobs isn't much good unless you can can do more than just resume the last one, so it should be no surprise that you can indeed do more than this. `%n' resumes job number `n'. `%' and `%%' are synonyms for `fg' and resume the "current" job. `%-' resumes the "previous" job. You can also use `fg %n' (and `fg %', `fg %-', etc.) if you like.
If you have no more use for a job and wish to terminate it, you can use the `kill' command: `kill %n' kills job `n'. This is similar to typing `Ctrl-C' when the program is in the foreground, except that sometimes a program ignores a `Ctrl-C' but will recognise a `Ctrl-Z'.
You can also tell the shell to make a process continue to run "in the background" --- that is, if a program does not need to interact with the user, it can be "detached" from the terminal and continue to run while you do other things (though if it produces lots of output and you haven't redirected it elsewhere, it still gets printed on the screen, which can make doing other things with that shell awkward). To make job `n' run in the background, type `bg %n'.
Note that if you wish to have the program run in the background from the beginning, you can just put a `&' at the end of the command and the shell will start it in the background for you. Still, I often forget, and it's convenient to be able to just hit `Ctrl-Z' and type `bg' to fix it.
One kind of program you'll often want to run in the background are X programs (assuming you don't start it directly from the window manager). For instance I have an alias `xvi' which runs `xterm' with appropriate options so it opens a new window and starts editing a file. If this `xterm' were just run in the background (with `&'), then each of these edit sessions would appear in my jobs list (as `Running'). I often start a dozen or so of these edit sessions from the one shell, and since I typically don't want to suspend or resume them (after all, they're "running" in a separate window), I'd prefer it if they didn't clutter up the job list. The trick here is to start such programs in a "sub shell". E.g. `(netscape &)', `(exmh &)'.
Sometimes the process management facilities of the shell are not sufficient;
for example when the process you're wanting to manage is not a job created
by a shell, or the shell which created the job is no longer around or not
accessible. This is where tools such as `ps' and `top' start becoming really
useful.
ps
`ps' is like `jobs', except that it can list more detailed information, and can give information about all processes running on the machine, not just those jobs being managed by the shell. `ps' has many options, and I encourage you to read the man page to find out about some of them. But here's the ones I use the most:
l "long" format u "user" format a include processes owned by all users, not just your own x include processes without a controlling terminal
An example extract of the "long" format:
FLAGS UID PID PPID PRI NI SIZE RSS WCHAN STA TTY TIME COMMAND
100 776 325 1 0 0 1500 0 wait4 SW 1 0:00 (login)
40 776 8386 1 0 0 928 168 wait_for_co S p5 5:37 /home/wha
0 776 15257 1 0 0 112680 17428 do_select S p2 28:02 /usr/lib/
0 776 18531 325 0 0 1600 0 read_chan SW 1 0:00 (tcsh)
100 776 433 419 0 0 1704 0 read_chan SW p2 0:00 (tcsh)
100 776 436 420 13 0 1704 580 sigsuspend S p5 0:01 (tcsh)
100 776 435 424 0 0 1932 0 do_select SW p4 0:06 (ssh)
The same processes using the "user" format:
USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND wharvey 325 0.0 0.0 1500 0 1 SW Jul 6 0:00 (login) wharvey 433 0.0 0.0 1704 0 p2 SW Jul 6 0:00 (tcsh) wharvey 435 0.0 0.0 1932 0 p4 SW Jul 6 0:06 (ssh) wharvey 436 0.0 0.1 1704 580 p5 S Jul 6 0:01 (tcsh) wharvey 8386 0.1 0.0 928 168 p5 S Jul 30 5:37 /home/wharvey/goofey wharvey 15257 0.0 4.5 112680 17448 p2 S Jul 9 28:02 /usr/lib/netscape/net wharvey 18531 0.0 0.0 1600 0 1 SW Jul 12 0:00 (tcsh)
If you find your machine thrashing, often you can find out why by
running a `ps aux' or `ps alx' and looking at the `RSS' fields to see which
processes are consuming lots of memory. `ps' can also be extremely useful
for diagnosing problems with system services: e.g. it could be that you're
having trouble with NFS because the `rpc.mountd' daemon isn't running.
top
`top' is useful for monitoring those processes consuming the most CPU at any given time, as well as various other system statistics:
11:27pm up 27 days, 8:05, 47 users, load average: 0.00, 0.00, 0.00
186 processes: 158 sleeping, 2 running, 0 zombie, 26 stopped
CPU states: 0.3% user, 5.1% system, 0.0% nice, 94.6% idle
Mem: 387688K av, 290820K used, 96868K free, 41808K shrd, 120544K buff
Swap: 521676K av, 267072K used, 254604K free 69656K cached
PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
19002 wharvey 12 0 776 776 564 R 0 4.3 0.2 0:00 top
335 root 7 0 24688 15M 1288 R 0 1.1 3.9 611:30 X
1 root 0 0 108 68 52 S 0 0.0 0.0 0:04 init
2 root 0 0 0 0 0 SW 0 0.0 0.0 0:16 kflushd
3 root 0 0 0 0 0 SW 0 0.0 0.0 30:07 kswapd
65 root 0 0 84 48 36 S 0 0.0 0.0 0:03 kerneld
188 bin 0 0 80 0 0 SW 0 0.0 0.0 0:00 portmap
192 root 0 0 0 0 0 SW 0 0.0 0.0 0:05 rpciod
The `proc' filesystem can be a great source of information about processes (as well as many other system resources). All of the information available to `ps', `top', etc. can also be found here --- in its raw form. There's one subdirectory per process under `/proc', plus a whole bunch of other files and directories containing system-wide information.
wharvey@bowman% ls -Flag /proc/8386 total 0 dr-xr-xr-x 3 wharvey staff 0 Aug 3 00:24 ./ dr-xr-xr-x 199 root root 0 Jul 6 15:21 ../ -r--r--r-- 1 wharvey staff 0 Aug 3 00:24 cmdline -r--r--r-- 1 wharvey staff 0 Aug 3 00:24 cpu lrwx------ 1 wharvey staff 0 Aug 3 00:24 cwd -> /home/wharvey/src/-r-------- 1 wharvey staff 0 Aug 3 00:24 environ lrwx------ 1 wharvey staff 0 Aug 3 00:24 exe -> /home/wharvey/goofey* dr-x------ 2 wharvey staff 0 Aug 3 00:24 fd/ pr--r--r-- 1 wharvey staff 0 Aug 3 00:24 maps| -rw------- 1 wharvey staff 0 Aug 3 00:24 mem lrwx------ 1 wharvey staff 0 Aug 3 00:24 root -> // -r--r--r-- 1 wharvey staff 0 Aug 3 00:24 stat -r--r--r-- 1 wharvey staff 0 Aug 3 00:24 statm -r--r--r-- 1 wharvey staff 0 Aug 3 00:24 status wharvey@bowman% cat /proc/partitions major minor #blocks name 3 0 6353235 hda 3 1 2048256 hda1 3 2 1 hda2 3 5 4160803 hda5 3 6 128488 hda6 wharvey@bowman% ls -Flag /proc/net total 0 dr-xr-xr-x 3 root root 0 Aug 3 00:34 ./ dr-xr-xr-x 199 root root 0 Jul 6 15:21 ../ -r--r--r-- 1 root root 0 Aug 3 00:34 arp -r--r--r-- 1 root root 0 Aug 3 00:34 dev -r--r--r-- 1 root root 0 Aug 3 00:34 dev_mcast -r--r--r-- 1 root root 0 Aug 3 00:34 dev_stat -r--r--r-- 1 root root 0 Aug 3 00:34 netlink -r--r--r-- 1 root root 0 Aug 3 00:34 netstat -r--r--r-- 1 root root 0 Aug 3 00:34 raw -r--r--r-- 1 root root 0 Aug 3 00:34 route dr-xr-xr-x 2 root root 0 Aug 3 00:34 rpc/ -r--r--r-- 1 root root 0 Aug 3 00:34 rt_cache -r--r--r-- 1 root root 0 Aug 3 00:34 snmp -r--r--r-- 1 root root 0 Aug 3 00:34 sockstat -r--r--r-- 1 root root 0 Aug 3 00:34 tcp -r--r--r-- 1 root root 0 Aug 3 00:34 udp -r--r--r-- 1 root root 0 Aug 3 00:34 unix wharvey@bowman%
`kill' does more than kill processes or jobs. It is actually a generic tool for sending signals to processes. It's just that the default signal happens to be `TERM' (terminate)...
There are many signals one can send to a process:
HUP INT QUIT ILL TRAP ABRT BUS FPE KILL USR1 SEGV USR2 PIPE ALRM TERM STKFLT CHLD CONT STOP TSTP TTIN TTOU URG XCPU XFSZ VTALRM PROF WINCH POLL PWR UNUSED
`INT', `QUIT', `TERM' and `KILL' are all different ways of terminating a process, with slightly different semantics. `HUP' also by default terminates a process, but many processes trap this signal and perform some special operation instead (e.g. re-read configuration files). `STOP' and `CONT' can be used to temporarily suspend and then resume a process (like `Ctrl-Z' and `fg'/`bg' in a shell).
More information about the various signals can be found in the man
pages: `man 7 signal'.
nice
Sometimes you want processes to run with different priorities. For example, suppose you have a CPU-intensive program which is going to take several hours to complete its task, and you (or others) will be trying to get other work done in the mean time. Unless the long-running job is extremely urgent, it is better to let interactive jobs have a higher priority. The process scheduler will do this automatically to a limited extent, but a better response can be achieved if you designate the long-running jobs as having a lower priority. This can be done using the `nice' command when starting the job. E.g. `nice make' will execute `make' with a lower priority.
You can specify how nice to make the job with an appropriate command-line argument. Note that many shells implement `nice' as a built-in, and that the syntax can differ from the system-supplied executable. E.g. `nice -10' means the opposite if invoked under `tcsh' when compared to `/bin/nice': it tries to *raise* the job's priority (which only works if you have root privileges).
You may wish to modify a job's priority after it starts, if you forgot to `nice' it, or if you didn't realise how long it was going to run for. You can use the `renice' command for this, but you need to know the process ID (ps and top are your friends). If the job contains multiple processes, you will have to renice them all individually (child processes normally inherit their parent's priority, but if the parent's priority is modified later, this does not affect the child).
Note that once a process's priority is lowered, it cannot be raised
again, except by the super user.
Miscellaneous
Some other tools useful for monitoring one's system are `uptime' and `xload'.