Examining a process in Linux.

I've been thinking about writing a blog post about Linux tools and commands related to processes. Let's take a look at some of them.

The process that we'll be looking at is a webserver that I wrote some time ago to practice my C and write some code that does network related work. This webserver runs a threadpool where N threads are waiting for server requests that they're going to execute.

So let's start the server:

$ ./server
Running on port: 8000

Great. So which process is this server running as? We can use the pidof command to find that out. Its output looks like this:

$ pidof server
8876

If we had other processes which were running an executable with that name, we'd see more process ids, but since we only have one, we see one process id.

What next? Let's see how the process is layed out in memory. To do this, we can use the pmap command. Its output looks like this:

$ sudo pmap -p 8876
8876:   ./server
0000000000400000     16K r-x-- /home/petko/work/github/webserver/server
0000000000603000      4K r---- /home/petko/work/github/webserver/server
0000000000604000      4K rw--- /home/petko/work/github/webserver/server
000000000110f000    132K rw---   [ anon ]
00007fd5ca731000      4K -----   [ anon ]
00007fd5ca732000   8192K rw---   [ anon ]
00007fd5caf32000      4K -----   [ anon ]
00007fd5caf33000   8192K rw---   [ anon ]
00007fd5cb733000      4K -----   [ anon ]
00007fd5cb734000   8192K rw---   [ anon ]
00007fd5cbf34000      4K -----   [ anon ]
00007fd5cbf35000   8192K rw---   [ anon ]
00007fd5cc735000   1792K r-x-- /lib/x86_64-linux-gnu/libc-2.23.so
00007fd5cc8f5000   2044K ----- /lib/x86_64-linux-gnu/libc-2.23.so
00007fd5ccaf4000     16K r---- /lib/x86_64-linux-gnu/libc-2.23.so
00007fd5ccaf8000      8K rw--- /lib/x86_64-linux-gnu/libc-2.23.so
00007fd5ccafa000     16K rw---   [ anon ]
00007fd5ccafe000     96K r-x-- /lib/x86_64-linux-gnu/libpthread-2.23.so
00007fd5ccb16000   2044K ----- /lib/x86_64-linux-gnu/libpthread-2.23.so
00007fd5ccd15000      4K r---- /lib/x86_64-linux-gnu/libpthread-2.23.so
00007fd5ccd16000      4K rw--- /lib/x86_64-linux-gnu/libpthread-2.23.so
00007fd5ccd17000     16K rw---   [ anon ]
00007fd5ccd1b000    152K r-x-- /lib/x86_64-linux-gnu/ld-2.23.so
00007fd5ccf22000     12K rw---   [ anon ]
00007fd5ccf3e000      8K rw---   [ anon ]
00007fd5ccf40000      4K r---- /lib/x86_64-linux-gnu/ld-2.23.so
00007fd5ccf41000      4K rw--- /lib/x86_64-linux-gnu/ld-2.23.so
00007fd5ccf42000      4K rw---   [ anon ]
00007ffca0861000    132K rw---   [ stack ]
00007ffca09eb000      8K r----   [ anon ]
00007ffca09ed000      8K r-x--   [ anon ]
ffffffffff600000      4K r-x--   [ anon ]
total            39316K

What you see here are virtual memory addresses. For example, let's take a look at this line:

00007fd5cc735000 1792K r-x-- /lib/x86_64-linux-gnu/libc-2.23.so

This is the code for libc, which is the C standard library. This code is shared between processes that need it. We can see the x flag, which means that this is executable memory. The size if roughly the same as the size of this so file. This library is memory mapped into a region starting at address 00007fd5cc735000, but in physical memory it's only stored in one place. To learn more about memory in Linux, here's a great post going into detail about it.

Another interesting command is lsof. lsof stands for "list of open files". Let's see its output:

$ sudo lsof -p 8876
COMMAND  PID  USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
server  8876 petko  cwd    DIR    8,1     4096  262299 /home/petko/work/github/webserver
server  8876 petko  rtd    DIR    8,1     4096       2 /
server  8876 petko  txt    REG    8,1    25536  306491 /home/petko/work/github/webserver/server
server  8876 petko  mem    REG    8,1  1864888 1184834 /lib/x86_64-linux-gnu/libc-2.23.so
server  8876 petko  mem    REG    8,1   138744 1184980 /lib/x86_64-linux-gnu/libpthread-2.23.so
server  8876 petko  mem    REG    8,1   162632 1184806 /lib/x86_64-linux-gnu/ld-2.23.so
server  8876 petko    0u   CHR  136,9      0t0      12 /dev/pts/9
server  8876 petko    1u   CHR  136,9      0t0      12 /dev/pts/9
server  8876 petko    2u   CHR  136,9      0t0      12 /dev/pts/9
server  8876 petko    3u  IPv4  81993      0t0     TCP *:8000 (LISTEN)

As you can see, we have file descriptors 0,1 and 2, which are stdin, stdout and stderr. They are linked to the terminal in which the process is running in. You can write to that terminal btw. Just type echo "hello world" > /dev/pts/9 and you'll see that text in the terminal where your webserver is running. File descriptor number 3 is our socket which accepts connections.

Another interesting way to inspect processes is the ps command. Its basic output looks like this:

$ ps --pid 8876
  PID TTY          TIME CMD
 8876 pts/9    00:00:00 server

This is simple. We can also show the threads inside a process, like this:

$ ps  m --pid 8876 -o pid,tid,cmd
  PID   TID CMD
 8876     - ./server
    -  8876 -
    -  8877 -
    -  8878 -
    -  8879 -
    -  8880 -

We have five threads here. One is our main thread and the other four are the threadpool threads. The m option tells ps to show the threads of a process. The -o option specifies fields to output. We can even get fancy and output the addresses of the threads' stack pointers and instruction pointers, like this:

$ ps  m --pid 8876 -o pid,tid,cmd,esp,eip
  PID   TID CMD                              ESP      EIP
 8876     - ./server                           -        -
    -  8876 -                           a0880b70 ccb0e7ad
    -  8877 -                           cc733ec0 ccb0b3a0
    -  8878 -                           cbf32ec0 ccb0b3a0
    -  8879 -                           cb731ec0 ccb0b3a0
    -  8880 -                           caf30ec0 ccb0b3a0

So all the threads are at the same instruction, but they have different stack pointers, which makes sense. If I execute something on one of the threads, both the ESP and EIP can possibly change.

A lot of data about processes lives in the proc filesytem, located in /proc. For each running process, there's a subdirectory of /proc named after the process id. For example, for our process 8876, there's a status file which lists various information about the process. Let's look at it:

$ cat /proc/8876/status
Name:   server
State:  S (sleeping)
Tgid:   8876
Ngid:   0
Pid:    8876
PPid:   2604
TracerPid:      0
Uid:    1000    1000    1000    1000
Gid:    1000    1000    1000    1000
FDSize: 256
Groups: 4 24 27 30 46 113 128 999 1000 
NStgid: 8876
NSpid:  8876
NSpgid: 8876
NSsid:  2604
VmPeak:    39316 kB
VmSize:    39316 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:       800 kB
VmRSS:       800 kB
VmData:    32988 kB
VmStk:       136 kB
VmExe:        16 kB
VmLib:      2040 kB
VmPTE:        48 kB
VmPMD:        12 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
Threads:        5
SigQ:   0/7848
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Seccomp:        0
Cpus_allowed:   1
Cpus_allowed_list:      0
Mems_allowed:   00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        3
nonvoluntary_ctxt_switches:     2

There's a lot of data in here, but remember how we used ps to count the number of threads in this process. That's also available here on the line saying Threads: 5.

Our last command is pidstat. pidstat shows statistics about a running process, which can be updated at a regular time interval. A possible invocation can be:

$ pidstat -p 8876 1
Linux 4.4.0-64-generic (virtbox)        03/01/2017      _x86_64_        (1 CPU)

12:22:00 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
12:22:01 PM  1000      8876    0.00    0.00    0.00    0.00     0  server
12:22:02 PM  1000      8876    0.00    0.00    0.00    0.00     0  server

Our server is not doing anything right now, so you see a lot of zeroes.

There are many other interesting commands that you can look to figure out what processes are doing. strace shows system calls run by a process. ltrace shows dynamic library calls. tcpdump can be used to show traffic going in and out of a process.

So, that's all for today. Happy running of processes.

social