Monitoring Disk I/O on Linux.
Today I wrote a little piece of code that generates a random array of numbers, stores it into disk and sorts it on disk, using bubble sort. This uses O(1) memory, but it's obviously very slow. I did this for fun mostly. The code is here.
I wanted to know, if I'm running this code, what Linux tools are going to show an increase in disk I/O.
So, I run this:
$ ./disksort 150000
This starts the sorting. It takes a long time to sort 150k numbers on disk, at least with that algorithm.
So the first command that we can run is iostat. Let's see what it is outputting before I start disksort
$ sudo iostat 5 2
Linux 4.4.0-31-generic (petko-VirtualBox) 11/16/2016 _x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
14.68 0.19 38.34 0.05 0.00 46.73
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 5.60 83.17 63.96 539970 415252
avg-cpu: %user %nice %system %iowait %steal %idle
3.03 0.00 0.00 0.00 0.00 96.97
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 0.00 0.00 0.00 0 0
The command line invocation is a bit strange. The 5 2 part says "aggregate IO data for five seconds and show me two reports". The idea is that you can get continuous report output every five seconds. But I need only one report. The first one is a default one, which shows aggregated data for I'm not sure what period. But the second one is interesting. Notice kB_read/s and kB_wrtn/s. They're zeros. So it's all quiet. I now run disksort and run the same command.
$ iostat 5 2
Linux 4.4.0-31-generic (petko-VirtualBox) 11/16/2016 _x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.32 0.17 40.33 0.05 0.00 44.14
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 5.04 73.65 59.42 540254 435868
avg-cpu: %user %nice %system %iowait %steal %idle
26.81 0.00 73.19 0.00 0.00 0.00
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 0.80 0.00 157.64 0 588
Notice how kB_wrtn/s spiked up. kB_read is zero and I assume that's because of the buffer cache. After all, I'm reading the same file again and again.
Another command that we can use is iotop. iotop is similar to top, but shows I/O stats. We'll run it using iotop -oa. The -o parameter makes it show only processes that do I/O. The -a flag aggregates the data during the time this command is running. So after running it for a few seconds I'm seeing this:
Total DISK READ : 0.00 B/s | Total DISK WRITE : 83.77 K/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
545 be/3 root 0.00 B 4.00 K 0.00 % 0.08 % [jbd2/sda1-8]
7717 be/4 root 0.00 B 0.00 B 0.00 % 0.01 % [kworker/u2:3]
7865 be/4 root 0.00 B 0.00 B 0.00 % 0.01 % [kworker/u2:1]
7714 be/4 petko 0.00 B 1144.00 K 0.00 % 0.00 % ./disksort 150000
7097 be/4 root 0.00 B 0.00 B 0.00 % 0.00 % [kworker/u2:2]
7758 be/4 root 0.00 B 0.00 B 0.00 % 0.00 % [kworker/u2:0]
Notice how disksort with pid of 7714 is well ahead of everything else shown.
So that's it, happy debugging.