Linux Load Average Values - How to explain the mystery?


I have an ubuntu 16.04 LTS box (kernel 4.4, Intel I3 2core) here running 4 Docker containers with one process each:
a) nginx
b) postgresql
c) some proprietary data accepting client
d) some proprietary data processing client

Whereas “data” comes in with moderate rates via a Gigabit interface.

if i compare the load case (data is comming in) versus the idle case (no input data, everything idle), i see pretty much the same output of top, just the load averages differ.
usually, i would see something like processes in status “D” or increased “IO” / “WAIT” / “SIRQ” / “HIRQ” values that explain the difference in the load.

But not here.
By the way, the (high) load of 6.42 is by far not unusual, i have seen values of 12 also.

Anybody having an idea how that high load average can be explained without evidence in the metrics shown by top ?

root@mybox:~# top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D (I/O wait probably): "count}'
top - 06:29:56 up 6 days, 13:19,  2 users,  load average: 6.42, 4.58, 2.25
Tasks: 153 total,   3 running, 150 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.1 us,  2.6 sy,  2.7 ni, 87.1 id,  0.0 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem :  3963320 total,  1917248 free,  1147236 used,   898836 buff/cache
KiB Swap:  4110332 total,  3969812 free,   140520 used.  2381128 avail Mem

Total status D (I/O wait probably):

root@mybox:~# top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D (I/O wait probably): "count}'
top - 06:50:03 up 6 days, 13:40,  2 users,  load average: 0.18, 2.55, 3.56
Tasks: 145 total,   1 running, 144 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.3 us,  2.6 sy,  2.7 ni, 86.9 id,  0.0 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem :  3963320 total,  1348120 free,  1428480 used,  1186720 buff/cache
KiB Swap:  4110332 total,  3979252 free,   131080 used.  2096020 avail Mem

Total status D (I/O wait probably):


In the high load scenario there are two running processes more. What are they doing?

(David G. Miller) #3

Load for Linux is the number of processes currently running plus the number of processes in the run queue. Load average is just that number averaged over time. If you’re data digestion means some processes are CPU bound, the load is just indicating that other processes are waiting to get a CPU.



One of them receives the data and passes it to postgresql.
The other more or less post-processes the data.


After reading,
i doubt that.

Brendan states that

In 1993, a Linux engineer found a nonintuitive case with load averages, and with a three-line patch changed them forever from “CPU load averages” to what one might call “system load averages.” His change included tasks in the uninterruptible state, so that load averages reflected demand for disk resources and not just CPUs.

And he refers to a code snipped:

for(p = &LAST_TASK; p > &FIRST_TASK; --p)
       if (*p && ((*p)->state == TASK_RUNNING) ||
                  (*p)->state == TASK_UNINTERRUPTIBLE) ||
                  (*p)->state == TASK_SWAPPING))
            nr += FIXED_1;
    return nr;

But we see no “State D” tasks.
And we do not see a major change in the swap space nor in the number of processes.

Thanks for your thoughts anyhow to both of you !

(David G. Miller) #6

I think I ran cross this article last year when the question came up:

I think you’ll find that the two ways of looking at “load average” are equivalent. TASK_RUNNING is the state for processes that either are running or are waiting for a time slice to run. TASK_UNITERUPTABLE is rare but just means the task is running and TASK_SWAPPING means you have too little memory to run your runnable processes. Add them up and you get the number of running processes plus the number of processes that need a CPU in order to run: aka the run queue depth.

I needed an explanation that worked for people whose eyes glaze over when they have to look at code. :smiley:



Agreed. Thanks for the link and for your efforts.

I meanwhile found the solution to my problem being the way top works:
top basically consults /proc/stat which contains the metrics as accumulated values since reboot.
It stores this reading as “lastvalue” and after the next refresh interval, displays the difference of currentread - lastread.

That leads to the fact that the very first reading is always wrong because it shows the values since reboot, not according to the given interval.

Running top interactively, you will never mention that.
But in batchmode and a maximum of 1 refresh (-b -n 1) i always get these values i am not interested in.

Instead i need to run top -b -n 2 -d [LongAsPossibleInterval] and ignore the first of the both resulting readings.
What is shown here matches what can be calculated from /proc/loadavg and /proc/stat and thus looks way more trusty :slight_smile:

(David G. Miller) #8

I always took the load averages presented by top as being one step above an “idiot light” for the system load (single CPU): less than one, things are good; between one and five, loaded but probably managing; greater than five, overloaded. I never tried to get anything more than that out of it. Appreciate you sharing the details.