2.6 upgrade, high high CS and RunQ - where is it coming from?

Submitted by lizard2004
on October 18, 2007 - 2:29pm

Hi Yall,

I am running into an issue, wondering if anyone's heard of it before, or even if I should be worried.

I recently reinstalled half of a Apache/JBoss/Java 1.5 server farm to FedoraCore3 (don't ask, I'd rather go higher but test/dev holding me back) and now notice that although throughput is fine, CPU utilization is much lower than before, but the context switches and run queues are uncomfortably high:

# vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
22 0 5792 293644 128628 660736 0 0 0 26 1245 27878 2 1 98 0
0 0 5792 294092 128628 660736 0 0 0 14 1678 24973 15 2 83 0
0 0 5792 294100 128628 660736 0 0 0 17 1465 22947 9 1 90 0
19 0 5792 294164 128628 660736 0 0 0 22 1327 26252 4 1 95 0
14 0 5792 293724 128628 660736 0 0 0 34 1337 27446 17 2 81 0
13 0 5792 311268 128628 660736 0 0 0 30 1409 20033 13 2 85 0
3 0 5792 311668 128628 660736 0 0 0 30 1572 25517 8 2 90 0

# uptime
16:12:39 up 85 days, 1:42, 1 user, load average: 9.62, 7.01, 6.18

Same load is being directed to our 2.4 servers, but never a queued process and very low CS, mostly high User Time:

# vmstat 5
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 44348 241796 48072 445640 0 1 2 3 0 1 1 1 0
0 0 0 44348 242140 48072 445656 0 0 0 30 786 4843 32 4 63
0 0 0 44348 242748 48072 445676 0 0 0 20 561 3481 33 1 65
0 0 0 44348 242672 48072 445736 0 0 0 30 1694 5264 87 4 9
0 0 0 44348 242744 48072 445776 0 0 0 32 933 5058 44 2 54
0 0 0 44348 242760 48076 445792 0 0 0 25 499 4741 8 11 81
0 0 0 44348 242800 48076 445816 0 0 0 22 584 4818 23 20 57

# uptime
16:21:34 up 321 days, 22:47, 1 user, load average: 0.51, 0.72, 0.94

Is there a good way to tell which threads are in the run queue so I can see if they are from Java as opposed to the kernel?

BTW: here is the CPUInfo as I hear there may be Xeon issues:

processor : 3 (this is a 2-CPU SMT-enabled machine).
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 3.06GHz
stepping : 5
cpu MHz : 3049.382
cache size : 512 KB

Thanks for any insights!
-Gary

Is there a good way to tell

intgr
on
October 19, 2007 - 3:01am

Is there a good way to tell which threads are in the run queue so I can see if they are from Java as opposed to the kernel?

Have you looked at the "top" command? Runnable threads/processes are marked as "R" in the "S" (state) column. (you can use something like htop, but you explicitly need to enable showing kernel threads).

But 4 thousand context switches per second is pretty bad, especially Java programs which tend to be more cache-hostile than normal ones.

You forgot to tell us which kernel version you're running. If it's 2.6.23, or any kernel with the new CFS scheduler, it sounds likely that this is caused by the programs calling yield() — which is a hack and should get replaced by proper semaphores.

The sched_yield() call has different semantics in the old scheduler and the new one. What this call should actually do is largely undefined; many operating systems have never supported the old-style Linux yield() behavior.

As a workaround, and to see if this is really the problem, you can try this:

echo 1 > /proc/sys/kernel/sched_yield_bug_workaround

or alternatively you can try a less recent kernel that doesn't yet use the CFS scheduler (such as vanilla 2.6.22 or earlier).

For more details on this problem, read the recent KernelTrap article on yield() in CFS: http://kerneltrap.org/Linux/Using_sched_yield_Improperly

Hope this helps!


P.S. you should keep alignment-sensitive console output, such as from iostat, between <pre></pre> tags, otherwise they are extremely hard to read. Just for anyone else who might be wondering:

# vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
22  0   5792 293644 128628 660736    0    0     0    26 1245 27878  2  1 98  0
 0  0   5792 294092 128628 660736    0    0     0    14 1678 24973 15  2 83  0
 0  0   5792 294100 128628 660736    0    0     0    17 1465 22947  9  1 90  0
19  0   5792 294164 128628 660736    0    0     0    22 1327 26252  4  1 95  0
14  0   5792 293724 128628 660736    0    0     0    34 1337 27446 17  2 81  0
13  0   5792 311268 128628 660736    0    0     0    30 1409 20033 13  2 85  0
 3  0   5792 311668 128628 660736    0    0     0    30 1572 25517  8  2 90  0

[...]
# vmstat 5
   procs                      memory      swap          io     system      cpu
 r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id
 0  0  0  44348 241796  48072 445640    0    1     2     3    0     1  1  1  0
 0  0  0  44348 242140  48072 445656    0    0     0    30  786  4843 32  4 63
 0  0  0  44348 242748  48072 445676    0    0     0    20  561  3481 33  1 65
 0  0  0  44348 242672  48072 445736    0    0     0    30 1694  5264 87  4  9
 0  0  0  44348 242744  48072 445776    0    0     0    32  933  5058 44  2 54
 0  0  0  44348 242760  48076 445792    0    0     0    25  499  4741  8 11 81
 0  0  0  44348 242800  48076 445816    0    0     0    22  584  4818 23 20 57

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.