How hard is tasklist_lock hit on these systems?
How hard is the pid hash hit on these systems?
My hunch is that if you are doing a lot of forking and exiting
zombie reaping isn't the only problem you are observing.
Thinking about it I do agree with Linus that two lists sounds like the
right solution because it ensures we always have O(1) time when
waiting for a zombie. I'd like to place the list head for the zombie
list in the signal_struct and not in the task_struct so our
performance continues to be O(1) when we have a threaded process.
The big benefit of the zombie list over your proposed list reordering
is that waitpid can return immediately when we don't have zombies to
wait for, but we have lots of children. So it looks like a universal
benefit and about as good as it is possible to make zombie handling
of waitpid.
Eric
-