This is a narrow special-case where the spin-rwlock is safe, and the
rwsem is unsafe.
But it should work for rwlocks - it always worked and the networking
code always relied on that AFAIK.
Here's the x86 assembly code of the write-side slowpath:
ENTRY(__write_lock_failed)
CFI_STARTPROC
LOCK_PREFIX
addl $RW_LOCK_BIAS,(%rdi)
1: rep
nop
cmpl $RW_LOCK_BIAS,(%rdi)
jne 1b
LOCK_PREFIX
subl $RW_LOCK_BIAS,(%rdi)
jnz __write_lock_failed
ret
CFI_ENDPROC
the fastpath decreased the value with RW_LOCK_BIAS, and when we
enter this function we undo that effect by adding RW_LOCK_BIAS. Then
we spin (without signalling our write-intent) passively until the
count reaches RW_LOCK_BIAS. Then we try to lock it again and bring
it to zero (meaning no other readers or writers - we got the lock).
This is pretty much the most unfair strategy possible for writers -
but this is how rwlocks always behaved - and they do so mostly for
recursive use within networking.
This is why the tasklist_lock was always so suspect to insane
starvation symptoms on really large SMP systems, and this is why
write_lock_irq(&tasklist_lock) was always a dangerous operation to
do. (it can spin for a _long_ time with irqs off.)
It's not the most optimal of situations. Some patches are in the
works to fix the irqs-off artifact (on ia64 - no x86 patches yet
AFAICS) - but that's just papering it over.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html