[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-kernel
Subject:    Re: [Patch] shm bug introduced with pagecache in 2.3.11
From:       Manfred Spraul <manfreds () colorfullife ! com>
Date:       1999-11-21 9:15:55
[Download RAW message or body]

Linus Torvalds wrote:
>  - if a writer is waiting for another writer (contention_ww case), it will
>    have to increment the "reader" part of the semaphore value, in order to
>    get the other writer to wake it up on "write_up()".
> 
>  - if a reader is waiting for a writer, then the reader will have
>    incremented the semaphore, and the writer will know to wake it up
>    becasue the semaphore value won't be zero after the "write_up()".

Only one thread must do that, otherwise you couldn't distinguish between
"multiple writers are waiting, the lock is free" and "one writer is
waiting, one reader owns the lock".

> All other races should be trivially handled by just having the spinlock,
> so the only really hard cases are the fast-path stuff where we cannot get
> the semaphore because it is too expensive.

I think there is a problem: neither "write_lock_trylock()" nor
"read_lock_trylock()" [ie the inline part of your rw-semaphore
operations] are atomic, both change a value, and if a certain flag is
set, then they undo their change.
I think there is a race with 2 CPUs [1*write,1*read] where noone owns
the semaphore, but both fail to acquire the semaphore:

CPU1: down_write()	CPU2: down_read()
CPU1:			CPU2:
lock bts RWL
			lock inc RWL
test RWL ,0x7ff
jne contention_wr[taken]	
			js [taken]
			<hardware interrupt>
contention_wr:
spinlock RWL_LOCK

if <we are the first sleeper>
	inc RWL <executed>
if RWL == 0x8000 0001 <ie lock is free>
	goto got_semaphore; <not taken, value is 0x8000 0002>
spin_unlock RWL_LOCK
schedule()


			lock dec RWL;
			spinlock RWL_LOCK

			if <we are the first sleeper>
				inc RWL <not executed>			
			
			if RWL >0 <no writer>
				goto got_semaphore <not taken>

			spin_unlock RWL_LOCK
			schedule()

--> both threads called schedule(), and noone is going to wake them up.

A dirty hack would be that contention_wr stores the information
"highest bit set because a writer is waiting, the semaphore is not
acquired by a writer" in "struct rw_semaphore.ugly_hack", and
__down_shared_failed() always succeeds if ugly_hack is set.

But now we would starve writers.

--
	Manfred


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic