'Re: [Patch] shm bug introduced with pagecache in 2.3.11'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-kernel
Subject:    Re: [Patch] shm bug introduced with pagecache in 2.3.11
From:       Linus Torvalds <torvalds () transmeta ! com>
Date:       1999-11-21 0:33:49
[Download RAW message or body]



On Sat, 20 Nov 1999, Marcelo Tosatti wrote:
>
> http://bazar.conectiva.com.br/~marcelo/rwsem-2.3.18ac7.patch
> This code is a Linux "port" of the psedo-code implementation found in the
> "Unix Kernel Internals" book i wrote some time ago.

Well, if it's a port of that, then it won't have the 2-instruction
fast-path that is pretty much required, imho.

I'll see if I can get a free afternoon some day and try to port the
current x86 semaphore code over to a rw version too. The plan was
something like this:

 - read_down():

	lock ; incl mem
	js	contention_rw

 - read_up():

	lock ; decl mem
	js	wake_up_writer

 - write_down():

	lock ; btsl $31,mem
	jc	contention_ww
	testl $0x7fffffff,mem
	jne	contention_wr

 - write_up():

	lock ; andl $0x7fffffff,mem
	jne	wake_up_reader_or_writer

where all the three contention cases grab a "contention spinlock" before
they then start sorting things out. The only interesting part is making
sure that the contention case gets the wakeups, and the above counts on:

 - if a writer is waiting for readers (contention_wr), then the writer
   will have already set the high bit, and a reader will know to wake it
   up because the rw-semaphore value will be negative when it does
   read_up().

 - if a reader is waiting for a writer, then the reader will have
   incremented the semaphore, and the writer will know to wake it up
   becasue the semaphore value won't be zero after the "write_up()".

 - if a writer is waiting for another writer (contention_ww case), it will
   have to increment the "reader" part of the semaphore value, in order to
   get the other writer to wake it up on "write_up()".

All other races should be trivially handled by just having the spinlock,
so the only really hard cases are the fast-path stuff where we cannot get
the semaphore because it is too expensive.

Does anybody see any holes in the above pseudo-implementation? Please take
a look at the way the current x86 semaphores are implemented: they use
exactly the above kinds of single-atomic-instruction-plus-condition-codes
trickery to get the non-contention case without _any_ extra instructions.

		Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic