[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-parisc
Subject: Re: [PATCH v2] parisc: Rewrite light-weight syscall and futex code
From: John David Anglin <dave.anglin () bell ! net>
Date: 2021-12-23 21:36:53
Message-ID: 4c4608c7-0c24-9bf7-02f1-27acad108aaf () bell ! net
[Download RAW message or body]
On 2021-12-23 4:13 p.m., Helge Deller wrote:
> On 12/23/21 21:14, John David Anglin wrote:
> > On 2021-12-23 2:31 p.m., Helge Deller wrote:
> > > On 12/23/21 20:17, John David Anglin wrote:
> > > > On 2021-12-23 1:47 p.m., Helge Deller wrote:
> > > > > ...
> > > > > > In order to do this, we need a mechanism to trigger COW breaks outside \
> > > > > > the critical region. Fortunately, parisc has the "stbys,e" instruction. \
> > > > > > When the leftmost byte of a word is addressed, this instruction triggers \
> > > > > > all the exceptions of a normal store but it does not write to memory. \
> > > > > > Thus, we can use it to trigger COW breaks outside the critical region \
> > > > > > without modifying the data that is to be updated atomically.
> > > > > ...
> > > > > > diff --git a/arch/parisc/include/asm/futex.h \
> > > > > > b/arch/parisc/include/asm/futex.h index 9cd4dd6e63ad..8f97db995b04 100644
> > > > > > --- a/arch/parisc/include/asm/futex.h
> > > > > > +++ b/arch/parisc/include/asm/futex.h
> > > > > ...
> > > > > > +static inline bool _futex_force_interruptions(unsigned long ua)
> > > > > > +{
> > > > > > + bool result;
> > > > > > +
> > > > > > + __asm__ __volatile__(
> > > > > > + "1:\tldw 0(%1), %0\n"
> > > > > > + "2:\tstbys,e %%r0, 0(%1)\n"
> > > > > > + "\tcomclr,= %%r0, %%r0, %0\n"
> > > > > > + "3:\tldi 1, %0\n"
> > > > > > + ASM_EXCEPTIONTABLE_ENTRY(1b, 3b)
> > > > > > + ASM_EXCEPTIONTABLE_ENTRY(2b, 3b)
> > > > > > + : "=&r" (result) : "r" (ua) : "memory"
> > > > > > + );
> > > > > > + return result;
> > > > > I wonder if we can get rid of the comclr,= instruction by using
> > > > > ASM_EXCEPTIONTABLE_ENTRY_EFAULT instead of ASM_EXCEPTIONTABLE_ENTRY,
> > > > > e.g.:
> > > > >
> > > > > diff --git a/arch/parisc/include/asm/futex.h \
> > > > > b/arch/parisc/include/asm/futex.h index 8f97db995b04..ea052f013865 100644
> > > > > --- a/arch/parisc/include/asm/futex.h
> > > > > +++ b/arch/parisc/include/asm/futex.h
> > > > > @@ -21,20 +21,21 @@ static inline unsigned long _futex_hash_index(unsigned \
> > > > > long ua)
> > > > > * if load and store fault.
> > > > > */
> > > > >
> > > > > -static inline bool _futex_force_interruptions(unsigned long ua)
> > > > > +static inline unsigned long _futex_force_interruptions(unsigned long ua)
> > > > > {
> > > > > - bool result;
> > > > > + register unsigned long error __asm__ ("r8") = 0;
> > > > > + register unsigned long temp;
> > > > >
> > > > > __asm__ __volatile__(
> > > > > - "1:\tldw 0(%1), %0\n"
> > > > > - "2:\tstbys,e %%r0, 0(%1)\n"
> > > > > - "\tcomclr,= %%r0, %%r0, %0\n"
> > > > > - "3:\tldi 1, %0\n"
> > > > > - ASM_EXCEPTIONTABLE_ENTRY(1b, 3b)
> > > > > - ASM_EXCEPTIONTABLE_ENTRY(2b, 3b)
> > > > > - : "=&r" (result) : "r" (ua) : "memory"
> > > > > + "1:\tldw 0(%2), %0\n"
> > > > > + "2:\tstbys,e %%r0, 0(%2)\n"
> > > > > + "3:\n"
> > > > > + ASM_EXCEPTIONTABLE_ENTRY_EFAULT(1b, 3b)
> > > > > + ASM_EXCEPTIONTABLE_ENTRY_EFAULT(2b, 3b)
> > > > > + : "=r" (temp), "=r" (error)
> > > > > + : "r" (ua), "1" (error) : "memory"
> > > > > );
> > > > > - return result;
> > > > > + return error;
> > > > > }
> > > > I don't think this is a win.
> > > >
> > > > 1) Register %r8 needs to get loaded with 0. So, that's one instruction.
> > > True. On the other hand you don't need the "ldi 1,%0" then, which brings \
> > > parity.
> > > > 2) Register %r8 is a caller saves register. Using it will cause gcc to save \
> > > > and restore it from stack. This may cause a stack frame to be created when \
> > > > one isn't needed. The save and restore instructions are more expensive, \
> > > > particularly if they cause a TLB miss.
> > > Because of this reason, wouln't it make sense to switch the uaccess functions \
> > > away from r8 too, and use another temp register in both?
> > Yes. I think it would be best to use %r28. Then, error will be in correct \
> > register for return.
> r29 seems to generate smaller code than r28, so I choosed r29.
Okay. Maybe it won't matter in an inline function.
>
> > The variables temp and error can combined.
> Then I need the copy %r0,%error in asm code.
> It's no option to use ldw with target register %r0 ?
Correct. It's a cache prefetch instruction and doesn't trigger exceptions.
I'm rewriting routine.
Dave
--
John David Anglin dave.anglin@bell.net
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic