[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-mips
Subject: Re: [PATCH] MIPS: tlbex: fix a missing statement for HUGETLB
From: Aurelien Jarno <aurelien () aurel32 ! net>
Date: 2014-08-02 21:35:38
Message-ID: 20140802213538.GC19066 () hall ! aurel32 ! net
[Download RAW message or body]
On Thu, Jul 31, 2014 at 10:33:55AM -0700, David Daney wrote:
> On 07/31/2014 04:54 AM, James Hogan wrote:
> >Hi,
> >
> >On 31/07/14 02:13, David Daney wrote:
> >>On 07/30/2014 05:48 PM, Huacai Chen wrote:
> >>>For non-Octeon CPU, htlb_info.huge_pte is equal to K0, but I don't
> >>>know much about Octeon. So I think you know whether we should use K0
> >>> or htlb_info.huge_pte here, since you are the original author.
> >>>
> >>
> >>This is why I requested that somebody show me a disassembly of the
> >>faulty handler. I cannot tell where the problem is unless I see that.
Here is the faulty handler, that is a dump on a machine affected by the
bug:
| #define _PAGE_PRESENT_SHIFT 0
| #define _PAGE_READ_SHIFT 1
| #define _PAGE_WRITE_SHIFT 2
| #define _PAGE_ACCESSED_SHIFT 3
| #define _PAGE_MODIFIED_SHIFT 4
| #define _PAGE_HUGE_SHIFT 5
| #define _PAGE_SPLITTING_SHIFT 6
| #define _PAGE_GLOBAL_SHIFT 7
| #define _PAGE_VALID_SHIFT 8
| #define _PAGE_DIRTY_SHIFT 9
| #define _PFN_SHIFT 13
|
| 0000000000000000 <r4000_tlb_refill>:
| 0: 001ad1fa dsrl k0,k0,0x7
| 4: 40ba1000 dmtc0 k0,c0_entrylo0
| 8: 675a4000 daddiu k0,k0,16384
| c: 40ba1800 dmtc0 k0,c0_entrylo1
| 10: 3c1a001f lui k0,0x1f
| 14: 375ae000 ori k0,k0,0xe000
| 18: 409a2800 mtc0 k0,c0_pagemask
| 1c: 42000006 tlbwr
| 20: 10000031 b e8 <r4000_tlb_refill+0xe8>
| 24: 40802800 mtc0 zero,c0_pagemask
| 28: 10000019 b 90 <r4000_tlb_refill+0x90>
| 2c: 3c1b8095 lui k1,0x8095
| ...
| 80: 403a4000 dmfc0 k0,c0_badvaddr
| 84: 0740ffe8 bltz k0,28 <r4000_tlb_refill+0x28>
| 88: 3c1b8095 lui k1,0x8095
| 8c: df7b4fb0 ld k1,20400(k1)
| 90: 001ad6fa dsrl k0,k0,0x1b
| 94: 335a1ff8 andi k0,k0,0x1ff8
| 98: 037ad82d daddu k1,k1,k0
| 9c: 403a4000 dmfc0 k0,c0_badvaddr
| a0: df7b0000 ld k1,0(k1)
| a4: 001ad4ba dsrl k0,k0,0x12
| a8: 335a0ff8 andi k0,k0,0xff8
| ac: 037ad82d daddu k1,k1,k0
| b0: df7a0000 ld k0,0(k1)
| b4: 335a0020 andi k0,k0,0x20
| b8: 1740ffd1 bnez k0,0 <r4000_tlb_refill>
| bc: 403aa000 dmfc0 k0,c0_xcontext
| c0: df7b0000 ld k1,0(k1)
| c4: 335a0ff0 andi k0,k0,0xff0
| c8: 037ad82d daddu k1,k1,k0
| cc: df7a0000 ld k0,0(k1)
| d0: df7b0008 ld k1,8(k1)
| d4: 001ad1fa dsrl k0,k0,0x7
| d8: 40ba1000 dmtc0 k0,c0_entrylo0
| dc: 001bd9fa dsrl k1,k1,0x7
| e0: 40bb1800 dmtc0 k1,c0_entrylo1
| e4: 42000006 tlbwr
| e8: 42000018 eret
| ...
> >>Really I think the problem is in build_is_huge_pte(), where we are
> >>clobbering 'tmp' which is K0.
Indeed, that is the problem. In the above code build_is_huge_pte()
corresponds to addresses b4 and b8. It needs huge_pte loaded in K0, but
at the same time it clobbers it. That's the reason why prior to commit
2c8c53e28f1, k0 was reloaded before calling build_huge_update_entries.
> >>So you could reload tmp/K0 in build_is_huge_pte().
> >
> >b4 is apparently where it branches back to the huge page case at the
> >beginning. In that case the at register (htlb_info.huge_pte) is set to
> >*(k1+at) instead of *(k1), so loading to htlb_info.huge_pte instead of
> >k0 would I think be bad and change the behaviour. So forget my suggestion!
> >
> >On the other hand loading the pte to k0 is redundant when
> >build_fast_tlb_refill_handler is used (which depends on bbit1), and also
> >in the other case if bbit1 is available since it won't get clobbered by
> >build_is_huge_pte().
> >
> >Maybe the reload should simply be conditional on !use_bbit_insns()?
> >
>
> That was kind of my suggestion. What happens if you do something
> like (untested):
>
> --- a/arch/mips/mm/tlbex.c
> +++ b/arch/mips/mm/tlbex.c
> @@ -716,6 +716,7 @@ build_is_huge_pte(u32 **p, struct uasm_reloc
> **r, unsigned int tmp,
> } else {
> uasm_i_andi(p, tmp, tmp, _PAGE_HUGE);
> uasm_il_bnez(p, r, tmp, lid);
> + UASM_i_LW(p, tmp, 0, pmd);
> }
> }
I haven't tested it, but it my opinion this patch won't work or at least
is suboptimal given build_is_huge_pte is also used to build
r4000_tlb_load, r4000_tlb_store and r4000_tlb_modify handlers.
> or
>
> diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
> index f99ec587..341add1 100644
> --- a/arch/mips/mm/tlbex.c
> +++ b/arch/mips/mm/tlbex.c
> @@ -1299,6 +1299,8 @@ static void build_r4000_tlb_refill_handler(void)
> }
> #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
> uasm_l_tlb_huge_update(&l, p);
> + if (!use_bbit_insns())
> + UASM_i_LW(&p, K0, 0, K1);
> build_huge_update_entries(&p, htlb_info.huge_pte, K1);
> build_huge_tlb_write_entry(&p, &l, &r, K0, tlb_random,
> htlb_info.restore_scratch);
This patch fixes the issue, thanks. That said it doesn't look fully
correct. The test should be done the same way as for
build_fast_tlb_refill_handler. For example the fast handler is not
called on a 32-bit machine with bbit instructions, so it would need
to reload K0.
Now I do wonder if we should add yet another test there, or simply move
and duplicate these 3 or 4 lines in the fast and non-fast branches. At
least it will improve readability.
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic