[prev in list] [next in list] [prev in thread] [next in thread] 

List:       qemu-devel
Subject:    Re: [PATCH 2/3] target/hppa: mask offset bits in gva
From:       Sven Schnelle <svens () stackframe ! org>
Date:       2024-03-24 18:41:28
Message-ID: 87o7b31nhj.fsf () t14 ! stackframe ! org
[Download RAW message or body]

Hi Richard,

Richard Henderson <richard.henderson@linaro.org> writes:

> In particular Figure 2-14 for "data translation disabled" may be
> instructive.  Suppose the cpu does not implement all of the physical
> address lines (true for all extant pa-risc cpus; qemu implements 40
> bits to match pa-8500 iirc).  Suppose when reporting a trap with
> translation disabled, it is a truncated physical address that is used
> as input to Figure 2-14.
> 
> If that is so, then the fix might be in hppa_set_ior_and_isr.  Perhaps
> 
> -    env->cr[CR_ISR] &= 0x3fffffff;
> +    env->cr[CR_ISR] &= 0x301fffff;
> 
> Though my argument would suggest the mask should be 0xff for the
> 40-bit physical address, which is not what you see at all, so perhaps
> the thing is moot.  I am at a loss to explain why or how HP-UX gets a
> 7-bit hole in the ISR result.
> 
> On the other hand, there are some not-well-documented shenanigans (aka
> implementation defined behaviour) between Figure H-8 and Figure H-11,
> where the 62-bit absolute address is expanded to a 64-bit logical
> physical address and then compacted to a 40-bit implementation
> physical address.
> 
> We've already got hacks in place for this in hppa_abs_to_phys_pa2_w1,
> which just truncates everything down to 40 bits.  But that's probably
> not what the processor is really doing.
> 
> Anyhow, will you please try the hppa_set_ior_and_isr change and see if
> that fixes your HP-UX problems?

The problem occurs with data address translation - it's working without,
which is not suprising because no exception can happen there. But as
soon as the kernel enables address translation it will hit a data tlb
miss exception because it can't find 0xfffffffffffb0500 in the page
tables. Trying to truncate the ISR in hppa_set_ior_and_isr() for the
data translation enabled case leads to this loop:

hppa_tlb_fill_excp env=0x55bf06e976e0 addr=0x3ffffffffffb0500 size=4 type=0 mmu_idx=9
hppa_tlb_find_entry env=0x55bf06e976e0 ent=0x55bf06e97b30 valid=1 va_b=0x200000 \
va_e=0x2fffff pa=0x200000 hppa_tlb_get_physical_address env=0x55bf06e976e0 ret=-1 \
prot=5 addr=0x26170c phys=0x26170c hppa_tlb_flush_ent env=0x55bf06e976e0 \
ent=0x55bf06e97bf0 va_b=0x301ffffffffb0000 va_e=0x301ffffffffb0fff \
pa=0xfffffffffffb0000 hppa_tlb_itlba env=0x55bf06e976e0 ent=0x55bf06e97bf0 \
va_b=0x301ffffffffb0000 va_e=0x301ffffffffb0fff pa=0xfffffffffffb0000 hppa_tlb_itlbp \
env=0x55bf06e976e0 ent=0x55bf06e97bf0 access_id=0 u=1 pl2=0 pl1=0 type=1 b=0 d=0 t=0

So qemu is looking up 0x3ffffffffffb0500 in the TLB, can't find it,
raises an exception, HP-UX says: "ah nice, i have a translation for
you", but that doesn't match because we're only stripping the bits
in the ISR.

As i was a bit puzzled in the beginning what's going on, i dumped the
pagetables and wrote a small dump program:

680000: val=000f47ff301fffff r2=110e0f0000000001 r1=01ffffffffe8ffe0 \
                phys=fffffffff47ff000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
680020: val=000f47fe301fffff r2=110e0f0000000001 r1=01ffffffffe8ffc0 \
                phys=fffffffff47fe000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
680060: val=000f47fc301fffff r2=110e0f0000000001 r1=01ffffffffe8ff80 \
                phys=fffffffff47fc000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5860: val=000fed3c301fffff r2=010e000000000001 r1=01fffffffffda780 \
                phys=fffffffffed3c000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d58e0: val=000fed38301fffff r2=010e000000000001 r1=01fffffffffda700 \
                phys=fffffffffed38000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d59a0: val=000fed32301fffff r2=010e000000000001 r1=01fffffffffda640 \
                phys=fffffffffed32000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d59e0: val=000fed30301fffff r2=110e0f0000000001 r1=01fffffffffda600 \
                phys=fffffffffed30000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a00: val=000fed2f301fffff r2=010e000000000001 r1=01fffffffffda5e0 \
                phys=fffffffffed2f000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a20: val=000fed2e301fffff r2=010e000000000001 r1=01fffffffffda5c0 \
                phys=fffffffffed2e000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a40: val=000fed2d301fffff r2=010e000000000001 r1=01fffffffffda5a0 \
                phys=fffffffffed2d000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a60: val=000fed2c301fffff r2=010e000000000001 r1=01fffffffffda580 \
                phys=fffffffffed2c000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a80: val=000fed2b301fffff r2=010e000000000001 r1=01fffffffffda560 \
                phys=fffffffffed2b000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5aa0: val=000fed2a301fffff r2=010e000000000001 r1=01fffffffffda540 \
                phys=fffffffffed2a000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ac0: val=000fed29301fffff r2=010e000000000001 r1=01fffffffffda520 \
                phys=fffffffffed29000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ae0: val=000fed28301fffff r2=010e000000000001 r1=01fffffffffda500 \
                phys=fffffffffed28000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b00: val=000fed27301fffff r2=010e000000000001 r1=01fffffffffda4e0 \
                phys=fffffffffed27000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b20: val=000fed26301fffff r2=010e000000000001 r1=01fffffffffda4c0 \
                phys=fffffffffed26000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b40: val=000fed25301fffff r2=010e000000000001 r1=01fffffffffda4a0 \
                phys=fffffffffed25000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b60: val=000fed24301fffff r2=010e000000000001 r1=01fffffffffda480 \
                phys=fffffffffed24000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b80: val=000fed23301fffff r2=010e000000000001 r1=01fffffffffda460 \
                phys=fffffffffed23000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ba0: val=000fed22301fffff r2=110e0f0000000001 r1=01fffffffffda440 \
                phys=fffffffffed22000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5bc0: val=000fed21301fffff r2=010e000000000001 r1=01fffffffffda420 \
                phys=fffffffffed21000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5be0: val=000fed20301fffff r2=010e000000000001 r1=01fffffffffda400 \
                phys=fffffffffed20000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5de0: val=000fed10301fffff r2=010e000000000001 r1=01fffffffffda200 \
                phys=fffffffffed10000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5fe0: val=000fed00301fffff r2=110e0f0000000001 r1=01fffffffffda000 \
                phys=fffffffffed00000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7f07e0: val=000fffc0301fffff r2=010e000000000001 r1=01fffffffffff800 \
                phys=fffffffffffc0000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7f09e0: val=000fffb0301fffff r2=110e0f0000000001 r1=01fffffffffff600 \
phys=fffffffffffb0000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)

'val' is the value constructed from IOR/ISR, r1/r2 are the args for the
idtlbt instructions, while the GVA just stays in IOR/ISR. If you look
at the val value, you'll recognize the 0301ffff... part. First i was
assuming some bug when creating the pagetables, but dumping pagetables
on my C3750/J6750 showed the same values.

The fastpath of the fault handler is:

         $i_dtlb_miss_2_0      
$TLB$:0002a1e0 02 a0 08 a9    mfctl      IOR,r9
$TLB$:0002a1e4 d9 21 0a 6c    extrd,u,*  r9,51,20,r1
$TLB$:0002a1e8 02 80 08 a8    mfctl      ISR,r8
$TLB$:0002a1ec 35 18 00 00    copy       r8,r24
$TLB$:0002a1f0 f0 28 06 96    depd,*     r8,43,10,r1
$TLB$:0002a1f4 d9 11 1a aa    extrd,u,*  r8,53,54,r17
         dtlb_bl_patch_2_0                     
$TLB$:0002a1f8 e8 00 18 80    b          dtlbmss_PCXU
$TLB$:0002a1fc 0a 21 02 91    xor        r1,r17,r17
         dtlbmss_PCXU                              
$TLB$:0002ae40 d9 19 03 e0    extrd,u,*  r8,31,32,r25
$TLB$:0002ae44 0b 21 02 99    xor        r1,r25,r25
$TLB$:0002ae48 f3 19 0c 0c    depd,*     r25,31,20,r24
         pdir_base_patch_017                            
$TLB$:0002ae4c 20 20 00 0a    ldil       0x500000,r1
         pdir_shift_patch_017
$TLB$:0002ae50 f0 21 00 00    depd,z,*   r1,0x3f,0x20,r1
         pdir_mask_patch_017
$TLB$:0002ae54 f0 31 04 a8    depd,*     r17,58,24,r1
$TLB$:0002ae58 0c 20 10 d1    ldd        0x0(r1),r17
$TLB$:0002ae5c bf 11 20 5a    cmpb,*<>,n r17,r24,d_target_miss_PCXU
$TLB$:0002ae60 50 29 00 20    ldd        0x10(r1),r9
$TLB$:0002ae64 0c 30 10 c8    ldd        0x8(r1),r8
$TLB$:0002ae68 d9 10 02 de    extrd,u,*  r8,0x16,0x2,r16
$TLB$:0002ae6c 8e 06 20 12    cmpib,<>,n 0x3,r16,make_nop_if_split_TLB_2_0_7
$TLB$:0002ae70 05 09 18 00    idtlbt     r9,r8
$TLB$:0002ae74 00 00 0c a0    rfi,r
$TLB$:0002ae78 08 00 02 40    nop

So the patch above was the only thing i could come up with - if you have
any better idea, let me know.

I also patched linux to execute exactly the same instruction with the
same address (space is 0), and i've seen different ISR/IOR values
compared to the values presented when HPUX is running. I think the
only explanation is that HPUX or firmware switches the behaviour
during runtime.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic