[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-pci
Subject:    Re: Disabling ASPM L1 sub-states selectively from drivers?
From:       Heiner Kallweit <hkallweit1 () gmail ! com>
Date:       2019-02-27 23:58:57
Message-ID: eb125be0-e3c8-750a-fa75-cb8dbf374b62 () gmail ! com
[Download RAW message or body]

On 26.02.2019 23:41, Bjorn Helgaas wrote:
> Hi Heiner,
> 
> On Fri, Feb 22, 2019 at 12:16 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>
>> I face the issue that a PCIe network chip misses RX packets if ASPM L1
>> sub-states are enabled. It seems that the RX FIFO is too small to buffer
>> all incoming packets during ASPM exit latency.
>>
>> So far pci_disable_link_state() only allows to disable L1 completely.
>> Would it make sense to extend this function to allow disabling
>> L1 sub-states selectively? Looking at pcie_config_aspm_link() this
>> seems to be possible.
> 
> We could certainly explore the option of selectively disabling L1 substates.
> 
> But before we do that, let's look at a couple things, because there
> are some Linux issues in that area, and it's possible we could make a
> generic fix that wouldn't require disabling the substates completely.
> 
> One problem is the ASPM L1.2 state depends on LTR information, and we
> don't support LTR correctly.  There are a couple patches in -next to
> fix some problems, but we still don't handle cases where the BIOS
> doesn't program the LTR latencies and the LTR_L1.2_Threshold.
> 
> Can you open a report at bugzilla.kernel.org and attach the complete
> dmesg log (booted with "pci=earlydump") and the "sudo lspci -vvvxx"
> output?
> 
I have no system where I can reproduce the issue, it was reported
by a user:  https://bugzilla.redhat.com/show_bug.cgi?id=1671958
In comments 61 and 65 you find the lspci -vv output for the network
chip.

> Have you figured out which L1 substate specifically causes problems?
> If not, maybe we can use setpci to fiddle with things manually and
> narrow it down.
> 
So far all I can say is that with ASPM disabled completely the issue
doesn't occur. I'd have to see how to disable L1.2 only with setpci
and then let the user test.

> Bjorn
> 
Heiner
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic