'Re: [RFC] tentative fix for drm/i915/gt regression on preempt-rt'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-rt-users
Subject:    Re: [RFC] tentative fix for drm/i915/gt regression on preempt-rt
From:       Sebastian Andrzej Siewior <bigeasy () linutronix ! de>
Date:       2023-06-30 13:09:49
Message-ID: 20230630130949.coN0sVU4 () linutronix ! de
[Download RAW message or body]

On 2023-06-22 20:57:50 [-0400], Paul Gortmaker wrote:
[ longer report about what is broken.]

Commit ade8a0f598443 ("drm/i915: Make all GPU resets atomic") introduces
a preempt_disable() section around the invocation of the reset callback.
I can't find an explanation why this is needed. There was a comment
saying
| * We want to perform per-engine reset from atomic context (e.g.
| * softirq), which imposes the constraint that we cannot sleep.

but it does not state the problem with being preempted while waiting for
the reset. The commit itself does not explain why an atomic reset is
needed, it just states that it is a requirement now. On !RT we could
pull packets from a NICs and forward them other NICs for 2ms.

I've been looking over the reset callbacks and gen8_reset_engines() +
gen6_reset_engines() acquire a spinlock_t. Since this becomes a sleeping
lock on PREEMPT_RT it must not be acquired with disabled preemption.
i915_do_reset() acquires no lock but then has two udelay()s of 50us
each. Not good latency wise in a preempt-off section.

Could someone please explain why atomic is needed during reset, what
problems are introduced by a possible preemption?

Sebastian
[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic