[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-kernel
Subject:    Re: [PATCH -next for tip:x86/pti] x86/tlb: drop unneeded local vars in enable_l1d_flush_for_task()
From:       "Singh, Balbir" <sblbir () amazon ! com>
Date:       2020-09-30 23:49:30
Message-ID: 19f57cbe-ba33-17d5-440c-2765e670782f () amazon ! com
[Download RAW message or body]

On 1/10/20 7:38 am, Thomas Gleixner wrote:

> 
> 
> 
> On Wed, Sep 30 2020 at 20:35, Peter Zijlstra wrote:
> > On Wed, Sep 30, 2020 at 08:00:59PM +0200, Thomas Gleixner wrote:
> > > On Wed, Sep 30 2020 at 19:03, Peter Zijlstra wrote:
> > > > On Wed, Sep 30, 2020 at 05:40:08PM +0200, Thomas Gleixner wrote:
> > > > Also, that preempt_disable() in there doesn't actually do anything.
> > > > Worse, preempt_disable(); for_each_cpu(); is an anti-pattern. It mixes
> > > > static_cpu_has() and boot_cpu_has() in the same bloody condition and has
> > > > a pointless ret variable.
> > 
> > Also, I forgot to add, it accesses ->cpus_mask without the proper
> > locking, so it could be reading intermediate state from whatever cpumask
> > operation that's in progress.
> 
> Yes. I saw that after hitting send. :(
> 
> > > I absolutely agree and I really missed it when looking at it before
> > > merging. cpus_read_lock()/unlock() is the right thing to do if at all.
> > > 
> > > > It's shoddy code, that only works if you align the planets right. We
> > > > really shouldn't provide interfaces that are this bad.
> > > > 
> > > > It's correct operation is only by accident.
> > > 
> > > True :(
> > > 
> > > I understand Balbirs problem and it makes some sense to provide a
> > > solution. We can:
> > > 
> > > 1) reject set_affinity() if the task has that flush muck enabled
> > > and user space tries to move it to a SMT enabled core
> > > 
> > > 2) disable the muck if if detects that it is runs on a SMT enabled
> > > core suddenly (hotplug says hello)
> > > 
> > > This one is nasty because there is no feedback to user space
> > > about the wreckage.
> > 
> > That's and, right, not or. because 1) deals with sched_setffinity()
> > and 2) deals wit hotplug.
> 
> It was meant as AND of course.
> 
> > Now 1) requires an arch hook in sched_setaffinity(), something I'm not
> > keen on providing, once we provide it, who knows what strange and
> > wonderful things archs will dream up.
> 
> I don't think so. We can have that magic in core:
> 
> #ifdef CONFIG_HAS_PARANOID_L1D_FLUSH
> static bool paranoid_l1d_valid(struct task_struct *tsk,
> const struct cpumask *msk)
> {
> if (!test_tsk_thread_flag(tsk, TIF_SPEC_L1D_FLUSH))
> return true;
> /* Do magic stuff */
> return res;
> }
> #else
> static bool paranoid_l1d_valid(struct task_struct *tsk,
> const struct cpumask *msk)
> {
> return true;
> }
> #endif
> 
> It's a pretty well defined problem and having the magic in core code
> prevents an arch hook which allows abuse of all sorts.
> 
> And the same applies to enable_l1d_flush_for_task(). The only
> architecture specific nonsense are the checks whether the CPU bug is
> there and whether the hardware supports L1D flushing.
> 
> So we can have:
> 
> #ifdef CONFIG_HAS_PARANOID_L1D_FLUSH
> int paranoid_l1d_enable(struct task_struct *tsk)
> {
> /* Do the SMT validation under the proper locks */
> if (!res)
> set_task_thread_flag(tsk, TIF_SPEC_L1D_FLUSH);
> return res;
> }
> #endif
> 
> > And 2) also happens on hot-un-plug, when the task's affinity gets
> > forced because it became empty. No user feedback there either, and
> > information is lost.
> 
> Of course. It's both that suddenly SMT gets enabled on a core which was
> isolated and when the last isolated core in the tasks CPU mask goes
> offline.
> 
> > I suppose we can do 2) but send a signal. That would cover all cases and
> > keep it in arch code. But yes, that's pretty terrible too.
> 
> Bah. I just looked at the condition to flush:
> 
> if (sched_smt_active() && !this_cpu_read(cpu_info.smt_active) &&
> (prev_mm & LAST_USER_MM_L1D_FLUSH))
> l1d_flush_hw();
> 
> That fails to flush when SMT is disabled globally. Balbir?
> 
> Of course this should be:
> 
> if (!this_cpu_read(cpu_info.smt_active) && (prev_mm & LAST_USER_MM_L1D_FLUSH))
> l1d_flush_hw();
> 
> Now we can make this:
> 
> if (unlikely(prev_mm & LAST_USER_MM_L1D_FLUSH)) {
> if (!this_cpu_read(cpu_info.smt_active))
> l1d_flush_hw();
> else
> task_work_add(...);
> 
> And that task work clears the flag and sends a signal. We're not going
> to send a signal from switch_mm() ....
> 
> Thanks,
> 


So this is the change I am playing with, I don't like the idea of killing the task, but it's better than \
silently not flushing, I guess system administrators will learn with time not to correctly the affinity \
of tasks flushing L1D. For the affinity bits, not being able to change the affinity is better, but not \
being able to provide feedback on as to why is a bit weird as well, but I wonder if there are other cases \
where we might want to lock the affinity of a task for it's lifetime.

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 6b0f4c88b07c..6b0d0a9cd447 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -320,26 +320,15 @@ int enable_l1d_flush_for_task(struct task_struct *tsk)
 
 	/*
 	 * Do not enable L1D_FLUSH_OUT if
-	 * b. The CPU is not affected by the L1TF bug
-	 * c. The CPU does not have L1D FLUSH feature support
-	 * c. The task's affinity is on cores with SMT on.
+	 * a. The CPU is not affected by the L1TF bug
+	 * b. The CPU does not have L1D FLUSH feature support
 	 */
 
 	if (!boot_cpu_has_bug(X86_BUG_L1TF) ||
-			!static_cpu_has(X86_FEATURE_FLUSH_L1D))
+		!boot_cpu_has(X86_FEATURE_FLUSH_L1D))
 		return -EINVAL;
 
-	cpu = get_cpu();
-
-	for_each_cpu(i, &tsk->cpus_mask) {
-		if (cpu_data(i).smt_active == true) {
-			put_cpu();
-			return -EINVAL;
-		}
-	}
-
 	set_ti_thread_flag(&tsk->thread_info, TIF_SPEC_L1D_FLUSH);
-	put_cpu();
 	return ret;
 }
 
@@ -349,6 +338,12 @@ int disable_l1d_flush_for_task(struct task_struct *tsk)
 	return 0;
 }
 
+static void l1d_flush_kill(struct callback_head *ch)
+{
+	clear_ti_thread_flag(&current->thread_info, TIF_SPEC_L1D_FLUSH);
+	force_signal(SIGBUS);
+}
+
 void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 	       struct task_struct *tsk)
 {
@@ -443,12 +438,14 @@ static void cond_mitigation(struct task_struct *next)
 	}
 
 	/*
-	 * Flush only if SMT is disabled as per the contract, which is checked
-	 * when the feature is enabled.
+	 * Flush only if SMT is disabled, if flushing is enabled
+	 * and we are on an SMT enabled core, kill the task
 	 */
-	if (sched_smt_active() && !this_cpu_read(cpu_info.smt_active) &&
-		(prev_mm & LAST_USER_MM_L1D_FLUSH))
-		l1d_flush_hw();
+	if (unlikely(prev_mm & LAST_USER_MM_L1D_FLUSH)) {
+		if (!this_cpu_read(cpu_info.smt_active))
+			l1d_flush_hw();
+		else
+			task_work_add(prev, l1d_flush_kill, true);
 
 	this_cpu_write(cpu_tlbstate.last_user_mm_spec, next_mm);
 }


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic