[prev in list] [next in list] [prev in thread] [next in thread] 

List:       oss-security
Subject:    [oss-security] Xen Security Advisory 292 v3 (CVE-2019-17346) - x86: insufficient TLB flushing when u
From:       Xen.org security team <security () xen ! org>
Date:       2019-10-25 11:10:38
Message-ID: E1iNxUM-0002l3-Ex () xenbits ! xenproject ! org
[Download RAW message or body]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

            Xen Security Advisory CVE-2019-17346 / XSA-292
                              version 3

            x86: insufficient TLB flushing when using PCID

UPDATES IN VERSION 3
====================

CVE assigned.

ISSUE DESCRIPTION
=================

Use of Process Context Identifiers (PCID) was introduced into Xen in
order to improve performance after XSA-254 (and in particular its
Meltdown sub-issue).  This enablement implied changes to the TLB
flushing logic.  The particular case of context switch to a vCPU of a
PCID-enabled guest left open a time window between the full TLB flush,
and the actual address space switch, during which additional TLB
entries (from the address space about to be switched away from) can be
accumulated, which will not subsequently be purged.

IMPACT
======

Malicious PV guests may be able to cause a host crash (Denial of
Service) or to gain access to data pertaining to other guests.
Privilege escalation opportunities cannot be ruled out.

Additionally, vulnerable configurations are likely to be unstable even
in the absence of an attack.

VULNERABLE SYSTEMS
==================

Only x86 systems are vulnerable.  ARM systems are not vulnerable.

Only systems running x86 PV guests are vulnerable.  Systems running
only x86 HVM or PVH guests are not vulnerable.

Only systems with at least one PCID-enabled PV guest are vulnerable.

Systems where PCID or INVPCID are unavailable or entirely disabled are
not vulnerable.

Note that PCID is enabled by default for both 64-bit dom0 and 64-bit
domU when hardware supports it.  PCID acceleration has been backported
to the following versions:
 - Xen 4.11.x,
 - Xen 4.10.2 and onwards,
 - Xen 4.9.3 and onwards,
 - Xen 4.8.4 and onwards,
 - Xen 4.7.6.

To exploit this vulnerability, problematic TLB entries must be created
between the full TLB flush and the address space switch.  The NMI
watchdog handler (enabled via the "watchdog" command line option) is
known to create such entries; other vectors cannot be ruled out.

MITIGATION
==========

Running only HVM or PVH guests will avoid this vulnerability.

Running only 32-bit PV guests alongside the other two types mentioned
above will also avoid this vulnerability, provided Dom0 is also 32-bit
or is not using PCID.  Making a 64-bit Dom0 not use PCID can be achieved
by e.g. "xpti=no-dom0 pcid=xpti".

Disabling use of PCID entirely, by passing "pcid=0" or "invpcid=0" as a
command line option to the hypervisor, will also avoid this
vulnerability (albeit re-introducing the XPTI performance regression
use of PCID was intended to reduce).

Disabling the watchdog timer will remove the only known way of reliably
creating problematic TLB entries, potentially reducing the risk of a
successful attack.

CREDITS
=======

This issue was discovered by Sergey Dyasli and Andrew Cooper of Citrix.

RESOLUTION
==========

Applying the attached patch resolves this issue.

xsa292.patch           xen-unstable, Xen 4.11.x ... Xen 4.7.6

$ sha256sum xsa292*
c515e98e5ae8a16bc5c894741eea5523a7e568f81ee8a570626dcc0f58f40b40  xsa292.meta
f42cb5e1eae5a5c6f0fd84e38df4db9f09a4e1176905c37f292fef9855c82fea  xsa292.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl2y1+cMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZV48H/i1Wi6DV90quHvewv0j792crdJojnHgq/8V3+hfT
lXWcmfW5IQLi02o4aG7XjUYwRTQ6clRgF4AZDZyrAY15QyVCz9diusvWOUzaq7Pd
hrvuIMeaB3+ba2OY7bB3P0sCekhhj6MwqKEhGVlbLEB8A0vGq9XjZBuTmws6QA2J
6Il8fxEVupdtETsf3KlYfxvJOubN/B+tByaIpdWU0C2M66EVa4pcijSLcvoylGxi
YS7jJrSMcqg4Sx/e/HnzCJ7jrvzhxSDHeyhPy1/NrwlQz2NQjd+FoFownsH48LuH
6LA6GGTIk5v+a/GtNVpb8Wwfg0UleabF+8S30C6QasUO70E=
=Pk5K
-----END PGP SIGNATURE-----

["xsa292.meta" (application/octet-stream)]
["xsa292.patch" (application/octet-stream)]

From: Jan Beulich <jbeulich@suse.com>
Subject: x86/mm: properly flush TLB in switch_cr3_cr4()

The CR3 values used for contexts run with PCID enabled uniformly have
CR3.NOFLUSH set, resulting in the CR3 write itself to not cause any
flushing at all. When the second CR4 write is skipped or doesn't do any
flushing, there's nothing so far which would purge TLB entries which may
have accumulated again if the PCID doesn't change; the "just in case"
flush only affects the case where the PCID actually changes. (There may
be particularly many TLB entries re-accumulated in case of a watchdog
NMI kicking in during the critical time window.)

Suppress the no-flush behavior of the CR3 write in this particular case.

Similarly the second CR4 write may not cause any flushing of TLB entries
established again while the original PCID was still in use - it may get
performed because of unrelated bits changing. The flush of the old PCID
needs to happen nevertheless.

At the same time also eliminate a possible race with lazy context
switch: Just like for CR4, CR3 may change at any time while interrupts
are enabled, due to the __sync_local_execstate() invocation from the
flush IPI handler. It is for that reason that the CR3 read, just like
the CR4 one, must happen only after interrupts have been turned off.

This is XSA-292.

Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v3: Adjust comments. Drop old_cr4 from the PGE check in the expression
    controlling the invocation of invpcid_flush_single_context(), as PGE
    is always clear there.
v2: Decouple invpcid_flush_single_context() from 2nd CR4 write.

--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -103,9 +103,8 @@ static void do_tlb_flush(void)
 
 void switch_cr3_cr4(unsigned long cr3, unsigned long cr4)
 {
-    unsigned long flags, old_cr4;
+    unsigned long flags, old_cr4, old_pcid;
     u32 t;
-    unsigned long old_pcid = cr3_pcid(read_cr3());
 
     /* This non-reentrant function is sometimes called in interrupt context. */
     local_irq_save(flags);
@@ -133,15 +132,38 @@ void switch_cr3_cr4(unsigned long cr3, u
          */
         invpcid_flush_all_nonglobals();
 
+    /*
+     * If we don't change PCIDs, the CR3 write below needs to flush this very
+     * PCID, even when a full flush was performed above, as we are currently
+     * accumulating TLB entries again from the old address space.
+     * NB: Clearing the bit when we don't use PCID is benign (as it is clear
+     * already in that case), but allows the if() to be more simple.
+     */
+    old_pcid = cr3_pcid(read_cr3());
+    if ( old_pcid == cr3_pcid(cr3) )
+        cr3 &= ~X86_CR3_NOFLUSH;
+
     write_cr3(cr3);
 
     if ( old_cr4 != cr4 )
         write_cr4(cr4);
-    else if ( old_pcid != cr3_pcid(cr3) )
-        /*
-         * Make sure no TLB entries related to the old PCID created between
-         * flushing the TLB and writing the new %cr3 value remain in the TLB.
-         */
+
+    /*
+     * Make sure no TLB entries related to the old PCID created between
+     * flushing the TLB and writing the new %cr3 value remain in the TLB.
+     *
+     * The write to CR4 just above has performed a wider flush in certain
+     * cases, which therefore get excluded here. Since that write is
+     * conditional, note in particular that it won't be skipped if PCIDE
+     * transitions from 1 to 0. This is because the CR4 write further up will
+     * have been skipped in this case, as PCIDE and PGE won't both be set at
+     * the same time.
+     *
+     * Note also that PGE is always clear in old_cr4.
+     */
+    if ( old_pcid != cr3_pcid(cr3) &&
+         !(cr4 & X86_CR4_PGE) &&
+         (old_cr4 & X86_CR4_PCIDE) <= (cr4 & X86_CR4_PCIDE) )
         invpcid_flush_single_context(old_pcid);
 
     post_flush(t);


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic