[prev in list] [next in list] [prev in thread] [next in thread] 

List:       oss-security
Subject:    [oss-security] Xen Security Advisory 247 - Missing p2m error checking in PoD code
From:       Xen.org security team <security () xen ! org>
Date:       2017-11-28 12:00:24
Message-ID: E1eJeYq-0000OT-94 () xenbits ! xenproject ! org
[Download RAW message or body]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

                    Xen Security Advisory XSA-247
                              version 2

                 Missing p2m error checking in PoD code

UPDATES IN VERSION 2
====================

Public release.

ISSUE DESCRIPTION
=================

Certain actions require modification of entries in a guest's P2M
(Physical-to-Machine) table.  When large pages are in use for this
table, such an operation may incur a memory allocation (to replace a
large mapping with individual smaller ones).  If this allocation
fails, the p2m_set_entry() function will return an error.

Unfortunately, several places in the populate-on-demand code don't
check the return value of p2m_set_entry() to see if it succeeded.

In some cases, the operation was meant to remove an entry from the p2m
table.  If this removal fails, a malicious guest may engineer that the
page be returned to the Xen free list, making it available to be
allocated to another domain, while it retains a writable mapping to
the page.

In other cases, the operation was meant to remove special
populate-on-demand entries; if this removal fails, the internal
accounting becomes inconsistent and may eventually hit a BUG().

The allocation involved comes from a separate pool of memory created
when the domain is created; under normal operating conditions it never
fails, but a malicious guest may be able to engineer situations where
this pool is exhausted.

IMPACT
======

An unprivileged guest can retain a writable mapping of freed memory.
Depending on how this page is used, it could result in either an
information leak, or full privilege escalation.

Alternatively, an unprivileged guest can cause Xen to hit a BUG(),
causing a clean crash - ie, host-wide denial-of-service (DoS).

VULNERABLE SYSTEMS
==================

All systems from Xen 3.4 are vulnerable.

Only x86 systems are vulnerable.  ARM is not vulnerable.

x86 PV VMs cannot leverage the vulnerability.

Only systems with 2MiB or 1GiB HAP pages enabled are vulnerable.

The vulnerability is largely restricted to HVM guests which have been
constructed in Populate-on-Demand mode (i.e. with memory < maxmem):

x86 HVM domains without PoD (i.e. started with memory == maxmem, or
without mentioning "maxmem" in the guest config file) also cannot
leverage the vulnerability, in recent enough Xen versions:
  4.8.x and later: all versions safe if PoD not configured
  4.7.x: 4.7.1 and later safe if PoD not configured
  4.6.x: 4.6.4 and later safe if PoD not configured
  4.5.x: 4.5.4 and later safe if PoD not configured
  4.4.x and earlier: all versions vulnerable even if PoD not configured

The commit required to prevent this vulnerability when PoD
not configured is 2a99aa99fc84a45f505f84802af56b006d14c52e
  xen/physmap: Do not permit a guest to populate PoD pages for itself
and the corresponding backports.

MITIGATION
==========

Running only PV guests will avoid this issue.

Running HVM guests only in non-PoD mode (maxmem == memory) will also
avoid this issue.  NOTE: In older releases of Xen, an HVM guest can
create PoD entries itself; so this mitigation will not be effective.

Specifying "hap_1gb=0 hap_2mb=0" on the hypervisor command line will
also avoid the vulnerability.

Alternatively, running all x86 HVM guests in shadow mode will also
avoid this vulnerability.  (For example, by specifying "hap=0" in the
xl domain configuration file.)

CREDITS
=======

This issue was discovered by George Dunlap of Citrix.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

xsa247/*.patch           xen-unstable
xsa247-4.9/*.patch       Xen 4.9.x
xsa247-4.8/*.patch       Xen 4.8.x
xsa247-4.7/*.patch       Xen 4.7.x
xsa247-4.6/*.patch       Xen 4.6.x
xsa247-4.5/*.patch       Xen 4.5.x

$ sha256sum xsa247* xsa247*/*
e8fc454c35f429ab60b94c0e812f86fd2b3b37edfff2bfdcc13a7e13ebc2efbe  xsa247.meta
59e977d81ad85c25572b79db48d62b4f040026e88f51fe61051b7d30e97fad06  \
xsa247-4.5/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch \
6221f5fc7899253888a1711e83436f1b8ddc51046ec920d83b7ea2f4266d13f7  \
xsa247-4.5/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch \
f54c4984731f9138e522685e98359a0bb409146091fedb8b7beaac48b3460c22  \
xsa247-4.6/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch \
258aaa76e164d70fbfead9de1370577c328dff78c09b81ac7b708fd5c530859a  \
xsa247-4.6/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch \
85f0d5f3940bb27f84867b9ac227636a786519dfc1b35ad82f402f9c044ecac9  \
xsa247-4.7/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch \
8f0d45b617e0b4c0c1ff490e84c6415f1444696d2afce09eeaa970fbedb8f4c3  \
xsa247-4.7/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch \
580771a125aa577ff4c7607679ef5d8d6c668446f4573bf11e4fe6829d02d157  \
xsa247-4.8/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch \
f88d252305d8229374f3fe25bae3c9ea165acab28be9908a1a9a816ae85170ac  \
xsa247-4.8/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch \
5fe123448b8ea63f96495462a274d986016264acef4a81e555848ae0d38bd035  \
xsa247-4.9/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch \
f6e061be4c6355a8d949d258bf1180ff607ce95ca40213cc0e2ee112db435ebd  \
xsa247-4.9/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch \
4cfa8a89f0d2c4a7bf09f31df1e3050b3cce685efd01a17196e991e02d8dd61d  \
xsa247/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch \
37d7f639af2c857d5232ab69aa9c576c37e6a46a7b246d0cd1cc6d05a93360ba  \
xsa247/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch $

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators, with ONE exception:

Removing the ability to boot in populate-on-demand mode is NOT
permitted during the embargo on public cloud systems.  This is because
doing so might alert attackers to the nature of the vulnerability.
Deployment of this mitigation is permitted only AFTER the embargo
ends.

Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBCAAGBQJaHU9KAAoJEIP+FMlX6CvZ8jAH/2mwq7lDxVJVMIVXWZg1b3jU
eVk+yXKh3x1piFUa5jyjiSzpWW7IDnYL3eYbTKoYbAfvWFmM+VPdr6cyhq35T8pv
KVc6Ml+t2gHrBPE6kAsHiKUWTW0pYP/vbfQVRrGx0br7+d0+LVITi8NNc9uRLwqp
eJOhTe3h/Cxoy9Du38AlabDBVXOieKHzJOPNoCLCRkhKL1Jyhg8uEBtZyytc/pOx
xM88uiZ/NIk+nu28fiFiy1EARzS1hhNfb4t2QsmKxFtw0Kkyq7D025Tg/tpKi/Uh
qx2ek6NYMQbIgozyWLlrZOdfxWvFJpIj/5ZhHL/rpawAOfJk3+CwW8OCzfEBKeY=
=q8YR
-----END PGP SIGNATURE-----


["xsa247.meta" (application/octet-stream)]
["xsa247-4.5/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch" (application/octet-stream)]

From 325189906bec1affb6c472f9cce711af1701c602 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:54 +0000
Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
 worked

The PoD zero-check functions speculatively remove memory from the p2m,
then check to see if it's completely zeroed, before putting it in the
cache.

Unfortunately, the p2m_set_entry() calls may fail if the underlying
pagetable structure needs to change and the domain has exhausted its
p2m memory pool: for instance, if we're removing a 2MiB region out of
a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
case); and the return value is not checked.

The underlying mfn will then be added into the PoD cache, and at some
point mapped into another location in the p2m.  If the guest
afterwards ballons out this memory, it will be freed to the hypervisor
and potentially reused by another domain, in spite of the fact that
the original domain still has writable mappings to it.

There are several places where p2m_set_entry() shouldn't be able to
fail, as it is guaranteed to write an entry of the same order that
succeeded before.  Add a backstop of crashing the domain just in case,
and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
builds.

While we're here, use PAGE_ORDER_2M rather than a magic constant.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v4:
- Removed some training whitespace
v3:
- Reformat reset clause to be more compact
- Make sure to set map[i] = NULL when unmapping in case we need to bail
v2:
- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
---
 xen/arch/x86/mm/p2m-pod.c | 76 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 60 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 9c08797a9d..d2931c5306 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -730,8 +730,9 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     }
 
     /* Try to remove the page, restoring old mapping if it fails. */
-    p2m_set_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_2M,
-                  p2m_populate_on_demand, p2m->default_access);
+    if ( p2m_set_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_2M,
+                       p2m_populate_on_demand, p2m->default_access) )
+        goto out;
 
     /* Make none of the MFNs are used elsewhere... for example, mapped
      * via the grant table interface, or by qemu.  Allow one refcount for
@@ -787,9 +788,18 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     ret = SUPERPAGE_PAGES;
 
 out_reset:
-    if ( reset )
-        p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
-    
+    /*
+     * This p2m_set_entry() call shouldn't be able to fail, since the same order
+     * on the same gfn succeeded above.  If that turns out to be false, crashing
+     * the domain should be the safest way of making sure we don't leak memory.
+     */
+    if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
+                                type0, p2m->default_access) )
+    {
+        ASSERT_UNREACHABLE();
+        domain_crash(d);
+    }
+
 out:
     gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
     return ret;
@@ -846,19 +856,30 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
         }
 
         /* Try to remove the page, restoring old mapping if it fails. */
-        p2m_set_entry(p2m, gfns[i], _mfn(INVALID_MFN), PAGE_ORDER_4K,
-                      p2m_populate_on_demand, p2m->default_access);
+        if ( p2m_set_entry(p2m, gfns[i], _mfn(INVALID_MFN), PAGE_ORDER_4K,
+                           p2m_populate_on_demand, p2m->default_access) )
+            goto skip;
 
         /* See if the page was successfully unmapped.  (Allow one refcount
          * for being allocated to a domain.) */
         if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
         {
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
+
+        skip:
             unmap_domain_page(map[i]);
             map[i] = NULL;
 
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
-
             continue;
         }
     }
@@ -875,12 +896,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
 
         unmap_domain_page(map[i]);
 
-        /* See comment in p2m_pod_zero_check_superpage() re gnttab
-         * check timing.  */
-        if ( j < PAGE_SIZE/sizeof(*map[i]) )
+        map[i] = NULL;
+
+        /*
+         * See comment in p2m_pod_zero_check_superpage() re gnttab
+         * check timing.
+         */
+        if ( j < (PAGE_SIZE / sizeof(*map[i])) )
         {
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
         }
         else
         {
@@ -904,7 +938,17 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
             p2m->pod.entry_count++;
         }
     }
-    
+
+    return;
+
+out_unmap:
+    /*
+     * Something went wrong, probably crashing the domain.  Unmap
+     * everything and return.
+     */
+    for ( i = 0; i < count; i++ )
+        if ( map[i] )
+            unmap_domain_page(map[i]);
 }
 
 #define POD_SWEEP_LIMIT 1024
-- 
2.15.0


["xsa247-4.5/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch" (application/octet-stream)]

From 6a483ec69f93597aa5edd7e6979cd297c1294edb Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:55 +0000
Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
 decreasing reservation

If the entire range specified to p2m_pod_decrease_reservation() is marked
populate-on-demand, then it will make a single p2m_set_entry() call,
reducing its PoD entry count.

Unfortunately, in the right circumstances, this p2m_set_entry() call
may fail.  It that case, repeated calls to decrease_reservation() may
cause p2m->pod.entry_count to fall below zero, potentially tripping
over BUG_ON()s to the contrary.

Instead, check to see if the entry succeeded, and return false if not.
The caller will then call guest_remove_page() on the gfns, which will
return -EINVAL upon finding no valid memory there to return.

Unfortunately if the order > 0, the entry may have partially changed.
A domain_crash() is probably the safest thing in that case.

Other p2m_set_entry() calls in the same function should be fine,
because they are writing the entry at its current order.  Nonetheless,
check the return value and crash if our assumption turns otu to be
wrong.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2: Crash the domain if we're not sure it's safe (or if we think it
can't happen)
---
 xen/arch/x86/mm/p2m-pod.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index d2931c5306..b29043c6a1 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -560,11 +560,23 @@ recount:
 
     if ( !nonpod )
     {
-        /* All PoD: Mark the whole region invalid and tell caller
-         * we're done. */
-        p2m_set_entry(p2m, gpfn, _mfn(INVALID_MFN), order, p2m_invalid,
-                      p2m->default_access);
-        p2m->pod.entry_count-=(1<<order);
+        /*
+         * All PoD: Mark the whole region invalid and tell caller
+         * we're done.
+         */
+        if ( p2m_set_entry(p2m, gpfn, _mfn(INVALID_MFN), order, p2m_invalid,
+                           p2m->default_access) )
+        {
+            /*
+             * If this fails, we can't tell how much of the range was changed.
+             * Best to crash the domain unless we're sure a partial change is
+             * impossible.
+             */
+            if ( order != 0 )
+                domain_crash(d);
+            goto out_unlock;
+        }
+        p2m->pod.entry_count -= 1UL << order;
         BUG_ON(p2m->pod.entry_count < 0);
         ret = 1;
         goto out_entry_check;
@@ -596,8 +608,14 @@ recount:
         mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
         if ( t == p2m_populate_on_demand )
         {
-            p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
-                          p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m->pod.entry_count--;
             BUG_ON(p2m->pod.entry_count < 0);
             pod--;
@@ -610,8 +628,14 @@ recount:
 
             page = mfn_to_page(mfn);
 
-            p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
-                          p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
 
             p2m_pod_cache_add(p2m, page, 0);
-- 
2.15.0


["xsa247-4.6/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch" (application/octet-stream)]

From 6208d2d761ca4cec3560322222532c4a5ba1b375 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:54 +0000
Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
 worked

The PoD zero-check functions speculatively remove memory from the p2m,
then check to see if it's completely zeroed, before putting it in the
cache.

Unfortunately, the p2m_set_entry() calls may fail if the underlying
pagetable structure needs to change and the domain has exhausted its
p2m memory pool: for instance, if we're removing a 2MiB region out of
a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
case); and the return value is not checked.

The underlying mfn will then be added into the PoD cache, and at some
point mapped into another location in the p2m.  If the guest
afterwards ballons out this memory, it will be freed to the hypervisor
and potentially reused by another domain, in spite of the fact that
the original domain still has writable mappings to it.

There are several places where p2m_set_entry() shouldn't be able to
fail, as it is guaranteed to write an entry of the same order that
succeeded before.  Add a backstop of crashing the domain just in case,
and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
builds.

While we're here, use PAGE_ORDER_2M rather than a magic constant.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v4:
- Removed some training whitespace
v3:
- Reformat reset clause to be more compact
- Make sure to set map[i] = NULL when unmapping in case we need to bail
v2:
- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
---
 xen/arch/x86/mm/p2m-pod.c | 76 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 60 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 519b80cc3d..b1f0abe02d 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -729,8 +729,9 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     }
 
     /* Try to remove the page, restoring old mapping if it fails. */
-    p2m_set_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_2M,
-                  p2m_populate_on_demand, p2m->default_access);
+    if ( p2m_set_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_2M,
+                       p2m_populate_on_demand, p2m->default_access) )
+        goto out;
 
     /* Make none of the MFNs are used elsewhere... for example, mapped
      * via the grant table interface, or by qemu.  Allow one refcount for
@@ -786,9 +787,18 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     ret = SUPERPAGE_PAGES;
 
 out_reset:
-    if ( reset )
-        p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
-    
+    /*
+     * This p2m_set_entry() call shouldn't be able to fail, since the same order
+     * on the same gfn succeeded above.  If that turns out to be false, crashing
+     * the domain should be the safest way of making sure we don't leak memory.
+     */
+    if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
+                                type0, p2m->default_access) )
+    {
+        ASSERT_UNREACHABLE();
+        domain_crash(d);
+    }
+
 out:
     gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
     return ret;
@@ -845,19 +855,30 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
         }
 
         /* Try to remove the page, restoring old mapping if it fails. */
-        p2m_set_entry(p2m, gfns[i], _mfn(INVALID_MFN), PAGE_ORDER_4K,
-                      p2m_populate_on_demand, p2m->default_access);
+        if ( p2m_set_entry(p2m, gfns[i], _mfn(INVALID_MFN), PAGE_ORDER_4K,
+                           p2m_populate_on_demand, p2m->default_access) )
+            goto skip;
 
         /* See if the page was successfully unmapped.  (Allow one refcount
          * for being allocated to a domain.) */
         if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
         {
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
+
+        skip:
             unmap_domain_page(map[i]);
             map[i] = NULL;
 
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
-
             continue;
         }
     }
@@ -874,12 +895,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
 
         unmap_domain_page(map[i]);
 
-        /* See comment in p2m_pod_zero_check_superpage() re gnttab
-         * check timing.  */
-        if ( j < PAGE_SIZE/sizeof(*map[i]) )
+        map[i] = NULL;
+
+        /*
+         * See comment in p2m_pod_zero_check_superpage() re gnttab
+         * check timing.
+         */
+        if ( j < (PAGE_SIZE / sizeof(*map[i])) )
         {
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
         }
         else
         {
@@ -903,7 +937,17 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
             p2m->pod.entry_count++;
         }
     }
-    
+
+    return;
+
+out_unmap:
+    /*
+     * Something went wrong, probably crashing the domain.  Unmap
+     * everything and return.
+     */
+    for ( i = 0; i < count; i++ )
+        if ( map[i] )
+            unmap_domain_page(map[i]);
 }
 
 #define POD_SWEEP_LIMIT 1024
-- 
2.15.0


["xsa247-4.6/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch" (application/octet-stream)]

From d65a029d34e3d6157c87ac343dc8eefa1b12818e Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:55 +0000
Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
 decreasing reservation

If the entire range specified to p2m_pod_decrease_reservation() is marked
populate-on-demand, then it will make a single p2m_set_entry() call,
reducing its PoD entry count.

Unfortunately, in the right circumstances, this p2m_set_entry() call
may fail.  It that case, repeated calls to decrease_reservation() may
cause p2m->pod.entry_count to fall below zero, potentially tripping
over BUG_ON()s to the contrary.

Instead, check to see if the entry succeeded, and return false if not.
The caller will then call guest_remove_page() on the gfns, which will
return -EINVAL upon finding no valid memory there to return.

Unfortunately if the order > 0, the entry may have partially changed.
A domain_crash() is probably the safest thing in that case.

Other p2m_set_entry() calls in the same function should be fine,
because they are writing the entry at its current order.  Nonetheless,
check the return value and crash if our assumption turns otu to be
wrong.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2: Crash the domain if we're not sure it's safe (or if we think it
can't happen)
---
 xen/arch/x86/mm/p2m-pod.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index b1f0abe02d..9324f16c91 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -559,11 +559,23 @@ recount:
 
     if ( !nonpod )
     {
-        /* All PoD: Mark the whole region invalid and tell caller
-         * we're done. */
-        p2m_set_entry(p2m, gpfn, _mfn(INVALID_MFN), order, p2m_invalid,
-                      p2m->default_access);
-        p2m->pod.entry_count-=(1<<order);
+        /*
+         * All PoD: Mark the whole region invalid and tell caller
+         * we're done.
+         */
+        if ( p2m_set_entry(p2m, gpfn, _mfn(INVALID_MFN), order, p2m_invalid,
+                           p2m->default_access) )
+        {
+            /*
+             * If this fails, we can't tell how much of the range was changed.
+             * Best to crash the domain unless we're sure a partial change is
+             * impossible.
+             */
+            if ( order != 0 )
+                domain_crash(d);
+            goto out_unlock;
+        }
+        p2m->pod.entry_count -= 1UL << order;
         BUG_ON(p2m->pod.entry_count < 0);
         ret = 1;
         goto out_entry_check;
@@ -595,8 +607,14 @@ recount:
         mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
         if ( t == p2m_populate_on_demand )
         {
-            p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
-                          p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m->pod.entry_count--;
             BUG_ON(p2m->pod.entry_count < 0);
             pod--;
@@ -609,8 +627,14 @@ recount:
 
             page = mfn_to_page(mfn);
 
-            p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
-                          p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
 
             p2m_pod_cache_add(p2m, page, 0);
-- 
2.15.0


["xsa247-4.7/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch" (application/octet-stream)]

From f345ca185e0c042ed12bf929a9e93efaf33397bb Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:54 +0000
Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
 worked

The PoD zero-check functions speculatively remove memory from the p2m,
then check to see if it's completely zeroed, before putting it in the
cache.

Unfortunately, the p2m_set_entry() calls may fail if the underlying
pagetable structure needs to change and the domain has exhausted its
p2m memory pool: for instance, if we're removing a 2MiB region out of
a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
case); and the return value is not checked.

The underlying mfn will then be added into the PoD cache, and at some
point mapped into another location in the p2m.  If the guest
afterwards ballons out this memory, it will be freed to the hypervisor
and potentially reused by another domain, in spite of the fact that
the original domain still has writable mappings to it.

There are several places where p2m_set_entry() shouldn't be able to
fail, as it is guaranteed to write an entry of the same order that
succeeded before.  Add a backstop of crashing the domain just in case,
and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
builds.

While we're here, use PAGE_ORDER_2M rather than a magic constant.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v4:
- Removed some training whitespace
v3:
- Reformat reset clause to be more compact
- Make sure to set map[i] = NULL when unmapping in case we need to bail
v2:
- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
---
 xen/arch/x86/mm/p2m-pod.c | 77 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 61 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 87082cf65f..5ec8a37949 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -754,8 +754,10 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     }
 
     /* Try to remove the page, restoring old mapping if it fails. */
-    p2m_set_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_2M,
-                  p2m_populate_on_demand, p2m->default_access);
+    if ( p2m_set_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_2M,
+                       p2m_populate_on_demand, p2m->default_access) )
+        goto out;
+
     p2m_tlb_flush_sync(p2m);
 
     /* Make none of the MFNs are used elsewhere... for example, mapped
@@ -812,9 +814,18 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     ret = SUPERPAGE_PAGES;
 
 out_reset:
-    if ( reset )
-        p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
-    
+    /*
+     * This p2m_set_entry() call shouldn't be able to fail, since the same order
+     * on the same gfn succeeded above.  If that turns out to be false, crashing
+     * the domain should be the safest way of making sure we don't leak memory.
+     */
+    if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
+                                type0, p2m->default_access) )
+    {
+        ASSERT_UNREACHABLE();
+        domain_crash(d);
+    }
+
 out:
     gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
     return ret;
@@ -871,19 +882,30 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
         }
 
         /* Try to remove the page, restoring old mapping if it fails. */
-        p2m_set_entry(p2m, gfns[i], _mfn(INVALID_MFN), PAGE_ORDER_4K,
-                      p2m_populate_on_demand, p2m->default_access);
+        if ( p2m_set_entry(p2m, gfns[i], _mfn(INVALID_MFN), PAGE_ORDER_4K,
+                           p2m_populate_on_demand, p2m->default_access) )
+            goto skip;
 
         /* See if the page was successfully unmapped.  (Allow one refcount
          * for being allocated to a domain.) */
         if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
         {
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
+
+        skip:
             unmap_domain_page(map[i]);
             map[i] = NULL;
 
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
-
             continue;
         }
     }
@@ -902,12 +924,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
 
         unmap_domain_page(map[i]);
 
-        /* See comment in p2m_pod_zero_check_superpage() re gnttab
-         * check timing.  */
-        if ( j < PAGE_SIZE/sizeof(*map[i]) )
+        map[i] = NULL;
+
+        /*
+         * See comment in p2m_pod_zero_check_superpage() re gnttab
+         * check timing.
+         */
+        if ( j < (PAGE_SIZE / sizeof(*map[i])) )
         {
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
         }
         else
         {
@@ -931,7 +966,17 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
             p2m->pod.entry_count++;
         }
     }
-    
+
+    return;
+
+out_unmap:
+    /*
+     * Something went wrong, probably crashing the domain.  Unmap
+     * everything and return.
+     */
+    for ( i = 0; i < count; i++ )
+        if ( map[i] )
+            unmap_domain_page(map[i]);
 }
 
 #define POD_SWEEP_LIMIT 1024
-- 
2.15.0


["xsa247-4.7/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch" (application/octet-stream)]

From 01feeda5363dd8d2fea8395c2c435203751c8ba5 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:55 +0000
Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
 decreasing reservation

If the entire range specified to p2m_pod_decrease_reservation() is marked
populate-on-demand, then it will make a single p2m_set_entry() call,
reducing its PoD entry count.

Unfortunately, in the right circumstances, this p2m_set_entry() call
may fail.  It that case, repeated calls to decrease_reservation() may
cause p2m->pod.entry_count to fall below zero, potentially tripping
over BUG_ON()s to the contrary.

Instead, check to see if the entry succeeded, and return false if not.
The caller will then call guest_remove_page() on the gfns, which will
return -EINVAL upon finding no valid memory there to return.

Unfortunately if the order > 0, the entry may have partially changed.
A domain_crash() is probably the safest thing in that case.

Other p2m_set_entry() calls in the same function should be fine,
because they are writing the entry at its current order.  Nonetheless,
check the return value and crash if our assumption turns otu to be
wrong.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2: Crash the domain if we're not sure it's safe (or if we think it
can't happen)
---
 xen/arch/x86/mm/p2m-pod.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 5ec8a37949..91d309647e 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -557,11 +557,23 @@ p2m_pod_decrease_reservation(struct domain *d,
 
     if ( !nonpod )
     {
-        /* All PoD: Mark the whole region invalid and tell caller
-         * we're done. */
-        p2m_set_entry(p2m, gpfn, _mfn(INVALID_MFN), order, p2m_invalid,
-                      p2m->default_access);
-        p2m->pod.entry_count-=(1<<order);
+        /*
+         * All PoD: Mark the whole region invalid and tell caller
+         * we're done.
+         */
+        if ( p2m_set_entry(p2m, gpfn, _mfn(INVALID_MFN), order, p2m_invalid,
+                           p2m->default_access) )
+        {
+            /*
+             * If this fails, we can't tell how much of the range was changed.
+             * Best to crash the domain unless we're sure a partial change is
+             * impossible.
+             */
+            if ( order != 0 )
+                domain_crash(d);
+            goto out_unlock;
+        }
+        p2m->pod.entry_count -= 1UL << order;
         BUG_ON(p2m->pod.entry_count < 0);
         ret = 1;
         goto out_entry_check;
@@ -602,8 +614,14 @@ p2m_pod_decrease_reservation(struct domain *d,
         n = 1UL << cur_order;
         if ( t == p2m_populate_on_demand )
         {
-            p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m->pod.entry_count -= n;
             BUG_ON(p2m->pod.entry_count < 0);
             pod -= n;
@@ -624,8 +642,14 @@ p2m_pod_decrease_reservation(struct domain *d,
 
             page = mfn_to_page(mfn);
 
-            p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m_tlb_flush_sync(p2m);
             for ( j = 0; j < n; ++j )
                 set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
-- 
2.15.0


["xsa247-4.8/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch" (application/octet-stream)]

From 0a004cf322940d99432b84284b22f3a9ea67a282 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:54 +0000
Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
 worked

The PoD zero-check functions speculatively remove memory from the p2m,
then check to see if it's completely zeroed, before putting it in the
cache.

Unfortunately, the p2m_set_entry() calls may fail if the underlying
pagetable structure needs to change and the domain has exhausted its
p2m memory pool: for instance, if we're removing a 2MiB region out of
a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
case); and the return value is not checked.

The underlying mfn will then be added into the PoD cache, and at some
point mapped into another location in the p2m.  If the guest
afterwards ballons out this memory, it will be freed to the hypervisor
and potentially reused by another domain, in spite of the fact that
the original domain still has writable mappings to it.

There are several places where p2m_set_entry() shouldn't be able to
fail, as it is guaranteed to write an entry of the same order that
succeeded before.  Add a backstop of crashing the domain just in case,
and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
builds.

While we're here, use PAGE_ORDER_2M rather than a magic constant.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v4:
- Removed some training whitespace
v3:
- Reformat reset clause to be more compact
- Make sure to set map[i] = NULL when unmapping in case we need to bail
v2:
- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
---
 xen/arch/x86/mm/p2m-pod.c | 77 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 61 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 0e15290390..d73a86dde0 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -754,8 +754,10 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     }
 
     /* Try to remove the page, restoring old mapping if it fails. */
-    p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
-                  p2m_populate_on_demand, p2m->default_access);
+    if ( p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
+                       p2m_populate_on_demand, p2m->default_access) )
+        goto out;
+
     p2m_tlb_flush_sync(p2m);
 
     /* Make none of the MFNs are used elsewhere... for example, mapped
@@ -812,9 +814,18 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     ret = SUPERPAGE_PAGES;
 
 out_reset:
-    if ( reset )
-        p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
-    
+    /*
+     * This p2m_set_entry() call shouldn't be able to fail, since the same order
+     * on the same gfn succeeded above.  If that turns out to be false, crashing
+     * the domain should be the safest way of making sure we don't leak memory.
+     */
+    if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
+                                type0, p2m->default_access) )
+    {
+        ASSERT_UNREACHABLE();
+        domain_crash(d);
+    }
+
 out:
     gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
     return ret;
@@ -871,19 +882,30 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
         }
 
         /* Try to remove the page, restoring old mapping if it fails. */
-        p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
-                      p2m_populate_on_demand, p2m->default_access);
+        if ( p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
+                           p2m_populate_on_demand, p2m->default_access) )
+            goto skip;
 
         /* See if the page was successfully unmapped.  (Allow one refcount
          * for being allocated to a domain.) */
         if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
         {
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
+
+        skip:
             unmap_domain_page(map[i]);
             map[i] = NULL;
 
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
-
             continue;
         }
     }
@@ -902,12 +924,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
 
         unmap_domain_page(map[i]);
 
-        /* See comment in p2m_pod_zero_check_superpage() re gnttab
-         * check timing.  */
-        if ( j < PAGE_SIZE/sizeof(*map[i]) )
+        map[i] = NULL;
+
+        /*
+         * See comment in p2m_pod_zero_check_superpage() re gnttab
+         * check timing.
+         */
+        if ( j < (PAGE_SIZE / sizeof(*map[i])) )
         {
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
         }
         else
         {
@@ -931,7 +966,17 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
             p2m->pod.entry_count++;
         }
     }
-    
+
+    return;
+
+out_unmap:
+    /*
+     * Something went wrong, probably crashing the domain.  Unmap
+     * everything and return.
+     */
+    for ( i = 0; i < count; i++ )
+        if ( map[i] )
+            unmap_domain_page(map[i]);
 }
 
 #define POD_SWEEP_LIMIT 1024
-- 
2.15.0


["xsa247-4.8/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch" (application/octet-stream)]

From f01b21460bdd5205e1a92552d37a276866f64f1f Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:55 +0000
Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
 decreasing reservation

If the entire range specified to p2m_pod_decrease_reservation() is marked
populate-on-demand, then it will make a single p2m_set_entry() call,
reducing its PoD entry count.

Unfortunately, in the right circumstances, this p2m_set_entry() call
may fail.  It that case, repeated calls to decrease_reservation() may
cause p2m->pod.entry_count to fall below zero, potentially tripping
over BUG_ON()s to the contrary.

Instead, check to see if the entry succeeded, and return false if not.
The caller will then call guest_remove_page() on the gfns, which will
return -EINVAL upon finding no valid memory there to return.

Unfortunately if the order > 0, the entry may have partially changed.
A domain_crash() is probably the safest thing in that case.

Other p2m_set_entry() calls in the same function should be fine,
because they are writing the entry at its current order.  Nonetheless,
check the return value and crash if our assumption turns otu to be
wrong.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2: Crash the domain if we're not sure it's safe (or if we think it
can't happen)
---
 xen/arch/x86/mm/p2m-pod.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index d73a86dde0..c750d0d8cc 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -557,11 +557,23 @@ p2m_pod_decrease_reservation(struct domain *d,
 
     if ( !nonpod )
     {
-        /* All PoD: Mark the whole region invalid and tell caller
-         * we're done. */
-        p2m_set_entry(p2m, gpfn, INVALID_MFN, order, p2m_invalid,
-                      p2m->default_access);
-        p2m->pod.entry_count-=(1<<order);
+        /*
+         * All PoD: Mark the whole region invalid and tell caller
+         * we're done.
+         */
+        if ( p2m_set_entry(p2m, gpfn, INVALID_MFN, order, p2m_invalid,
+                           p2m->default_access) )
+        {
+            /*
+             * If this fails, we can't tell how much of the range was changed.
+             * Best to crash the domain unless we're sure a partial change is
+             * impossible.
+             */
+            if ( order != 0 )
+                domain_crash(d);
+            goto out_unlock;
+        }
+        p2m->pod.entry_count -= 1UL << order;
         BUG_ON(p2m->pod.entry_count < 0);
         ret = 1;
         goto out_entry_check;
@@ -602,8 +614,14 @@ p2m_pod_decrease_reservation(struct domain *d,
         n = 1UL << cur_order;
         if ( t == p2m_populate_on_demand )
         {
-            p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m->pod.entry_count -= n;
             BUG_ON(p2m->pod.entry_count < 0);
             pod -= n;
@@ -624,8 +642,14 @@ p2m_pod_decrease_reservation(struct domain *d,
 
             page = mfn_to_page(mfn);
 
-            p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m_tlb_flush_sync(p2m);
             for ( j = 0; j < n; ++j )
                 set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
-- 
2.15.0


["xsa247-4.9/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch" (application/octet-stream)]

From ad208b8b7e45fb2b7c572b86c61c26412609e82d Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:54 +0000
Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
 worked

The PoD zero-check functions speculatively remove memory from the p2m,
then check to see if it's completely zeroed, before putting it in the
cache.

Unfortunately, the p2m_set_entry() calls may fail if the underlying
pagetable structure needs to change and the domain has exhausted its
p2m memory pool: for instance, if we're removing a 2MiB region out of
a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
case); and the return value is not checked.

The underlying mfn will then be added into the PoD cache, and at some
point mapped into another location in the p2m.  If the guest
afterwards ballons out this memory, it will be freed to the hypervisor
and potentially reused by another domain, in spite of the fact that
the original domain still has writable mappings to it.

There are several places where p2m_set_entry() shouldn't be able to
fail, as it is guaranteed to write an entry of the same order that
succeeded before.  Add a backstop of crashing the domain just in case,
and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
builds.

While we're here, use PAGE_ORDER_2M rather than a magic constant.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v4:
- Removed some training whitespace
v3:
- Reformat reset clause to be more compact
- Make sure to set map[i] = NULL when unmapping in case we need to bail
v2:
- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
---
 xen/arch/x86/mm/p2m-pod.c | 77 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 61 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 730a48f928..f2ed751892 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -752,8 +752,10 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     }
 
     /* Try to remove the page, restoring old mapping if it fails. */
-    p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
-                  p2m_populate_on_demand, p2m->default_access);
+    if ( p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
+                       p2m_populate_on_demand, p2m->default_access) )
+        goto out;
+
     p2m_tlb_flush_sync(p2m);
 
     /* Make none of the MFNs are used elsewhere... for example, mapped
@@ -810,9 +812,18 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     ret = SUPERPAGE_PAGES;
 
 out_reset:
-    if ( reset )
-        p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
-    
+    /*
+     * This p2m_set_entry() call shouldn't be able to fail, since the same order
+     * on the same gfn succeeded above.  If that turns out to be false, crashing
+     * the domain should be the safest way of making sure we don't leak memory.
+     */
+    if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
+                                type0, p2m->default_access) )
+    {
+        ASSERT_UNREACHABLE();
+        domain_crash(d);
+    }
+
 out:
     gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
     return ret;
@@ -869,19 +880,30 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
         }
 
         /* Try to remove the page, restoring old mapping if it fails. */
-        p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
-                      p2m_populate_on_demand, p2m->default_access);
+        if ( p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
+                           p2m_populate_on_demand, p2m->default_access) )
+            goto skip;
 
         /* See if the page was successfully unmapped.  (Allow one refcount
          * for being allocated to a domain.) */
         if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
         {
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
+
+        skip:
             unmap_domain_page(map[i]);
             map[i] = NULL;
 
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
-
             continue;
         }
     }
@@ -900,12 +922,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
 
         unmap_domain_page(map[i]);
 
-        /* See comment in p2m_pod_zero_check_superpage() re gnttab
-         * check timing.  */
-        if ( j < PAGE_SIZE/sizeof(*map[i]) )
+        map[i] = NULL;
+
+        /*
+         * See comment in p2m_pod_zero_check_superpage() re gnttab
+         * check timing.
+         */
+        if ( j < (PAGE_SIZE / sizeof(*map[i])) )
         {
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
         }
         else
         {
@@ -929,7 +964,17 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
             p2m->pod.entry_count++;
         }
     }
-    
+
+    return;
+
+out_unmap:
+    /*
+     * Something went wrong, probably crashing the domain.  Unmap
+     * everything and return.
+     */
+    for ( i = 0; i < count; i++ )
+        if ( map[i] )
+            unmap_domain_page(map[i]);
 }
 
 #define POD_SWEEP_LIMIT 1024
-- 
2.15.0


["xsa247-4.9/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch" (application/octet-stream)]

From d4bc7833707351a5341a6bdf04c752a028d9560d Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:55 +0000
Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
 decreasing reservation

If the entire range specified to p2m_pod_decrease_reservation() is marked
populate-on-demand, then it will make a single p2m_set_entry() call,
reducing its PoD entry count.

Unfortunately, in the right circumstances, this p2m_set_entry() call
may fail.  It that case, repeated calls to decrease_reservation() may
cause p2m->pod.entry_count to fall below zero, potentially tripping
over BUG_ON()s to the contrary.

Instead, check to see if the entry succeeded, and return false if not.
The caller will then call guest_remove_page() on the gfns, which will
return -EINVAL upon finding no valid memory there to return.

Unfortunately if the order > 0, the entry may have partially changed.
A domain_crash() is probably the safest thing in that case.

Other p2m_set_entry() calls in the same function should be fine,
because they are writing the entry at its current order.  Nonetheless,
check the return value and crash if our assumption turns otu to be
wrong.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2: Crash the domain if we're not sure it's safe (or if we think it
can't happen)
---
 xen/arch/x86/mm/p2m-pod.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index f2ed751892..473d6a6dbf 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -555,11 +555,23 @@ p2m_pod_decrease_reservation(struct domain *d,
 
     if ( !nonpod )
     {
-        /* All PoD: Mark the whole region invalid and tell caller
-         * we're done. */
-        p2m_set_entry(p2m, gpfn, INVALID_MFN, order, p2m_invalid,
-                      p2m->default_access);
-        p2m->pod.entry_count-=(1<<order);
+        /*
+         * All PoD: Mark the whole region invalid and tell caller
+         * we're done.
+         */
+        if ( p2m_set_entry(p2m, gpfn, INVALID_MFN, order, p2m_invalid,
+                           p2m->default_access) )
+        {
+            /*
+             * If this fails, we can't tell how much of the range was changed.
+             * Best to crash the domain unless we're sure a partial change is
+             * impossible.
+             */
+            if ( order != 0 )
+                domain_crash(d);
+            goto out_unlock;
+        }
+        p2m->pod.entry_count -= 1UL << order;
         BUG_ON(p2m->pod.entry_count < 0);
         ret = 1;
         goto out_entry_check;
@@ -600,8 +612,14 @@ p2m_pod_decrease_reservation(struct domain *d,
         n = 1UL << cur_order;
         if ( t == p2m_populate_on_demand )
         {
-            p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m->pod.entry_count -= n;
             BUG_ON(p2m->pod.entry_count < 0);
             pod -= n;
@@ -622,8 +640,14 @@ p2m_pod_decrease_reservation(struct domain *d,
 
             page = mfn_to_page(mfn);
 
-            p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m_tlb_flush_sync(p2m);
             for ( j = 0; j < n; ++j )
                 set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
-- 
2.15.0


["xsa247/0001-p2m-Always-check-to-see-if-removing-a-p2m-entry-actu.patch" (application/octet-stream)]

From dc9317d25da03a17f82cca7723edc59f2306599f Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:54 +0000
Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
 worked

The PoD zero-check functions speculatively remove memory from the p2m,
then check to see if it's completely zeroed, before putting it in the
cache.

Unfortunately, the p2m_set_entry() calls may fail if the underlying
pagetable structure needs to change and the domain has exhausted its
p2m memory pool: for instance, if we're removing a 2MiB region out of
a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
case); and the return value is not checked.

The underlying mfn will then be added into the PoD cache, and at some
point mapped into another location in the p2m.  If the guest
afterwards ballons out this memory, it will be freed to the hypervisor
and potentially reused by another domain, in spite of the fact that
the original domain still has writable mappings to it.

There are several places where p2m_set_entry() shouldn't be able to
fail, as it is guaranteed to write an entry of the same order that
succeeded before.  Add a backstop of crashing the domain just in case,
and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
builds.

While we're here, use PAGE_ORDER_2M rather than a magic constant.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v4:
- Removed some training whitespace
v3:
- Reformat reset clause to be more compact
- Make sure to set map[i] = NULL when unmapping in case we need to bail
v2:
- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
---
 xen/arch/x86/mm/p2m-pod.c | 65 +++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 54 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 7ba56b14ab..cc8e3fb845 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -771,8 +771,10 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, gfn_t gfn)
     }
 
     /* Try to remove the page, restoring old mapping if it fails. */
-    p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
-                  p2m_populate_on_demand, p2m->default_access);
+    if ( p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
+                       p2m_populate_on_demand, p2m->default_access) )
+        goto out;
+
     p2m_tlb_flush_sync(p2m);
 
     /*
@@ -833,8 +835,17 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, gfn_t gfn)
     ret = SUPERPAGE_PAGES;
 
 out_reset:
-    if ( reset )
-        p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
+    /*
+     * This p2m_set_entry() call shouldn't be able to fail, since the same order
+     * on the same gfn succeeded above.  If that turns out to be false, crashing
+     * the domain should be the safest way of making sure we don't leak memory.
+     */
+    if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
+                                type0, p2m->default_access) )
+    {
+        ASSERT_UNREACHABLE();
+        domain_crash(d);
+    }
 
 out:
     gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
@@ -900,8 +911,9 @@ p2m_pod_zero_check(struct p2m_domain *p2m, const gfn_t *gfns, int count)
         }
 
         /* Try to remove the page, restoring old mapping if it fails. */
-        p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
-                      p2m_populate_on_demand, p2m->default_access);
+        if ( p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
+                           p2m_populate_on_demand, p2m->default_access) )
+            goto skip;
 
         /*
          * See if the page was successfully unmapped.  (Allow one refcount
@@ -909,12 +921,22 @@ p2m_pod_zero_check(struct p2m_domain *p2m, const gfn_t *gfns, int count)
          */
         if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
         {
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
+
+        skip:
             unmap_domain_page(map[i]);
             map[i] = NULL;
 
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                types[i], p2m->default_access);
-
             continue;
         }
     }
@@ -933,14 +955,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, const gfn_t *gfns, int count)
 
         unmap_domain_page(map[i]);
 
+        map[i] = NULL;
+
         /*
          * See comment in p2m_pod_zero_check_superpage() re gnttab
          * check timing.
          */
         if ( j < (PAGE_SIZE / sizeof(*map[i])) )
         {
-            p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-                          types[i], p2m->default_access);
+            /*
+             * If the previous p2m_set_entry call succeeded, this one shouldn't
+             * be able to fail.  If it does, crashing the domain should be safe.
+             */
+            if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
+                               types[i], p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unmap;
+            }
         }
         else
         {
@@ -965,6 +998,16 @@ p2m_pod_zero_check(struct p2m_domain *p2m, const gfn_t *gfns, int count)
         }
     }
 
+    return;
+
+out_unmap:
+    /*
+     * Something went wrong, probably crashing the domain.  Unmap
+     * everything and return.
+     */
+    for ( i = 0; i < count; i++ )
+        if ( map[i] )
+            unmap_domain_page(map[i]);
 }
 
 #define POD_SWEEP_LIMIT 1024
-- 
2.15.0


["xsa247/0002-p2m-Check-return-value-of-p2m_set_entry-when-decreas.patch" (application/octet-stream)]

From 510a3313bbac437c82c7091c04c973cffb1f3be3 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 10 Nov 2017 16:53:55 +0000
Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
 decreasing reservation

If the entire range specified to p2m_pod_decrease_reservation() is marked
populate-on-demand, then it will make a single p2m_set_entry() call,
reducing its PoD entry count.

Unfortunately, in the right circumstances, this p2m_set_entry() call
may fail.  It that case, repeated calls to decrease_reservation() may
cause p2m->pod.entry_count to fall below zero, potentially tripping
over BUG_ON()s to the contrary.

Instead, check to see if the entry succeeded, and return false if not.
The caller will then call guest_remove_page() on the gfns, which will
return -EINVAL upon finding no valid memory there to return.

Unfortunately if the order > 0, the entry may have partially changed.
A domain_crash() is probably the safest thing in that case.

Other p2m_set_entry() calls in the same function should be fine,
because they are writing the entry at its current order.  Nonetheless,
check the return value and crash if our assumption turns otu to be
wrong.

This is part of XSA-247.

Reported-by: XXX PERSON <XXX EMAIL>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2: Crash the domain if we're not sure it's safe (or if we think it
can't happen)
---
 xen/arch/x86/mm/p2m-pod.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index cc8e3fb845..e8d561b97e 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -565,8 +565,18 @@ p2m_pod_decrease_reservation(struct domain *d, gfn_t gfn, unsigned int order)
          * All PoD: Mark the whole region invalid and tell caller
          * we're done.
          */
-        p2m_set_entry(p2m, gfn, INVALID_MFN, order, p2m_invalid,
-                      p2m->default_access);
+        if ( p2m_set_entry(p2m, gfn, INVALID_MFN, order, p2m_invalid,
+                           p2m->default_access) )
+        {
+            /*
+             * If this fails, we can't tell how much of the range was changed.
+             * Best to crash the domain unless we're sure a partial change is
+             * impossible.
+             */
+            if ( order != 0 )
+                domain_crash(d);
+            goto out_unlock;
+        }
         p2m->pod.entry_count -= 1UL << order;
         BUG_ON(p2m->pod.entry_count < 0);
         ret = 1;
@@ -609,8 +619,14 @@ p2m_pod_decrease_reservation(struct domain *d, gfn_t gfn, unsigned int order)
         n = 1UL << cur_order;
         if ( t == p2m_populate_on_demand )
         {
-            p2m_set_entry(p2m, gfn_add(gfn, i), INVALID_MFN, cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gfn_add(gfn, i), INVALID_MFN, cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m->pod.entry_count -= n;
             BUG_ON(p2m->pod.entry_count < 0);
             pod -= n;
@@ -631,8 +647,14 @@ p2m_pod_decrease_reservation(struct domain *d, gfn_t gfn, unsigned int order)
 
             page = mfn_to_page(mfn);
 
-            p2m_set_entry(p2m, gfn_add(gfn, i), INVALID_MFN, cur_order,
-                          p2m_invalid, p2m->default_access);
+            /* This shouldn't be able to fail */
+            if ( p2m_set_entry(p2m, gfn_add(gfn, i), INVALID_MFN, cur_order,
+                               p2m_invalid, p2m->default_access) )
+            {
+                ASSERT_UNREACHABLE();
+                domain_crash(d);
+                goto out_unlock;
+            }
             p2m_tlb_flush_sync(p2m);
             for ( j = 0; j < n; ++j )
                 set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
-- 
2.15.0



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic