[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xen-users
Subject:    [Xen-users] Xen Security Advisory 218 - Races in the grant table unmap code
From:       Xen.org security team <security () xen ! org>
Date:       2017-06-20 12:03:22
Message-ID: E1dNHsQ-0007FF-Hf () xenbits ! xenproject ! org
[Download RAW message or body]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

                    Xen Security Advisory XSA-218
                              version 4

                 Races in the grant table unmap code

UPDATES IN VERSION 4
====================

Adjust last patch description and add review tag.

Public release.

ISSUE DESCRIPTION
=================

We have discovered two bugs in the code unmapping grant references.

* When a grant had been mapped twice by a backend domain, and then
unmapped by two concurrent unmap calls, the frontend may be informed
that the page had no further mappings when the first call completed rather
than when the second call completed.

* A race triggerable by an unprivileged guest could cause a grant
maptrack entry for grants to be "freed" twice.  The ultimate effect of
this would be for maptrack entries for a single domain to be re-used.

IMPACT
======

For the first issue, for a short window of time, a malicious backend
could still read and write memory that the frontend thought was its
own again.  Depending on the usage, this could be either an
information leak, or a backend-to-frontend privilege escalation.

The second issue is more difficult to analyze. It can probably cause
reference counts to leak, preventing memory from being freed on domain
destruction (denial-of-service), but information leakage or host
privilege escalation cannot be ruled out.

VULNERABLE SYSTEMS
==================

All versions of Xen are vulnerable.

Both ARM and x86 are vulnerable.

On x86, systems with either PV or HVM guests are vulnerable.

MITIGATION
==========

None.

CREDITS
=======

This issue was discovered by Jann Horn of Google Project Zero.

RESOLUTION
==========

Applying the appropriate set of attached patches resolves this issue.

xsa218-unstable/*.patch    xen-unstable
xsa218-4.8/*.patch         Xen 4.8.x
xsa218-4.7/*.patch         Xen 4.7.x
xsa218-4.6/*.patch         Xen 4.6.x
xsa218-4.5/*.patch         Xen 4.5.x

$ sha256sum xsa218*/*
6f5e588edb6d3f0a37b89235e95cdcc7ca73cdff236d86b65e6f608bd15b03ec  \
xsa218-unstable/0001-gnttab-fix-unmap-pin-accounting-race.patch \
5cb85f0aaa19ff343fc51b08addbf37d62352774115acd28eb18a73f67507e21  \
xsa218-unstable/0002-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch \
f5f3d27ce2829b3aa5e09b216bf9afcb1dc6b1f9f3b3a0f3ebfe5a68b4948aef  \
xsa218-unstable/0003-gnttab-correct-maptrack-table-accesses.patch \
fafb8773957bbffb21ab43c7a3559efe15f52d234afba5f2ad2739411946c021  \
xsa218-4.5/0001-IOMMU-handle-IOMMU-mapping-and-unmapping-failures.patch \
4398ad7111421dbf954ede651cb7f9acd83c654c7fa93d54a4e5f9b7b25fe918  \
xsa218-4.5/0002-gnttab-fix-unmap-pin-accounting-race.patch \
9d23946afb96a70c574b8c7ff42ed8b30b72e9a1f751ff617a7578c79645c094  \
xsa218-4.5/0003-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch \
27d92c6f4d89de3fd9e9311337823370303c1ef985cce2bd9bea28f00cd6c184  \
xsa218-4.5/0004-gnttab-correct-maptrack-table-accesses.patch \
99ac090d7955a46c6c9c73ca62b64cef6b8f05439961e52278c662f030a36ee2  \
xsa218-4.6/0001-IOMMU-handle-IOMMU-mapping-and-unmapping-failures.patch \
e0f0839336e055c1422cf0f76c37f6d9cc8474b0140ffef2451dca6697a9f20f  \
xsa218-4.6/0002-gnttab-fix-unmap-pin-accounting-race.patch \
5f6f63211b18bb6ec157353b9e8b844abe3fd767ef1780e6d28731e935559fbc  \
xsa218-4.6/0003-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch \
6a786a8c4b916b6f99092598bd4d60381907cd7e728c98a79e999afeec4f45a6  \
xsa218-4.6/0004-gnttab-correct-maptrack-table-accesses.patch \
58354eec5f4f0b87640c702c6e1ce0eeb57dffbd09394a96e88bd6ff42c53e7e  \
xsa218-4.7/0001-IOMMU-handle-IOMMU-mapping-and-unmapping-failures.patch \
0683d7ffdbe60dc8e1d161adeb0c5465df1840e86353b5cbb96dd204f2dbb526  \
xsa218-4.7/0002-gnttab-fix-unmap-pin-accounting-race.patch \
6bfef9e1653a305e49653c5b81acb57ca41ee8410ea085d49c9bc7e4ccd31e54  \
xsa218-4.7/0003-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch \
b4ede29e3a94d9e7992c90b8b7c8d489e071764218b28962b5755a444040e1ae  \
xsa218-4.7/0004-gnttab-correct-maptrack-table-accesses.patch \
c2a1b40e76764333f3ee34dd9bc7d3e34bab91f8b44eaae7aa6f187bbddb358f  \
xsa218-4.8/0001-gnttab-fix-unmap-pin-accounting-race.patch \
a210ff17a0ca1a81f2c98cce84a104ac7dd2f1a72fa3855ca5f3b3d13e95468c  \
xsa218-4.8/0002-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch \
0b8fa3d6a0f3ccb43c8134db2240867d5a850ee0821d4124a1642596b4d6cb5a  \
xsa218-4.8/0003-gnttab-correct-maptrack-table-accesses.patch $

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBCAAGBQJZSQ8GAAoJEIP+FMlX6CvZMtoH/jDvFJKVeBCD3p/6sg8XiMR9
saDiNfB0ULOIw6ffqFMC8SKYS7cRh+ijmt66ehsPZ/Azv17P5L19bmXQlW2y0Ea9
sAoJ3OJGI7FUz2O8SVhLmN+wSxKmfwmzuK6Rn3xX6VE9UPL8yfzdZIDT1K+Oparz
0mw0IAp7xukXdB0LlWePf1RLTl+0tZAoqsOIQXmM58kz1zvXIgOuSbI/ULUb/vnY
cj6BJKdyARM+7Kgpdj7bw1cunjo5RiH2aSeji9/T6QJbO6sVv3cb7qZfV94SRfJL
eaem+3awJjo39R5itO/UgL55K77/7yqtKt8ZUvhndKgmXeWyQsTp7HQx+lE8zv4=
=sS5B
-----END PGP SIGNATURE-----


["xsa218-unstable/0001-gnttab-fix-unmap-pin-accounting-race.patch" (application/octet-stream)]

From 18bddb81714c87dffccabde8261c551abd15a07a Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Fri, 2 Jun 2017 12:22:42 +0100
Subject: [PATCH 1/3] gnttab: fix unmap pin accounting race

Once all {writable} mappings of a grant entry have been unmapped, the
hypervisor informs the guest that the grant entry has been released by
clearing the _GTF_{reading,writing} usage flags in the guest's grant
table as appropriate.

Unfortunately, at the moment, the code that updates the accounting
happens in a different critical section than the one which updates the
usage flags; this means that under the right circumstances, there may be
a window in time after the hypervisor reported the grant as being free
during which the grant referee still had access to the page.

Move the grant accounting code into the same critical section as the
reporting code to make sure this kind of race can't happen.

This is part of XSA-218.

Reported-by: Jann Horn <jannh.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 4fe9544..147afe9 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1150,15 +1150,8 @@ __gnttab_unmap_common(
             PIN_FAIL(act_release_out, GNTST_general_error,
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
-        if ( op->flags & GNTMAP_device_map )
-        {
-            ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
-            op->map->flags &= ~GNTMAP_device_map;
-            if ( op->flags & GNTMAP_readonly )
-                act->pin -= GNTPIN_devr_inc;
-            else
-                act->pin -= GNTPIN_devw_inc;
-        }
+
+        op->map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1168,12 +1161,7 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
         op->map->flags &= ~GNTMAP_host_map;
-        if ( op->flags & GNTMAP_readonly )
-            act->pin -= GNTPIN_hstr_inc;
-        else
-            act->pin -= GNTPIN_hstw_inc;
     }
 
  act_release_out:
@@ -1266,6 +1254,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             else
                 put_page_and_type(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_devr_inc;
+        else
+            act->pin -= GNTPIN_devw_inc;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1274,7 +1268,9 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
         {
             /*
              * Suggests that __gntab_unmap_common failed in
-             * replace_grant_host_mapping() so nothing further to do
+             * replace_grant_host_mapping() or IOMMU handling, so nothing
+             * further to do (short of re-establishing the mapping in the
+             * latter case).
              */
             goto act_release_out;
         }
@@ -1285,6 +1281,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
                 put_page_type(pg);
             put_page(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_hstr_inc;
+        else
+            act->pin -= GNTPIN_hstw_inc;
     }
 
     if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-- 
2.1.4


["xsa218-unstable/0002-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch" (application/octet-stream)]

From 01b0102634c08e12723ae67512326b6b9e4abd4e Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Thu, 15 Jun 2017 12:05:14 +0100
Subject: [PATCH 2/3] gnttab: Avoid potential double-put of maptrack entry

Each grant mapping for a particular domain is tracked by an in-Xen
"maptrack" entry.  This entry is is referenced by a "handle", which is
given to the guest when it calls gnttab_map_grant_ref().

There are two types of mapping a particular handle can refer to:
GNTMAP_host_map and GNTMAP_device_map.  A given
gnttab_unmap_grant_ref() call can remove either only one or both of
these entries.  When a particular handle has no entries left, it must
be freed.

gnttab_unmap_grant_ref() loops through its grant unmap request list
twice.  It first removes entries from any host pagetables and (if
appropraite) iommus; then it does a single domain TLB flush; then it
does the clean-up, including telling the granter that entries are no
longer being used (if appropriate).

At the moment, it's during the first pass that the maptrack flags are
cleared, but the second pass that the maptrack entry is freed.

Unfortunately this allows the following race, which results in a
double-free:

 A: (pass 1) clear host_map
 B: (pass 1) clear device_map
 A: (pass 2) See that maptrack entry has no mappings, free it
 B: (pass 2) See that maptrack entry has no mappings, free it #

Unfortunately, unlike the active entry pinning update, we can't simply
move the maptrack flag changes to the second half, because the
maptrack flags are used to determine if iommu entries need to be
added: a domain's iommu must never have fewer permissions than the
maptrack flags indicate, or a subsequent map_grant_ref() might fail to
add the necessary iommu entries.

Instead, free the maptrack entry in the first pass if there are no
further mappings.

This is part of XSA-218.

Reported-by: Jan Beulich <jbeulich.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 77 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 147afe9..7098af7 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -98,8 +98,8 @@ struct gnttab_unmap_common {
     /* Shared state beteen *_unmap and *_unmap_complete */
     u16 flags;
     unsigned long frame;
-    struct grant_mapping *map;
     struct domain *rd;
+    grant_ref_t ref;
 };
 
 /* Number of unmap operations that are done between each tlb flush */
@@ -1079,6 +1079,8 @@ __gnttab_unmap_common(
     struct grant_table *lgt, *rgt;
     struct active_grant_entry *act;
     s16              rc = 0;
+    struct grant_mapping *map;
+    bool put_handle = false;
 
     ld = current->domain;
     lgt = ld->grant_table;
@@ -1092,11 +1094,11 @@ __gnttab_unmap_common(
         return;
     }
 
-    op->map = &maptrack_entry(lgt, op->handle);
+    map = &maptrack_entry(lgt, op->handle);
 
     grant_read_lock(lgt);
 
-    if ( unlikely(!read_atomic(&op->map->flags)) )
+    if ( unlikely(!read_atomic(&map->flags)) )
     {
         grant_read_unlock(lgt);
         gdprintk(XENLOG_INFO, "Zero flags for handle %#x\n", op->handle);
@@ -1104,7 +1106,7 @@ __gnttab_unmap_common(
         return;
     }
 
-    dom = op->map->domid;
+    dom = map->domid;
     grant_read_unlock(lgt);
 
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
@@ -1129,16 +1131,43 @@ __gnttab_unmap_common(
 
     grant_read_lock(rgt);
 
-    op->flags = read_atomic(&op->map->flags);
-    if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) )
+    op->rd = rd;
+    op->ref = map->ref;
+
+    /*
+     * We can't assume there was no racing unmap for this maptrack entry,
+     * and hence we can't assume map->ref is valid for rd. While the checks
+     * below (with the active entry lock held) will reject any such racing
+     * requests, we still need to make sure we don't attempt to acquire an
+     * invalid lock.
+     */
+    smp_rmb();
+    if ( unlikely(op->ref >= nr_grant_entries(rgt)) )
     {
         gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
         rc = GNTST_bad_handle;
-        goto unmap_out;
+        goto unlock_out;
     }
 
-    op->rd = rd;
-    act = active_entry_acquire(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+
+    /*
+     * Note that we (ab)use the active entry lock here to protect against
+     * multiple unmaps of the same mapping here. We don't want to hold lgt's
+     * lock, and we only hold rgt's lock for reading (but the latter wouldn't
+     * be the right one anyway). Hence the easiest is to rely on a lock we
+     * hold anyway; see docs/misc/grant-tables.txt's "Locking" section.
+     */
+
+    op->flags = read_atomic(&map->flags);
+    smp_rmb();
+    if ( unlikely(!op->flags) || unlikely(map->domid != dom) ||
+         unlikely(map->ref != op->ref) )
+    {
+        gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
+        rc = GNTST_bad_handle;
+        goto act_release_out;
+    }
 
     if ( op->frame == 0 )
     {
@@ -1151,7 +1180,7 @@ __gnttab_unmap_common(
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
 
-        op->map->flags &= ~GNTMAP_device_map;
+        map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1161,14 +1190,23 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        op->map->flags &= ~GNTMAP_host_map;
+        map->flags &= ~GNTMAP_host_map;
+    }
+
+    if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) )
+    {
+        map->flags = 0;
+        put_handle = true;
     }
 
  act_release_out:
     active_entry_release(act);
- unmap_out:
+ unlock_out:
     grant_read_unlock(rgt);
 
+    if ( put_handle )
+        put_maptrack_handle(lgt, op->handle);
+
     if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
         unsigned int kind;
@@ -1205,7 +1243,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     grant_entry_header_t *sha;
     struct page_info *pg;
     uint16_t *status;
-    bool_t put_handle = 0;
 
     if ( rd == NULL )
     { 
@@ -1226,13 +1263,13 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     if ( rgt->gt_version == 0 )
         goto unlock_out;
 
-    act = active_entry_acquire(rgt, op->map->ref);
-    sha = shared_entry_header(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+    sha = shared_entry_header(rgt, op->ref);
 
     if ( rgt->gt_version == 1 )
         status = &sha->flags;
     else
-        status = &status_entry(rgt, op->map->ref);
+        status = &status_entry(rgt, op->ref);
 
     if ( unlikely(op->frame != act->frame) ) 
     {
@@ -1289,9 +1326,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             act->pin -= GNTPIN_hstw_inc;
     }
 
-    if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-        put_handle = 1;
-
     if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) &&
          !(op->flags & GNTMAP_readonly) )
         gnttab_clear_flag(_GTF_writing, status);
@@ -1304,11 +1338,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
  unlock_out:
     grant_read_unlock(rgt);
 
-    if ( put_handle )
-    {
-        op->map->flags = 0;
-        put_maptrack_handle(ld->grant_table, op->handle);
-    }
     rcu_unlock_domain(rd);
 }
 
-- 
2.1.4


["xsa218-unstable/0003-gnttab-correct-maptrack-table-accesses.patch" (application/octet-stream)]

From d61cad5ed9a2c0b06f5cff5c5723c5862942d15d Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Thu, 15 Jun 2017 12:05:29 +0100
Subject: [PATCH 3/3] gnttab: correct maptrack table accesses

In order to observe a consistent (limit,pointer-table) pair, the reader
needs to either hold the maptrack lock (in line with documentation) or
both sides need to order their accesses suitably (the writer side
barrier was removed by commit dff515dfea ["gnttab: use per-VCPU
maptrack free lists"], and a read side barrier has never been there).

Make the writer publish a new table page before limit (for bounds
checks to work), and new list head last (for racing maptrack_entry()
invocations to work). At the same time add read barriers to lockless
readers.

Additionally get_maptrack_handle() must not assume ->maptrack_head to
not change behind its back: Another handle may be put (updating only
->maptrack_tail) and then got or stolen (updating ->maptrack_head).

This is part of XSA-218.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/common/grant_table.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 7098af7..927fd2b 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -395,7 +395,7 @@ get_maptrack_handle(
     struct grant_table *lgt)
 {
     struct vcpu          *curr = current;
-    int                   i;
+    unsigned int          i, head;
     grant_handle_t        handle;
     struct grant_mapping *new_mt;
 
@@ -451,17 +451,20 @@ get_maptrack_handle(
         new_mt[i].ref = handle + i + 1;
         new_mt[i].vcpu = curr->vcpu_id;
     }
-    new_mt[i - 1].ref = curr->maptrack_head;
 
     /* Set tail directly if this is the first page for this VCPU. */
     if ( curr->maptrack_tail == MAPTRACK_TAIL )
         curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
 
-    write_atomic(&curr->maptrack_head, handle + 1);
-
     lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
+    smp_wmb();
     lgt->maptrack_limit += MAPTRACK_PER_PAGE;
 
+    do {
+        new_mt[i - 1].ref = read_atomic(&curr->maptrack_head);
+        head = cmpxchg(&curr->maptrack_head, new_mt[i - 1].ref, handle + 1);
+    } while ( head != new_mt[i - 1].ref );
+
     spin_unlock(&lgt->maptrack_lock);
 
     return handle;
@@ -727,6 +730,7 @@ static unsigned int mapkind(
     for ( handle = 0; !(kind & MAPKIND_WRITE) &&
                       handle < lgt->maptrack_limit; handle++ )
     {
+        smp_rmb();
         map = &maptrack_entry(lgt, handle);
         if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) ||
              map->domid != rd->domain_id )
@@ -1094,6 +1098,7 @@ __gnttab_unmap_common(
         return;
     }
 
+    smp_rmb();
     map = &maptrack_entry(lgt, op->handle);
 
     grant_read_lock(lgt);
-- 
2.1.4


["xsa218-4.5/0001-IOMMU-handle-IOMMU-mapping-and-unmapping-failures.patch" (application/octet-stream)]

From 4ca403ecff82d3efdd838e08b258cb4dd3062c60 Mon Sep 17 00:00:00 2001
From: Quan Xu <quan.xu@intel.com>
Date: Fri, 2 Jun 2017 12:30:34 +0100
Subject: [PATCH 1/4] IOMMU: handle IOMMU mapping and unmapping failures

Treat IOMMU mapping and unmapping failures as a fatal to the DomU
If IOMMU mapping and unmapping failed, crash the DomU and propagate
the error up to the call trees.

No spamming of the log can occur. For DomU, we avoid logging any
message for already dying domains. For Dom0, that'll still be more
verbose than we'd really like, but it at least wouldn't outright
flood the console.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index f2841de..5af95ce 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -200,21 +200,47 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    if ( unlikely(rc) )
+    {
+        if ( !d->is_shutting_down && printk_ratelimit() )
+            printk(XENLOG_ERR
+                   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
+                   d->domain_id, gfn, mfn, rc);
+
+        if ( !is_hardware_domain(d) )
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 int iommu_unmap_page(struct domain *d, unsigned long gfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, gfn);
+    if ( unlikely(rc) )
+    {
+        if ( !d->is_shutting_down && printk_ratelimit() )
+            printk(XENLOG_ERR
+                   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
+                   d->domain_id, gfn, rc);
+
+        if ( !is_hardware_domain(d) )
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 static void iommu_free_pagetables(unsigned long unused)
-- 
2.1.4


["xsa218-4.5/0002-gnttab-fix-unmap-pin-accounting-race.patch" (application/octet-stream)]

From abbbccaa119bf8b05f0828bd9138104f08c4f8df Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Fri, 2 Jun 2017 12:22:42 +0100
Subject: [PATCH 2/4] gnttab: fix unmap pin accounting race

Once all {writable} mappings of a grant entry have been unmapped, the
hypervisor informs the guest that the grant entry has been released by
clearing the _GTF_{reading,writing} usage flags in the guest's grant
table as appropriate.

Unfortunately, at the moment, the code that updates the accounting
happens in a different critical section than the one which updates the
usage flags; this means that under the right circumstances, there may be
a window in time after the hypervisor reported the grant as being free
during which the grant referee still had access to the page.

Move the grant accounting code into the same critical section as the
reporting code to make sure this kind of race can't happen.

This is part of XSA-218.

Reported-by: Jann Horn <jannh.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 935034c..9ef0a6f 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -962,15 +962,8 @@ __gnttab_unmap_common(
             PIN_FAIL(unmap_out, GNTST_general_error,
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
-        if ( op->flags & GNTMAP_device_map )
-        {
-            ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
-            op->map->flags &= ~GNTMAP_device_map;
-            if ( op->flags & GNTMAP_readonly )
-                act->pin -= GNTPIN_devr_inc;
-            else
-                act->pin -= GNTPIN_devw_inc;
-        }
+
+        op->map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -980,12 +973,7 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto unmap_out;
 
-        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
         op->map->flags &= ~GNTMAP_host_map;
-        if ( op->flags & GNTMAP_readonly )
-            act->pin -= GNTPIN_hstr_inc;
-        else
-            act->pin -= GNTPIN_hstw_inc;
     }
 
     if ( gnttab_need_iommu_mapping(ld) )
@@ -1072,6 +1060,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             else
                 put_page_and_type(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_devr_inc;
+        else
+            act->pin -= GNTPIN_devw_inc;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1080,7 +1074,9 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
         {
             /*
              * Suggests that __gntab_unmap_common failed in
-             * replace_grant_host_mapping() so nothing further to do
+             * replace_grant_host_mapping() or IOMMU handling, so nothing
+             * further to do (short of re-establishing the mapping in the
+             * latter case).
              */
             goto unmap_out;
         }
@@ -1091,6 +1087,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
                 put_page_type(pg);
             put_page(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_hstr_inc;
+        else
+            act->pin -= GNTPIN_hstw_inc;
     }
 
     if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-- 
2.1.4


["xsa218-4.5/0003-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch" (application/octet-stream)]

From d188b4bf7c6d17fae1a7d14867aaeb72f972d3ac Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 2 Jun 2017 12:40:04 +0100
Subject: [PATCH 3/4] gnttab: Avoid potential double-put of maptrack entry

Each grant mapping for a particular domain is tracked by an in-Xen
"maptrack" entry.  This entry is is referenced by a "handle", which is
given to the guest when it calls gnttab_map_grant_ref().

There are two types of mapping a particular handle can refer to:
GNTMAP_host_map and GNTMAP_device_map.  A given
gnttab_unmap_grant_ref() call can remove either only one or both of
these entries.  When a particular handle has no entries left, it must
be freed.

gnttab_unmap_grant_ref() loops through its grant unmap request list
twice.  It first removes entries from any host pagetables and (if
appropraite) iommus; then it does a single domain TLB flush; then it
does the clean-up, including telling the granter that entries are no
longer being used (if appropriate).

At the moment, it's during the first pass that the maptrack flags are
cleared, but the second pass that the maptrack entry is freed.

Unfortunately this allows the following race, which results in a
double-free:

 A: (pass 1) clear host_map
 B: (pass 1) clear device_map
 A: (pass 2) See that maptrack entry has no mappings, free it
 B: (pass 2) See that maptrack entry has no mappings, free it #

Unfortunately, unlike the active entry pinning update, we can't simply
move the maptrack flag changes to the second half, because the
maptrack flags are used to determine if iommu entries need to be
added: a domain's iommu must never have fewer permissions than the
maptrack flags indicate, or a subsequent map_grant_ref() might fail to
add the necessary iommu entries.

Instead, free the maptrack entry in the first pass if there are no
further mappings.

This is part of XSA-218.

Reported-by: Jan Beulich <jbeulich.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 63 +++++++++++++++++++++++++++---------------------
 1 file changed, 35 insertions(+), 28 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 9ef0a6f..5203929 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -98,8 +98,8 @@ struct gnttab_unmap_common {
     /* Shared state beteen *_unmap and *_unmap_complete */
     u16 flags;
     unsigned long frame;
-    struct grant_mapping *map;
     struct domain *rd;
+    grant_ref_t ref;
 };
 
 /* Number of unmap operations that are done between each tlb flush */
@@ -893,6 +893,8 @@ __gnttab_unmap_common(
     struct grant_table *lgt, *rgt;
     struct active_grant_entry *act;
     s16              rc = 0;
+    struct grant_mapping *map;
+    bool_t put_handle = 0;
 
     ld = current->domain;
     lgt = ld->grant_table;
@@ -906,10 +908,10 @@ __gnttab_unmap_common(
         return;
     }
 
-    op->map = &maptrack_entry(lgt, op->handle);
+    map = &maptrack_entry(lgt, op->handle);
     spin_lock(&lgt->lock);
 
-    if ( unlikely(!op->map->flags) )
+    if ( unlikely(!map->flags) )
     {
         spin_unlock(&lgt->lock);
         gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle);
@@ -917,7 +919,7 @@ __gnttab_unmap_common(
         return;
     }
 
-    dom = op->map->domid;
+    dom = map->domid;
     spin_unlock(&lgt->lock);
 
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
@@ -941,8 +943,8 @@ __gnttab_unmap_common(
     rgt = rd->grant_table;
     double_gt_lock(lgt, rgt);
 
-    op->flags = op->map->flags;
-    if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) )
+    op->flags = map->flags;
+    if ( unlikely(!op->flags) || unlikely(map->domid != dom) )
     {
         gdprintk(XENLOG_WARNING, "Unstable handle %u\n", op->handle);
         rc = GNTST_bad_handle;
@@ -950,7 +952,8 @@ __gnttab_unmap_common(
     }
 
     op->rd = rd;
-    act = &active_entry(rgt, op->map->ref);
+    op->ref = map->ref;
+    act = &active_entry(rgt, map->ref);
 
     if ( op->frame == 0 )
     {
@@ -963,7 +966,7 @@ __gnttab_unmap_common(
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
 
-        op->map->flags &= ~GNTMAP_device_map;
+        map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -973,31 +976,44 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto unmap_out;
 
-        op->map->flags &= ~GNTMAP_host_map;
+        map->flags &= ~GNTMAP_host_map;
     }
 
-    if ( gnttab_need_iommu_mapping(ld) )
+    if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) )
+    {
+        map->flags = 0;
+        put_handle = 1;
+    }
+
+ unmap_out:
+    double_gt_unlock(lgt, rgt);
+
+    if ( put_handle )
+        put_maptrack_handle(lgt, op->handle);
+
+    if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
         unsigned int wrc, rdc;
         int err = 0;
+
+        double_gt_lock(lgt, rgt);
+
         mapcount(lgt, rd, op->frame, &wrc, &rdc);
         if ( (wrc + rdc) == 0 )
             err = iommu_unmap_page(ld, op->frame);
         else if ( wrc == 0 )
             err = iommu_map_page(ld, op->frame, op->frame, IOMMUF_readable);
+
+        double_gt_unlock(lgt, rgt);
+
         if ( err )
-        {
             rc = GNTST_general_error;
-            goto unmap_out;
-        }
     }
 
     /* If just unmapped a writable mapping, mark as dirtied */
-    if ( !(op->flags & GNTMAP_readonly) )
+    if ( rc == GNTST_okay && !(op->flags & GNTMAP_readonly) )
          gnttab_mark_dirty(rd, op->frame);
 
- unmap_out:
-    double_gt_unlock(lgt, rgt);
     op->status = rc;
     rcu_unlock_domain(rd);
 }
@@ -1011,7 +1027,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     grant_entry_header_t *sha;
     struct page_info *pg;
     uint16_t *status;
-    bool_t put_handle = 0;
 
     if ( rd == NULL )
     { 
@@ -1032,13 +1047,13 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     if ( rgt->gt_version == 0 )
         goto unmap_out;
 
-    act = &active_entry(rgt, op->map->ref);
-    sha = shared_entry_header(rgt, op->map->ref);
+    act = &active_entry(rgt, op->ref);
+    sha = shared_entry_header(rgt, op->ref);
 
     if ( rgt->gt_version == 1 )
         status = &sha->flags;
     else
-        status = &status_entry(rgt, op->map->ref);
+        status = &status_entry(rgt, op->ref);
 
     if ( unlikely(op->frame != act->frame) ) 
     {
@@ -1095,9 +1110,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             act->pin -= GNTPIN_hstw_inc;
     }
 
-    if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-        put_handle = 1;
-
     if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) &&
          !(op->flags & GNTMAP_readonly) )
         gnttab_clear_flag(_GTF_writing, status);
@@ -1107,11 +1119,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
 
  unmap_out:
     spin_unlock(&rgt->lock);
-    if ( put_handle )
-    {
-        op->map->flags = 0;
-        put_maptrack_handle(ld->grant_table, op->handle);
-    }
     rcu_unlock_domain(rd);
 }
 
-- 
2.1.4


["xsa218-4.5/0004-gnttab-correct-maptrack-table-accesses.patch" (application/octet-stream)]

From 1b0cf99d046286c601a28af0ef7c1a17eb3eb2cb Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Thu, 15 Jun 2017 12:05:29 +0100
Subject: [PATCH 4/4] gnttab: correct maptrack table accesses

In order to observe a consistent (limit,pointer-table) pair, the reader
needs to either hold the grant table lock or both sides need to order
their accesses suitably (the writer side barrier is already there). Add
the missing barrier.

This is part of XSA-218.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/common/grant_table.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 5203929..ac98aef 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -908,7 +908,9 @@ __gnttab_unmap_common(
         return;
     }
 
+    smp_rmb();
     map = &maptrack_entry(lgt, op->handle);
+
     spin_lock(&lgt->lock);
 
     if ( unlikely(!map->flags) )
-- 
2.1.4


["xsa218-4.6/0001-IOMMU-handle-IOMMU-mapping-and-unmapping-failures.patch" (application/octet-stream)]

From d5f169f853cc6ae4295565e3c4765a270278c5c0 Mon Sep 17 00:00:00 2001
From: Quan Xu <quan.xu@intel.com>
Date: Fri, 2 Jun 2017 12:30:34 +0100
Subject: [PATCH 1/4] IOMMU: handle IOMMU mapping and unmapping failures

Treat IOMMU mapping and unmapping failures as a fatal to the DomU
If IOMMU mapping and unmapping failed, crash the DomU and propagate
the error up to the call trees.

No spamming of the log can occur. For DomU, we avoid logging any
message for already dying domains. For Dom0, that'll still be more
verbose than we'd really like, but it at least wouldn't outright
flood the console.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index dfb0fdd..6394b24 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -228,21 +228,47 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    if ( unlikely(rc) )
+    {
+        if ( !d->is_shutting_down && printk_ratelimit() )
+            printk(XENLOG_ERR
+                   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
+                   d->domain_id, gfn, mfn, rc);
+
+        if ( !is_hardware_domain(d) )
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 int iommu_unmap_page(struct domain *d, unsigned long gfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, gfn);
+    if ( unlikely(rc) )
+    {
+        if ( !d->is_shutting_down && printk_ratelimit() )
+            printk(XENLOG_ERR
+                   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
+                   d->domain_id, gfn, rc);
+
+        if ( !is_hardware_domain(d) )
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 static void iommu_free_pagetables(unsigned long unused)
-- 
2.1.4


["xsa218-4.6/0002-gnttab-fix-unmap-pin-accounting-race.patch" (application/octet-stream)]

From 951da330060745d7c5ffe3628ca78be34d0b325b Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Fri, 2 Jun 2017 12:22:42 +0100
Subject: [PATCH 2/4] gnttab: fix unmap pin accounting race

Once all {writable} mappings of a grant entry have been unmapped, the
hypervisor informs the guest that the grant entry has been released by
clearing the _GTF_{reading,writing} usage flags in the guest's grant
table as appropriate.

Unfortunately, at the moment, the code that updates the accounting
happens in a different critical section than the one which updates the
usage flags; this means that under the right circumstances, there may be
a window in time after the hypervisor reported the grant as being free
during which the grant referee still had access to the page.

Move the grant accounting code into the same critical section as the
reporting code to make sure this kind of race can't happen.

This is part of XSA-218.

Reported-by: Jann Horn <jannh.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 2b449d5..bd62339 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1137,15 +1137,8 @@ __gnttab_unmap_common(
             PIN_FAIL(act_release_out, GNTST_general_error,
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
-        if ( op->flags & GNTMAP_device_map )
-        {
-            ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
-            op->map->flags &= ~GNTMAP_device_map;
-            if ( op->flags & GNTMAP_readonly )
-                act->pin -= GNTPIN_devr_inc;
-            else
-                act->pin -= GNTPIN_devw_inc;
-        }
+
+        op->map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1155,12 +1148,7 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
         op->map->flags &= ~GNTMAP_host_map;
-        if ( op->flags & GNTMAP_readonly )
-            act->pin -= GNTPIN_hstr_inc;
-        else
-            act->pin -= GNTPIN_hstw_inc;
     }
 
  act_release_out:
@@ -1253,6 +1241,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             else
                 put_page_and_type(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_devr_inc;
+        else
+            act->pin -= GNTPIN_devw_inc;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1261,7 +1255,9 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
         {
             /*
              * Suggests that __gntab_unmap_common failed in
-             * replace_grant_host_mapping() so nothing further to do
+             * replace_grant_host_mapping() or IOMMU handling, so nothing
+             * further to do (short of re-establishing the mapping in the
+             * latter case).
              */
             goto act_release_out;
         }
@@ -1272,6 +1268,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
                 put_page_type(pg);
             put_page(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_hstr_inc;
+        else
+            act->pin -= GNTPIN_hstw_inc;
     }
 
     if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-- 
2.1.4


["xsa218-4.6/0003-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch" (application/octet-stream)]

From 12edc6f9a567808b1e1f8fb1bc8ef0d1611c756a Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Thu, 15 Jun 2017 12:05:14 +0100
Subject: [PATCH 3/4] gnttab: Avoid potential double-put of maptrack entry

Each grant mapping for a particular domain is tracked by an in-Xen
"maptrack" entry.  This entry is is referenced by a "handle", which is
given to the guest when it calls gnttab_map_grant_ref().

There are two types of mapping a particular handle can refer to:
GNTMAP_host_map and GNTMAP_device_map.  A given
gnttab_unmap_grant_ref() call can remove either only one or both of
these entries.  When a particular handle has no entries left, it must
be freed.

gnttab_unmap_grant_ref() loops through its grant unmap request list
twice.  It first removes entries from any host pagetables and (if
appropraite) iommus; then it does a single domain TLB flush; then it
does the clean-up, including telling the granter that entries are no
longer being used (if appropriate).

At the moment, it's during the first pass that the maptrack flags are
cleared, but the second pass that the maptrack entry is freed.

Unfortunately this allows the following race, which results in a
double-free:

 A: (pass 1) clear host_map
 B: (pass 1) clear device_map
 A: (pass 2) See that maptrack entry has no mappings, free it
 B: (pass 2) See that maptrack entry has no mappings, free it #

Unfortunately, unlike the active entry pinning update, we can't simply
move the maptrack flag changes to the second half, because the
maptrack flags are used to determine if iommu entries need to be
added: a domain's iommu must never have fewer permissions than the
maptrack flags indicate, or a subsequent map_grant_ref() might fail to
add the necessary iommu entries.

Instead, free the maptrack entry in the first pass if there are no
further mappings.

This is part of XSA-218.

Reported-by: Jan Beulich <jbeulich.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 79 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 54 insertions(+), 25 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index bd62339..747bf1a 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -98,8 +98,8 @@ struct gnttab_unmap_common {
     /* Shared state beteen *_unmap and *_unmap_complete */
     u16 flags;
     unsigned long frame;
-    struct grant_mapping *map;
     struct domain *rd;
+    grant_ref_t ref;
 };
 
 /* Number of unmap operations that are done between each tlb flush */
@@ -1066,6 +1066,8 @@ __gnttab_unmap_common(
     struct grant_table *lgt, *rgt;
     struct active_grant_entry *act;
     s16              rc = 0;
+    struct grant_mapping *map;
+    bool_t put_handle = 0;
 
     ld = current->domain;
     lgt = ld->grant_table;
@@ -1079,11 +1081,11 @@ __gnttab_unmap_common(
         return;
     }
 
-    op->map = &maptrack_entry(lgt, op->handle);
+    map = &maptrack_entry(lgt, op->handle);
 
     read_lock(&lgt->lock);
 
-    if ( unlikely(!read_atomic(&op->map->flags)) )
+    if ( unlikely(!read_atomic(&map->flags)) )
     {
         read_unlock(&lgt->lock);
         gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle);
@@ -1091,7 +1093,7 @@ __gnttab_unmap_common(
         return;
     }
 
-    dom = op->map->domid;
+    dom = map->domid;
     read_unlock(&lgt->lock);
 
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
@@ -1116,16 +1118,43 @@ __gnttab_unmap_common(
 
     read_lock(&rgt->lock);
 
-    op->flags = read_atomic(&op->map->flags);
-    if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) )
+    op->rd = rd;
+    op->ref = map->ref;
+
+    /*
+     * We can't assume there was no racing unmap for this maptrack entry,
+     * and hence we can't assume map->ref is valid for rd. While the checks
+     * below (with the active entry lock held) will reject any such racing
+     * requests, we still need to make sure we don't attempt to acquire an
+     * invalid lock.
+     */
+    smp_rmb();
+    if ( unlikely(op->ref >= nr_grant_entries(rgt)) )
     {
-        gdprintk(XENLOG_WARNING, "Unstable handle %u\n", op->handle);
+        gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
         rc = GNTST_bad_handle;
-        goto unmap_out;
+        goto unlock_out;
     }
 
-    op->rd = rd;
-    act = active_entry_acquire(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+
+    /*
+     * Note that we (ab)use the active entry lock here to protect against
+     * multiple unmaps of the same mapping here. We don't want to hold lgt's
+     * lock, and we only hold rgt's lock for reading (but the latter wouldn't
+     * be the right one anyway). Hence the easiest is to rely on a lock we
+     * hold anyway; see docs/misc/grant-tables.txt's "Locking" section.
+     */
+
+    op->flags = read_atomic(&map->flags);
+    smp_rmb();
+    if ( unlikely(!op->flags) || unlikely(map->domid != dom) ||
+         unlikely(map->ref != op->ref) )
+    {
+        gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
+        rc = GNTST_bad_handle;
+        goto act_release_out;
+    }
 
     if ( op->frame == 0 )
     {
@@ -1138,7 +1167,7 @@ __gnttab_unmap_common(
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
 
-        op->map->flags &= ~GNTMAP_device_map;
+        map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1148,14 +1177,23 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        op->map->flags &= ~GNTMAP_host_map;
+        map->flags &= ~GNTMAP_host_map;
+    }
+
+    if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) )
+    {
+        map->flags = 0;
+        put_handle = 1;
     }
 
  act_release_out:
     active_entry_release(act);
- unmap_out:
+ unlock_out:
     read_unlock(&rgt->lock);
 
+    if ( put_handle )
+        put_maptrack_handle(lgt, op->handle);
+
     if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
         unsigned int kind;
@@ -1192,7 +1230,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     grant_entry_header_t *sha;
     struct page_info *pg;
     uint16_t *status;
-    bool_t put_handle = 0;
 
     if ( rd == NULL )
     { 
@@ -1213,13 +1250,13 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     if ( rgt->gt_version == 0 )
         goto unlock_out;
 
-    act = active_entry_acquire(rgt, op->map->ref);
-    sha = shared_entry_header(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+    sha = shared_entry_header(rgt, op->ref);
 
     if ( rgt->gt_version == 1 )
         status = &sha->flags;
     else
-        status = &status_entry(rgt, op->map->ref);
+        status = &status_entry(rgt, op->ref);
 
     if ( unlikely(op->frame != act->frame) ) 
     {
@@ -1276,9 +1313,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             act->pin -= GNTPIN_hstw_inc;
     }
 
-    if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-        put_handle = 1;
-
     if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) &&
          !(op->flags & GNTMAP_readonly) )
         gnttab_clear_flag(_GTF_writing, status);
@@ -1291,11 +1325,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
  unlock_out:
     read_unlock(&rgt->lock);
 
-    if ( put_handle )
-    {
-        op->map->flags = 0;
-        put_maptrack_handle(ld->grant_table, op->handle);
-    }
     rcu_unlock_domain(rd);
 }
 
-- 
2.1.4


["xsa218-4.6/0004-gnttab-correct-maptrack-table-accesses.patch" (application/octet-stream)]

From 44b38e8fa323252238fa6a55111001389aff2412 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Thu, 15 Jun 2017 12:05:29 +0100
Subject: [PATCH 4/4] gnttab: correct maptrack table accesses

In order to observe a consistent (limit,pointer-table) pair, the reader
needs to either hold the maptrack lock (in line with documentation) or
both sides need to order their accesses suitably (the writer side
barrier was removed by commit dff515dfea ["gnttab: use per-VCPU
maptrack free lists"], and a read side barrier has never been there).

Make the writer publish a new table page before limit (for bounds
checks to work), and new list head last (for racing maptrack_entry()
invocations to work). At the same time add read barriers to lockless
readers.

Additionally get_maptrack_handle() must not assume ->maptrack_head to
not change behind its back: Another handle may be put (updating only
->maptrack_tail) and then got or stolen (updating ->maptrack_head).

This is part of XSA-218.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/common/grant_table.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 747bf1a..a642763 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -387,7 +387,7 @@ get_maptrack_handle(
     struct grant_table *lgt)
 {
     struct vcpu          *curr = current;
-    int                   i;
+    unsigned int          i, head;
     grant_handle_t        handle;
     struct grant_mapping *new_mt;
 
@@ -443,17 +443,20 @@ get_maptrack_handle(
         new_mt[i].ref = handle + i + 1;
         new_mt[i].vcpu = curr->vcpu_id;
     }
-    new_mt[i - 1].ref = curr->maptrack_head;
 
     /* Set tail directly if this is the first page for this VCPU. */
     if ( curr->maptrack_tail == MAPTRACK_TAIL )
         curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
 
-    write_atomic(&curr->maptrack_head, handle + 1);
-
     lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
+    smp_wmb();
     lgt->maptrack_limit += MAPTRACK_PER_PAGE;
 
+    do {
+        new_mt[i - 1].ref = read_atomic(&curr->maptrack_head);
+        head = cmpxchg(&curr->maptrack_head, new_mt[i - 1].ref, handle + 1);
+    } while ( head != new_mt[i - 1].ref );
+
     spin_unlock(&lgt->maptrack_lock);
 
     return handle;
@@ -713,6 +716,7 @@ static unsigned int mapkind(
     for ( handle = 0; !(kind & MAPKIND_WRITE) &&
                       handle < lgt->maptrack_limit; handle++ )
     {
+        smp_rmb();
         map = &maptrack_entry(lgt, handle);
         if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) ||
              map->domid != rd->domain_id )
@@ -1081,6 +1085,7 @@ __gnttab_unmap_common(
         return;
     }
 
+    smp_rmb();
     map = &maptrack_entry(lgt, op->handle);
 
     read_lock(&lgt->lock);
-- 
2.1.4


["xsa218-4.7/0001-IOMMU-handle-IOMMU-mapping-and-unmapping-failures.patch" (application/octet-stream)]

From 03f872b98f24e25cafb478b5d7c34e1eb18e1e4c Mon Sep 17 00:00:00 2001
From: Quan Xu <quan.xu@intel.com>
Date: Fri, 2 Jun 2017 12:30:34 +0100
Subject: [PATCH 1/4] IOMMU: handle IOMMU mapping and unmapping failures

Treat IOMMU mapping and unmapping failures as a fatal to the DomU
If IOMMU mapping and unmapping failed, crash the DomU and propagate
the error up to the call trees.

No spamming of the log can occur. For DomU, we avoid logging any
message for already dying domains. For Dom0, that'll still be more
verbose than we'd really like, but it at least wouldn't outright
flood the console.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 1a315ee..927966f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -239,21 +239,47 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    if ( unlikely(rc) )
+    {
+        if ( !d->is_shutting_down && printk_ratelimit() )
+            printk(XENLOG_ERR
+                   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
+                   d->domain_id, gfn, mfn, rc);
+
+        if ( !is_hardware_domain(d) )
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 int iommu_unmap_page(struct domain *d, unsigned long gfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, gfn);
+    if ( unlikely(rc) )
+    {
+        if ( !d->is_shutting_down && printk_ratelimit() )
+            printk(XENLOG_ERR
+                   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
+                   d->domain_id, gfn, rc);
+
+        if ( !is_hardware_domain(d) )
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 static void iommu_free_pagetables(unsigned long unused)
-- 
2.1.4


["xsa218-4.7/0002-gnttab-fix-unmap-pin-accounting-race.patch" (application/octet-stream)]

From 2c146b4f763f47180a0effb8d8045b0ebb93652c Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Fri, 2 Jun 2017 12:22:42 +0100
Subject: [PATCH 2/4] gnttab: fix unmap pin accounting race

Once all {writable} mappings of a grant entry have been unmapped, the
hypervisor informs the guest that the grant entry has been released by
clearing the _GTF_{reading,writing} usage flags in the guest's grant
table as appropriate.

Unfortunately, at the moment, the code that updates the accounting
happens in a different critical section than the one which updates the
usage flags; this means that under the right circumstances, there may be
a window in time after the hypervisor reported the grant as being free
during which the grant referee still had access to the page.

Move the grant accounting code into the same critical section as the
reporting code to make sure this kind of race can't happen.

This is part of XSA-218.

Reported-by: Jann Horn <jannh.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 8b22299..cfc483f 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1150,15 +1150,8 @@ __gnttab_unmap_common(
             PIN_FAIL(act_release_out, GNTST_general_error,
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
-        if ( op->flags & GNTMAP_device_map )
-        {
-            ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
-            op->map->flags &= ~GNTMAP_device_map;
-            if ( op->flags & GNTMAP_readonly )
-                act->pin -= GNTPIN_devr_inc;
-            else
-                act->pin -= GNTPIN_devw_inc;
-        }
+
+        op->map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1168,12 +1161,7 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
         op->map->flags &= ~GNTMAP_host_map;
-        if ( op->flags & GNTMAP_readonly )
-            act->pin -= GNTPIN_hstr_inc;
-        else
-            act->pin -= GNTPIN_hstw_inc;
     }
 
  act_release_out:
@@ -1266,6 +1254,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             else
                 put_page_and_type(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_devr_inc;
+        else
+            act->pin -= GNTPIN_devw_inc;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1274,7 +1268,9 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
         {
             /*
              * Suggests that __gntab_unmap_common failed in
-             * replace_grant_host_mapping() so nothing further to do
+             * replace_grant_host_mapping() or IOMMU handling, so nothing
+             * further to do (short of re-establishing the mapping in the
+             * latter case).
              */
             goto act_release_out;
         }
@@ -1285,6 +1281,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
                 put_page_type(pg);
             put_page(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_hstr_inc;
+        else
+            act->pin -= GNTPIN_hstw_inc;
     }
 
     if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-- 
2.1.4


["xsa218-4.7/0003-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch" (application/octet-stream)]

From 39b704785a8d330c02e8e2d2368c80dbaf679bc0 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Thu, 15 Jun 2017 12:05:14 +0100
Subject: [PATCH 3/4] gnttab: Avoid potential double-put of maptrack entry

Each grant mapping for a particular domain is tracked by an in-Xen
"maptrack" entry.  This entry is is referenced by a "handle", which is
given to the guest when it calls gnttab_map_grant_ref().

There are two types of mapping a particular handle can refer to:
GNTMAP_host_map and GNTMAP_device_map.  A given
gnttab_unmap_grant_ref() call can remove either only one or both of
these entries.  When a particular handle has no entries left, it must
be freed.

gnttab_unmap_grant_ref() loops through its grant unmap request list
twice.  It first removes entries from any host pagetables and (if
appropraite) iommus; then it does a single domain TLB flush; then it
does the clean-up, including telling the granter that entries are no
longer being used (if appropriate).

At the moment, it's during the first pass that the maptrack flags are
cleared, but the second pass that the maptrack entry is freed.

Unfortunately this allows the following race, which results in a
double-free:

 A: (pass 1) clear host_map
 B: (pass 1) clear device_map
 A: (pass 2) See that maptrack entry has no mappings, free it
 B: (pass 2) See that maptrack entry has no mappings, free it #

Unfortunately, unlike the active entry pinning update, we can't simply
move the maptrack flag changes to the second half, because the
maptrack flags are used to determine if iommu entries need to be
added: a domain's iommu must never have fewer permissions than the
maptrack flags indicate, or a subsequent map_grant_ref() might fail to
add the necessary iommu entries.

Instead, free the maptrack entry in the first pass if there are no
further mappings.

This is part of XSA-218.

Reported-by: Jan Beulich <jbeulich.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 79 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 54 insertions(+), 25 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index cfc483f..81a1a8b 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -98,8 +98,8 @@ struct gnttab_unmap_common {
     /* Shared state beteen *_unmap and *_unmap_complete */
     u16 flags;
     unsigned long frame;
-    struct grant_mapping *map;
     struct domain *rd;
+    grant_ref_t ref;
 };
 
 /* Number of unmap operations that are done between each tlb flush */
@@ -1079,6 +1079,8 @@ __gnttab_unmap_common(
     struct grant_table *lgt, *rgt;
     struct active_grant_entry *act;
     s16              rc = 0;
+    struct grant_mapping *map;
+    bool_t put_handle = 0;
 
     ld = current->domain;
     lgt = ld->grant_table;
@@ -1092,11 +1094,11 @@ __gnttab_unmap_common(
         return;
     }
 
-    op->map = &maptrack_entry(lgt, op->handle);
+    map = &maptrack_entry(lgt, op->handle);
 
     grant_read_lock(lgt);
 
-    if ( unlikely(!read_atomic(&op->map->flags)) )
+    if ( unlikely(!read_atomic(&map->flags)) )
     {
         grant_read_unlock(lgt);
         gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle);
@@ -1104,7 +1106,7 @@ __gnttab_unmap_common(
         return;
     }
 
-    dom = op->map->domid;
+    dom = map->domid;
     grant_read_unlock(lgt);
 
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
@@ -1129,16 +1131,43 @@ __gnttab_unmap_common(
 
     grant_read_lock(rgt);
 
-    op->flags = read_atomic(&op->map->flags);
-    if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) )
+    op->rd = rd;
+    op->ref = map->ref;
+
+    /*
+     * We can't assume there was no racing unmap for this maptrack entry,
+     * and hence we can't assume map->ref is valid for rd. While the checks
+     * below (with the active entry lock held) will reject any such racing
+     * requests, we still need to make sure we don't attempt to acquire an
+     * invalid lock.
+     */
+    smp_rmb();
+    if ( unlikely(op->ref >= nr_grant_entries(rgt)) )
     {
-        gdprintk(XENLOG_WARNING, "Unstable handle %u\n", op->handle);
+        gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
         rc = GNTST_bad_handle;
-        goto unmap_out;
+        goto unlock_out;
     }
 
-    op->rd = rd;
-    act = active_entry_acquire(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+
+    /*
+     * Note that we (ab)use the active entry lock here to protect against
+     * multiple unmaps of the same mapping here. We don't want to hold lgt's
+     * lock, and we only hold rgt's lock for reading (but the latter wouldn't
+     * be the right one anyway). Hence the easiest is to rely on a lock we
+     * hold anyway; see docs/misc/grant-tables.txt's "Locking" section.
+     */
+
+    op->flags = read_atomic(&map->flags);
+    smp_rmb();
+    if ( unlikely(!op->flags) || unlikely(map->domid != dom) ||
+         unlikely(map->ref != op->ref) )
+    {
+        gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
+        rc = GNTST_bad_handle;
+        goto act_release_out;
+    }
 
     if ( op->frame == 0 )
     {
@@ -1151,7 +1180,7 @@ __gnttab_unmap_common(
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
 
-        op->map->flags &= ~GNTMAP_device_map;
+        map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1161,14 +1190,23 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        op->map->flags &= ~GNTMAP_host_map;
+        map->flags &= ~GNTMAP_host_map;
+    }
+
+    if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) )
+    {
+        map->flags = 0;
+        put_handle = 1;
     }
 
  act_release_out:
     active_entry_release(act);
- unmap_out:
+ unlock_out:
     grant_read_unlock(rgt);
 
+    if ( put_handle )
+        put_maptrack_handle(lgt, op->handle);
+
     if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
         unsigned int kind;
@@ -1205,7 +1243,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     grant_entry_header_t *sha;
     struct page_info *pg;
     uint16_t *status;
-    bool_t put_handle = 0;
 
     if ( rd == NULL )
     { 
@@ -1226,13 +1263,13 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     if ( rgt->gt_version == 0 )
         goto unlock_out;
 
-    act = active_entry_acquire(rgt, op->map->ref);
-    sha = shared_entry_header(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+    sha = shared_entry_header(rgt, op->ref);
 
     if ( rgt->gt_version == 1 )
         status = &sha->flags;
     else
-        status = &status_entry(rgt, op->map->ref);
+        status = &status_entry(rgt, op->ref);
 
     if ( unlikely(op->frame != act->frame) ) 
     {
@@ -1289,9 +1326,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             act->pin -= GNTPIN_hstw_inc;
     }
 
-    if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-        put_handle = 1;
-
     if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) &&
          !(op->flags & GNTMAP_readonly) )
         gnttab_clear_flag(_GTF_writing, status);
@@ -1304,11 +1338,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
  unlock_out:
     grant_read_unlock(rgt);
 
-    if ( put_handle )
-    {
-        op->map->flags = 0;
-        put_maptrack_handle(ld->grant_table, op->handle);
-    }
     rcu_unlock_domain(rd);
 }
 
-- 
2.1.4


["xsa218-4.7/0004-gnttab-correct-maptrack-table-accesses.patch" (application/octet-stream)]

From bb765f7863e5d19eebcfb29c117e2909bce241e7 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Thu, 15 Jun 2017 12:05:29 +0100
Subject: [PATCH 4/4] gnttab: correct maptrack table accesses

In order to observe a consistent (limit,pointer-table) pair, the reader
needs to either hold the maptrack lock (in line with documentation) or
both sides need to order their accesses suitably (the writer side
barrier was removed by commit dff515dfea ["gnttab: use per-VCPU
maptrack free lists"], and a read side barrier has never been there).

Make the writer publish a new table page before limit (for bounds
checks to work), and new list head last (for racing maptrack_entry()
invocations to work). At the same time add read barriers to lockless
readers.

Additionally get_maptrack_handle() must not assume ->maptrack_head to
not change behind its back: Another handle may be put (updating only
->maptrack_tail) and then got or stolen (updating ->maptrack_head).

This is part of XSA-218.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/common/grant_table.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 81a1a8b..c4d73af 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -395,7 +395,7 @@ get_maptrack_handle(
     struct grant_table *lgt)
 {
     struct vcpu          *curr = current;
-    int                   i;
+    unsigned int          i, head;
     grant_handle_t        handle;
     struct grant_mapping *new_mt;
 
@@ -451,17 +451,20 @@ get_maptrack_handle(
         new_mt[i].ref = handle + i + 1;
         new_mt[i].vcpu = curr->vcpu_id;
     }
-    new_mt[i - 1].ref = curr->maptrack_head;
 
     /* Set tail directly if this is the first page for this VCPU. */
     if ( curr->maptrack_tail == MAPTRACK_TAIL )
         curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
 
-    write_atomic(&curr->maptrack_head, handle + 1);
-
     lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
+    smp_wmb();
     lgt->maptrack_limit += MAPTRACK_PER_PAGE;
 
+    do {
+        new_mt[i - 1].ref = read_atomic(&curr->maptrack_head);
+        head = cmpxchg(&curr->maptrack_head, new_mt[i - 1].ref, handle + 1);
+    } while ( head != new_mt[i - 1].ref );
+
     spin_unlock(&lgt->maptrack_lock);
 
     return handle;
@@ -727,6 +730,7 @@ static unsigned int mapkind(
     for ( handle = 0; !(kind & MAPKIND_WRITE) &&
                       handle < lgt->maptrack_limit; handle++ )
     {
+        smp_rmb();
         map = &maptrack_entry(lgt, handle);
         if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) ||
              map->domid != rd->domain_id )
@@ -1094,6 +1098,7 @@ __gnttab_unmap_common(
         return;
     }
 
+    smp_rmb();
     map = &maptrack_entry(lgt, op->handle);
 
     grant_read_lock(lgt);
-- 
2.1.4


["xsa218-4.8/0001-gnttab-fix-unmap-pin-accounting-race.patch" (application/octet-stream)]

From 25263d50f1440e3c1ff7782892e81f2612bcfce1 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Fri, 2 Jun 2017 12:22:42 +0100
Subject: [PATCH 1/3] gnttab: fix unmap pin accounting race

Once all {writable} mappings of a grant entry have been unmapped, the
hypervisor informs the guest that the grant entry has been released by
clearing the _GTF_{reading,writing} usage flags in the guest's grant
table as appropriate.

Unfortunately, at the moment, the code that updates the accounting
happens in a different critical section than the one which updates the
usage flags; this means that under the right circumstances, there may be
a window in time after the hypervisor reported the grant as being free
during which the grant referee still had access to the page.

Move the grant accounting code into the same critical section as the
reporting code to make sure this kind of race can't happen.

This is part of XSA-218.

Reported-by: Jann Horn <jannh.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index e2c4097..d80bd49 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1150,15 +1150,8 @@ __gnttab_unmap_common(
             PIN_FAIL(act_release_out, GNTST_general_error,
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
-        if ( op->flags & GNTMAP_device_map )
-        {
-            ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
-            op->map->flags &= ~GNTMAP_device_map;
-            if ( op->flags & GNTMAP_readonly )
-                act->pin -= GNTPIN_devr_inc;
-            else
-                act->pin -= GNTPIN_devw_inc;
-        }
+
+        op->map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1168,12 +1161,7 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
         op->map->flags &= ~GNTMAP_host_map;
-        if ( op->flags & GNTMAP_readonly )
-            act->pin -= GNTPIN_hstr_inc;
-        else
-            act->pin -= GNTPIN_hstw_inc;
     }
 
  act_release_out:
@@ -1266,6 +1254,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             else
                 put_page_and_type(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_devw_mask | GNTPIN_devr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_devr_inc;
+        else
+            act->pin -= GNTPIN_devw_inc;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1274,7 +1268,9 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
         {
             /*
              * Suggests that __gntab_unmap_common failed in
-             * replace_grant_host_mapping() so nothing further to do
+             * replace_grant_host_mapping() or IOMMU handling, so nothing
+             * further to do (short of re-establishing the mapping in the
+             * latter case).
              */
             goto act_release_out;
         }
@@ -1285,6 +1281,12 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
                 put_page_type(pg);
             put_page(pg);
         }
+
+        ASSERT(act->pin & (GNTPIN_hstw_mask | GNTPIN_hstr_mask));
+        if ( op->flags & GNTMAP_readonly )
+            act->pin -= GNTPIN_hstr_inc;
+        else
+            act->pin -= GNTPIN_hstw_inc;
     }
 
     if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-- 
2.1.4


["xsa218-4.8/0002-gnttab-Avoid-potential-double-put-of-maptrack-entry.patch" (application/octet-stream)]

From bb6d476b09e635baf5e9fb22540ab7c3530d1d98 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Thu, 15 Jun 2017 12:05:14 +0100
Subject: [PATCH 2/3] gnttab: Avoid potential double-put of maptrack entry

Each grant mapping for a particular domain is tracked by an in-Xen
"maptrack" entry.  This entry is is referenced by a "handle", which is
given to the guest when it calls gnttab_map_grant_ref().

There are two types of mapping a particular handle can refer to:
GNTMAP_host_map and GNTMAP_device_map.  A given
gnttab_unmap_grant_ref() call can remove either only one or both of
these entries.  When a particular handle has no entries left, it must
be freed.

gnttab_unmap_grant_ref() loops through its grant unmap request list
twice.  It first removes entries from any host pagetables and (if
appropraite) iommus; then it does a single domain TLB flush; then it
does the clean-up, including telling the granter that entries are no
longer being used (if appropriate).

At the moment, it's during the first pass that the maptrack flags are
cleared, but the second pass that the maptrack entry is freed.

Unfortunately this allows the following race, which results in a
double-free:

 A: (pass 1) clear host_map
 B: (pass 1) clear device_map
 A: (pass 2) See that maptrack entry has no mappings, free it
 B: (pass 2) See that maptrack entry has no mappings, free it #

Unfortunately, unlike the active entry pinning update, we can't simply
move the maptrack flag changes to the second half, because the
maptrack flags are used to determine if iommu entries need to be
added: a domain's iommu must never have fewer permissions than the
maptrack flags indicate, or a subsequent map_grant_ref() might fail to
add the necessary iommu entries.

Instead, free the maptrack entry in the first pass if there are no
further mappings.

This is part of XSA-218.

Reported-by: Jan Beulich <jbeulich.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 xen/common/grant_table.c | 77 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index d80bd49..ba10e76 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -98,8 +98,8 @@ struct gnttab_unmap_common {
     /* Shared state beteen *_unmap and *_unmap_complete */
     u16 flags;
     unsigned long frame;
-    struct grant_mapping *map;
     struct domain *rd;
+    grant_ref_t ref;
 };
 
 /* Number of unmap operations that are done between each tlb flush */
@@ -1079,6 +1079,8 @@ __gnttab_unmap_common(
     struct grant_table *lgt, *rgt;
     struct active_grant_entry *act;
     s16              rc = 0;
+    struct grant_mapping *map;
+    bool put_handle = false;
 
     ld = current->domain;
     lgt = ld->grant_table;
@@ -1092,11 +1094,11 @@ __gnttab_unmap_common(
         return;
     }
 
-    op->map = &maptrack_entry(lgt, op->handle);
+    map = &maptrack_entry(lgt, op->handle);
 
     grant_read_lock(lgt);
 
-    if ( unlikely(!read_atomic(&op->map->flags)) )
+    if ( unlikely(!read_atomic(&map->flags)) )
     {
         grant_read_unlock(lgt);
         gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle);
@@ -1104,7 +1106,7 @@ __gnttab_unmap_common(
         return;
     }
 
-    dom = op->map->domid;
+    dom = map->domid;
     grant_read_unlock(lgt);
 
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
@@ -1129,16 +1131,43 @@ __gnttab_unmap_common(
 
     grant_read_lock(rgt);
 
-    op->flags = read_atomic(&op->map->flags);
-    if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) )
+    op->rd = rd;
+    op->ref = map->ref;
+
+    /*
+     * We can't assume there was no racing unmap for this maptrack entry,
+     * and hence we can't assume map->ref is valid for rd. While the checks
+     * below (with the active entry lock held) will reject any such racing
+     * requests, we still need to make sure we don't attempt to acquire an
+     * invalid lock.
+     */
+    smp_rmb();
+    if ( unlikely(op->ref >= nr_grant_entries(rgt)) )
     {
         gdprintk(XENLOG_WARNING, "Unstable handle %u\n", op->handle);
         rc = GNTST_bad_handle;
-        goto unmap_out;
+        goto unlock_out;
     }
 
-    op->rd = rd;
-    act = active_entry_acquire(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+
+    /*
+     * Note that we (ab)use the active entry lock here to protect against
+     * multiple unmaps of the same mapping here. We don't want to hold lgt's
+     * lock, and we only hold rgt's lock for reading (but the latter wouldn't
+     * be the right one anyway). Hence the easiest is to rely on a lock we
+     * hold anyway; see docs/misc/grant-tables.txt's "Locking" section.
+     */
+
+    op->flags = read_atomic(&map->flags);
+    smp_rmb();
+    if ( unlikely(!op->flags) || unlikely(map->domid != dom) ||
+         unlikely(map->ref != op->ref) )
+    {
+        gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle);
+        rc = GNTST_bad_handle;
+        goto act_release_out;
+    }
 
     if ( op->frame == 0 )
     {
@@ -1151,7 +1180,7 @@ __gnttab_unmap_common(
                      "Bad frame number doesn't match gntref. (%lx != %lx)\n",
                      op->frame, act->frame);
 
-        op->map->flags &= ~GNTMAP_device_map;
+        map->flags &= ~GNTMAP_device_map;
     }
 
     if ( (op->host_addr != 0) && (op->flags & GNTMAP_host_map) )
@@ -1161,14 +1190,23 @@ __gnttab_unmap_common(
                                               op->flags)) < 0 )
             goto act_release_out;
 
-        op->map->flags &= ~GNTMAP_host_map;
+        map->flags &= ~GNTMAP_host_map;
+    }
+
+    if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) )
+    {
+        map->flags = 0;
+        put_handle = true;
     }
 
  act_release_out:
     active_entry_release(act);
- unmap_out:
+ unlock_out:
     grant_read_unlock(rgt);
 
+    if ( put_handle )
+        put_maptrack_handle(lgt, op->handle);
+
     if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
         unsigned int kind;
@@ -1205,7 +1243,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     grant_entry_header_t *sha;
     struct page_info *pg;
     uint16_t *status;
-    bool_t put_handle = 0;
 
     if ( rd == NULL )
     { 
@@ -1226,13 +1263,13 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     if ( rgt->gt_version == 0 )
         goto unlock_out;
 
-    act = active_entry_acquire(rgt, op->map->ref);
-    sha = shared_entry_header(rgt, op->map->ref);
+    act = active_entry_acquire(rgt, op->ref);
+    sha = shared_entry_header(rgt, op->ref);
 
     if ( rgt->gt_version == 1 )
         status = &sha->flags;
     else
-        status = &status_entry(rgt, op->map->ref);
+        status = &status_entry(rgt, op->ref);
 
     if ( unlikely(op->frame != act->frame) ) 
     {
@@ -1289,9 +1326,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
             act->pin -= GNTPIN_hstw_inc;
     }
 
-    if ( (op->map->flags & (GNTMAP_device_map|GNTMAP_host_map)) == 0 )
-        put_handle = 1;
-
     if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) &&
          !(op->flags & GNTMAP_readonly) )
         gnttab_clear_flag(_GTF_writing, status);
@@ -1304,11 +1338,6 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
  unlock_out:
     grant_read_unlock(rgt);
 
-    if ( put_handle )
-    {
-        op->map->flags = 0;
-        put_maptrack_handle(ld->grant_table, op->handle);
-    }
     rcu_unlock_domain(rd);
 }
 
-- 
2.1.4


["xsa218-4.8/0003-gnttab-correct-maptrack-table-accesses.patch" (application/octet-stream)]

From 29f04a077972e07c86c9e911005220f6d691ffa6 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Thu, 15 Jun 2017 12:05:29 +0100
Subject: [PATCH 3/3] gnttab: correct maptrack table accesses

In order to observe a consistent (limit,pointer-table) pair, the reader
needs to either hold the maptrack lock (in line with documentation) or
both sides need to order their accesses suitably (the writer side
barrier was removed by commit dff515dfea ["gnttab: use per-VCPU
maptrack free lists"], and a read side barrier has never been there).

Make the writer publish a new table page before limit (for bounds
checks to work), and new list head last (for racing maptrack_entry()
invocations to work). At the same time add read barriers to lockless
readers.

Additionally get_maptrack_handle() must not assume ->maptrack_head to
not change behind its back: Another handle may be put (updating only
->maptrack_tail) and then got or stolen (updating ->maptrack_head).

This is part of XSA-218.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/common/grant_table.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index ba10e76..627947a 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -395,7 +395,7 @@ get_maptrack_handle(
     struct grant_table *lgt)
 {
     struct vcpu          *curr = current;
-    int                   i;
+    unsigned int          i, head;
     grant_handle_t        handle;
     struct grant_mapping *new_mt;
 
@@ -451,17 +451,20 @@ get_maptrack_handle(
         new_mt[i].ref = handle + i + 1;
         new_mt[i].vcpu = curr->vcpu_id;
     }
-    new_mt[i - 1].ref = curr->maptrack_head;
 
     /* Set tail directly if this is the first page for this VCPU. */
     if ( curr->maptrack_tail == MAPTRACK_TAIL )
         curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
 
-    write_atomic(&curr->maptrack_head, handle + 1);
-
     lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
+    smp_wmb();
     lgt->maptrack_limit += MAPTRACK_PER_PAGE;
 
+    do {
+        new_mt[i - 1].ref = read_atomic(&curr->maptrack_head);
+        head = cmpxchg(&curr->maptrack_head, new_mt[i - 1].ref, handle + 1);
+    } while ( head != new_mt[i - 1].ref );
+
     spin_unlock(&lgt->maptrack_lock);
 
     return handle;
@@ -727,6 +730,7 @@ static unsigned int mapkind(
     for ( handle = 0; !(kind & MAPKIND_WRITE) &&
                       handle < lgt->maptrack_limit; handle++ )
     {
+        smp_rmb();
         map = &maptrack_entry(lgt, handle);
         if ( !(map->flags & (GNTMAP_device_map|GNTMAP_host_map)) ||
              map->domid != rd->domain_id )
@@ -1094,6 +1098,7 @@ __gnttab_unmap_common(
         return;
     }
 
+    smp_rmb();
     map = &maptrack_entry(lgt, op->handle);
 
     grant_read_lock(lgt);
-- 
2.1.4


[Attachment #21 (text/plain)]

_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
https://lists.xen.org/xen-users

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic