[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-api
Subject: Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory
From: Michael Roth <michael.roth () amd ! com>
Date: 2023-03-20 19:08:36
Message-ID: 20230320190836.z2rqrhybke3egiuu () amd ! com
[Download RAW message or body]
On Thu, Feb 16, 2023 at 03:21:21PM +0530, Nikunj A. Dadhania wrote:
>
> > +static struct file *restrictedmem_file_create(struct file *memfd)
> > +{
> > + struct restrictedmem_data *data;
> > + struct address_space *mapping;
> > + struct inode *inode;
> > + struct file *file;
> > +
> > + data = kzalloc(sizeof(*data), GFP_KERNEL);
> > + if (!data)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + data->memfd = memfd;
> > + mutex_init(&data->lock);
> > + INIT_LIST_HEAD(&data->notifiers);
> > +
> > + inode = alloc_anon_inode(restrictedmem_mnt->mnt_sb);
> > + if (IS_ERR(inode)) {
> > + kfree(data);
> > + return ERR_CAST(inode);
> > + }
>
> alloc_anon_inode() uses new_pseudo_inode() to get the inode. As per the comment, \
> new inode is not added to the superblock s_inodes list.
Another issue somewhat related to alloc_anon_inode() is that the shmem code
in some cases assumes the inode struct was allocated via shmem_alloc_inode(),
which allocates a struct shmem_inode_info, which is a superset of struct inode
with additional fields for things like spinlocks.
These additional fields don't get allocated/ininitialized in the case of
restrictedmem, so when restrictedmem_getattr() tries to pass the inode on to
shmem handler, it can cause a crash.
For instance, the following trace was seen when executing 'sudo lsof' while a
process/guest was running with an open memfd FD:
[24393.121409] general protection fault, probably for non-canonical address \
0xfe9fb182fea3f077: 0000 [#1] PREEMPT SMP NOPTI [24393.133546] CPU: 2 PID: 590073 \
Comm: lsof Tainted: G E 6.1.0-rc4-upm10b-host-snp-v8b+ #4 \
[24393.144125] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS RXM1009B \
05/14/2022 [24393.153150] RIP: 0010:native_queued_spin_lock_slowpath+0x3a3/0x3e0
[24393.160049] Code: f3 90 41 8b 04 24 85 c0 74 ea eb f4 c1 ea 12 83 e0 03 83 ea \
01 48 c1 e0 05 48 63 d2 48 05 00 41 04 00 48 03 04 d5 e0 ea 8b 82 <48> 89 18 8b 43 08 \
85 c0 75 09 f3 90 8b 43 08 85 c0 74 f7 48 8b 13 [24393.181004] RSP: \
0018:ffffc9006b6a3cf8 EFLAGS: 00010086 [24393.186832] RAX: fe9fb182fea3f077 RBX: \
ffff889fcc144100 RCX: 0000000000000000 [24393.194793] RDX: 0000000000003ffe RSI: \
ffffffff827acde9 RDI: ffffc9006b6a3cdf [24393.202751] RBP: ffffc9006b6a3d20 R08: \
0000000000000001 R09: 0000000000000000 [24393.210710] R10: 0000000000000000 R11: \
000000000000ffff R12: ffff888179fa50e0 [24393.218670] R13: ffff889fcc144100 R14: \
00000000000c0000 R15: 00000000000c0000 [24393.226629] FS: 00007f9440f45400(0000) \
GS:ffff889fcc100000(0000) knlGS:0000000000000000 [24393.235692] CS: 0010 DS: 0000 \
ES: 0000 CR0: 0000000080050033 [24393.242101] CR2: 000055c55a9cf088 CR3: \
0008000220e9c003 CR4: 0000000000770ee0 [24393.250059] PKRU: 55555554
[24393.253073] Call Trace:
[24393.255797] <TASK>
[24393.258133] do_raw_spin_lock+0xc4/0xd0
[24393.262410] _raw_spin_lock_irq+0x50/0x70
[24393.266880] ? shmem_getattr+0x4c/0xf0
[24393.271060] shmem_getattr+0x4c/0xf0
[24393.275044] restrictedmem_getattr+0x34/0x40
[24393.279805] vfs_getattr_nosec+0xbd/0xe0
[24393.284178] vfs_getattr+0x37/0x50
[24393.287971] vfs_statx+0xa0/0x150
[24393.291668] vfs_fstatat+0x59/0x80
[24393.295462] __do_sys_newstat+0x35/0x70
[24393.299739] __x64_sys_newstat+0x16/0x20
[24393.304111] do_syscall_64+0x3b/0x90
[24393.308098] entry_SYSCALL_64_after_hwframe+0x63/0xcd
As a workaround we've been doing the following, but it's probably not the
proper fix:
https://github.com/AMDESE/linux/commit/0378116b5c4e373295c9101727f2cb5112d6b1f4
-Mike
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic