[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-aio
Subject: Re: [PATCH v1 00/11] mm: migrate: support poison recover from migrate folio
From: Kefeng Wang <wangkefeng.wang () huawei ! com>
Date: 2024-03-28 13:30:36
Message-ID: 1c8b52d2-485f-4972-aa46-0493b18186f9 () huawei ! com
[Download RAW message or body]
Hi, since rfcv2, there is no more changes, kindly ping, any comments,
thanks all.
On 2024/3/21 11:27, Kefeng Wang wrote:
> The folio migration is widely used in kernel, memory compaction, memory
> hotplug, soft offline page, numa balance, memory demote/promotion, etc,
> but once access a poisoned source folio when migrating, the kerenl will
> panic.
>
> There is a mechanism in the kernel to recover from uncorrectable memory
> errors, ARCH_HAS_COPY_MC(Machine Check Safe Memory Copy), which is already
> used in NVDIMM or core-mm paths(eg, CoW, khugepaged, coredump, ksm copy),
> see copy_mc_to_{user,kernel}, copy_mc_{user_}highpage callers.
>
> This series of patches provide the recovery mechanism from folio copy for
> the widely used folio migration. Please note, because folio migration is
> no guarantee of success, so we could chose to make folio migration tolerant
> of memory failures, adding folio_mc_copy() which is a #MC versions of
> folio_copy(), once accessing a poisoned source folio, we could return error
> and make the folio migration fail, and this could avoid the similar panic
> shown below.
>
> CPU: 1 PID: 88343 Comm: test_softofflin Kdump: loaded Not tainted 6.6.0
> pc : copy_page+0x10/0xc0
> lr : copy_highpage+0x38/0x50
> ...
> Call trace:
> copy_page+0x10/0xc0
> folio_copy+0x78/0x90
> migrate_folio_extra+0x54/0xa0
> move_to_new_folio+0xd8/0x1f0
> migrate_folio_move+0xb8/0x300
> migrate_pages_batch+0x528/0x788
> migrate_pages_sync+0x8c/0x258
> migrate_pages+0x440/0x528
> soft_offline_in_use_page+0x2ec/0x3c0
> soft_offline_page+0x238/0x310
> soft_offline_page_store+0x6c/0xc0
> dev_attr_store+0x20/0x40
> sysfs_kf_write+0x4c/0x68
> kernfs_fop_write_iter+0x130/0x1c8
> new_sync_write+0xa4/0x138
> vfs_write+0x238/0x2d8
> ksys_write+0x74/0x110
>
> v1:
> - no change, resend and rebased on 6.9-rc1
>
> rfcv2:
> - Separate __migrate_device_pages() cleanup from patch "remove
> migrate_folio_extra()", suggested by Matthew
> - Split folio_migrate_mapping(), move refcount check/freeze out
> of folio_migrate_mapping(), suggested by Matthew
> - add RB
>
> Kefeng Wang (11):
> mm: migrate: simplify __buffer_migrate_folio()
> mm: migrate_device: use more folio in __migrate_device_pages()
> mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY
> mm: migrate: remove migrate_folio_extra()
> mm: remove MIGRATE_SYNC_NO_COPY mode
> mm: migrate: split folio_migrate_mapping()
> mm: add folio_mc_copy()
> mm: migrate: support poisoned recover from migrate folio
> fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
> mm: migrate: remove folio_migrate_copy()
> fs: aio: add explicit check for large folio in aio_migrate_folio()
>
> fs/aio.c | 15 ++--
> fs/hugetlbfs/inode.c | 5 +-
> include/linux/migrate.h | 3 -
> include/linux/migrate_mode.h | 5 --
> include/linux/mm.h | 1 +
> mm/balloon_compaction.c | 8 --
> mm/migrate.c | 157 +++++++++++++++++------------------
> mm/migrate_device.c | 28 +++----
> mm/util.c | 20 +++++
> mm/zsmalloc.c | 8 --
> 10 files changed, 115 insertions(+), 135 deletions(-)
>
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic