[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-aio
Subject:    Re: [PATCH v1 00/11] mm: migrate: support poison recover from migrate folio
From:       Kefeng Wang <wangkefeng.wang () huawei ! com>
Date:       2024-03-28 13:30:36
Message-ID: 1c8b52d2-485f-4972-aa46-0493b18186f9 () huawei ! com
[Download RAW message or body]

Hi, since rfcv2, there is no more changes, kindly ping, any comments, 
thanks all.

On 2024/3/21 11:27, Kefeng Wang wrote:
> The folio migration is widely used in kernel, memory compaction, memory
> hotplug, soft offline page, numa balance, memory demote/promotion, etc,
> but once access a poisoned source folio when migrating, the kerenl will
> panic.
> 
> There is a mechanism in the kernel to recover from uncorrectable memory
> errors, ARCH_HAS_COPY_MC(Machine Check Safe Memory Copy), which is already
> used in NVDIMM or core-mm paths(eg, CoW, khugepaged, coredump, ksm copy),
> see copy_mc_to_{user,kernel}, copy_mc_{user_}highpage callers.
> 
> This series of patches provide the recovery mechanism from folio copy for
> the widely used folio migration. Please note, because folio migration is
> no guarantee of success, so we could chose to make folio migration tolerant
> of memory failures, adding folio_mc_copy() which is a #MC versions of
> folio_copy(), once accessing a poisoned source folio, we could return error
> and make the folio migration fail, and this could avoid the similar panic
> shown below.
> 
>    CPU: 1 PID: 88343 Comm: test_softofflin Kdump: loaded Not tainted 6.6.0
>    pc : copy_page+0x10/0xc0
>    lr : copy_highpage+0x38/0x50
>    ...
>    Call trace:
>     copy_page+0x10/0xc0
>     folio_copy+0x78/0x90
>     migrate_folio_extra+0x54/0xa0
>     move_to_new_folio+0xd8/0x1f0
>     migrate_folio_move+0xb8/0x300
>     migrate_pages_batch+0x528/0x788
>     migrate_pages_sync+0x8c/0x258
>     migrate_pages+0x440/0x528
>     soft_offline_in_use_page+0x2ec/0x3c0
>     soft_offline_page+0x238/0x310
>     soft_offline_page_store+0x6c/0xc0
>     dev_attr_store+0x20/0x40
>     sysfs_kf_write+0x4c/0x68
>     kernfs_fop_write_iter+0x130/0x1c8
>     new_sync_write+0xa4/0x138
>     vfs_write+0x238/0x2d8
>     ksys_write+0x74/0x110
> 
> v1:
> - no change, resend and rebased on 6.9-rc1
> 
> rfcv2:
> - Separate __migrate_device_pages() cleanup from patch "remove
>    migrate_folio_extra()", suggested by Matthew
> - Split folio_migrate_mapping(), move refcount check/freeze out
>    of folio_migrate_mapping(), suggested by Matthew
> - add RB
> 
> Kefeng Wang (11):
>    mm: migrate: simplify __buffer_migrate_folio()
>    mm: migrate_device: use more folio in __migrate_device_pages()
>    mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY
>    mm: migrate: remove migrate_folio_extra()
>    mm: remove MIGRATE_SYNC_NO_COPY mode
>    mm: migrate: split folio_migrate_mapping()
>    mm: add folio_mc_copy()
>    mm: migrate: support poisoned recover from migrate folio
>    fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
>    mm: migrate: remove folio_migrate_copy()
>    fs: aio: add explicit check for large folio in aio_migrate_folio()
> 
>   fs/aio.c                     |  15 ++--
>   fs/hugetlbfs/inode.c         |   5 +-
>   include/linux/migrate.h      |   3 -
>   include/linux/migrate_mode.h |   5 --
>   include/linux/mm.h           |   1 +
>   mm/balloon_compaction.c      |   8 --
>   mm/migrate.c                 | 157 +++++++++++++++++------------------
>   mm/migrate_device.c          |  28 +++----
>   mm/util.c                    |  20 +++++
>   mm/zsmalloc.c                |   8 --
>   10 files changed, 115 insertions(+), 135 deletions(-)
> 

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic