[prev in list] [next in list] [prev in thread] [next in thread]
List: ocfs2-devel
Subject: Re: [Ocfs2-devel] [PATCH] ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_t
From: wangyan <wangyan122 () huawei ! com>
Date: 2020-01-09 6:26:10
Message-ID: 027dc960-8904-8bcb-b778-a496172110e1 () huawei ! com
[Download RAW message or body]
On 2020/1/9 9:57, Changwei Ge wrote:
>
> On 1/8/20 5:23 PM, wangyan wrote:
>> I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(),
>> handle->h_transaction may be NULL in this situation:
>> ocfs2_file_write_iter
>> ->__generic_file_write_iter
>> ->generic_perform_write
>> ->ocfs2_write_begin
>> ->ocfs2_write_begin_nolock
>> ->ocfs2_write_cluster_by_desc
>> ->ocfs2_write_cluster
>> ->ocfs2_mark_extent_written
>> ->ocfs2_change_extent_flag
>> ->ocfs2_split_extent
>> ->ocfs2_try_to_merge_extent
>> ->ocfs2_extend_rotate_transaction
>> ->ocfs2_extend_trans
>> ->jbd2_journal_restart
>> ->jbd2__journal_restart
>> // handle->h_transaction is NULL here
>> ->handle->h_transaction = NULL;
>> ->start_this_handle
>> /* journal aborted due to storage
>> network disconnection, return error */
>> ->return -EROFS;
>> /* line 3806 in ocfs2_try_to_merge_extent (),
>> it will ignore ret error. */
>> ->ret = 0;
>> ->...
>> ->ocfs2_write_end
>> ->ocfs2_write_end_nolock
>> ->ocfs2_update_inode_fsync_trans
>> // NULL pointer dereference
>> ->oi->i_sync_tid = handle->h_transaction->t_tid;
>>
>> The information of NULL pointer dereference as follows:
>> JBD2: Detected IO errors while flushing file data on dm-11-45
>> Aborting journal on device dm-11-45.
>> JBD2: Error -5 detected when updating journal superblock for dm-11-45.
>> (dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30
>> (dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30
>> Unable to handle kernel NULL pointer dereference at
>> virtual address 0000000000000008
>> Mem abort info:
>> ESR = 0x96000004
>> Exception class = DABT (current EL), IL = 32 bits
>> SET = 0, FnV = 0
>> EA = 0, S1PTW = 0
>> Data abort info:
>> ISV = 0, ISS = 0x00000004
>> CM = 0, WnR = 0
>> user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338
>> [0000000000000008] pgd=0000000000000000
>> Internal error: Oops: 96000004 [#1] SMP
>> Process dd (pid: 22081, stack limit = 0x00000000584f35a9)
>> CPU: 3 PID: 22081 Comm: dd Kdump: loaded
>> Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
>> pstate: 60400009 (nZCv daif +PAN -UAO)
>> pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
>> lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2]
>> sp : ffff0000459fba70
>> x29: ffff0000459fba70 x28: 0000000000000000
>> x27: ffff807ccf7f1000 x26: 0000000000000001
>> x25: ffff807bdff57970 x24: ffff807caf1d4000
>> x23: ffff807cc79e9000 x22: 0000000000001000
>> x21: 000000006c6cd000 x20: ffff0000091d9000
>> x19: ffff807ccb239db0 x18: ffffffffffffffff
>> x17: 000000000000000e x16: 0000000000000007
>> x15: ffff807c5e15bd78 x14: 0000000000000000
>> x13: 0000000000000000 x12: 0000000000000000
>> x11: 0000000000000000 x10: 0000000000000001
>> x9 : 0000000000000228 x8 : 000000000000000c
>> x7 : 0000000000000fff x6 : ffff807a308ed6b0
>> x5 : ffff7e01f10967c0 x4 : 0000000000000018
>> x3 : d0bc661572445600 x2 : 0000000000000000
>> x1 : 000000001b2e0200 x0 : 0000000000000000
>> Call trace:
>> ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
>> ocfs2_write_end+0x4c/0x80 [ocfs2]
>> generic_perform_write+0x108/0x1a8
>> __generic_file_write_iter+0x158/0x1c8
>> ocfs2_file_write_iter+0x668/0x950 [ocfs2]
>> __vfs_write+0x11c/0x190
>> vfs_write+0xac/0x1c0
>> ksys_write+0x6c/0xd8
>> __arm64_sys_write+0x24/0x30
>> el0_svc_common+0x78/0x130
>> el0_svc_handler+0x38/0x78
>> el0_svc+0x8/0xc
>>
>> To prevent NULL pointer dereference in this situation, we use
>> is_handle_aborted() before using handle->h_transaction->t_tid.
>>
>> Signed-off-by: Yan Wang <wangyan122@huawei.com>
>> Reviewed-by: Jun Piao <piaojun@huawei.com>
>> ---
>> fs/ocfs2/journal.h | 8 +++++---
>> fs/ocfs2/namei.c | 3 +--
>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
>> index 3103ba7f97a2..bfe611ed1b1d 100644
>> --- a/fs/ocfs2/journal.h
>> +++ b/fs/ocfs2/journal.h
>> @@ -597,9 +597,11 @@ static inline void
>> ocfs2_update_inode_fsync_trans(handle_t *handle,
>> {
>> struct ocfs2_inode_info *oi = OCFS2_I(inode);
>>
>> - oi->i_sync_tid = handle->h_transaction->t_tid;
>> - if (datasync)
>> - oi->i_datasync_tid = handle->h_transaction->t_tid;
>> + if (!is_handle_aborted(handle)) {
>> + oi->i_sync_tid = handle->h_transaction->t_tid;
>> + if (datasync)
>> + oi->i_datasync_tid = handle->h_transaction->t_tid;
>> + }
>
>
> I don't think your way can fix the issue you reported completely.
>
> Even you check if the journal is ABORTED or not, you still face a race
> causing accessing NULL h_transaction.
>
> Otherwise, you need synchronization mechanism help.
Just one process, and handle is not shared by other processes, no race here.
>
> Besides, if journal is aborted, ocfs2 won't fence the machine by resetting?
Even journal is not aborted, it still has other wrong branch in
start_this_handle(), which will return error before assign value to
handle->h_transaction.
>
>
> Thanks,
>
> Changwei
>
>
>> }
>>
>> #endif /* OCFS2_JOURNAL_H */
>> diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
>> index 8ea51cf27b97..da65251ef815 100644
>> --- a/fs/ocfs2/namei.c
>> +++ b/fs/ocfs2/namei.c
>> @@ -586,8 +586,7 @@ static int __ocfs2_mknod_locked(struct inode *dir,
>> mlog_errno(status);
>> }
>>
>> - oi->i_sync_tid = handle->h_transaction->t_tid;
>> - oi->i_datasync_tid = handle->h_transaction->t_tid;
>> + ocfs2_update_inode_fsync_trans(handle, inode, 1);
>>
>> leave:
>> if (status < 0) {
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic