[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-raid
Subject:    Re: [PATCH v6 4/4] md: raid456 add nowait support
From:       Vishal Verma <vverma () digitalocean ! com>
Date:       2021-12-26 21:20:20
Message-ID: 30463c91-f07b-ec4a-a519-1a2bf13c4242 () digitalocean ! com
[Download RAW message or body]


On 12/25/21 9:02 PM, Vishal Verma wrote:
>
> On 12/25/21 5:07 PM, Song Liu wrote:
>> On Sat, Dec 25, 2021 at 2:13 PM Vishal Verma 
>> <vverma@digitalocean.com> wrote:
>>>
>>> On 12/25/21 12:28 AM, Vishal Verma wrote:
>>>>
>>>> On 12/24/21 7:14 PM, Song Liu wrote:
>>>>> On Tue, Dec 21, 2021 at 12:06 PM Vishal 
>>>>> Verma<vverma@digitalocean.com>  wrote:
>>>>>> Returns EAGAIN in case the raid456 driver would block
>>>>>> waiting for situations like:
>>>>>>
>>>>>>     - Reshape operation,
>>>>>>     - Discard operation.
>>>>>>
>>>>>> Signed-off-by: Vishal Verma<vverma@digitalocean.com>
>>>>> I think we will need the following fix for raid456:
>>>> Ack
>>>>> ============================ 8< ============================
>>>>>
>>>>> diff --git i/drivers/md/raid5.c w/drivers/md/raid5.c
>>>>> index 6ab22f29dacd..55d372ce3300 100644
>>>>> --- i/drivers/md/raid5.c
>>>>> +++ w/drivers/md/raid5.c
>>>>> @@ -5717,6 +5717,7 @@ static void make_discard_request(struct mddev
>>>>> *mddev, struct bio *bi)
>>>>>                           raid5_release_stripe(sh);
>>>>>                           /* Bail out if REQ_NOWAIT is set */
>>>>>                           if (bi->bi_opf & REQ_NOWAIT) {
>>>>> + finish_wait(&conf->wait_for_overlap, &w);
>>>>> bio_wouldblock_error(bi);
>>>>>                                   return;
>>>>>                           }
>>>>> @@ -5734,6 +5735,7 @@ static void make_discard_request(struct mddev
>>>>> *mddev, struct bio *bi)
>>>>> raid5_release_stripe(sh);
>>>>>                                   /* Bail out if REQ_NOWAIT is set */
>>>>>                                   if (bi->bi_opf & REQ_NOWAIT) {
>>>>> +
>>>>> finish_wait(&conf->wait_for_overlap, &w);
>>>>> bio_wouldblock_error(bi);
>>>>>                                           return;
>>>>>                                   }
>>>>> @@ -5829,7 +5831,6 @@ static bool raid5_make_request(struct mddev
>>>>> *mddev, struct bio * bi)
>>>>>           last_sector = bio_end_sector(bi);
>>>>>           bi->bi_next = NULL;
>>>>>
>>>>> -       md_account_bio(mddev, &bi);
>>>>>           /* Bail out if REQ_NOWAIT is set */
>>>>>           if ((bi->bi_opf & REQ_NOWAIT) &&
>>>>>               (conf->reshape_progress != MaxSector) &&
>>>>> @@ -5837,9 +5838,11 @@ static bool raid5_make_request(struct mddev
>>>>> *mddev, struct bio * bi)
>>>>>               ? (logical_sector > conf->reshape_progress &&
>>>>> logical_sector <= conf->reshape_safe)
>>>>>               : (logical_sector >= conf->reshape_safe && 
>>>>> logical_sector
>>>>> < conf->reshape_progress))) {
>>>>>                   bio_wouldblock_error(bi);
>>>>> +               if (rw == WRITE)
>>>>> +                       md_write_end(mddev);
>>>>>                   return true;
>>>>>           }
>>>>> -
>>>>> +       md_account_bio(mddev, &bi);
>>>>>           prepare_to_wait(&conf->wait_for_overlap, &w, 
>>>>> TASK_UNINTERRUPTIBLE);
>>>>>           for (; logical_sector < last_sector; logical_sector +=
>>>>> RAID5_STRIPE_SECTORS(conf)) {
>>>>>                   int previous;
>>>>>
>>>>> ============================ 8< ============================
>>>>>
>>>>> Vishal, please try to trigger all these conditions (including raid1,
>>>>> raid10) and make sure
>>>>> they work properly.
>>>>>
>>>>> For example, I triggered raid5 reshape and used something like the
>>>>> following to make
>>>>> sure the logic is triggered:
>>>>>
>>>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>>>> index 55d372ce3300..e79de48a0027 100644
>>>>> --- a/drivers/md/raid5.c
>>>>> +++ b/drivers/md/raid5.c
>>>>> @@ -5840,6 +5840,11 @@ static bool raid5_make_request(struct mddev
>>>>> *mddev, struct bio * bi)
>>>>>                   bio_wouldblock_error(bi);
>>>>>                   if (rw == WRITE)
>>>>>                           md_write_end(mddev);
>>>>> +               {
>>>>> +                       static int count = 0;
>>>>> +                       if (count++ < 10)
>>>>> +                               pr_info("%s REQ_NOWAIT return\n", 
>>>>> __func__);
>>>>> +               }
>>>>>                   return true;
>>>>>           }
>>>>>           md_account_bio(mddev, &bi);
>>>>>
>>>>> Thanks,
>>>>> Song
>>>>>
>>>> Sure, will try this and verify for raid1/10.
>> Please also try test raid5 with discard. I haven't tested those two
>> conditions yet.
> Ack.

Do you have suggestion around how to test this. As in use fstrim or 
something
to issue discard op to the raid5 array?

>>
>>> I am running into an issue during raid10 reshape. I can see the nowait
>>> code getting triggered during reshape, but it seems like the reshape
>>> operation was stuck as soon as I issued write IO using FIO to the array
>>> during reshape.
>>> FIO also seem stuck i.e no IO went through...
>> Maybe the following could fix it?
>>
>> Thanks,
>> Song
> Hmm no luck, still the same issue.
It seems both: iou-wrk thread & md_reshape thread are hung during reshape..

[  247.889279] task:iou-wrk-9013    state:D stack:    0 pid: 9088 ppid:  
8869 flags:0x00004000
[  247.889282] Call Trace:
[  247.889284]  <TASK>
[  247.889286]  __schedule+0x2d5/0x9b0
[  247.889292]  ? preempt_count_add+0x74/0xc0
[  247.889295]  schedule+0x58/0xd0
[  247.889298]  wait_barrier+0x1ad/0x270 [raid10]
[  247.889301]  ? wait_woken+0x60/0x60
[  247.889304]  regular_request_wait+0x42/0x1e0 [raid10]
[  247.889306]  ? default_wake_function+0x1a/0x30
[  247.889308]  ? autoremove_wake_function+0x12/0x40
[  247.889310]  raid10_write_request+0x85/0x670 [raid10]
[  247.889312]  ? r10bio_pool_alloc+0x26/0x30 [raid10]
[  247.889314]  ? md_write_start+0xa7/0x270
[  247.889318]  raid10_make_request+0xe8/0x170 [raid10]
[  247.889320]  md_handle_request+0x13d/0x1d0
[  247.889322]  ? submit_bio_checks+0x1f6/0x5a0
[  247.889325]  md_submit_bio+0x6d/0xa0
[  247.889326]  __submit_bio+0x94/0x140
[  247.889327]  submit_bio_noacct+0xe1/0x2a0
[  247.889329]  submit_bio+0x48/0x120
[  247.889330]  blkdev_direct_IO+0x19b/0x540
[  247.889332]  ? hctx_unlock+0x17/0x40
[  247.889335]  ? blk_mq_request_issue_directly+0x57/0x80
[  247.889338]  generic_file_direct_write+0x9f/0x190
[  247.889342]  __generic_file_write_iter+0x9d/0x1c0
[  247.889345]  blkdev_write_iter+0xe7/0x160
[  247.889347]  io_write+0x153/0x300
[  247.889350]  ? __this_cpu_preempt_check+0x13/0x20
[  247.889352]  ? __perf_event_task_sched_in+0x81/0x230
[  247.889355]  ? debug_smp_processor_id+0x17/0x20
[  247.889356]  ? __perf_event_task_sched_out+0x77/0x510
[  247.889359]  io_issue_sqe+0x387/0x19c0
[  247.889361]  ? _raw_spin_lock_irqsave+0x1d/0x50
[  247.889363]  ? lock_timer_base+0x72/0xa0
[  247.889367]  io_wq_submit_work+0x67/0x170
[  247.889369]  io_worker_handle_work+0x2b0/0x500
[  247.889372]  io_wqe_worker+0x1ca/0x360
[  247.889374]  ? _raw_spin_unlock+0x1a/0x30
[  247.889376]  ? preempt_count_add+0x74/0xc0
[  247.889377]  ? io_workqueue_create+0x60/0x60
[  247.889380]  ret_from_fork+0x1f/0x30

[  247.908367] task:md5_reshape     state:D stack:    0 pid: 9087 
ppid:     2 flags:0x00004000
[  247.908369] Call Trace:
[  247.908370]  <TASK>
[  247.908371]  __schedule+0x2d5/0x9b0
[  247.908373]  schedule+0x58/0xd0
[  247.908375]  raise_barrier+0xb7/0x170 [raid10]
[  247.908377]  ? wait_woken+0x60/0x60
[  247.908378]  reshape_request+0x1b9/0x920 [raid10]
[  247.908380]  ? __this_cpu_preempt_check+0x13/0x20
[  247.908382]  ? __perf_event_task_sched_in+0x81/0x230
[  247.908384]  raid10_sync_request+0x1073/0x1640 [raid10]
[  247.908386]  ? _raw_spin_unlock+0x1a/0x30
[  247.908388]  ? __switch_to+0x12e/0x430
[  247.908390]  ? __schedule+0x2dd/0x9b0
[  247.908392]  ? blk_flush_plug+0xeb/0x120
[  247.908393]  ? preempt_count_add+0x74/0xc0
[  247.908394]  ? _raw_spin_lock_irqsave+0x1d/0x50
[  247.908396]  md_do_sync.cold+0x3fa/0x97f
[  247.908399]  ? wait_woken+0x60/0x60
[  247.908401]  md_thread+0xae/0x170
[  247.908402]  ? preempt_count_add+0x74/0xc0
[  247.908403]  ? _raw_spin_lock_irqsave+0x1d/0x50
[  247.908405]  kthread+0x177/0x1a0
[  247.908407]  ? md_start_sync+0x60/0x60
[  247.908408]  ? set_kthread_struct+0x40/0x40
[  247.908410]  ret_from_fork+0x1f/0x30
[  247.908412]  </TASK>

>> diff --git i/drivers/md/raid10.c w/drivers/md/raid10.c
>> index e2c524d50ec0..291eceaeb26c 100644
>> --- i/drivers/md/raid10.c
>> +++ w/drivers/md/raid10.c
>> @@ -1402,14 +1402,14 @@ static void raid10_write_request(struct mddev
>> *mddev, struct bio *bio,
>>               : (bio->bi_iter.bi_sector + sectors > 
>> conf->reshape_safe &&
>>                  bio->bi_iter.bi_sector < conf->reshape_progress))) {
>>                  /* Need to update reshape_position in metadata */
>> -               mddev->reshape_position = conf->reshape_progress;
>> -               set_mask_bits(&mddev->sb_flags, 0,
>> -                             BIT(MD_SB_CHANGE_DEVS) |
>> BIT(MD_SB_CHANGE_PENDING));
>> -               md_wakeup_thread(mddev->thread);
>>                  if (bio->bi_opf & REQ_NOWAIT) {
>>                          bio_wouldblock_error(bio);
>>                          return;
>>                  }
>> +               mddev->reshape_position = conf->reshape_progress;
>> +               set_mask_bits(&mddev->sb_flags, 0,
>> +                             BIT(MD_SB_CHANGE_DEVS) |
>> BIT(MD_SB_CHANGE_PENDING));
>> +               md_wakeup_thread(mddev->thread);
>>                  raid10_log(conf->mddev, "wait reshape metadata");
>>                  wait_event(mddev->sb_wait,
>>                             !test_bit(MD_SB_CHANGE_PENDING, 
>> &mddev->sb_flags));
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic