[prev in list] [next in list] [prev in thread] [next in thread]
List: scst-devel
Subject: Re: [Scst-devel] 3.4.x Hung Tasks
From: Bart Van Assche <bvanassche () acm ! org>
Date: 2020-11-12 3:45:33
Message-ID: 6c9edbce-6a24-1cf0-a5a9-5f048bf688f7 () acm ! org
[Download RAW message or body]
On 11/11/20 7:36 AM, Marc Smith wrote:
> On Sun, Nov 8, 2020 at 10:06 PM Marc Smith <msmith626@gmail.com> wrote:
>> On Thu, Nov 5, 2020 at 11:19 PM Bart Van Assche <bvanassche@acm.org> wrote:
>>> On 11/4/20 11:35 AM, Marc Smith wrote:
[ ... ]
>>>> [ 4597.468834] Call Trace:
>>>> [ 4597.468837] __schedule+0x46e/0x4b5
>>>> [ 4597.468839] ? __switch_to_asm+0x40/0x70
>>>> [ 4597.468841] ? __switch_to_asm+0x34/0x70
>>>> [ 4597.468844] schedule+0x67/0x81
>>>> [ 4597.468846] rwsem_down_read_slowpath+0x292/0x2f1
>>>> [ 4597.468848] ? __switch_to_asm+0x34/0x70
>>>> [ 4597.468852] ? __switch_to+0x2a7/0x354
>>>> [ 4597.468855] dlm_lock+0x82/0x183
>
> (gdb) list *(dlm_lock+0x82)
> 0xffffffff8123cb1f is in dlm_lock (fs/dlm/lock.c:3432).
>
> dlm_lock_recovery(ls);
>
> static inline void dlm_lock_recovery(struct dlm_ls *ls)
> {
> down_read(&ls->ls_in_recovery);
> }
>
> This is 'static inline' so we don't see it in the call trace? Right? I
> see rwsem_down_read_slowpath() above it so I assume if I followed
> down_read() I would see that.
>
>
>>>> [ 4597.468866] ? scst_dlm_post_ast+0x1/0x1 [scst]
>>>> [ 4597.468868] ? usleep_range+0x7a/0x7a
>>>> [ 4597.468871] ? schedule+0x67/0x81
>>>> [ 4597.468872] ? schedule_timeout+0x2c/0xe5
>>>> [ 4597.468882] scst_dlm_lock_wait+0x72/0x10a [scst]
>
> (gdb) list *(scst_dlm_lock_wait+0x72)
> 0x22ade is in scst_dlm_lock_wait
> (/sources/scst-3.4.x_r9170/scst/src/scst_dlm.c:95).
>
> So this indicates we are here:
> res = dlm_lock(ls, mode, &lksb->lksb, flags,
> (void *)name, name ? strlen(name) : 0, 0,
> scst_dlm_ast, lksb, bast);
>
> Is it possibly getting stuck in dlm_lock() itself? I understand it's
> async and supposed to return immediately, but perhaps something is
> wrong in dlm_lock()? Or something seriously wrong on my machine when
> this happens. =)
Hi Marc,
That might be what is going on. dlm_lock_recovery() is indeed not
visible in the call trace because it has been inlined. But
rwsem_down_read_slowpath() is visible in the call trace what means that
dlm_lock() is waiting for down_read() to finish. Is it possible to
reproduce this hang with lockdep enabled? If so, does lockdep provide
more information about the context that is holding the lock too long?
I have verified the v5.9 DLM source code with a static analyzer but that
did not yield any interesting results:
make M=fs/dlm W=1 C=2 CHECK="smatch -p=kernel"
Thanks,
Bart.
_______________________________________________
Scst-devel mailing list
https://lists.sourceforge.net/lists/listinfo/scst-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic