[prev in list] [next in list] [prev in thread] [next in thread]
List: ssic-linux-devel
Subject: [SSI-devel] [ ssic-linux-Bugs-1811510 ] deadlock on loop mounted fs
From: "SourceForge.net" <noreply () sourceforge ! net>
Date: 2008-06-19 6:32:10
Message-ID: E1K9Dgo-0001lf-0h () b55xhf1 ! ch3 ! sourceforge ! com
[Download RAW message or body]
Bugs item #1811510, was opened at 2007-10-11 08:22
Message generated for change (Comment added) made by rogertsang
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1811510&group_id=32541
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Filesystem
Group: v1.9.3
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: John Hughes (hughesj)
Assigned to: Roger Tsang (rogertsang)
Summary: deadlock on loop mounted fs
Initial Comment:
1. Make a sparse file
perl -e 'open BIGFILE, ">BIGFILE"; seek BIGFILE, 1024 * 1024 * 1024, 0; print \
BIGFILE "big"'
2. make a filesystem on it
losetup /dev/loop/0 BIGFILE
mkfs -t ext3 /dev/loop/0
3. mount it
mount -t ext3 /dev/loop/0 /mnt
4. write a lot of files to it
cd /mnt
dump 0f - / | restore rf -
eventualy the node where we are writing to the loopback mounted fs gets deadlocked. \
It's still up as far as the cluster is concerned, but any attempt to start a process \
on it blocks.
----------------------------------------------------------------------
> Comment By: Roger Tsang (rogertsang)
Date: 2008-06-19 02:32
Message:
Logged In: YES
user_id=1246761
Originator: NO
Try the attached patch.
More work would need to be done to pass a flag to kernel space for CFS to
use a different congestion bit in the case of CFS on loopback. However the
proposed solution only works if you are not going to CFS mount another
loopback on top of a CFS mount on loopback on CFS. So the simple fix would
be this patch. Loopback becomes a standard mount.
File Added: util-linux.1811510.patch
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2008-03-16 20:35
Message:
Logged In: YES
user_id=1246761
Originator: NO
Should be fixed in 2.0.0pre3...
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2007-10-20 21:58
Message:
Logged In: YES
user_id=1246761
Originator: NO
It looks like CFS ran out of memory. Try the latest checkin of
kernel/cluster/ssi/cfs code that re-enables commit for soft mounts.
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2007-10-20 14:42
Message:
Logged In: YES
user_id=1246761
Originator: NO
Does 2.6.10-ssi run into this bug?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2007-10-16 10:33
Message:
Logged In: NO
Still looks the same as the old bug... This time it is stacked
generic_file_writev().
cfs_async (has i_sem)
loop0
pdflush
kjournald
cfs_async (waiting for i_sem)
----------------------------------------------------------------------
Comment By: John Hughes (hughesj)
Date: 2007-10-12 07:36
Message:
Logged In: YES
user_id=166336
Originator: YES
Here's some debugging. I've got to the point where the "restore" process
on node 1 seems hung. On node 2 I try an "onnode 1 pwd". It hangs.
One node 1:
Entering kdb (current=0xc0502bc0, pid 0) on processor 0 due to Keyboard
Entry
[0]kdb> ps
1 idle process (state I) and 50 sleeping system daemon (state M) processes
suppressed
Task Addr Pid Parent [*] cpu State Thread Command
0xcf82a5d0 5 2 0 0 R 0xcf82a7b0 events/0
0xcf68b990 117 11 0 0 D 0xcf68bb70 pdflush
0xcf68a310 121 2 0 0 D 0xcf68a4f0 cfs_async
0xcf6b99b0 122 2 0 0 D 0xcf6b9b90 cfs_async
0xcf6b9410 123 2 0 0 D 0xcf6b95f0 cfs_async
0xcf6b8e70 124 2 0 0 D 0xcf6b9050 cfs_async
0xcf6b88d0 125 2 0 0 D 0xcf6b8ab0 cfs_async
0xcf6b8330 126 2 0 0 D 0xcf6b8510 cfs_async
0xcf6c99d0 127 2 0 0 D 0xcf6c9bb0 cfs_async
0xcf6c9430 128 2 0 0 D 0xcf6c9610 cfs_async
0xce92b730 1 0 0 0 D 0xce92b910 init
[...]
0xce90b150 67763 2 0 0 D 0xce90b330 loop0
0xce90d170 67820 2 0 0 D 0xce90d350 kjournald
0xce8f96d0 67822 67636 0 0 S 0xce8f98b0 dump
0xce8f8b90 67823 67636 0 0 D 0xce8f8d70 restore
0xcf13f970 67824 67822 0 0 S 0xcf13fb50 dump
0xcf13f3d0 67825 67824 0 0 S 0xcf13f5b0 dump
0xcf7861f0 67826 67824 0 0 S 0xcf7863d0 dump
0xcf786790 67827 67824 0 0 S 0xcf786970 dump
0xcf47d9b0 132773 2 0 0 D 0xcf47db90 onnode
[0]kdb> btp 132773
Stack traceback for pid 132773
0xcf47d9b0 132773 2 0 0 D 0xcf47db90 onnode
EBP EIP Function (args)
0xce879ba8 0xc046c2e6 schedule+0x3a6 (0xce879c10)
0xce879bb4 0xc046d348 io_schedule+0x28 (0xc1271c70)
0xce879bc0 0xc014aed5 sync_page+0x45 (0xc10c37f8, 0x0, 0xc014ae90,
0xcf47d9b0, 0xce879c10)
0xce879be0 0xc046d6fe __wait_on_bit_lock+0x5e (0x2, 0xc10c37f8,
0xc10c37f8, 0x0, 0x0)
0xce879c3c 0xc014b744 __lock_page+0x84 (0xc049efb5, 0xa7, 0xce7c31a0, 0x0,
0x1)
0xce879cc4 0xc014beeb do_generic_mapping_read+0x3db (0xce88ca00,
0xce7c31f0, 0xce7c31a0, 0xce879e00, 0xce879d00)
0xce879d1c 0xc014c3ed __generic_file_aio_read+0x1ed (0xce879dc4,
0xce879d34, 0x1, 0xce879e00, 0xcf06d600)
0xce879d48 0xc014c473 generic_file_aio_read+0x53 (0xce879dc4, 0xcf06d600,
0x80, 0x0, 0x0)
0xce879d84 0xc028375a __cfs_file_read+0xaa (0xce879dc4, 0x0, 0xcf06d600,
0x80, 0xce879da0)
0xce879da8 0xc0283828 cfs_file_aio_read+0x38 (0xce879dc4, 0xcf06d600,
0x80, 0x0, 0x0)
0xce879e50 0xc016c3b3 do_sync_read+0xa3 (0xce7c31a0, 0xcf06d600, 0x80,
0xce879e8c, 0xce879000)
0xce879e74 0xc016c490 vfs_read+0xb0 (0xce7c31a0, 0xcf06d600, 0x80,
0xce879e8c, 0x0)
0xce879e9c 0xc017895a kernel_read+0x4a (0xce7c31a0, 0x0, 0xcf06d600, 0x80,
0xcf06d600)
0xce879ec0 0xc017946a prepare_binprm+0xca (0xcf06d600, 0x7fff, 0xc13b4080,
0x0, 0x0)
0xce879eec 0xc0179a16 ssi_do_execve+0x1a6 (0xcf012920, 0xce6f8800,
0xcf6aa400, 0xce879fa0, 0x0)
0xce879f78 0xc0245c3a rexecve_server+0xea (0xcf50e000, 0xcf47d9b0,
0xcf012920, 0xce6f8800, 0xcf6aa400)
0xce879fec 0xc02454f5 rexecve_server_setup+0x55
0xc01023a5 kernel_thread_helper+0x5
[0]kdb> btp 67823
Stack traceback for pid 67823
0xce8f8b90 67823 67636 0 0 D 0xce8f8d70 restore
EBP EIP Function (args)
0xca2fcea0 0xc046c2e6 schedule+0x3a6 (0x0, 0xce8f8b90, 0xc013f0a0,
0xca2fced4, 0xca2fced4)
0xca2fcef4 0xc029f3ba cfs_wait_on_request+0x7a (0xc9a8c200, 0xca2fcf14,
0x0, 0x1, 0x0)
0xca2fcf24 0xc0285a9e cfs_wait_on_requests+0x8e (0xccb63be4, 0x0, 0x0,
0x0, 0xce7c3600)
0xca2fcf48 0xc0286f66 cfs_sync_inode+0x76 (0xccb63be4, 0x0, 0x0, 0x2,
0x0)
0xca2fcf80 0xc0283653 cfs_file_flush+0x93 (0xce7c3600, 0x81a4, 0xccdef200,
0x5, 0xccdef204)
0xca2fcf9c 0xc016bb3c filp_close+0x6c (0xce7c3600, 0xccdef200, 0xce7c3600,
0x5, 0x0)
0xca2fcfbc 0xc016bbce sys_close+0x6e
0xc0105a3b syscall_call+0x7
[0]kdb>
[0]kdb> btp 67763
Stack traceback for pid 67763
0xce90b150 67763 2 0 0 D 0xce90b330 loop0
EBP EIP Function (args)
0xca488db8 0xc046c2e6 schedule+0x3a6 (0xca488e20)
0xca488dc4 0xc046d348 io_schedule+0x28 (0xc12711e0)
0xca488dd0 0xc014aed5 sync_page+0x45 (0xc11d6be0, 0x0, 0xc014ae90,
0xce90b150, 0xca488e20)
0xca488df0 0xc046d6fe __wait_on_bit_lock+0x5e (0x2, 0xc11d6be0,
0xc11d6be0, 0x0, 0x0)
0xca488e4c 0xc014b744 __lock_page+0x84 (0xc049efb5, 0xa7, 0xcd6ca600,
0x38002, 0x1)
0xca488ed4 0xc014beeb do_generic_mapping_read+0x3db (0xcb632f40,
0xcd6ca650, 0xcd6ca600, 0xca488f58, 0xca488ef4)
0xca488f04 0xc014c61b generic_file_sendfile+0x5b (0xcd6ca600, 0xca488f58,
0x1000, 0xd08f15d0, 0xca488f60)
0xca488f3c 0xc02838bd cfs_file_sendfile+0x8d (0xcd6ca600, 0xca488f58,
0x1000, 0xd08f15d0, 0xca488f60)
0xca488f74 0xd08f16fc [loop]do_lo_receive+0x5c (0xc9353000, 0xc4279630,
0x1000, 0x38002000, 0x0)
0xca488fa4 0xd08f176e [loop]lo_receive+0x5e (0xc9353000, 0xc1ed33e0,
0x1000, 0x38002000, 0x0)
0xca488fc8 0xd08f17eb [loop]do_bio_filebacked+0x4b (0xc9353000,
0xc1ed33e0, 0x0, 0xc9353138, 0xd08f1a60)
0xca488fec 0xd08f1b3b [loop]loop_thread+0xdb
0xc01023a5 kernel_thread_helper+0x5
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2007-10-11 22:21
Message:
Logged In: YES
user_id=1246761
Originator: NO
Sounds like [ 686748 ] Filesystem stacking deadlock.
----------------------------------------------------------------------
Comment By: John Hughes (hughesj)
Date: 2007-10-11 08:22
Message:
Logged In: YES
user_id=166336
Originator: YES
This is with the 2.6.11 kernel
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1811510&group_id=32541
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
ssic-linux-devel mailing list
ssic-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic