[prev in list] [next in list] [prev in thread] [next in thread]
List: opensolaris-lvm-discuss
Subject: [lvm-discuss] Resyncrhonization anomaly
From: Sanjay.Nadkarni () sun ! com (Sanjay Nadkarni)
Date: 2005-08-18 21:54:09
Message-ID: 43056669.7010605 () sun ! com
[Download RAW message or body]
The fact that it went into maintenance is a bit disturbing. However
what bothers me more is that you have Last Err on one submirror and
okay on the other. This is not good. Were there any messages on the
console ? Is this reproducible ?
The correct states should have been SM0 in Maintenance and SM1 in Last
Erred. Under this condition when resync is done, the contents of last
erred are read and copied to the metadevice in maintenance. The
assumption here is that the metadevice that's gone into last erred has
the latest set of changes. However you should know that the integrity
of the entire mirror is *not* guaranteed whenever a mirror has gone to
the last erred state since this means that the mirror saw error on the
only remaining side...i.e. all bets are off. If the last write that
failed was a metadata update then typically the logging functionality of
UFS can handle this error. But by no means should this be taken as a
guarantee.
You mention that removing the md_resync_bufsz fixed. Were the other
setting left in place ? If so can you check if can you check if the
same problems occurs if the only setting is md_mirror:md_resync_bufsz =
2048. i.e. remove set maxphys and md:md_maxphys
-Sanjay
Matty wrote:
>
> Sorry to keep pinging the list, but I came across another odd issue as
> part of my testing. If a device is in the "Needs maintenance" state,
> how can it be used to synchronize another sub mirror?:
>
> d0: Mirror
> Submirror 0: d10
> State: Needs maintenance
> Submirror 1: d20
> State: Resyncing
> Resync in progress: 0 % done
> Pass: 1
> Read option: roundrobin (default)
> Write option: parallel (default)
> Size: 36962352 blocks (17 GB)
>
> d10: Submirror of d0
> State: Needs maintenance
> Size: 36962352 blocks (17 GB)
> Stripe 0:
> Device Start Block Dbase State Reloc Hot Spare
> c0t0d0s0 0 No Last Erred Yes
>
>
> d20: Submirror of d0
> State: Resyncing
> Size: 36962352 blocks (17 GB)
> Stripe 0:
> Device Start Block Dbase State Reloc Hot Spare
> c0t1d0s0 0 No Okay Yes
>
> $ iosat -zxnM 5
> extended device statistics
> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 100.4 0.0 6.3 0.0 0.4 0.0 4.1 0 41 c0t1d0
> 100.4 0.0 6.3 0.0 0.0 0.6 0.0 5.8 0 58 c0t0d0
> 100.4 100.4 6.3 6.3 0.0 1.0 0.0 5.0 0 100 d0
> 100.4 0.0 6.3 0.0 0.0 0.6 0.0 5.8 0 58 d10
> 0.0 100.4 0.0 6.3 0.0 0.4 0.0 4.1 0 41 d20
> extended device statistics
> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 100.4 0.0 6.3 0.0 0.4 0.0 4.1 0 41 c0t1d0
> 100.6 0.0 6.3 0.0 0.0 0.6 0.0 5.8 0 58 c0t0d0
> 100.6 100.4 6.3 6.3 0.0 1.0 0.0 5.0 0 100 d0
> 100.6 0.0 6.3 0.0 0.0 0.6 0.0 5.8 0 58 d10
> 0.0 100.4 0.0 6.3 0.0 0.4 0.0 4.1 0 41 d20
>
> This clearly shows that md is reading from SM0 and writing to SM1. Is
> this normal? If a device is in trouble (e.g., Needs maintenance), I
> would think that it shouldn't be synchronizing data to other
> subvolumes. Am I completely off base here?
>
> Thanks,
> - Ryan
> _______________________________________________
> lvm-discuss mailing list
> lvm-discuss@opensolaris.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic