[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gluster-users
Subject:    Re: [Gluster-users] Missing files on one of the bricks
From:       Frederic Harmignies <frederic.harmignies () elementai ! com>
Date:       2017-11-16 18:13:21
Message-ID: CAMJDJ6TL5O1u_-KWLuC7LhNxsDTFY8Ny31M1MkzSY+cqq6Euew () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hello, looks like the full heal fixed the problem, i was just impatient :)

[2017-11-16 15:04:34.102010] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-data01-replicate-0: performing metadata selfheal on
9612ecd2-106d-42f2-95eb-fef495c1d8ab
[2017-11-16 15:04:34.186781] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab.
sources=[1]  sinks=0
[2017-11-16 15:04:38.776070] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed data selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
sources=[1]  sinks=0
[2017-11-16 15:04:38.811744] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-data01-replicate-0: performing metadata selfheal on
7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54
[2017-11-16 15:04:38.867474] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
sources=[1]  sinks=0

# gluster volume heal data01 info
Brick 192.168.186.11:/mnt/AIDATA/data
Status: Connected
Number of entries: 0

Brick 192.168.186.12:/mnt/AIDATA/data
Status: Connected
Number of entries: 0

Thank you for your fast response!

On Thu, Nov 16, 2017 at 10:13 AM, Frederic Harmignies <
frederic.harmignies@elementai.com> wrote:

> Hello, we are using glusterfs 3.10.3.
>
> We currently have a gluster heal volume full running, the crawl is still
> running.
>
> Starting time of crawl: Tue Nov 14 15:58:35 2017
>
> Crawl is in progress
> Type of crawl: FULL
> No. of entries healed: 0
> No. of entries in split-brain: 0
> No. of heal failed entries: 0
>
> getfattr from both files:
>
> # getfattr -d -m . -e hex /mnt/AIDATA/data//ishmaelb/
> experiments/omie/omieali/cifar10/donsker_grad_reg_ali_
> dcgan_stat_dcgan_ac_True/omieali_cifar10_zdim_100_enc_
> dcgan_dec_dcgan_stat_dcgan_posterior_propagated_enc_
> beta1.0_dec_beta_1.0_info_metric_donsker_varadhan_info_
> lam_0.334726025306_222219-23_10_17/data/data_gen_iter_86000.pkl
> getfattr: Removing leading '/' from absolute path names
> # file: mnt/AIDATA/data//ishmaelb/experiments/omie/omieali/
> cifar10/donsker_grad_reg_ali_dcgan_stat_dcgan_ac_True/
> omieali_cifar10_zdim_100_enc_dcgan_dec_dcgan_stat_dcgan_
> posterior_propagated_enc_beta1.0_dec_beta_1.0_info_
> metric_donsker_varadhan_info_lam_0.334726025306_222219-23_
> 10_17/data/data_gen_iter_86000.pkl
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.data01-client-0=0x000000000000000100000000
> trusted.gfid=0x7e8513f4d4e24e66b0ba2dbe4c803c54
>
> # getfattr -d -m . -e hex /mnt/AIDATA/data/home/allac/
> experiments/171023_105655_mini_imagenet_projection_size_
> mixing_depth_num_filters_filter_size_block_depth_Explore\ architecture\
> capacity/Explore\ architecture\ capacity\(projection_size\=32\
> ;mixing_depth\=0\;num_filters\=64\;filter_size\=3\;block_
> depth\=3\)/model.ckpt-70001.data-00000-of-00001.
> tempstate1629411508065733704
> getfattr: Removing leading '/' from absolute path names
> # file: mnt/AIDATA/data/home/allac/experiments/171023_105655_
> mini_imagenet_projection_size_mixing_depth_num_filters_
> filter_size_block_depth_Explore architecture capacity/Explore
> architecture capacity(projection_size=32;mixing_depth=0;num_filters=64;
> filter_size=3;block_depth=3)/model.ckpt-70001.data-00000-of-00001.
> tempstate1629411508065733704
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.data01-client-0=0x000000000000000000000000
> trusted.bit-rot.version=0x02000000000000005979d278000af1e7
> trusted.gfid=0x9612ecd2106d42f295ebfef495c1d8ab
>
>
> # gluster volume heal data01
> Launching heal operation to perform index self heal on volume data01 has
> been successful
> Use heal info commands to check status
> # cat /var/log/glusterfs/glustershd.log
> [2017-11-12 08:39:01.907287] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2017-11-15 08:18:02.084766] I [MSGID: 100011] [glusterfsd.c:1414:reincarnate]
> 0-glusterfsd: Fetching the volume file from server...
> [2017-11-15 08:18:02.085718] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2017-11-15 19:13:42.005307] W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54> (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]
> The message "W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54> (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]" repeated 5 times between [2017-11-15
> 19:13:42.005307] and [2017-11-15 19:13:42.166579]
> [2017-11-15 19:23:43.041956] W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54> (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]
> The message "W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54> (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]" repeated 5 times between [2017-11-15
> 19:23:43.041956] and [2017-11-15 19:23:43.235831]
> [2017-11-15 19:30:22.726808] W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54> (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]
> The message "W [MSGID: 114031] [client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54> (7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]" repeated 4 times between [2017-11-15
> 19:30:22.726808] and [2017-11-15 19:30:22.827631]
> [2017-11-16 15:04:34.102010] I [MSGID: 108026]
> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> 0-data01-replicate-0: performing metadata selfheal on
> 9612ecd2-106d-42f2-95eb-fef495c1d8ab
> [2017-11-16 15:04:34.186781] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
> Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab.
> sources=[1]  sinks=0
> [2017-11-16 15:04:38.776070] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
> Completed data selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
> sources=[1]  sinks=0
> [2017-11-16 15:04:38.811744] I [MSGID: 108026]
> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> 0-data01-replicate-0: performing metadata selfheal on
> 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54
> [2017-11-16 15:04:38.867474] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
> Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
> sources=[1]  sinks=0
>
>
>
>
> On Thu, Nov 16, 2017 at 7:14 AM, Ravishankar N <ravishankar@redhat.com>
> wrote:
>
>>
>>
>> On 11/16/2017 04:12 PM, Nithya Balachandran wrote:
>>
>>
>>
>> On 15 November 2017 at 19:57, Frederic Harmignies <
>> frederic.harmignies@elementai.com> wrote:
>>
>>> Hello, we have 2x files that are missing from one of the bricks. No idea
>>> how to fix this.
>>>
>>> Details:
>>>
>>> # gluster volume info
>>>
>>> Volume Name: data01
>>> Type: Replicate
>>> Volume ID: 39b4479c-31f0-4696-9435-5454e4f8d310
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 192.168.186.11:/mnt/AIDATA/data
>>> Brick2: 192.168.186.12:/mnt/AIDATA/data
>>> Options Reconfigured:
>>> performance.cache-refresh-timeout: 30
>>> client.event-threads: 16
>>> server.event-threads: 32
>>> performance.readdir-ahead: off
>>> performance.io-thread-count: 32
>>> performance.cache-size: 32GB
>>> transport.address-family: inet
>>> nfs.disable: on
>>> features.trash: off
>>> features.trash-max-filesize: 500MB
>>>
>>> # gluster volume heal data01 info
>>> Brick 192.168.186.11:/mnt/AIDATA/data
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> Brick 192.168.186.12:/mnt/AIDATA/data
>>> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
>>> <gfid:9612ecd2-106d-42f2-95eb-fef495c1d8ab>
>>> Status: Connected
>>> Number of entries: 2
>>>
>>> # gluster volume heal data01 info split-brain
>>> Brick 192.168.186.11:/mnt/AIDATA/data
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick 192.168.186.12:/mnt/AIDATA/data
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>>
>>> Both files is missing from the folder on Brick1, the gfid files are also
>>> missing in the .gluster folder on that same Brick1.
>>> Brick2 has both the files and the gfid file in .gluster
>>>
>>> We already tried:
>>>
>>>  #gluster heal volume full
>>> Running a stat and ls -l on both files from a mounted client to try and
>>> trigger a heal
>>>
>>> Would a re-balance fix this? Any guidance would be greatly appreciated!
>>>
>>
>> A rebalance would not help here as this is a replicate volume. Ravi, any
>> idea what could be going wrong here?
>>
>> No, explicit lookup should have healed the file on the missing brick.
>> Unless lookup did not hit afr and is served from caching translators.
>> Frederic, what version of gluster are you running? Can you launch
>> 'gluster heal volume' and see glustershd logs for possible warnings? Use
>> DEBUG client-log-level if you have to.  Also, instead of stat, try a
>> getfattr on the file from the mount.
>>
>> -Ravi
>>
>>
>> Regards,
>> Nithya
>>
>>>
>>> Thank you in advance!
>>>
>>> --
>>>
>>> *Frederic Harmignies*
>>> *High Performance Computer Administrator*
>>>
>>> www.elementai.com
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>>
>
>
> --
>
> *Frederic Harmignies*
> *High Performance Computer Administrator*
>
> www.elementai.com
>



-- 

*Frederic Harmignies*
*High Performance Computer Administrator*

www.elementai.com

[Attachment #5 (text/html)]

<div dir="ltr">Hello, looks like the full heal fixed the problem, i was just \
impatient :)<div><br></div><div><div>[2017-11-16 15:04:34.102010] I [MSGID: 108026] \
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-data01-replicate-0: \
performing metadata selfheal on \
9612ecd2-106d-42f2-95eb-fef495c1d8ab</div><div>[2017-11-16 15:04:34.186781] I [MSGID: \
108026] [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0: \
Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab. sources=[1]   \
sinks=0  </div><div>[2017-11-16 15:04:38.776070] I [MSGID: 108026] \
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0: Completed data \
selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54. sources=[1]   sinks=0  \
</div><div>[2017-11-16 15:04:38.811744] I [MSGID: 108026] \
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-data01-replicate-0: \
performing metadata selfheal on \
7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54</div><div>[2017-11-16 15:04:38.867474] I [MSGID: \
108026] [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0: \
Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54. sources=[1]   \
sinks=0  </div></div><div><br></div><div><div># gluster volume heal data01 \
info</div><div>Brick 192.168.186.11:/mnt/AIDATA/data</div><div>Status: \
Connected</div><div>Number of entries: 0</div><div><br></div><div>Brick \
192.168.186.12:/mnt/AIDATA/data</div><div>Status: Connected</div><div>Number of \
entries: 0</div></div><div><br></div><div>Thank you for your fast \
response!</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, \
Nov 16, 2017 at 10:13 AM, Frederic Harmignies <span dir="ltr">&lt;<a \
href="mailto:frederic.harmignies@elementai.com" \
target="_blank">frederic.harmignies@elementai.com</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex"><div dir="ltr">Hello, we are using  glusterfs \
3.10.3.<div><br></div><div>We currently have a gluster heal volume full running, the \
crawl is still running.<br><div><br></div><div><div>Starting time of crawl: Tue Nov \
14 15:58:35 2017</div><div><br></div><div>Crawl is in progress</div><div>Type of \
crawl: FULL</div><div>No. of entries healed: 0</div><div>No. of entries in \
split-brain: 0</div><div>No. of heal failed entries: \
0</div></div></div><div><br></div><div>getfattr from both \
files:</div><div><br></div><div><div># getfattr -d -m . -e hex \
/mnt/AIDATA/data//ishmaelb/<wbr>experiments/omie/omieali/<wbr>cifar10/donsker_grad_reg \
_ali_<wbr>dcgan_stat_dcgan_ac_True/<wbr>omieali_cifar10_zdim_100_enc_<wbr>dcgan_dec_dc \
gan_stat_dcgan_<wbr>posterior_propagated_enc_<wbr>beta1.0_dec_beta_1.0_info_<wbr>metri \
c_donsker_varadhan_info_<wbr>lam_0.334726025306_222219-23_<wbr>10_17/data/data_gen_iter_<wbr>86000.pkl</div><div>getfattr: \
Removing leading &#39;/&#39; from absolute path names</div><div># file: \
mnt/AIDATA/data//ishmaelb/<wbr>experiments/omie/omieali/<wbr>cifar10/donsker_grad_reg_ \
ali_<wbr>dcgan_stat_dcgan_ac_True/<wbr>omieali_cifar10_zdim_100_enc_<wbr>dcgan_dec_dcg \
an_stat_dcgan_<wbr>posterior_propagated_enc_<wbr>beta1.0_dec_beta_1.0_info_<wbr>metric \
_donsker_varadhan_info_<wbr>lam_0.334726025306_222219-23_<wbr>10_17/data/data_gen_iter \
_<wbr>86000.pkl</div><div>security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745 \
f723a756e6c6162656c65645f74<wbr>3a733000</div><div>trusted.afr.data01-client-0=<wbr>0x \
000000000000000100000000</div><div>trusted.gfid=<wbr>0x7e8513f4d4e24e66b0ba2dbe4c80<wbr>3c54</div></div><div><br></div><div><div># \
getfattr -d -m . -e hex \
/mnt/AIDATA/data/home/allac/<wbr>experiments/171023_105655_<wbr>mini_imagenet_projection_size_<wbr>mixing_depth_num_filters_<wbr>filter_size_block_depth_<wbr>Explore\ \
architecture\ capacity/Explore\ architecture\ \
capacity\(projection_size\=32\<wbr>;mixing_depth\=0\;num_filters\<wbr>=64\;filter_size \
\=3\;block_<wbr>depth\=3\)/model.ckpt-70001.<wbr>data-00000-of-00001.<wbr>tempstate1629411508065733704</div><div>getfattr: \
Removing leading &#39;/&#39; from absolute path names</div><div># file: \
mnt/AIDATA/data/home/allac/<wbr>experiments/171023_105655_<wbr>mini_imagenet_projection_size_<wbr>mixing_depth_num_filters_<wbr>filter_size_block_depth_<wbr>Explore \
architecture capacity/Explore architecture \
capacity(projection_size=32;<wbr>mixing_depth=0;num_filters=64;<wbr>filter_size=3;bloc \
k_depth=3)/<wbr>model.ckpt-70001.data-00000-<wbr>of-00001.<wbr>tempstate16294115080657 \
33704</div><div>security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a756e6 \
c6162656c65645f74<wbr>3a733000</div><div>trusted.afr.data01-client-0=<wbr>0x0000000000 \
00000000000000</div><div>trusted.bit-rot.version=<wbr>0x02000000000000005979d278000a<w \
br>f1e7</div><div>trusted.gfid=<wbr>0x9612ecd2106d42f295ebfef495c1<wbr>d8ab</div></div><div><br></div><div><span \
class=""><div><br></div><div># gluster volume heal \
data01<br></div></span><div><div>Launching heal operation to perform index self heal \
on volume data01 has been successful  </div><div>Use heal info commands to check \
status</div></div><div># cat \
/var/log/glusterfs/glustershd.<wbr>log<br></div><div><div>[2017-11-12 \
08:39:01.907287] I [glusterfsd-mgmt.c:1789:mgmt_<wbr>getspec_cbk] 0-glusterfs: No \
change in volfile, continuing</div><div>[2017-11-15 08:18:02.084766] I [MSGID: \
100011] [glusterfsd.c:1414:<wbr>reincarnate] 0-glusterfsd: Fetching the volume file \
from server...</div><div>[2017-11-15 08:18:02.085718] I \
[glusterfsd-mgmt.c:1789:mgmt_<wbr>getspec_cbk] 0-glusterfs: No change in volfile, \
continuing</div><div>[2017-11-15 19:13:42.005307] W [MSGID: 114031] \
[client-rpc-fops.c:2928:<wbr>client3_3_lookup_cbk] 0-data01-client-0: remote \
operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; \
(7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54) [No such file or directory]</div><div>The \
message &quot;W [MSGID: 114031] [client-rpc-fops.c:2928:<wbr>client3_3_lookup_cbk] \
0-data01-client-0: remote operation failed. Path: \
&lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; \
(7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54) [No such file or directory]&quot; \
repeated 5 times between [2017-11-15 19:13:42.005307] and [2017-11-15 \
19:13:42.166579]</div><div>[2017-11-15 19:23:43.041956] W [MSGID: 114031] \
[client-rpc-fops.c:2928:<wbr>client3_3_lookup_cbk] 0-data01-client-0: remote \
operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; \
(7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54) [No such file or directory]</div><div>The \
message &quot;W [MSGID: 114031] [client-rpc-fops.c:2928:<wbr>client3_3_lookup_cbk] \
0-data01-client-0: remote operation failed. Path: \
&lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; \
(7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54) [No such file or directory]&quot; \
repeated 5 times between [2017-11-15 19:23:43.041956] and [2017-11-15 \
19:23:43.235831]</div><div>[2017-11-15 19:30:22.726808] W [MSGID: 114031] \
[client-rpc-fops.c:2928:<wbr>client3_3_lookup_cbk] 0-data01-client-0: remote \
operation failed. Path: &lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; \
(7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54) [No such file or directory]</div><div>The \
message &quot;W [MSGID: 114031] [client-rpc-fops.c:2928:<wbr>client3_3_lookup_cbk] \
0-data01-client-0: remote operation failed. Path: \
&lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt; \
(7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54) [No such file or directory]&quot; \
repeated 4 times between [2017-11-15 19:30:22.726808] and [2017-11-15 \
19:30:22.827631]</div><div>[2017-11-16 15:04:34.102010] I [MSGID: 108026] \
[afr-self-heal-metadata.c:52:_<wbr>_afr_selfheal_metadata_do] 0-data01-replicate-0: \
performing metadata selfheal on \
9612ecd2-106d-42f2-95eb-<wbr>fef495c1d8ab</div><div>[2017-11-16 15:04:34.186781] I \
[MSGID: 108026] [afr-self-heal-common.c:1255:<wbr>afr_log_selfheal] \
0-data01-replicate-0: Completed metadata selfheal on \
9612ecd2-106d-42f2-95eb-<wbr>fef495c1d8ab. sources=[1]   sinks=0  \
</div><div>[2017-11-16 15:04:38.776070] I [MSGID: 108026] \
[afr-self-heal-common.c:1255:<wbr>afr_log_selfheal] 0-data01-replicate-0: Completed \
data selfheal on 7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54. sources=[1]   sinks=0  \
</div><div>[2017-11-16 15:04:38.811744] I [MSGID: 108026] \
[afr-self-heal-metadata.c:52:_<wbr>_afr_selfheal_metadata_do] 0-data01-replicate-0: \
performing metadata selfheal on \
7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54</div><div>[2017-11-16 15:04:38.867474] I \
[MSGID: 108026] [afr-self-heal-common.c:1255:<wbr>afr_log_selfheal] \
0-data01-replicate-0: Completed metadata selfheal on \
7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54. sources=[1]   sinks=0  \
</div></div></div><div><br></div><div><br></div><div><br></div></div><div \
class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div \
class="gmail_quote">On Thu, Nov 16, 2017 at 7:14 AM, Ravishankar N <span \
dir="ltr">&lt;<a href="mailto:ravishankar@redhat.com" \
target="_blank">ravishankar@redhat.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">  
    
  
  <div text="#000000" bgcolor="#FFFFFF"><div><div class="m_-4338297060369881573h5">
    <p><br>
    </p>
    <br>
    <div class="m_-4338297060369881573m_-3149634344898588617moz-cite-prefix">On \
11/16/2017 04:12 PM, Nithya  Balachandran wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On 15 November 2017 at 19:57,
            Frederic Harmignies <span dir="ltr">&lt;<a \
href="mailto:frederic.harmignies@elementai.com" \
target="_blank">frederic.harmignies@elementai<wbr>.com</a>&gt;</span>  wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex">  <div dir="ltr">Hello, we have 2x files that are \
missing  from one of the bricks. No idea how to fix this.
                <div><br>
                </div>
                <div>Details:</div>
                <div><br>
                </div>
                <div>
                  <div># gluster volume info</div>
                  <div>  </div>
                  <div>Volume Name: data01</div>
                  <div>Type: Replicate</div>
                  <div>Volume ID: 39b4479c-31f0-4696-9435-5454e4<wbr>f8d310</div>
                  <div>Status: Started</div>
                  <div>Snapshot Count: 0</div>
                  <div>Number of Bricks: 1 x 2 = 2</div>
                  <div>Transport-type: tcp</div>
                  <div>Bricks:</div>
                  <div>Brick1: 192.168.186.11:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Brick2: 192.168.186.12:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Options Reconfigured:</div>
                  <div>performance.cache-refresh-time<wbr>out: 30</div>
                  <div>client.event-threads: 16</div>
                  <div>server.event-threads: 32</div>
                  <div>performance.readdir-ahead: off</div>
                  <div>performance.io-thread-count: 32</div>
                  <div>performance.cache-size: 32GB</div>
                  <div>transport.address-family: inet</div>
                  <div>nfs.disable: on</div>
                  <div>features.trash: off</div>
                  <div>features.trash-max-filesize: 500MB</div>
                </div>
                <div><br>
                </div>
                <div>
                  <div># gluster volume heal data01 info</div>
                  <div>Brick 192.168.186.11:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Status: Connected</div>
                  <div>Number of entries: 0</div>
                  <div><br>
                  </div>
                  <div>Brick 192.168.186.12:/mnt/AIDATA/dat<wbr>a</div>
                  <div>&lt;gfid:7e8513f4-d4e2-4e66-b0ba-<wbr>2dbe4c803c54&gt;  </div>
                  <div>&lt;gfid:9612ecd2-106d-42f2-95eb-<wbr>fef495c1d8ab&gt;  </div>
                  <div>Status: Connected</div>
                  <div>Number of entries: 2</div>
                  <div><br>
                  </div>
                  <div># gluster volume heal data01 info split-brain</div>
                  <div>Brick 192.168.186.11:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Status: Connected</div>
                  <div>Number of entries in split-brain: 0</div>
                  <div><br>
                  </div>
                  <div>Brick 192.168.186.12:/mnt/AIDATA/dat<wbr>a</div>
                  <div>Status: Connected</div>
                  <div>Number of entries in split-brain: 0</div>
                </div>
                <div><br>
                </div>
                <div>
                  <div><br>
                  </div>
                </div>
                <div>Both files is missing from the folder on Brick1,
                  the gfid files are also missing in the .gluster folder
                  on that same Brick1.</div>
                <div>Brick2 has both the files and the gfid file in
                  .gluster</div>
                <div><br>
                </div>
                <div>We already tried:</div>
                <div><br>
                </div>
                <div>  #gluster heal volume full</div>
                <div>Running a stat and ls -l on both files from a
                  mounted client to try and trigger a heal</div>
                <div><br>
                </div>
                <div>Would a re-balance fix this? Any guidance would be
                  greatly appreciated!</div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>A rebalance would not help here as this is a replicate
              volume. Ravi, any idea what could be going wrong here? <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote></div></div>
    No, explicit lookup should have healed the file on the missing
    brick. Unless lookup did not hit afr and is served from caching
    translators.<br>
    Frederic, what version of gluster are you running? Can you launch
    &#39;gluster heal volume&#39; and see glustershd logs for possible warnings?
    Use DEBUG client-log-level if you have to.   Also, instead of stat,
    try a getfattr on the file from the mount.<br>
    <br>
    -Ravi<span><br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>Regards,</div>
            <div>Nithya</div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex">  <div dir="ltr">
                <div><br>
                </div>
                <div>Thank you in advance!</div>
                <span \
class="m_-4338297060369881573m_-3149634344898588617HOEnZb"><font color="#888888">  \
<div><br>  </div>
                    <div>
                      <div>-- <br>
                        <div \
class="m_-4338297060369881573m_-3149634344898588617m_-3765909141665328534gmail_signature">
  <div dir="ltr">
                            <div dir="ltr" \
style="color:rgb(136,136,136);font-size:12.8px"><b style="font-size:12.8px"><img \
src="https://drive.google.com/uc?id=0B-sFtqOxQE9UeW0wTXAwTVc5Mkk&amp;export=download" \
width="96" height="12"><br>  </b></div>
                            <div dir="ltr" \
style="color:rgb(136,136,136);font-size:12.8px"><b style="font-size:12.8px">Frederic  \
                Harmignies</b>
                              <div style="font-size:12.8px"><i \
                style="font-size:12.8px">High
                                  Performance Computer Administrator</i></div>
                              <div style="font-size:12.8px"><br>
                              </div>
                              <div style="font-size:12.8px"><a \
href="http://www.elementai.com/" style="color:rgb(17,85,204)" target="_blank"><font \
color="#888888">www.elementai.com</font></a></div>  </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </font></span></div>
              <br>
              ______________________________<wbr>_________________<br>
              Gluster-users mailing list<br>
              <a href="mailto:Gluster-users@gluster.org" \
target="_blank">Gluster-users@gluster.org</a><br>  <a \
href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" \
target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>  \
</blockquote>  </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </span></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div \
class="m_-4338297060369881573gmail_signature" data-smartmail="gmail_signature"><div \
dir="ltr"><div dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><b \
style="font-size:12.8px"><img width="96" height="12" \
src="https://drive.google.com/uc?id=0B-sFtqOxQE9UeW0wTXAwTVc5Mkk&amp;export=download"><br></b></div><div \
dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><div \
style="font-size:12.8px"></div><b style="font-size:12.8px">Frederic \
Harmignies</b><div style="font-size:12.8px"><i style="font-size:12.8px">High \
Performance Computer Administrator</i></div><div \
style="font-size:12.8px"><br></div><div style="font-size:12.8px"><a \
href="http://www.elementai.com/" style="color:rgb(17,85,204)" target="_blank"><font \
color="#888888">www.elementai.com</font></a></div></div></div></div> </div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div \
class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div \
dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><b \
style="font-size:12.8px"><img width="96" height="12" \
src="https://drive.google.com/uc?id=0B-sFtqOxQE9UeW0wTXAwTVc5Mkk&amp;export=download"><br></b></div><div \
dir="ltr" style="color:rgb(136,136,136);font-size:12.8px"><div \
style="font-size:12.8px"></div><b style="font-size:12.8px">Frederic \
Harmignies</b><div style="font-size:12.8px"><i style="font-size:12.8px">High \
Performance Computer Administrator</i></div><div \
style="font-size:12.8px"><br></div><div style="font-size:12.8px"><a \
href="http://www.elementai.com/" style="color:rgb(17,85,204)" target="_blank"><font \
color="#888888">www.elementai.com</font></a></div></div></div></div> </div>



_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic