[prev in list] [next in list] [prev in thread] [next in thread] 

List:       vdsm-devel
Subject:    =?utf-8?q?=5Bovirt-devel=5D?= Re: [CQ 20589] engine reports storage domain does not exist
From:       Benny Zlotnik <bzlotnik () redhat ! com>
Date:       2020-02-25 14:40:16
Message-ID: CAH+QXvVrmvCmmdGQ=ut3ykJ+caQrP516FwyxUdWFjw+hX7cQdA () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I think it is in issue with newer kernel, looks like devices are not
available after `iscsiadm -l`, I've seen a similar reproducible issue on my
env (in a different flow though), so I submitted a bug[1]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1807050

On Tue, Feb 25, 2020 at 10:55 AM Sandro Bonazzola <sbonazzo@redhat.com>
wrote:

> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/
> 
> 
> Engine fails at:
> 
> 2020-02-25 02:53:53,144-05 DEBUG \
> [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] \
> (default task-1) [56f93c90-3868-43f8-920f-bc1fccc72a27] Exception: \
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: \
> VDSErrorException: Failed to HSMGetStorageDomainInfoVDS, error = Storage domain \
> does not exist: ('8c9f3762-3bf1-48fa-9237-b6587d0268ab',), code = 358 
> 
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/artifact/basic- \
> suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log
>  
> corresponding failure in VDSM is:
> 
> 2020-02-25 02:53:53,128-0500 ERROR (jsonrpc/7) [storage.TaskManager.Task] \
> (Task='752b540e-fbfc-4602-8eeb-3c357fb7f5a2') Unexpected error (task:874) Traceback \
> (most recent call last): File \
> "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 881, in _run return \
> fn(*args, **kargs) File "<decorator-gen-129>", line 2, in getStorageDomainInfo
> File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
> ret = func(*args, **kwargs)
> File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2752, in \
> getStorageDomainInfo dom = self.validateSdUUID(sdUUID)
> File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 309, in \
> validateSdUUID sdDom = sdCache.produce(sdUUID=sdUUID)
> File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 115, in produce
> domain.getRealDomain()
> File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 51, in \
> getRealDomain return self._cache._realProduce(self._sdUUID)
> File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 139, in \
> _realProduce domain = self._findDomain(sdUUID)
> File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 156, in \
> _findDomain return findMethod(sdUUID)
> File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 186, in \
> _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID)
> vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist: \
> ('8c9f3762-3bf1-48fa-9237-b6587d0268ab',) 2020-02-25 02:53:53,128-0500 INFO  \
> (jsonrpc/7) [storage.TaskManager.Task] \
> (Task='752b540e-fbfc-4602-8eeb-3c357fb7f5a2') aborting: Task is aborted: \
> "value=Storage domain does not exist: ('8c9f3762-3bf1-48fa-9237-b6587d0268ab',) \
> abortedcode=358" (task:1184) 2020-02-25 02:53:53,128-0500 ERROR (jsonrpc/7) \
> [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not \
> exist: ('8c9f3762-3bf1-48fa-9237-b6587d0268ab',) (dispatcher:83) 2020-02-25 \
> 02:53:53,129-0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call \
> StorageDomain.getInfo failed (error 358) in 0.28 seconds (__init__:312) 
> 
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/artifact/basic- \
> suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
>  
> Corresponding var log messages:
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2:
> Attached scsi generic sg8 type 0
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: alua:
> port group 00 state A non-preferred supports TOlUSNA
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1:
> Direct-Access     LIO-ORG  lun1_bdev        4.0  PQ: 0 ANSI: 5
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
> 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1: alua:
> supports implicit and explicit TPGS
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1: alua:
> device naa.600140541e20cc9dba94e4fa46a57322 port group 0 rel port 1
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
> Write Protect is off
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
> Write cache: enabled, read cache: enabled, supports DPO and FUA
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1:
> Attached scsi generic sg9 type 0
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: alua:
> port group 00 state A non-preferred supports TOlUSNA
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
> 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:0: [sdf]
> Attached SCSI disk
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
> Write Protect is off
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
> Write cache: enabled, read cache: enabled, supports DPO and FUA
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:4: [sdg]
> Attached SCSI disk
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:3: [sdh]
> Attached SCSI disk
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
> Attached SCSI disk
> Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
> Attached SCSI disk
> Feb 25 02:53:52 lago-basic-suite-master-host-0 systemd[1]: Started Session
> c67 of user root.
> Feb 25 02:53:52 lago-basic-suite-master-host-0 systemd[1]: Starting LVM
> event activation on device 8:128...
> Feb 25 02:53:52 lago-basic-suite-master-host-0 lvm[30164]:  pvscan[30164]
> PV /dev/sdi online, VG 8c9f3762-3bf1-48fa-9237-b6587d0268ab is complete.
> Feb 25 02:53:52 lago-basic-suite-master-host-0 lvm[30164]:  pvscan[30164]
> VG 8c9f3762-3bf1-48fa-9237-b6587d0268ab skip autoactivation.
> Feb 25 02:53:52 lago-basic-suite-master-host-0 systemd[1]: Started LVM
> event activation on device 8:128.
> 
> 
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/artifact/basic- \
> suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/messages
>  
> 
> 
> --
> 
> Sandro Bonazzola
> 
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> 
> Red Hat EMEA <https://www.redhat.com/>
> 
> sbonazzo@redhat.com
> <https://www.redhat.com/>*Red Hat respects your work life balance.
> Therefore there is no need to answer this email out of your office hours.*
> 


[Attachment #5 (text/html)]

<div dir="ltr"><div>I think it is in issue with newer kernel, looks like devices are \
not available after `iscsiadm -l`, I&#39;ve seen a similar reproducible issue on my \
env (in a different flow though), so I submitted a \
bug[1]<br></div><div><br></div><div>[1] <a \
href="https://bugzilla.redhat.com/show_bug.cgi?id=1807050" rel="noreferrer" \
target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1807050</a></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 25, 2020 at 10:55 \
AM Sandro Bonazzola &lt;<a href="mailto:sbonazzo@redhat.com" \
target="_blank">sbonazzo@redhat.com</a>&gt; wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><a \
href="https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/" \
target="_blank">https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/</a><br></div><div><br></div><div><br></div><div>Engine \
fails at:</div><div><pre style="color:rgb(0,0,0)">2020-02-25 02:53:53,144-05 DEBUG \
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] \
(default task-1) [56f93c90-3868-43f8-920f-bc1fccc72a27] Exception: \
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: \
VDSErrorException: Failed to HSMGetStorageDomainInfoVDS, error = Storage domain does \
not exist: (&#39;8c9f3762-3bf1-48fa-9237-b6587d0268ab&#39;,), code = 358 </pre><a \
href="https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/artifact/ba \
sic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log" \
target="_blank">https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/a \
rtifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago \
-basic-suite-master-engine/_var_log/ovirt-engine/engine.log</a><br></div><div><br></div><div>corresponding \
failure in VDSM is:<br></div><div><pre style="color:rgb(0,0,0)">2020-02-25 \
02:53:53,128-0500 ERROR (jsonrpc/7) [storage.TaskManager.Task] \
(Task=&#39;752b540e-fbfc-4602-8eeb-3c357fb7f5a2&#39;) Unexpected error (task:874) \
Traceback (most recent call last):  File \
&quot;/usr/lib/python3.6/site-packages/vdsm/storage/task.py&quot;, line 881, in _run  \
return fn(*args, **kargs)  File &quot;&lt;decorator-gen-129&gt;&quot;, line 2, in \
getStorageDomainInfo  File \
&quot;/usr/lib/python3.6/site-packages/vdsm/common/api.py&quot;, line 50, in method  \
ret = func(*args, **kwargs)  File \
&quot;/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py&quot;, line 2752, in \
getStorageDomainInfo  dom = self.validateSdUUID(sdUUID)
  File &quot;/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py&quot;, line 309, in \
validateSdUUID  sdDom = sdCache.produce(sdUUID=sdUUID)
  File &quot;/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py&quot;, line 115, in \
produce  domain.getRealDomain()
  File &quot;/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py&quot;, line 51, in \
getRealDomain  return self._cache._realProduce(self._sdUUID)
  File &quot;/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py&quot;, line 139, in \
_realProduce  domain = self._findDomain(sdUUID)
  File &quot;/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py&quot;, line 156, in \
_findDomain  return findMethod(sdUUID)
  File &quot;/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py&quot;, line 186, in \
_findUnfetchedDomain  raise se.StorageDomainDoesNotExist(sdUUID)
vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist: \
(&#39;8c9f3762-3bf1-48fa-9237-b6587d0268ab&#39;,) 2020-02-25 02:53:53,128-0500 INFO  \
(jsonrpc/7) [storage.TaskManager.Task] \
(Task=&#39;752b540e-fbfc-4602-8eeb-3c357fb7f5a2&#39;) aborting: Task is aborted: \
&quot;value=Storage domain does not exist: \
(&#39;8c9f3762-3bf1-48fa-9237-b6587d0268ab&#39;,) abortedcode=358&quot; (task:1184) \
2020-02-25 02:53:53,128-0500 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH \
getStorageDomainInfo error=Storage domain does not exist: \
(&#39;8c9f3762-3bf1-48fa-9237-b6587d0268ab&#39;,) (dispatcher:83) 2020-02-25 \
02:53:53,129-0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call \
StorageDomain.getInfo failed (error 358) in 0.28 seconds \
(__init__:312)</pre></div><div><a \
href="https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/artifact/ba \
sic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log" \
target="_blank">https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/a \
rtifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago \
-basic-suite-master-host-0/_var_log/vdsm/vdsm.log</a><br></div><div><br></div><div>Corresponding \
var log messages:</div><div>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd \
3:0:0:2: Attached scsi generic sg8 type 0<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: alua: port group 00 state A \
non-preferred supports TOlUSNA<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 \
kernel: scsi 3:0:0:1: Direct-Access       LIO-ORG   lun1_bdev            4.0   PQ: 0 \
ANSI: 5<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi] \
41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1: alua: supports implicit and \
explicit TPGS<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1: \
alua: device naa.600140541e20cc9dba94e4fa46a57322 port group 0 rel port 1<br>Feb 25 \
02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi] Write Protect is \
off<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi] Write \
cache: enabled, read cache: enabled, supports DPO and FUA<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: Attached scsi generic sg9 type \
0<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: alua: port \
group 00 state A non-preferred supports TOlUSNA<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj] 41943040 512-byte logical \
blocks: (21.5 GB/20.0 GiB)<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: \
sd 3:0:0:0: [sdf] Attached SCSI disk<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj] Write Protect is off<br>Feb \
25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj] Write cache: \
enabled, read cache: enabled, supports DPO and FUA<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 kernel: sd 3:0:0:4: [sdg] Attached SCSI disk<br>Feb 25 \
02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:3: [sdh] Attached SCSI \
disk<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi] \
Attached SCSI disk<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd \
3:0:0:1: [sdj] Attached SCSI disk<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 \
systemd[1]: Started Session c67 of user root.<br>Feb 25 02:53:52 \
lago-basic-suite-master-host-0 systemd[1]: Starting LVM event activation on device \
8:128...<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 lvm[30164]:   \
pvscan[30164] PV /dev/sdi online, VG 8c9f3762-3bf1-48fa-9237-b6587d0268ab is \
complete.<br>Feb 25 02:53:52 lago-basic-suite-master-host-0 lvm[30164]:   \
pvscan[30164] VG 8c9f3762-3bf1-48fa-9237-b6587d0268ab skip autoactivation.<br>Feb 25 \
02:53:52 lago-basic-suite-master-host-0 systemd[1]: Started LVM event activation on \
device 8:128.<br></div><div><br></div><div><a \
href="https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/artifact/ba \
sic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/messages" \
target="_blank">https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/a \
rtifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago \
-basic-suite-master-host-0/_var_log/messages</a><br></div><div><br></div><div><br></div><div><br></div>-- \
<br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div \
dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div \
dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div \
dir="ltr"><div dir="ltr"><div dir="ltr"><p \
style="color:rgb(0,0,0);font-family:RedHatText,sans-serif;font-weight:bold;margin:0px;padding:0px;font-size:14px;text-transform:capitalize"><span>Sandro</span> \
<span>Bonazzola</span><span \
style="text-transform:uppercase;color:rgb(170,170,170);margin:0px"></span></p><p \
style="color:rgb(0,0,0);font-family:RedHatText,sans-serif;font-size:12px;margin:0px;text-transform:capitalize"><span>MANAGER, \
SOFTWARE ENGINEERING, EMEA R&amp;D RHV</span></p><p \
style="color:rgb(0,0,0);font-family:RedHatText,sans-serif;margin:0px 0px \
4px;font-size:12px"><a href="https://www.redhat.com/" \
style="color:rgb(0,136,206);margin:0px" target="_blank">Red Hat  \
<span>EMEA</span></a></p><div \
style="color:rgb(0,0,0);font-family:RedHatText,sans-serif;font-size:medium;margin-bottom:4px"></div><p \
style="color:rgb(0,0,0);font-family:RedHatText,sans-serif;margin:0px;font-size:12px"><span \
style="margin:0px;padding:0px"><a href="mailto:sbonazzo@redhat.com" \
style="color:rgb(0,0,0);margin:0px" target="_blank">sbonazzo@redhat.com</a>     \
</span></p><div style="margin-top:12px"><table border="0"><tbody><tr><td \
width="100px"><a href="https://www.redhat.com/" target="_blank"><font size="3" \
face="RedHatText, sans-serif" color="#000000"><img \
src="https://marketing-outfit-prod-images.s3-us-west-2.amazonaws.com/f5445ae0c9ddafd5b2f1836854d7416a/Logo-RedHat-Email.png" \
width="90" height="auto"></font></a></td></tr></tbody></table><font size="1" \
face="arial, sans-serif" color="#000000"><b>Red Hat respects your work life balance. \
Therefore there is no need to answer this email out of your office \
hours.</b></font><br></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>
 </blockquote></div>


[Attachment #6 (text/plain)]

_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/SBRWPQGVMCL723VGSSGG6V6YS3N2P64M/




[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic