[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openembedded-core
Subject:    Re: [OE-core] [PATCH 0/2] Avoid build failures due to setscene errors
From:       Martin Jansa <martin.jansa () gmail ! com>
Date:       2019-11-29 16:48:50
Message-ID: CA+chaQdRJQ8i+a2Ko3_D7Gp=yAJvA5M5wCmZstMykip-x-FYbg () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On Wed, Aug 30, 2017 at 9:54 AM Martin Jansa <martin.jansa@gmail.com> wrote:

> I agree with this patchset and it would be OK with IGNORE_SETSCENE_ERRORS
> conditional as well.
>
> We're also sometimes seeing these errors, sometime anticipated when
> cleaning shared sstate-cache on NFS server sometimes unexpected when NFS or
> network goes down for a minute and for some builds it happens between
> sstate_checkhashes()  and using the sstate.
>
> We normally stop all jenkins builds, until the cleanup is complete (there
> is jenkins job doing the cleanup, so it puts jenkins into stop mode, waits
> for all current jobs to finish which can take hours, then performs the
> cleanup and cancels the stop mode), but we cannot stop hundreds of
> developers using the same sstate-cache in local builds (especially when we
> cannot really know when exactly the job will have free jenkins to perform
> the cleanup) - luckily in local builds it doesn't hurt so bad, because the
> developers are more likely to ignore the error as long as the image was
> created, but in jenkins builds when bitbake returns error we cannot easily
> distinguish this case of "RP is intentionally warning us that something
> went wrong with sstate, but everything was built correctly in the end" and
> "something failed in the build and we weren't able to recover from that,
> maybe even the image wasn't created" - so we don't trigger the follow up
> actions like announcing new official builds or parsing release notes or
> automated testing.
>
> Yes we could add more logic to these CI jobs, to grep the logs to decide
> if this error was the only one which caused the bitbake to return error
> code and ignore the returned error in such case, but simple variable is
> easier to maintain (even for the cost of forking bitbake and oe-core) and
> will work for local builds as well.
>

 I was using these 2 changes in my fork of oe-core and bitbake since they
were sent to the list, but today after getting a bunch of errors like this
from build which unfortunately wasn't using my forks and few questions
about why these errors aren't ignored from fellow developers I've finally
found time to improve our CI jobs to deal with this and ignore the bitbake
return code if it's reporting failure only because of these setscene
fetcher failures.

If someone needs similar work around for bitbake behavior, here is what I
did:
https://github.com/webOS-ports/jenkins-jobs/pull/12
yes, it's ugly, but it seems to work and is a bit better than forking
oe-core and bitbake just because of this issue.

Regards,

[Attachment #5 (text/html)]

<div dir="ltr"><div dir="ltr">On Wed, Aug 30, 2017 at 9:54 AM Martin Jansa &lt;<a \
href="mailto:martin.jansa@gmail.com">martin.jansa@gmail.com</a>&gt; \
wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr">I agree with this patchset and it \
would be OK with  <span style="font-size:12.8px">IGNORE_SETSCENE_ERRORS conditional \
as well.</span><div><span style="font-size:12.8px"><br></span></div><div><span \
style="font-size:12.8px">We&#39;re also sometimes seeing these errors, sometime \
anticipated when cleaning shared sstate-cache on NFS server sometimes unexpected when \
NFS or network goes down for a minute and for some builds it happens between  \
</span><span style="font-size:12.8px">sstate_checkhashes()</span><span \
style="font-size:12.8px">   and using the sstate.</span></div><div><span \
style="font-size:12.8px"><br></span></div><div><span \
style="font-size:12.8px">W</span><span style="font-size:12.8px">e normally stop all \
jenkins builds, until the cleanup is complete (there is jenkins job doing the \
cleanup, so it puts jenkins into stop mode, waits for all current jobs to finish \
which can take hours, then performs the cleanup and cancels the stop mode), but we \
cannot stop hundreds of developers using the same sstate-cache in local builds \
(especially when we cannot really know when exactly the job will have free jenkins to \
perform the cleanup) - luckily in local builds it doesn&#39;t hurt so bad, because \
the developers are more likely to ignore the error as long as the image was created, \
but in jenkins builds when bitbake returns error we cannot easily distinguish this \
case of &quot;RP is intentionally warning us that something went wrong with sstate, \
but everything was built correctly in the end&quot; and &quot;something failed in the \
build and we weren&#39;t able to recover from that, maybe even the image wasn&#39;t \
created&quot; - so we don&#39;t trigger the follow up actions like announcing new \
official builds or parsing release notes or automated testing.</span></div><div><span \
style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Yes we \
could add more logic to these CI jobs, to grep the logs to decide if this error was \
the only one which caused the bitbake to return error code and ignore the returned \
error in such case, but simple variable is easier to maintain (even for the cost of \
forking bitbake and oe-core) and will work for local builds as \
well.</span></div></div></blockquote><div><br></div><div>  I was using these 2 \
changes in my fork of oe-core and bitbake since they were sent to the list, but today \
after getting a bunch of errors like this from build which unfortunately wasn&#39;t \
using my forks and few questions about why these errors aren&#39;t ignored from \
fellow developers I&#39;ve finally found time to improve our CI jobs to deal with \
this and ignore the bitbake return code if it&#39;s reporting failure only because of \
these setscene fetcher failures.</div><div><br></div><div>If someone needs similar \
work around for bitbake behavior, here is what I did:</div><div><a \
href="https://github.com/webOS-ports/jenkins-jobs/pull/12">https://github.com/webOS-ports/jenkins-jobs/pull/12</a><br></div><div>yes, \
it&#39;s ugly, but it seems to work and is a bit better than forking oe-core and \
bitbake just because of this \
issue.</div><div><br></div><div>Regards,</div></div></div>



-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic