Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent build timeouts for arm64 snap #587

Closed
ppd opened this issue Apr 25, 2020 · 14 comments
Closed

Frequent build timeouts for arm64 snap #587

ppd opened this issue Apr 25, 2020 · 14 comments

Comments

@ppd
Copy link
Member

ppd commented Apr 25, 2020

Travis "deploy" stage for arm64 snap fails often with message: "No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself."

It's always in the CMake install stage, which is invoked by snapcraft. No build-error, just no output for ten minutes. There's nothing extraordinary going on and the hardware seems "OK fast", so it shouldn't choke on a copy of a font or icon.

WIP, obviously...

@phkahler
Copy link
Member

@ppd Thank you for spending so much time making SolveSpace available via Snap. I build from source, but the more ways there are for other people to get it the better!

@ppd
Copy link
Member Author

ppd commented Apr 26, 2020

There's something wrong with stdout/stderr capturing on arm64. It's not really stalling, as even output from simple echo loops fails to show after a while.

It would be simple to redirect snapcraft's output to a logfile, one would think. Then there's less output and maybe that will help. But no; apparmor confines its execution and makes delegation of file descriptors impossible, which means no piping etc.

It's not a huge deal. Travis will eventually fix their stuff without a doubt. Until then, it's maybe wise to pause the edge builds for arm64. It will fail all the time for no good reason and that's not so helpful, I think.

@whitequark
Copy link
Contributor

@ppd By any chance, would this also help arm64?

@ppd
Copy link
Member Author

ppd commented Apr 26, 2020

@whitequark Same problem as with the logfile, I think. Apparmor will shoot it down with sth like Log: apparmor="DENIED" operation="file_inherit" profile="/usr/lib/snapd/snap-confine" [...]

@whitequark
Copy link
Contributor

Until then, it's maybe wise to pause the edge builds for arm64.

Let's do that. Let's also remove the builds for arm64 from the store so that people don't unintentionally download outdated buggy versions. (There's probably very few people who can run it in the first place at the moment, too.)

@ppd
Copy link
Member Author

ppd commented Apr 27, 2020

Support request regarding snapcraft output redirection: https://forum.snapcraft.io/t/redirecting-snapcraft-output/16948

@ppd
Copy link
Member Author

ppd commented Apr 27, 2020

@ppd
Copy link
Member Author

ppd commented Apr 27, 2020

@whitequark What about adding allow_failures to Travis cfg to prevent this hickup from polluting the CI results?

edge is by definition unstable or uncurated enough that we could tolerate them being a little out of date (sometimes) till the arm64 problem is resolved on the Travis side.

@whitequark
Copy link
Contributor

Sounds fine.

ppd added a commit to ppd/solvespace that referenced this issue Apr 27, 2020
arm64 builds on Travis are not yet fully mature and this
causes a high failure rate that is not caused by our code.

Continue building and deploying arm64 to the edge channel,
but don't consider the result for the success of the whole
build job.

Alleviates solvespace#587
whitequark pushed a commit that referenced this issue Apr 27, 2020
arm64 builds on Travis are not yet fully mature and this
causes a high failure rate that is not caused by our code.

Continue building and deploying arm64 to the edge channel,
but don't consider the result for the success of the whole
build job.

Alleviates #587
@ppd
Copy link
Member Author

ppd commented May 7, 2020

Looks better now. On the other hand, OSX is always on the verge of timing out.

@whitequark
Copy link
Contributor

See #618.

@ppd
Copy link
Member Author

ppd commented Oct 17, 2020

@ppd
Copy link
Member Author

ppd commented Oct 18, 2020

This is likely to be resolved in the not-too-distant future with a move to arm64-graviton2 on travis-ci.com: #744 (comment)

ppd added a commit to ppd/solvespace that referenced this issue Oct 20, 2020
arm64 is extremely unreliable, as documented in solvespace#587.
Switch to this new arch, which has proven to be more stable in
my tests by a lot.
@ppd
Copy link
Member Author

ppd commented Oct 26, 2020

With our move to arm64-graviton2, the original issue is resolved.

@ppd ppd closed this as completed Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants