New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helix crashes #611
Comments
Does not happen on my Fedora setup. Did it really die on copying the font of an entity? That sounds like unitialized memory or something. Couldn this be due to the recent change of memory allocator? |
@phkahler Have you tried to reproduce this issue on a normal build, or on a build with sanitizers enabled? |
If I go back to 74103ee (just before 521473e) then it crashes here: solvespace/src/platform/utilwin.cpp Line 60 in 74103ee
Which I think is this: If I checkout up to and including 521473e then the crash changes to the one above. So @phkaler - yes, I think 521473e removes that crash and introduces this one. |
I only run release builds, sorry. It's kind of siily I haven't adopted a decent IDE yet or used the debugger. gedit and the terminal over here... @ruevs yeah using OpenMP was a risk, but that first crash you had was in such a strange place, it doesn't seem related on the surface. That second crash (that you posted) looks like it's in the merging of the triangle lists. That was not done by separate threads prior to e84fd46 Does it crash if you go one prior to that? |
You don't need an IDE or a debugger to run a build with sanitizers. Try configuring with |
OK, probably not important but when running cmake:
Then immediately on running solvespace:
|
The first issue (CMake) is unimportant but something we should fix in the long run just to reduce noise. The second issue (new-delete-type-mismatch) is a major memory issue that needs to be fixed. |
@phkahler I forgot to mention that to get useful backtraces with sanitizer builds you should add |
It looks like a bug in libsigc++ (which is a transitive dependency through GTK 3): libsigcplusplus/libsigcplusplus#10. We can't do anything about it unfortunately so you should |
@whitequark Thanks I'll try that next. If I ignore the error it appears to run but then crashes if I try to Revolve or Helix (but not extrude) it's claiming a use-after-free in CopyEntity called from Group::Generate. I'm impressed that these tools are finding issues that have gone undetected so long. |
Sanitizers are extremely powerful and valuable tools and it is almost impossible to write reliable C++ code without using them; completely impossible if a team of people with a varying degree of understaning C++ is working on a shared codebase. (Before sanitizers there was Valgrind, which is still occasionally useful, but it can fundamentally discover fewer issues in C/C++ codebases, and is much slower.) The main downside is a small amount of false positives they have (e.g. TSan doesn't appear to be usable with the GNU OpenMP implementation, sadly), but almost everything they report tends to be a real issue after all. |
Oh, and we do already run sanitizers on CI, but the scarce test coverage means many issues go undetected anyway. |
Lathe is failing too. |
Oh, lots of memory leaks too without doing anything, just run it and close the window. |
There are quite a few allocations in GTK and OpenGL drivers that aren't freed, so leak checking is more usefully performed with Windows builds, unfortunately. The leaks are "real" in the sense that it is memory that is never freed and, in fact, cannot be freed, but they are "not real" in the sense that it's done on purpose by an upstream project such that this memory is gone with the process and isn't going to be fixed. |
The leaks look mostly GTK related too. The crashes are not... |
Yep, the crashes are exceedingly likely to point to a real issue. |
Oh.... I think I did something dumb. I need to review some history. And by dumb I don't mean the typical "wrote a bug dumb", but the "ignored the comments that warn you about a hazard" kinda dumb. |
@ruevs can you try this: I probably introduced this when I removed extra entity copies from those 3 tools. There is a comment right there not to use *e after copy because it may not be valid. I changed to code for clarity but ignored that comment. |
Yes, the bugs I just fixed have been there since: 45eb246 There seem to be a lot of people building from source but apparently not on Windows? |
|
Just did. Your phkahler/issue611 fixes the "first crash". So you can merge #612 |
Now I'll try with your patch, the heap change AND parallel SShell::TriangulateInto. Essentially the new master at 23dfd97 ... |
Oops. I linked this issue to the PR so it closed on merge. Please let us know if you still have the other problem. |
Parallel triangulation won't ever work on Windows without removing HEAP_NO_SERIALIZE. |
Is that done on master? |
Of course. |
It does not crash at all with current master 23dfd97 So @whitequark your removal of the Win32 heap seems to fix (hide?) the "OpenMP SShell::TriangulateInt" related crash. And @phkahler your #612 fixes the original crash in this issue. And by the way I like your "Split quads based on angle" - now that I can try it I will play some more :-) |
Awesome! |
It doesn't "hide" anything. It fixes the root cause of the crash, which is the lack of serialization on heap access. In this context, serialization means using a mutex to synchronize heap access, which would cause the threads to access it one at a time, or serially. |
System information
SolveSpace version: master e84fd46
Operating system: Windows 10
Microsoft Visual C++ 2019 x64
Expected behavior
Create a helix
Actual behavior
Crashes here https://github.com/solvespace/solvespace/blob/master/src/group.cpp#L1033
With
Call stack:
Additional information
Watch the movie ;-)
HelixCrash.zip
The text was updated successfully, but these errors were encountered: