Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash at language.d:431 during startup #121

Closed
emlai opened this issue Jul 16, 2016 · 11 comments
Closed

Crash at language.d:431 during startup #121

emlai opened this issue Jul 16, 2016 · 11 comments
Labels
3-bug norepro I (Simon) cannot reproduce this. I run Arch Linux 64-bit, and can run Win32 or Win64 in Wine.

Comments

@emlai
Copy link

emlai commented Jul 16, 2016

I'm compiling this on macOS; installed dmd, dub, A5, and ran dub in the project root directory. The program immediately crashes. Here's the relevant stack trace from the crash log:

Thread 7 Crashed:
0   lix                             0x0000000105742ac7 D4file8language51loadUserLanguageAndIfNotExistSetUserOptionToEnglishFZv + 79 (language.d:431)
1   lix                             0x000000010569ed10 D6basics4init10initializeFxC6basics7cmdargs7CmdargsZv + 88 (init.d:43)
2   lix                             0x00000001057a7764 D4main4mainFAAyaZ9__lambda2MFZi + 28 (main.d:32)
3   lix                             0x00000001057c23f0 _D8allegro56system14al_run_allegroFMDFZiZ11main_runnerUiPPaZi + 40

Hope this helps.

Edit: I reproduced this with the latest master (fe114c5).

@SimonN SimonN added 3-bug norepro I (Simon) cannot reproduce this. I run Arch Linux 64-bit, and can run Win32 or Win64 in Wine. labels Jul 16, 2016
@SimonN
Copy link
Owner

SimonN commented Jul 16, 2016

Thanks for the report! I haven't tested on MacOS at all -- only Windows and Linux -- and this feedback is extremely valuable.

A fresh checkout of master (fe114c5) didn't reproduce the error here. I'll dissect the source, and keep you updated when I have a theory. :-)

@emlai
Copy link
Author

emlai commented Jul 16, 2016

Forgot to include information about the exception itself:

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000000

So it seems to be a null pointer dereference.

SimonN added a commit that referenced this issue Jul 17, 2016
@SimonN
Copy link
Owner

SimonN commented Jul 17, 2016

I've added debugging asserts around the crashing source line, and around fileLanguage, the likely candidate for the null pointer. These asserts all pass on a clean build here.

Please pull master again, and run dub again, without any parameters. Do you get assertion failures, or does it segfault again?

@emlai
Copy link
Author

emlai commented Jul 17, 2016

I did a clean clone of the new master (revision 6aa3d49) and ran dub. The assertion on line 432 gets triggered.

@emlai
Copy link
Author

emlai commented Jul 18, 2016

Not entirely sure about this, but this might have something to do with threads. I printed the id of the current thread where fileLanguage is initialized in user.d and where it is used in language.d and got different values.

@SimonN
Copy link
Owner

SimonN commented Jul 18, 2016

Thanks for the tips! I've changed some initialization code. Please pull master (fa9c3e4) again, and see if the bug still manifests.


Background info: Your findings fit into my mental image of where the bug may lie. The program works like this:

  1. run static module constructors,
  2. enter the D main(),
  3. run most the program through a special function that Allegro 5 can hijack. I believe that either this, or D's main(), spawn a fresh thread.

I conjecture that lines like static const Filename = new Filename(...) do what I want on Linux and Windows, but on MacOS, they fail to initialize the global constants before the module constructors static this() run. Now, I initialize the global constants in a module constructor themselves, and D guarantees that the module constructors run in the proper order.

@emlai
Copy link
Author

emlai commented Jul 18, 2016

Pulled fa9c3e4: still crashes on language.d:432.

@emlai
Copy link
Author

emlai commented Jul 19, 2016

I found the following in the DAllegro5 readme:

al_run_allegro will block until your code returns. On some platforms it will run your code in a different thread (you generally don't need to worry about this). This is done for cross-platform (specifically OSX) compatibility.

Might this have something to do with our issue?

@emlai
Copy link
Author

emlai commented Jul 19, 2016

I found the bug! It's in DAllegro5. On macOS they call thread_attachThis, whose docs say:

NOTE:
This routine does not run thread-local static constructors when called. If full functionality as a D thread is desired, the following function must be called after thread_attachThis:

extern (C) void rt_moduleTlsCtor();

But they don't call rt_moduleTlsCtor so the thread-local static constructors aren't invoked. Going to submit a patch there.


After fixing this, Lix doesn't crash in language.d anymore and a window opens up :) yay.

@emlai
Copy link
Author

emlai commented Jul 19, 2016

Btw this works fine now even without fa9c3e4, if you want to revert that.

@SimonN
Copy link
Owner

SimonN commented Jul 19, 2016

Excellent debugging! Awesome to see it running on MacOS.

I didn't have any new good ideas yesterday. Over night, I thought about replacing all module constructors with code that I call by hand.

Running all initialization manually might be better if the module constructors took a long time to run. My module constructors all run fast enough that running them twice, once for main(), once for the Allegro 5 thread, won't matter. I run manually the performance-heay initializations anyway, because they rely on Allegro 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3-bug norepro I (Simon) cannot reproduce this. I run Arch Linux 64-bit, and can run Win32 or Win64 in Wine.
Projects
None yet
Development

No branches or pull requests

2 participants