-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Singleton on non-persistent Java type Java::JavaLang::NullPointerException #4304
Comments
It appears the worklist for the data flow analysis problem is getting populated with a null. That appears to possible if |
Here's a patch that would at least prevent that null entry in the worklist, but I'm not sure why the DF problem tries to traverse a block that is no longer in the CFG. Perhaps this is related to the warning about caching the post-order traversal as @subbuss mentioned in CFG.postOrderList?
|
Could be. Not sure if the list is updated when a BB is deleted for whatever reason. And there is code that deletes BBs as far as I remember. |
For what it's worth, looking over the last week+ we haven't seen this re-occur. It's certainly not blocking us on anything in its current state. |
@headius @subbuss I think a null check will wallpaper over the actual problem. So I figured out two races:
I think this ends up being fairly rare because the race is not probably all that common. You need a closure and a parent (which could be closure/method/whatever) deciding to build at the same time but from different threads. Even if that does happen the CFG might not change in any meaningful way. CFG should be immutable? That certainly would get rid of the most obvious reported issue but is it a correct solution? If I LVA on a parent and it assumes something about the child but the child is optimizing and changing itself can we depend on the older CFG? Maybe for LVA (conservatively speaking) but in general? |
After a month of no issues (hundreds of successful builds), we just saw another occurrence of this - almost exactly the same place, during asset precompilation. Exactly the same stacktrace as far as I can tell, although this time running Rake 12.0.0. |
@tobymurray-nanometrics Thanks for the update. We're still debating the right fix for this. |
This is bad bug but I am pushing to 9.1.8. I will land my PR after 9.1.7.0 and we can beat on or improve the obvious concurrency issue we have when compiling child closures before their parents. The worst part of this issue is my inability to get a good test case. |
No complaints here from pushing the bug. It has only manifested for us during CI builds, even then only twice that I can find. and we can just run the build over again in the meantime. The asset precompilation stage of our build (the only place we've seen this) is pretty opaque, but is there anything we can do to help here? Any additional logs or information that would help build a reproduction? |
@tobymurray-nanometrics if I had a smallish script I could torture test which would manifest on occasion it would definitely help. My other thought is I could write Java which explicitly calls into our compiler from two threads to make the race much more likely. On your part, I am pretty sure I understand the actual issue. It is verifying any fix I have works which is harder part. My PR probably fixes the problem too although I want it to bake first. |
I can spend a little time torturing the precompile stage to see if it consistently fails occasionally, and if that works I can post it. Any ideas as to what may contribute to eliciting this (e.g. does RAM allocated to the process have any impact etc.)? |
@tobymurray-nanometrics In theory anything which has more than one thread calling a parent scope (closure or method) and a child closure from the other thread. The error condition is explained in the PR explanation. Largely it is both compiling at the same time on different threads. The parent thinks something on the child is usable but then by the time it asks for it again it has changed. |
@tobymurray-nanometrics Just so you know, PR #4392 was merged to master a couple days ago, so master builds should have our expected fix for this. If you're able to test it out that would help us verify that the issue is fixed. |
Resolving based on feedback in a related issue. @tobymurray-nanometrics if you still see this then please reopen it. |
We just upgraded from 9.1.2.0 to 9.1.6.0, and we saw a CI build failure with the exception below. It's only failed once so far, but we like to stay on top of these things. Look familiar at all? Anything we can do to provide more helpful logs/details? I don't have a reproducible case, so I'm just looking for some guidance as to whether you think this is an us problem or a JRuby 9.1.6.0 problem.
Environment
JRuby 1.9.6.0
Rails 4.2.7.1
Command executed
Stacktrace
The text was updated successfully, but these errors were encountered: