New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coroutine stackoverflow detection for linux/posix; Issue #2408 #2740
Conversation
@hkaiser mixed up my unit test with an existing unit test. will rollback and create a new unit test. on that note, should the coroutine errors be printed using std::cerr or should they be printed out using the hpx compliment streaming output sink? |
@@ -140,7 +151,19 @@ namespace hpx { namespace threads { namespace coroutines | |||
|
|||
x86_linux_context_impl() | |||
: m_stack(nullptr) | |||
{} | |||
{ | |||
segv_stack.ss_sp = valloc(SEGV_STACK_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have that depending on a define such that is configurable through cmake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sithhell sure thing. How would you want to distinguish or designate the cmake variables? example: HPX_COROUTINE_STACKOVERFLOW_DETECTION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sithhell is streaming to std::cerr acceptable or would ya'll prefer hpx::cerr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually name our cmake config variable HPX_WITH_<FOO>
and the corresponding preprocessor constant HPX_HAVE_<FOO>
.
Also, hpx::cerr
might not really work as it would send things to the console which might not be operational under stack-overflow conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ct-clmsn I can see the code you have written and am a bit afraid as the code is somewhat similar to that of libsigsegv, it's a GPL'ed library so please be a bit careful about it before opening any PR. I will get back to you about why it's not working but kindly take care of the GPL issue. |
@ABresting I'm aware of the GPL concerns mentioned in issue #2408. I did not read and have not read the libsegsegv implementation source files. At the top of this PR's commentary thread, I noted the reference materials used to implement this PR. I'd strongly encourage reviewing the links provided at the top of this PR commentary thread if you have concerns regarding the origin of this PR's implementation. |
segv_stack.ss_flags = 0; | ||
segv_stack.ss_size = SEGV_STACK_SIZE; | ||
|
||
bzero(&action, sizeof(action)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I suggest to use std::memset(&action), '\0', sizeof());
instead of bzero
(also #include <cstring>
for memset to be declared)? bzero
is not portable (not that we would need for it to be portable) and not part of the std C++ library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hkaiser sure thing! modifications made and pushed.
@ct-clmsn rethink-db's method is under Apache licensing, so even I am not so sure how safe it will be. |
The apache license allows redistribution under a new license, providing the original code keeps the apache license, so it would be compatible with HPX if any apache licensed code has been used directly. from https://www.whitesourcesoftware.com/whitesource-blog/top-10-apache-license-questions-answered/ "You can choose to release the modified or derived products under different licenses, the unmodified parts of the software, however, must retain the Apache License, and you cannot name your modified version in any way that suggests that the final product is endorsed or created by the ASF. Additionally, if you want to add a copyright statement about all the modifications that you’ve done to any Apache licensed software; you are free to do so. Since the Apache License doesn’t require you to release the modified code under the same license, you can choose to add specific license terms and conditions that govern how others use, reproduce, or distribute your modified code." |
@ABresting the implementation of this PR used tutorial information provided in the two blog posts (blogs provided examples and explanations - not licensed program source code) combined with my interpretation of posix/linux manual pages. @biddisco thanks! |
So to finish my thought on the subject - if we were to use the code, add the boost licence, add a note to say that the code was inspired by the blog posts #link1, #link2, &etc then since the author has not looked at the other library, it qualifies as not being a derivative work of the GPL'd code. It could be construed as a derivative of some apache stuff and we therefore note that, include the apache license and refer again to the articles, then I think we're ok. If anyone ever raises an objection, the code can be pulled until the issue is resolved - at that time. |
@biddisco: sounds like a plan. Better beg for forgiveness afterwards than to ask for permission upfront ;-) |
Thanks Chris. Code-wise this looks good now. I will try it out asap (I think @sithhell wanted to try it out as well). |
Great! Thank you! Should the error message include a suggestion about problem decomposition? |
hpx_option(HPX_WITH_THREAD_STACKOVERFLOW_DETECTION | ||
BOOL | ||
"Enable stackoverflow detection for HPX threads/coroutines. (default: ON)" | ||
ON ADVANCED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this OFF by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest leaving it ON
by default at least in Debug mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would it make sense to make enable/disable a runtime property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the overheads this detection adds. I am leaning towards only enabling it on request (If posix/linux based platform), given that prerequiste, we can also safely enable it when the build type is debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ct-clmsn Enable for debug builds by default, Disable by default for the remaining build types, I'd say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue we had back then was to supporting multi configuration builds. As those don't exist in any of the *nix ecosystems (or does WSL change the picture?), I would consider this a non problem for this particular feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh right, I withdraw my comment, then...
@@ -14,6 +14,7 @@ set(tests | |||
thread_launching | |||
thread_mf | |||
thread_stacksize | |||
thread_stacksize_overflow | |||
thread_suspension_executor | |||
thread_yield | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CMakeLists.txt here need to flag the added test as "will fail". I suggest something like this:
set_tests_properties(tests.unit.threads.thread_stacksize_overflow PROPERTIES
PASS_REGULAR_EXPRESSION "Stack overflow in coroutine at address 0x[0-9a-fA-F]*")
This will mark the test as passed when it fails with that output. That would require to add this test only when the overflow detection is activated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this should be enabled on Linux-based systems only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ct-clmsn I will do that. Let's go ahead with this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the comments are resolved, this PR is ready to be merged. This is a long needed addition.
<< "hpx.stacks.large_size, " | ||
<< "or hpx.stacks.huge_size runtime " | ||
<< std::endl | ||
<< "flags to configure coroutine heap sizes." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should read "stack sizes"
<< "hpx.stacks.large_size, " | ||
<< "or hpx.stacks.huge_size runtime " | ||
<< std::endl | ||
<< "flags to configure coroutine heap sizes." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@hkaiser thank you for helping to finalize the merge! |
Initial implementation of stackoverflow detection for coroutines issue tracker #2408 . The source was derived from posix/linux system man pages, the rethinkdb link provided in the issue conversation thread, and a blog post about coroutine implementations.