Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cmake flag -DHPX_WITH_FAULT_TOLERANCE=ON (OFF by default) #3170

Merged
merged 8 commits into from Feb 20, 2018

Conversation

KADichev
Copy link
Contributor

@KADichev KADichev commented Feb 13, 2018

to ignore connection reset signals of connections, so HPX does not terminate if a node crashes.

…re connection reset signals of connections, so HPX does not terminate if a node crashes.
hpx/runtime.hpp Outdated
/// component type. This a purely internal function allowing to work
/// around certain library specific problems related to dynamic
/// loading of external libraries.
HPX_EXPORT bool tolerate_node_faults();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment does not match the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, copy/paste :-( I'll add a proper comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KADichev ping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the branch. Not sure if anything else is required of me here. Never did PR before.

@hkaiser
Copy link
Member

hkaiser commented Feb 13, 2018

What is the rationale of making this functionality optional? Wouldn't it be better to be always able to handle this kind of exception?

@hkaiser hkaiser changed the title Add cmake flag -DHPX_WITH_FAULT_TOLERANCE=ON (OFF by default) to igno… Add cmake flag -DHPX_WITH_FAULT_TOLERANCE=ON (OFF by default) Feb 13, 2018
@KADichev
Copy link
Contributor Author

The functionality is optional, because I'm not sure if ignoring a "connection_reset" signal due to suddenly failing remote endpoints is the right way of handling this -- it is a workaround to avoid HPX raising an exception, and terminating, and allowing to explore recovery (which we are working on).

@hkaiser
Copy link
Member

hkaiser commented Feb 13, 2018

The functionality is optional, because I'm not sure if ignoring a "connection_reset" signal due to suddenly failing remote endpoints is the right way of handling this -- it is a workaround to avoid HPX raising an exception, and terminating, and allowing to explore recovery (which we are working on).

In the end this should be the default, it shouldn't even be possible to disable this. I would have no objections to go ahead with merging this PR if that's the eventual goal.

@sithhell
Copy link
Member

The goal should be to have a fault tolerance HPX in the end. I am all for having it optional, since there's always a trade off between fault tolerance and performance. This is also the first step in that direction so it makes sense to have it defaulted to being turned off for now.

@hkaiser
Copy link
Member

hkaiser commented Feb 13, 2018

The goal should be to have a fault tolerance HPX in the end. I am all for having it optional, since there's always a trade off between fault tolerance and performance.

Ok. Fair enough.

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@KADichev
Copy link
Contributor Author

I fixed the comment and pushed the changes to the branch Let me know if you need anything else.

@hkaiser
Copy link
Member

hkaiser commented Feb 17, 2018

@msimberg msimberg merged commit 72821e6 into STEllAR-GROUP:master Feb 20, 2018
@msimberg msimberg added this to the 1.1.0 milestone Mar 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants