New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This patch makes sure that all parcels in a batch are properly handled #2652
Conversation
- the split_gids container is not a zombie anymore for anything but the first parcel
continue; | ||
|
||
// collect the necessary size the serialization operation | ||
hpx::serialization::detail::preprocess preprocess; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work. preprocess needs to be kept alive as part of parcel_await. Once we have a future we need to wait on, the completion handlers are registered to exactly this object, once we leave the scope in line 63, we will get segfaults due to dangling pointers, for example here:
https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/runtime/serialization/detail/preprocess.hpp#L71
That's also the reason, why we need to keep this alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
archive_.reset(); | ||
archive_ << parcels_[idx_]; | ||
// send every parcel only once | ||
if (handled_[idx_]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's wrong. parcels need to get preprocessed as many times as there are no futures to be waited on. For example, a shared_futurehpx::id_type inside a parcels might need to be proprocessed three times. First, the future is not ready and needs to be waited on, second (now the future is ready) the id_type needs to be split (another future needs to be waited on), the third one is the last in the fixed point iteration to ensure everything is ready and the last round did indeed create a value and not an exception, that is, to calculate the final size of the data to be sent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this commit, you were right.
d490a92
to
4218539
Compare
Closing this as the problem has been addressed by #2619 |
@sithhell please verify