New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speeding up accessing the resource partitioner and the topology info #3015
Conversation
hkaiser
commented
Nov 19, 2017
- this reduces the overheads in the scheduler significantly
@@ -47,7 +47,7 @@ namespace hpx { namespace threads { namespace detail | |||
/////////////////////////////////////////////////////////////////////////// | |||
mask_cref_type thread_pool_base::get_used_processing_units() const | |||
{ | |||
std::lock_guard<pu_mutex_type> l(used_processing_units_mtx_); | |||
// std::lock_guard<pu_mutex_type> l(used_processing_units_mtx_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That lock is needed for concurrent enabling/disabling of PUs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called constantly from a hot loop from inside the scheduler. If the lock is really needed we have to find a way not to use this function as it is accounting for about 50% of the speedup I was seeing (%5 of the overall runtime).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Back when I implemented it at first, I measured the same. The real question however is if we support that use case or not. Leaving out the lock might speed up things, but also eliminates a use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the obvious thing to do is have differnet flags set for the schedulers at creation time. If the user knows that they are not hot swapping PUs from pools, then this functionality is not required, we can add it to the background_work and suchlike flags used at scheduler creation time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add a 'dynamic_pool' flag (or similar) to the pool or scheduler when it is created and if the user attempts to change it at runtime it can throw if necessary...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could cache the mask inside the scheduler and update it on a 'as needed basis', i.e. only whenever PUs have been enabled/disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work. Thanks for spotting this
- this reduces the overheads in the scheduler significantly
d0e95d2
to
7afcce6
Compare
I wanted to test this, so I fixed the commented out mutex and amended the commit |
@biddisco, @sithhell: the current code protecting the mask is broken anyways as the function returns a reference, see here: https://github.com/STEllAR-GROUP/hpx/blob/master/src/runtime/threads/detail/thread_pool_base.cpp#L48-L52. The only option I see now is to implement what @biddisco suggested, namely to introduce a flag for the pools enabling dynamic behavior on demand (besides fixing the way the variable is protected). Thoughts? |
@sithhell can we go ahead with applying this patch now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I suspect that this was merged using my version, that had the mutex put back - this goes against what @hkaiser wanted I believe. |
@biddisco I think we can merge this as the mutex problem was there before and the patch improves performance as is. We need to solve it, though as the current code is both, slow and broken. |