Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up accessing the resource partitioner and the topology info #3015

Merged
merged 1 commit into from Nov 21, 2017

Conversation

hkaiser
Copy link
Member

@hkaiser hkaiser commented Nov 19, 2017

  • this reduces the overheads in the scheduler significantly

@@ -47,7 +47,7 @@ namespace hpx { namespace threads { namespace detail
///////////////////////////////////////////////////////////////////////////
mask_cref_type thread_pool_base::get_used_processing_units() const
{
std::lock_guard<pu_mutex_type> l(used_processing_units_mtx_);
// std::lock_guard<pu_mutex_type> l(used_processing_units_mtx_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That lock is needed for concurrent enabling/disabling of PUs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called constantly from a hot loop from inside the scheduler. If the lock is really needed we have to find a way not to use this function as it is accounting for about 50% of the speedup I was seeing (%5 of the overall runtime).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back when I implemented it at first, I measured the same. The real question however is if we support that use case or not. Leaving out the lock might speed up things, but also eliminates a use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the obvious thing to do is have differnet flags set for the schedulers at creation time. If the user knows that they are not hot swapping PUs from pools, then this functionality is not required, we can add it to the background_work and suchlike flags used at scheduler creation time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a 'dynamic_pool' flag (or similar) to the pool or scheduler when it is created and if the user attempts to change it at runtime it can throw if necessary...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could cache the mask inside the scheduler and update it on a 'as needed basis', i.e. only whenever PUs have been enabled/disabled.

Copy link
Contributor

@biddisco biddisco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. Thanks for spotting this

- this reduces the overheads in the scheduler significantly
@biddisco
Copy link
Contributor

I wanted to test this, so I fixed the commented out mutex and amended the commit

@hkaiser
Copy link
Member Author

hkaiser commented Nov 21, 2017

@biddisco, @sithhell: the current code protecting the mask is broken anyways as the function returns a reference, see here: https://github.com/STEllAR-GROUP/hpx/blob/master/src/runtime/threads/detail/thread_pool_base.cpp#L48-L52.

The only option I see now is to implement what @biddisco suggested, namely to introduce a flag for the pools enabling dynamic behavior on demand (besides fixing the way the variable is protected). Thoughts?

@hkaiser
Copy link
Member Author

hkaiser commented Nov 21, 2017

@sithhell can we go ahead with applying this patch now?

Copy link
Member

@sithhell sithhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sithhell sithhell merged commit d421371 into master Nov 21, 2017
@sithhell sithhell deleted the performance_optimizations branch November 21, 2017 06:51
@biddisco
Copy link
Contributor

I suspect that this was merged using my version, that had the mutex put back - this goes against what @hkaiser wanted I believe.

@hkaiser
Copy link
Member Author

hkaiser commented Nov 21, 2017

@biddisco I think we can merge this as the mutex problem was there before and the patch improves performance as is. We need to solve it, though as the current code is both, slow and broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants