Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect pool usage masks setup in RP/thread manager #2878

Merged
merged 2 commits into from Sep 20, 2017
Merged

Conversation

biddisco
Copy link
Contributor

Just a minor tweak to get the pool_usage_mask set correctly

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test verifying the problem is fixed.

@@ -95,7 +95,7 @@ namespace hpx { namespace threads { namespace detail
auto const& rp = resource::get_partitioner();
for (std::size_t i = 0; i != pool_threads; ++i)
{
bit_or(used_processing_units_, rp.get_pu_mask(threads_offset + i));
used_processing_units_ |= rp.get_pu_mask(threads_offset + i);
Copy link
Member

@hkaiser hkaiser Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not work for all configurations. I think the correct fix would be:

used_processing_units_ = 
    bit_or(used_processing_units_, rp.get_pu_mask(threads_offset + i));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

bit_or(a,b)

returns a boolean true or false depending on the two masks and so it only sets the lowest bit. We actually want the true bitwise or operation and the submitted patch is correct. if CPU's <= 64 then a uint64_t is bitwise or'ed (ok), if CPUs>64, then a bitset<blah> is or'ed.
I believe my patch is correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for pointing this out. Still. there is a third option, namely if we use boost::dynamic_bitset. I'm not sure if operator|=() is provided in that case. IIRC, it provides only operator|().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was mistaken about the small CPU count version.
boost::dynamic_bitset has the != operator and it has the signature
dynamic_bitset& operator|=(const dynamic_bitset& b);
which is exactly what we want in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thanks for clarifying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can merge then yes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, could we still add that test, please?

@biddisco biddisco dismissed hkaiser’s stale review September 5, 2017 12:41

I believe the patch is good.

@sithhell
Copy link
Member

#2891 Does not specifically test for the used processing unit, however it uses the mask to test against the currently used processing units in the RP. Should we mark this as duplicate and consider the added test case as good enough to check for correctness of the mask?

@hkaiser
Copy link
Member

hkaiser commented Sep 11, 2017

#2891 Does not specifically test for the used processing unit, however it uses the mask to test against the currently used processing units in the RP. Should we mark this as duplicate and consider the added test case as good enough to check for correctness of the mask?

How much effort would it be to add an explicit check for this to this test?

@biddisco biddisco added this to Work in Progress in resource partitioner/manager work Sep 12, 2017
Copy link
Member

@sithhell sithhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test has been added. This looks good to me.

int main(int argc, char* argv[])
{
std::vector<std::string> cfg = {
"hpx.os_threads=4"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is a good idea? The pu mask will be 1 for each pu used. If you run this test on a single core machine, then asking for 4 threads will set the pu flag, but only once and not 4 times. So bits in pu mask will not match num threads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the only thing that will fail in this scenario will the test on line 25 then... I guess having at least 4 PUs available is a prerequisite of this test (this is also true for the other RP test)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's sensible. If the machine being tested on has < N cores, - just skip the test and return pass. There is no point testing unless N is useful.

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@hkaiser hkaiser merged commit 2e05958 into master Sep 20, 2017
@hkaiser hkaiser deleted the fix_rp_again branch September 20, 2017 12:25
@biddisco biddisco moved this from Work in Progress to completed in resource partitioner/manager work Oct 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants