New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scheduling loop exit #3069
Fix scheduling loop exit #3069
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@@ -440,7 +440,7 @@ namespace hpx { namespace threads { namespace detail | |||
// threads exist or if some other thread has terminated | |||
HPX_ASSERT( | |||
!sched_->Scheduler::get_thread_count( | |||
unknown, thread_priority_default, thread_num) || | |||
suspended, thread_priority_default, thread_num) || | |||
sched_->Scheduler::get_state(thread_num) > state_stopping); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this looking just for suspended
threads and not for all of them as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the exit condition for the scheduling loop checks for suspended
now (in addition to emptying the work queues). And the exit criteria changed to allow removing/suspending the OS thread on which the removing/suspending HPX thread was started. suspended
is a sufficient (but slightly overkill) condition to do this, while unknown
is too much (i.e. can lead to hangups).
The assert could be expanded to check that work_items_count
and new_tasks_count
are both 0 as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be a good idea, indeed. Thanks!
@@ -672,6 +672,7 @@ namespace hpx { namespace threads { namespace detail | |||
{ | |||
// clean up terminated threads one more time before sleeping | |||
bool can_exit = | |||
!running && | |||
scheduler.SchedulingPolicy::cleanup_terminated( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional to call cleanup_terminated
only if !running
evaluates to true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100%. It could be either way, but I put it first since when running
is true
the "idle" cleanup_terminated
will be called eventually if there's no work to do (at the end of scheduling_loop
).
This fixes occasional asserts firing on throttle tests. Because the suspension/throttling changes relaxed scheduling loop exit criteria, the assert after exiting also needs to be relaxed.