Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Checkpointing Components #3159

Merged
merged 5 commits into from Nov 1, 2019

Conversation

aserio
Copy link
Contributor

@aserio aserio commented Feb 8, 2018

Extended current implementation of hpx::util::checkpoint to properly handle components.

Summary:

  • Fixed checkpoint.cpp to make floats constant
    • Makes MSVC happy
  • Adding checkpoint_component.cpp
  • Raise abstraction level in checkpoint
    • Create new function arch_data that will be overloaded to properly
      handle components
  • Adding in components to test
  • Working component checkpointing!
    • Checkpoints a server
    • Still needs to create a new client
  • Create new client on restored server
  • Adding functionality to enable checkpointing of clients
    • Checkpoints the server the client points to
    • Still need to add functionality to restore which would
      create a new client with the resurrected server
  • Allowing components to be restored with provided client
    • Users can now use clients to checkpoint and restore
      servers
  • Updating Documentation
  • Preparing unit test for checkpointing components
  • Fixing 1d_stencil_4_checkpoint example for Windows/Mac
  • Fixing compilation on gcc
  • Adding documentation for checkpointing components
  • Clean up code with Clang-Format
    • Update year on license

hpx/util/checkpoint.hpp Outdated Show resolved Hide resolved
hkaiser
hkaiser previously approved these changes Feb 9, 2018
Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thanks!

Copy link
Member

@sithhell sithhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serialization of futures should be handled correctly. This will also require to implement something like we have for parcels.

hpx/util/checkpoint.hpp Outdated Show resolved Hide resolved
hpx/util/checkpoint.hpp Outdated Show resolved Hide resolved
@hkaiser
Copy link
Member

hkaiser commented Feb 22, 2018

I looked through the code we have to see what we can do to come to a mutually acceptable solution for this. I think we can create a specialization for hpx::traits::serialization_access_data<> to used for check-pointing that would fully handle the specifics. This trait already isolates the handling of (general) futures and of id_types from the actual serialization process. Providing a new specialization for check-pointing should allow to do the dereferencing for id_types we need.

@msimberg msimberg removed this from the 1.1.0 milestone Mar 22, 2018
@msimberg
Copy link
Contributor

What's the status of this PR?

@hkaiser
Copy link
Member

hkaiser commented Oct 13, 2018

@msimberg this still needs some work on the serialization end.

@stale
Copy link

stale bot commented Jul 4, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the tag: wontfix label Jul 4, 2019
@stale
Copy link

stale bot commented Aug 3, 2019

This issue has been automatically closed. Please re-open if necessary.

@stale stale bot closed this Aug 3, 2019
@hkaiser
Copy link
Member

hkaiser commented Aug 3, 2019

Let's keep this open for now This is needed functionality.

@hkaiser hkaiser reopened this Aug 3, 2019
@stale stale bot removed the tag: wontfix label Aug 3, 2019
@hkaiser hkaiser added the tag: pinned Never close as stale label Aug 3, 2019
@hkaiser hkaiser mentioned this pull request Sep 30, 2019
@hkaiser
Copy link
Member

hkaiser commented Oct 22, 2019

@aserio I have rebased and slightly cleaned up this branch for the new serialization module. We agreed with @sithhell that we will not support check-pointing managed id_type's, if a user needs to check-point a component he/she can use clients. Please verify and review.

@hkaiser hkaiser added this to the 1.4.0 milestone Oct 22, 2019
 - Fixed checkpoint.cpp to make floats constant
   -> Makes msvs happy
 - Adding checkpoint_component.cpp
 - Raise abstraction level in checkpoint
   -> Create new function arch_data that will be overloaded to properly
      handle components
 - Adding in components to test
 - Working component checkpointing!
   -> Checkpoints a server
   -> Still needs to create a new client
 - Create new client on restored server
 - Adding functionality to enable checkpointing of clients
   -> Checkpoints the server the client points to
   -> Still need to add functionality to restore which would
      create a new client with the resurrected server
 - Allowing components to be restored with provided client
   -> Users can now use clients to  checkpoint and restore
      servers
 - Updating Documentation
 - Preparing unit test for checkpointing components
 - Fixing 1d_stencil_4_checkpoint example for Win/Mac
 - Fixing compilation on gcc
 - Adding documentation for checkpointing components
 - Clean up code with Clang-Format
   -> Update year on license
 - Created `prep` function to handle clients
  - Ensures `get_id` will be ready
  - Passes future of `get_prt` to dataflow
 - Removes a "bad example" from
   `checkpoint_component.cpp`
 - Adding headers for for inspect
@hkaiser
Copy link
Member

hkaiser commented Oct 31, 2019

This should be good to go now. Please review.

@aserio
Copy link
Contributor Author

aserio commented Oct 31, 2019

LGTM!

@hkaiser
Copy link
Member

hkaiser commented Nov 1, 2019

The errors on pycicle are unrelated.

@hkaiser hkaiser merged commit 9568da2 into STEllAR-GROUP:master Nov 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants