-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed Forkable::ancestry_ to an absl::InlinedVector. #2776
Conversation
[Automated message from GitHub Pull Request Builder] Answer "ok to test" to trigger testing of this PR. |
ok to test |
I don’t think you can see the CI results, so copying that here: the linter says
|
ok to test |
Previously it was a deque. This change elides allocations in most cases. Note that as part of the change, the direction of ancestry was reversed (i.e. front and back have been swapped). This is only a notational change and has no effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution.
Would you mind writing a small benchmark? Something that creates a trajectory, adds a few points, creates 2-3 forks, and iterates over the entire trajectory and the forks. I'd imagine that the performance difference would be fairly clear (or whatever variation thereof where the issue is the clearest). We don't have benchmarks that are very clean in terms of the performance of DiscreteTrajectory at the moment, so it would be good to make sure that we don't create performance regressions in the future.
physics/forkable.hpp
Outdated
TimelineConstIterator current_; | ||
std::deque<not_null<Tr4jectory const*>> ancestry_; // Pointers not owned. | ||
absl::InlinedVector<not_null<Tr4jectory const*>, 2> | ||
ancestry_; // Pointers not owned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comment, it's a relic of the distant past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
physics/forkable_body.hpp
Outdated
ancestry_.erase(ancestry_.begin(), --ancestry_.end()); | ||
current_ = ancestry_.front()->timeline_end(); | ||
if (current_ == ancestry_.back()->timeline_end() && ancestry_.size() > 1) { | ||
ancestry_ = {ancestry_.front()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd imagine that ancestry_.resize(1)
would be more efficient in terms of heap management and a bit more readable IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that resize would be more readable, but it doesn't compile since not_null<> has no default constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s make it an erase(ancestry_.begin() + 1, ancestry_.end())
then.
physics/forkable.hpp
Outdated
TimelineConstIterator current_; | ||
std::deque<not_null<Tr4jectory const*>> ancestry_; // Pointers not owned. | ||
absl::InlinedVector<not_null<Tr4jectory const*>, 2> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that 3 would work better. While most trajectories only have 2 elements in this member, most trajectories will go up to 3 (history, psychohistory, prediction) at some point in their lifetime, we might as well reserve space for that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I added a comment with the reasoning.
physics/forkable_body.hpp
Outdated
@@ -285,7 +283,8 @@ LowerBound(Instant const& time) const { | |||
// We found an interesting timeline, i.e. one that is nonempty and | |||
// not forked at the fork point of its parent. Cut the ancestry and | |||
// return the beginning of that timeline. | |||
iterator.ancestry_.erase(iterator.ancestry_.begin(), ancestry_it); | |||
iterator.ancestry_ = {ancestry_it, iterator.ancestry_.rend()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. The construction using reverse iterators is going to effectively reverse the range, so the element at rend-1
, i.e., at begin
will land at the back. I would stick to erase
here. It would be good to add a test covering this situation: you'll need a trajectory with many forks, call LowerBound
with a time near the middle, and check that the resulting iterator gives you data in the right order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct; it is wrong. I added a new test and changed it to erase.
(Please ignore the CI failures, they are bogus.) |
I added a benchmark. Here are the results.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this impressive performance improvement; just a few nitpicks, looks good.
benchmarks/discrete_trajectory.cpp
Outdated
namespace { | ||
|
||
// Creates a trajectory with the given number of steps. | ||
std::unique_ptr<DiscreteTrajectory<World>> CreateTrajectory(int const steps) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not_null<std::unique_ptr<DiscreteTrajectory<World>>>
benchmarks/discrete_trajectory.cpp
Outdated
auto trajectory = std::make_unique<DiscreteTrajectory<World>>(); | ||
DegreesOfFreedom<World> const d(World::origin, Velocity<World>()); | ||
Instant t; | ||
for (int i = 0; i < steps; i++, t += Second) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to write 1 * Second
etc., it is more readable (and since are those are all constexpr the compilers are smart enough).
benchmarks/discrete_trajectory.cpp
Outdated
BENCHMARK(BM_DiscreteTrajectoryLowerBound)->Range(8, 1024); | ||
|
||
} // namespace physics | ||
} // namespace principia |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New line at end of file (the latest CI failure was the linter complaining about this).
benchmarks/discrete_trajectory.cpp
Outdated
// |parent| should be nonempty. | ||
// |pos| should be in [0, 1]. | ||
not_null<DiscreteTrajectory<World>*> ForkAt(DiscreteTrajectory<World>& parent, | ||
float const pos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s not mix precisions, we use double
everywhere (it gets converted to double
where it is used on line 45 anyway).
physics/forkable.hpp
Outdated
@@ -84,10 +84,11 @@ class ForkableIterator { | |||
void CheckNormalizedIfEnd(); | |||
|
|||
// |ancestry_| is never empty. |current_| is an iterator in the timeline | |||
// for |ancestry_.back()|. |current_| may be at end. | |||
// for |ancestry_.back()|. |current_| may be at end. The inline size of 3 | |||
// for |ancestry_| is intended to cover a trajectory's history, psychohistory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A vessel’s history, psychohistory, and prediction.
physics/forkable_body.hpp
Outdated
@@ -239,8 +239,8 @@ LowerBound(Instant const& time) const { | |||
// This queue is parallel to |iterator.ancestry_|, and each entry is an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a queue anymore.
physics/forkable_test.cpp
Outdated
fork->NewFork(fork->timeline_find(t2_)); | ||
fork2->push_back(t3_); | ||
|
||
auto lower_bound_it = fork2->LowerBound(t1_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const the iterators.
benchmarks/discrete_trajectory.cpp
Outdated
|
||
// Creates a trajectory with the given number of steps. | ||
std::unique_ptr<DiscreteTrajectory<World>> CreateTrajectory(int const steps) { | ||
auto trajectory = std::make_unique<DiscreteTrajectory<World>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_not_null_unique
benchmarks/discrete_trajectory.cpp
Outdated
} // namespace | ||
|
||
void BM_DiscreteTrajectoryFront(benchmark::State& state) { | ||
std::unique_ptr<DiscreteTrajectory<World>> trajectory = CreateTrajectory(4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const the pointers that are const (not_null<...> const
or auto const
, everywhere in this file).
physics/forkable_body.hpp
Outdated
ancestry_.erase(ancestry_.begin(), --ancestry_.end()); | ||
current_ = ancestry_.front()->timeline_end(); | ||
if (current_ == ancestry_.back()->timeline_end() && ancestry_.size() > 1) { | ||
ancestry_ = {ancestry_.front()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s make it an erase(ancestry_.begin() + 1, ancestry_.end())
then.
Previously it was a deque.
This change elides allocations in most cases.
Note that as part of the change, the direction of ancestry was reversed
(i.e. front and back have been swapped). This is only a notational
change and has no effect.
The inlined size of two was decided upon by analyzing a journal. Most (~80%) instances of ancestry_ eventually got to size 2, whereas only 10% got to size 3.
I also ran the benchmarks. I wasn't sure which ones to run, so I ran all of them. The results are
attached here.
Some context: At some point I noticed my game had become laggy and profiled it to see why. It turned out that a lot of time was being spent allocating/deallocating Forkable::ancestry_. Changing it to an InlinedVector reduced time per frame spent waiting on Principia from ~40ms to ~7ms.