Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash while descending towards the surface of a body #2811

Closed
RCrockford opened this issue Dec 4, 2020 · 7 comments · Fixed by #2817
Closed

Crash while descending towards the surface of a body #2811

RCrockford opened this issue Dec 4, 2020 · 7 comments · Fixed by #2817

Comments

@RCrockford
Copy link
Contributor

RCrockford commented Dec 4, 2020

A fairly repeatable crash while undergoing powered descent towards the surface of the Moon.

    @   00007FF82D1E640F  	google::LogMessageFatal::~LogMessageFatal [0x00007FF82D1E640E+46] (C:\Users\rcrockford\Documents\GitHub\Google\glog\src\logging.cc:2109)
    @   00007FFFF34A9214  	principia::astronomy::internal_orbit_recurrence::OrbitRecurrence::OrbitRecurrence [0x00007FFFF34A9213+675] (C:\Users\rcrockford\Documents\GitHub\Principia\astronomy\orbit_recurrence_body.hpp:27)
    @   00007FFFF34B93B3  	principia::ksp_plugin::internal_orbit_analyser::OrbitAnalyser::RepeatedlyAnalyseOrbit [0x00007FFFF34B93B2+4834] (C:\Users\rcrockford\Documents\GitHub\Principia\ksp_plugin\orbit_analyser.cpp:163)
    @   00007FFFF34B1DAF  	std::thread::_Invoke<std::tuple<`principia::base::internal_jthread::MakeStoppableThread<`principia::ksp_plugin::internal_orbit_analyser::OrbitAnalyser::Restart'::`2'::<lambda_1> >'::`2'::<lambda_1>,principia::base::internal_jthread::stop_token>,0,1> [0x00007FFFF34B1DAE+78] (C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\include\thread:43)
    @   00007FF84D8410B2  	beginthreadex [0x00007FF84D8410B1+321]
    @   00007FF84E637C24  	BaseThreadInitThunk [0x00007FF84E637C23+19]
    @   00007FF84FCCD4D1  	RtlUserThreadStart [0x00007FF84FCCD4D0+32]
F1204 18:28:54.317481 13504 orbit_recurrence_body.hpp:27] Check failed: Sign::OfNonZero(??) == sign_C?? (- vs. +) [-2147483648; -2147483648; 1]

Journal
Log

Note, this occurred on a locally created build.

@eggrobin
Copy link
Member

eggrobin commented Dec 5, 2020

I can’t trivially decode the stack with our usual script since this is a custom build, but since it runs on the machine where it was built it comes with symbols already so all is well.

We end up here with νₒ = Dᴛₒ = -2147483648, Cᴛₒ = 1, which is obviously a problem:

inline OrbitRecurrence::OrbitRecurrence(int const νₒ,
int const Dᴛₒ,
int const Cᴛₒ)
: νₒ_(νₒ), Dᴛₒ_(Dᴛₒ), Cᴛₒ_(Cᴛₒ) {
CHECK_NE(Cᴛₒ, 0) << *this;
Sign const sign_Cᴛₒ = Sign::OfNonZero(Cᴛₒ);
if (νₒ != 0) {
CHECK_EQ(Sign::OfNonZero(νₒ), sign_Cᴛₒ) << *this;
}

The call is here, so the -2147483648 is probably UB from a non-representable double to int.

int const νₒ = std::nearbyint(κ);
int const Dᴛₒ = std::nearbyint((κ - νₒ) * Cᴛₒ);
return OrbitRecurrence(νₒ, Dᴛₒ, Cᴛₒ);

We should probably be using 64-bit integers in general, but here there is something more worrying.

// νₒ is the number of orbits per day, rounded to the nearest integer.

At first glance it seems that 2147483648 orbits per day would still be 24855 orbits per second, which is too much (even if the orbit goes underground, since the spacecraft being analysed is above ground, the orbit cannot be shorter than the time required to fall to the centre of the Earth).

However, note that κ is computed as

AngularFrequency const& Ωʹ = nodal_precession;
AngularFrequency const& Ωʹᴛ = primary.angular_frequency();
// Nodal mean motion.
AngularFrequency const& nd = 2 * π * Radian / nodal_period;
// Daily recurrence frequency, see (7.41).
double const κ = nd / (Ωʹᴛ - Ωʹ);
and recall that
// In this class, the word “day” is used to denote a revolution of the primary
// in the reference frame fixing the orbital plane. Note that this day depends
// on the nodal precession of the satellite: it is not any of the usual
// definitions of the day:
// — for a sun-synchronous orbit (see sections 7.5.4, 7.5.5), this day is the
// mean solar day;
// — for an orbit with no nodal precession (i.e., a strictly polar orbit, see
// section 7.1.3), this day is the stellar day.
// These days are counted negatively for a body with retrograde rotation.
so that for an orbit with a very high nodal precession Ωʹᴛ, the “day” would be very long, and thus it is possible that we are indeed reaching those numbers with the underground orbits of descent.

@RCrockford
Copy link
Contributor Author

RCrockford commented Dec 5, 2020

Worth nothing that this occurred above the Moon as such the "day" is longer to start with, also the "orbit" had a high apoapsis approximately 10 Mm above the surface. I suspect that this ties in with your analysis.

@eggrobin
Copy link
Member

eggrobin commented Dec 6, 2020

Hm, I am very puzzled, replaying the journal (removing the pointer consistency checks and forcing all analysis requests to be executed by peppering the analyser with joins) I am unable to replicate the check failure, and the highest value of κ I observe is 6.2894…

@eggrobin
Copy link
Member

eggrobin commented Dec 6, 2020

On second thought the hypothesis according to which we are analysing underground orbits does not make much sense, considering that the integrator should stop when we are underground.

@eggrobin
Copy link
Member

eggrobin commented Dec 8, 2020

It seems that this is instead plain old UB, we are accessing a vector at index -1.

@eggrobin
Copy link
Member

eggrobin commented Dec 9, 2020

This in turn happens because the sidereal period is negative (-3.82112421336217958e+04 s, about -10 h 37 min).

@eggrobin
Copy link
Member

The sidereal period is negative because the trajectory is absurd.
In the figure below, the trajectory being analysed is shown as black points every 10 s, as computed; the sphere of radius min_radius (below which a collision is certain) is shown in white, and the sphere of radius 0.99 * min_radius (below which the integrator returns an erroneous status to avoid propagating a corrupt state) is shown in red.
2811
Far below the surface, the trajectory takes a sharp bend (the timestep is likely too small for the multistep integrator this deep in the gravity well; that kind of integrator does not behave nicely under those circumstances).

We should make the analyser robust to weird trajectories; we should also check if we can make the integrator stop when the force computation is erroneous (instead of continuing before returning an erroneous status) without running into performance issues.

Sorry, something went wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants