Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when craft breaks in atmosphere with explosions in KSP 1.3.1 with RP-1 (Erdos) #2056

Closed
nikolain opened this issue Jan 5, 2019 · 8 comments

Comments

@nikolain
Copy link

nikolain commented Jan 5, 2019

When craft starts breaking apart and explode on atmospheric reentry, there is a chance for Principia Erdos to crash. This happens with <10% of such reentries.

Native call stack:

 	ucrtbase.dll!abort�()	Unknown
 	libglog.dll!google::Symbolize() + 577 bytes	Unknown
 	libglog.dll!google::LogMessage::SendToLog() + 829 bytes	Unknown
 	libglog.dll!google::LogMessage::Flush() + 213 bytes	Unknown
 	libglog.dll!google::LogMessageFatal::~LogMessageFatal() + 135 bytes	Unknown
 	principia.dll!principia::ksp_plugin::internal_pile_up::PileUp::AdvanceTime(const principia::geometry::internal_point::Point<principia::quantities::internal_quantities::Quantity<principia::quantities::internal_dimensions::Dimensions<0,0,1,0,0,0,0,0> > > & t) Line 372	C++
 	principia.dll!principia::ksp_plugin::internal_pile_up::PileUp::DeformAndAdvanceTime(const principia::geometry::internal_point::Point<principia::quantities::internal_quantities::Quantity<principia::quantities::internal_dimensions::Dimensions<0,0,1,0,0,0,0,0> > > & t) Line 131	C++
 	[Inline Frame] principia.dll!principia::ksp_plugin::internal_plugin::Plugin::CatchUpLaggingVessels::__l6::<lambda_181fb935bfd596ed15d2291f87f03453>::operator()() Line 719	C++
 	[Inline Frame] principia.dll!std::_Invoker_functor::_Call(principia::ksp_plugin::internal_plugin::Plugin::CatchUpLaggingVessels::__l6::<lambda_181fb935bfd596ed15d2291f87f03453> &)	C++
 	[Inline Frame] principia.dll!std::invoke(principia::ksp_plugin::internal_plugin::Plugin::CatchUpLaggingVessels::__l6::<lambda_181fb935bfd596ed15d2291f87f03453> &)	C++
 	[Inline Frame] principia.dll!std::_Invoker_ret<principia::base::Status,0>::_Call(principia::ksp_plugin::internal_plugin::Plugin::CatchUpLaggingVessels::__l6::<lambda_181fb935bfd596ed15d2291f87f03453> &)	C++
 	principia.dll!std::_Func_impl_no_alloc<<lambda_181fb935bfd596ed15d2291f87f03453>,principia::base::Status>::_Do_call()	C++
 	[Inline Frame] principia.dll!std::_Func_class<principia::base::Status>::operator()() Line 15	C++
 	principia.dll!principia::base::internal_thread_pool::ExecuteAndSetValue<principia::base::Status>(const std::function<principia::base::Status __cdecl(void)> & function, std::promise<principia::base::Status> & promise) Line 15	C++
 	principia.dll!principia::base::ThreadPool<principia::base::Status>::DequeueCallAndExecute() Line 80	C++
>	[Inline Frame] principia.dll!std::_Invoker_pmf_pointer::_Call(void(principia::base::ThreadPool<principia::base::Status>::*)()) Line 230	C++
 	[Inline Frame] principia.dll!std::invoke(void(principia::base::ThreadPool<principia::base::Status>::*)() &) Line 230	C++
 	[Inline Frame] principia.dll!std::_Invoker_ret<std::_Unforced,0>::_Call(void(principia::base::ThreadPool<principia::base::Status>::*)() &) Line 230	C++
 	[Inline Frame] principia.dll!std::_Call_binder(std::_Invoker_ret<std::_Unforced,0>) Line 1858	C++
 	[Inline Frame] principia.dll!std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *>::operator()() Line 1914	C++
 	[Inline Frame] principia.dll!std::_Invoker_functor::_Call(std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> &&) Line 230	C++
 	[Inline Frame] principia.dll!std::invoke(std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> &&) Line 230	C++
 	[Inline Frame] principia.dll!std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> > > > >::_Execute(std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> > &) Line 238	C++
 	[Inline Frame] principia.dll!std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> > > > >::_Run(std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> > > > > *) Line 245	C++
 	principia.dll!std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void (__cdecl principia::base::ThreadPool<principia::base::Status>::*)(void),principia::base::ThreadPool<principia::base::Status> *> > > > >::_Go() Line 230	C++
 	principia.dll!std::_Pad::_Call_func(void * _Data) Line 209	C++
 	ucrtbase.dll!thread_start<unsigned int (__cdecl*)(void * __ptr64)>()	Unknown
 	kernel32.dll!BaseThreadInitThunk�()	Unknown
 	ntdll.dll!RtlUserThreadStart�()	Unknown
@nikolain
Copy link
Author

nikolain commented Jan 5, 2019

The relevant source code lines looks to be:

    auto const a = intrinsic_force_ / mass_;
    // NOTE(phl): |a| used to be captured by copy below, which is the logical
    // thing to do.  However, since it contains an |R3Element|, it must be
    // aligned on a 16-byte boundary.  Unfortunately, VS2015 gets confused and
    // aligns the function object on an 8-byte boundary, resulting in an
    // addressing fault.  With a reference, VS2015 knows what to do.
    auto const intrinsic_acceleration = [&a](Instant const& t) { return a; };
    CHECK_OK(ephemeris_->FlowWithAdaptiveStep(
                 history_.get(),
                 intrinsic_acceleration,
                 t,
                 adaptive_step_parameters_,
                 Ephemeris<Barycentric>::unlimited_max_ephemeris_steps,
                 /*last_point_only=*/false));
    psychohistory_ = history_->NewForkAtLast();

Sorry, something went wrong.

@eggrobin
Copy link
Member

eggrobin commented Jan 5, 2019

The last line in your stack trace is indeed

CHECK_OK(ephemeris_->FlowWithAdaptiveStep(
history_.get(),
intrinsic_acceleration,
t,
adaptive_step_parameters_,
Ephemeris<Barycentric>::unlimited_max_ephemeris_steps,
/*last_point_only=*/false));
and then it's LogMessageFatal, this looks like a CHECK_OK failure (FlowWithAdaptiveStep returned an erroneous status).

Can you give us the corresponding INFO log (https://github.com/mockingbirdnest/Principia/wiki/Installing,-reporting-bugs,-and-frequently-asked-questions#windows-dialog-box-or-sigabrt)?
It should have the status, which may be informative, as well as additional information on spacecraft breakups (pile up changes are logged at the INFO severity).

Given that this is a reentry, OUT_OF_RANGE seems likely.

return Status(Error::OUT_OF_RANGE, "Collision detected");

We should handle OUT_OF_RANGE more gracefully here: just because the spacecraft is about to crash doesn't mean we should crash...

@nikolain
Copy link
Author

nikolain commented Jan 5, 2019

Crash minidump with heap and glog's for 2018-12-30 are available at
https://1drv.ms/f/s!Ao0C2PjqjY3EiRcj1UCHXZzpwGCw

@nikolain
Copy link
Author

nikolain commented Jan 5, 2019

Given the intermittent nature of the issue, and not all disintegrations ending up in KSP crash, I suspect some form of timing / race condition when vehicle splits in multiple piece after fuel tank overheated and exploded.

As I've never seen this with last 1.2.2 version, I speculate, handling of objects in atmosphere made it much more likely.

@eggrobin
Copy link
Member

eggrobin commented Jan 5, 2019

Interesting; the log file ends with a singularity (FAILED_PRECONDITION), not a collision detection (OUT_OF_RANGE).

F1230 13:53:37.589620 124848 pile_up.cpp:378] Check failed:
    (ephemeris_->FlowWithAdaptiveStep(
         history_.get(), intrinsic_acceleration, t, adaptive_step_parameters_,
         Ephemeris<Barycentric>::unlimited_max_ephemeris_steps, false)) ==
    ::principia::base::Status::OK
    (FAILED_PRECONDITION: At time -1.52453025268020964e+09 s, step size is effectively zero.  
     Singularity or stiff system suspected.
    vs.
     OK) 

Decoded stack (same as yours above):

ksp_plugin/pile_up.cpp:131

base/thread_pool_body.hpp:15
base/thread_pool_body.hpp:80

This probably means that we have calculated that the vessel should have gone very close to the centre of the Earth (making it impossible for us to integrate its motion further), but KSP did not notice the collision and failed to kill the vessel.

As I've never seen this with last 1.2.2 version, I speculate, handling of objects in atmosphere made it much more likely.

I think so too (you can get closer to the ground, making it more likely to be about to crash or run into the singularity).

@pleroy
Copy link
Member

pleroy commented Jan 15, 2019

@nikolain: The change in #2064 removes the CHECK_OK so it won't fail in that manner again. Instead it will bubble up the error status and kill the vessel (or rather: vessel debris). However, there is possibility that continuing after the error would violate an invariant somewhere and cause trouble. I have examined the code and feel that we are reasonably safe, but since we couldn't reproduce the failure it's hard to be sure.

There is a LOG(WARNING) for a FAILED_PRECONDITION which should be easy to find in the log files if the situation that caused the crash in encountered. Please try to reproduce the issue and tell us if you see that message. The change will be in the next release, Εὐκλείδης, (around Feb. 4th); or if you feel like it, you can try building it from master.

@nikolain
Copy link
Author

Awesome. I'll give the changed code a try (either after Feb 4th with official build) or in private build if I get to it.

@eggrobin eggrobin added this to the Εὐκλείδης milestone Feb 10, 2019
@pleroy
Copy link
Member

pleroy commented Mar 23, 2019

No news is good news. Closing.

@pleroy pleroy closed this as completed Mar 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants