Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the precision of Time and Time::Span to nanoseconds #5022

Merged
merged 1 commit into from Oct 2, 2017
Merged

Increase the precision of Time and Time::Span to nanoseconds #5022

merged 1 commit into from Oct 2, 2017

Conversation

asterite
Copy link
Member

This PR fixes the issue that the current Time and Time::Span types cannot hold nanosecond resolution times.

Internally, Time is now stored as:

struct Time
  @seconds : Int64
  @nanoseconds : Int32
  @kind : Kind # Int32
end

and Time::Span is now:

struct Time::Span
  @seconds : Int64
  @nanoseconds : Int32
end

so Time occupies 128 bits and Time::Span 92 bits (though because of padding it will probably occupy 128 bits too).

However, I don't intend this to be the final representation, but I'd like to start sending PRs fixing issues one by one (for example, I can imagine @kind ceasing to exist).

This also gets rid of the ticks property of both Time and Time::Span which leaks information about the internal representation. I also refactored the code a bit so if later we want to change the internal representation it should be easier to do. The idea is to finish shaping the API and later being able to improve the internals without needing to change the API again.

Note: on OSX, even though Time can now have nanosecond precision, it won't be the case because of the C calls we are making. This should be improved in a separate PR.

@asterite
Copy link
Member Author

This also lets Time::Span hold a maximum of 106751991167300 days (about 292471208677 years), where previously it was limited to 10675199 days (about 29247 years)

@RX14
Copy link
Contributor

RX14 commented Sep 22, 2017

I dislike splitting seconds and nanoseconds. In this PR, Time#+(Time::Span) requires converting to nanoseconds (presumably an int64) and then back to seconds + nanoseconds again. This is expensive and error-prone. Obviously, the maximum length Time::Span cannot fit into a UInt64 of nanoseconds, and it will silently overflow. This is buggy and wrong. Working in ticks (or nanoseconds since 0000) is much easier and avoids errors like this because all arithmetic is simply done in a single numerical value (we'd need to implement 128bit numbers for this though) and complexity is introduced at conversion time.

@asterite
Copy link
Member Author

@RX14 How do you propose to implement it without 128 bit integers? What if we decide to never add that type to the language?

@asterite
Copy link
Member Author

And also, we can add overflow checks where needed, it's not something that can't be fixed at all.

@RX14
Copy link
Contributor

RX14 commented Sep 22, 2017

@asterite Well, i propose to implement it with 128bit integers. It seems like the easiest way to store times.

We can add overflow checks but it doesn't allow us to add together arbitrary time values. We'd need to do more complicated maths with both seconds and nanoseconds for that to work. Using a 128bit integer vastly simplifies the design of Time, and they are likely useful elsewhere too.

@ysbaddaden
Copy link
Contributor

Since we don't have Int128 we must split nanoseconds from seconds. It would greatly simplify some patterns, but Adding support for Int128 just for Time seems a little overkill.

Yet, we should manually do the separated computation of seconds (Int64) and nanoseconds (Int32) then check for any overflow in nanoseconds and apply it to seconds, which is fine because the maximum nanoseconds precision (999_999_999)

@ysbaddaden ysbaddaden closed this Sep 22, 2017
@ysbaddaden ysbaddaden reopened this Sep 22, 2017
@asterite
Copy link
Member Author

I can implement 128 bit integers before this, but just like @ysbaddaden says, maybe it's a bit overkill. Also, I'm not sure 128 bit integers are supported in all platforms... are we sure of that?

@RX14
Copy link
Contributor

RX14 commented Sep 22, 2017

@asterite LLVM should generate code for 128bit integers for all the architectures we support. Rust uses this feature too and supports many more platforms. 128bit floats are harder (we shouldn't support them) because you'd want to implement most of the Math functions for them, which gets ugly. The only other thing I can think of is we might have to disallow them in C bindings because I don't know how they're represented in the ABI.

@ysbaddaden
Copy link
Contributor

Hit wrong button inadvertently...

I was saying that the maximum overflowed nanoseconds was 999_999_999*2 which is smaller than Int32::MAX (same for negative), so we can do the computation on separated values and apply overflows.

Strong 👍 from me, once we're sure computations don't overflow on large dates and spans.

@asterite
Copy link
Member Author

I'm not sure 128 bit integers are supported on all platforms. I can find many comments in the Rust repo that say this. For example rust-lang/rust#35118 (comment)

In the past I implemented it but CI failed on some platforms with a cryptic LLVM error. That's why I never pushed it as a real PR.

@asterite
Copy link
Member Author

Hmm... the second time the std specs are run, one spec fails. Super mysterious!

@RX14
Copy link
Contributor

RX14 commented Sep 22, 2017

@asterite If emscripten (which isn't upstream llvm) is the only target the rust people could find that doesn't support i128, then i don't see an issue. I don't think crystal will ever support a target that rust doesn't. There really isn't much market share on UNIX and windows OSes outside ARM and X86.

@@ -8,41 +8,41 @@ end

describe Time::Span do
it "initializes" do
t1 = Time::Span.new 1234567890
t1.to_s.should eq("00:02:03.4567890")
t1 = Time::Span.new nanoseconds: 123_456_789_000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would/should changing 123_456_789_000 to 123_456_789_012 lead to 00:02:03.456789012 in the next line? [i.e.: to clarify precision more]

src/time/span.cr Outdated

# Make sure that if seconds is positive, nanoseconds is
# positive too. Likewise, if seconds is negtive, make
# sure tht nanoseconds is negative too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos: "negative" and "that".

src/time/span.cr Outdated
end

def nanoseconds
@nanoseconds.to_i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nanoseconds is already an Int32.

src/time/span.cr Outdated
TicksPerMinute = TicksPerSecond * 60
TicksPerHour = TicksPerMinute * 60
TicksPerDay = TicksPerHour * 24
MaxValue = new seconds: Int64::MAX, nanoseconds: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the maximum seconds: Int64::MAX, nanoseconds: 999_999_999?

src/time/span.cr Outdated
end

def total_milliseconds
ticks.to_f / TicksPerMillisecond
total_nanoseconds.to_f / NANOSECONDS_PER_MILLISECOND
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_nanoseconds is already a Float64.

src/time/span.cr Outdated
Span.new(ticks * number)
Span.new(
seconds: total_seconds_i * number,
nanoseconds: nanoseconds * number,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe convert nanoseconds to an Int64 here? It's fairly easy to overflow the maximum 999_999_999 with number a mere 2.15, whereas with an Int64 number can be as big as 9223372046.08.

src/time/span.cr Outdated
val = (value < 0 ? (value - 0.5) : (value + 0.5)).to_i64 # round away from zero
Span.new(val * TicksPerMillisecond)
# :nodoc:
def self.from(value, nanos_multiplicator) : self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method may be the most problematic, it takes a maximum Int64::MAX nanoseconds after the multiplicator is applied, which is very far from being capable to render the full potential of Time::Span representations.

Most regular use cases won't overflow, but larger computations will, e.g. 106752.days.

To be honest, the overflow was already present, just with slightly larger values, e.g. 10675200.days but we raise an exception (Time::Span too big or to small). I have a feeling the overflow is now silenced?

I'm totally wrong here, since Time::Span.from is only ever called from Float so it should be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I'll move the method to Float and make it private, it makes more sense :-)

src/time.cr Outdated
end

def clone
self
end

def +(other : Span)
add_ticks other.ticks
add_nanoseconds other.total_nanoseconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using other.total_seconds_i and other.nanoseconds with manual nanosecond overflow checks but exact precision, instead of converting the span to a float, with division and modulo? Something like:

def +(span : Span)
  s = @seconds + span.total_seconds_i
  ns = @nanoseconds + span.nanoseconds
  if ns > NANOSECONDS_PER_SECOND
    s += 1
    ns -= NANOSECONDS_PER_SECOND
  end
  new(seconds: s, nanoseconds: ns, kind: kind)
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes. That code is old because I originally implemented this with Time::Span just having a nanoseconds instance var (so max was around 290 years, a bit limited)

@asterite
Copy link
Member Author

@RX14 I think you are right, we can try to support 128 bit integers. I might try to do that later soon. In any case, once we have that support, changing the implementation of both Time and Time::Span should be trivial: for Time only a few constructors and #total_seconds and #nanosecond need to change, and for Time::Span only a few constructors and #to_i and #nanoseconds need to change.

However, I'd eventually like Time to include monotonic time too like in Go, when doing Time.now, and I think there's a way to efficiently implement that with bit fiddling so having 128 bit integers wouldn't be a lot of help (we can always implement this with a fixed static array of bytes).

src/time/span.cr Outdated
def self.zero
new(0)
new(nanoseconds: 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return ZERO?

@RX14
Copy link
Contributor

RX14 commented Sep 23, 2017

@asterite If we're fiddling around with Time, how about we work out how many bits we'd like for monotonic time now and reserve the space? In fact, if we're adding monotonic time why do we need wall time in ns? Wall time is never going to be accurate to a nanosecond precision, is there any reason to make it be that precise?

@asterite
Copy link
Member Author

Because I'd like to implement (or someone else to implement) monotonic time in a separate PR. Having both things in a same PR will make it harder to review (just see the "YAML revamp" PR :-P)

@asterite
Copy link
Member Author

Specs for Float#days fail... but I don't know how to fix them or why they are only failing for 32 bits and after compiling the compiler.

Maybe we can just remove the methods on Float because they are not precise, 1.2345.days makes very little sense.

@Sija
Copy link
Contributor

Sija commented Sep 24, 2017

@asterite 2.5.days makes a perfect sense to me.

@RX14
Copy link
Contributor

RX14 commented Sep 24, 2017

@asterite I didn't say implement them, I just don't want to merge this PR, then find out that we want to change the precision of wall time again when we implement monotonic.

@asterite
Copy link
Member Author

@Sija what about 2.4983 days? I mean, yes, with .5 values it makes sense, but is it worth it to implement this functionality for every float? You can always do 2.days + 12.hours which is probably easier to understand and more precise to compute.

@RX14 Well, some internal representation will have to change in a subsequent PR. But that doesn't mean we need to change wall time precision. At least not leak it through the API. The user can always ask for nanoseconds but maybe the precision will be less than that (which is actually the cas now in OSX).

@RX14
Copy link
Contributor

RX14 commented Sep 24, 2017

@asterite We don't appear to document the precision of time or that it may change based on platform but I think we should.

@asterite
Copy link
Member Author

We can do that. But maybe there's a way to get good precision in mac too. I think there are some APIs for that, we are just not using them.

@asterite
Copy link
Member Author

I fixed the Float methods :-)

src/time.cr Outdated
KindMask = 0xc000000000000000
MAX_VALUE_TICKS = 3155378975999999999_i64
# :nodoc:
DAYS_MONTH = [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a side node: Couldn't these constants be tuples?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I changed it

src/time.cr Outdated
DP100 = 36524
DP4 = 1461
# :nodoc:
SECONDS_PER_MINUTE = 60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shouldn't this be documented? All these constants are "natural" constants and not internal values. They might be useful for time handling, so it I'd like to see them in the API docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they are part of the internal implementation. If the implementation changes I don't want to have to keep this constant. Anyone can compute this by themselves.

@asterite asterite added this to the Next milestone Sep 29, 2017
@bew
Copy link
Contributor

bew commented Oct 2, 2017

Is there something missing before merge? or does it need another review? or more thinking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants