Std.os.time #933

tgschultz · 2018-04-18T22:58:09Z

I've added a namespace time to std.os and included within it basic utc timestamp functions, a high performance timer, and constants for various epochs. It made sense to me to move os.sleep into time so I did that too.

tiehuis · 2018-04-18T23:20:40Z

std/os/epoch.zig

+pub const amiga       = 252460800;        //Jan 01, 1978 AD
+pub const pickos      = -63244800;        //Dec 31, 1967 AD
+pub const gps         = 315964800;        //Jan 06, 1980 AD
+pub const clr         = 62135769600;      //Jan 01, 0001 AD


Should this be negative instead?

Yes, it should.

tiehuis · 2018-04-18T23:33:01Z

std/os/time.zig

+            },
+            Os.linux => {
+                var ts: posix.timespec = undefined;
+                var result = posix.clock_getres(posix.CLOCK_MONOTONIC, &ts);


I think we want CLOCK_MONOTONIC_RAW here instead. The _RAW variant is the same but avoids any time skewing due to ntp and/or adjtime. This makes it more suitable as a high-performance timer since we won't get any sudden jumps/delays.

Windows' QueryPerformanceCounter isn't time-adjusted either so we match this behavior across operating systems. Not 100% on MacOS.

Note that CLOCK_MONOTONIC_RAW this is only present on kernels 2.6.28 and newer and is linux-specific so we would need to keep that in mind:
https://linux.die.net/man/2/clock_gettime

I'm actually not 100% convinced about this change after doing some more reading. So happy to hear disagreements.

I had the same thought and did some reading on the subject. I went this way for two reasons:

CLOCK_MONOTONIC_RAW is linux specific from what I can tell. Easy enough to switch our way through that, but:

Pretty much everyone I've seen having this discussion has ultimately decided on using regular MONOTONIC. The reasoning is that even though the interval of CLOCK_MONOTONIC is not stable, it will average out to being more correct in relation to the real passage of time.

Both are guaranteed to only increase (bugs aside) and neither should "jump" ahead, my understanding is that the skews applied at this level are small.

That said, I'm not entirely sure RAW isn't a better choice just because it matches the QPC behavior. My guess is that the Darwin behavior is also not subject to skew based on how it works.

I think CLOCK_MONOTONIC_RAW is best for linux.

it's OK to depend on OS-specific things. we'll abstract darwin, linux, and windows (and others when supported) into a platform-agnostic API.

timer is trying to measure an absolute duration, rather than measure the difference between 2 clock times.

A thing to consider is whether handling ntp adjustments leads to more accurate time on small periods i.e. if the cpu oscillator is inaccurate enough that the constant skewing leads to more accurate times using CLOCK_MONTONIC on moderate intervals (say 1second+). Looking at ntpd specifically, it seems that the lower poll interval is 64 seconds by default so I would presume that the onboard cpu clock is accurate enough for the time periods (sub-minute, likely second) we are interested in.

Hmm, I see.

This looks like a nice writeup: ros2/rcutils#43

In this case I'm happy with CLOCK_MONOTONIC.

I've been working on a distributed clock fault-tolerance problem the past week and spotted the comment in std/time.zig for the Timer implementation pointing here, so I thought I'd drop a comment if this is revisited again:

CLOCK_MONOTONIC trumps CLOCK_MONOTONIC_RAW for the reasons already given by @tgschultz and @andrewrk . It's definitely the better choice of the two. As far as I can see, CLOCK_MONOTONIC_RAW is mostly useful for programs like NTP that want to get at the raw data to measure errors with respect to drift (i.e. clocks that tick too fast or too slow) as opposed to skew (i.e. clocks that are head or behind the "true time"), but CLOCK_MONOTONIC_RAW is not what you want when you want to measure elapsed units of time. So we're good here.

However, for measuring elapsed time that corresponds to time passing across multiple systems, CLOCK_MONOTONIC is trumped again by CLOCK_BOOTTIME, which is exactly the same as CLOCK_MONOTONIC but which can measure elapsed time accurately, even across a suspend. That CLOCK_MONOTONIC fails to measure elapsed time after a suspend accurately was more an accidental historical bug in the kernel. There was an attempt to fix this in the kernel at one point to make CLOCK_MONOTONIC the same as CLOCK_BOOTTIME but this broke applications. Compared to CLOCK_MONOTONIC, CLOCK_BOOTTIME allows applications to get a true suspend aware monotonic clock.

Granted, some applications may only want to measure elapsed time while the process is running, so may prefer CLOCK_MONOTONIC but the choice would be application-specific, and it seems there are also quite a few applications (e.g. distributed databases) suggesting or using CLOCK_MONOTONIC for timers that govern leader leases where they probably should be using CLOCK_BOOTTIME.

Thanks for the comments @jorangreef. Based on this information I think we may want to consider an iteration on the std lib time API to take into consideration hardware suspension and perhaps force the user to make a choice.

…our mind in the future. Updated std.os imported tests' block with lazy declaration workaround and added time.zig. Corrected some incorrect comments.

bnoordhuis

Left some comments. Nice work!

bnoordhuis · 2018-04-19T09:17:24Z

std/os/index.zig

-    _ = @import("windows/index.zig");
-    _ = @import("test.zig");
+comptime {
+    if(builtin.is_test) {


Style nit: space before (. Happens in other places too, I won't point them out individually.

I missed that subtlety of the styling in the other std code. One day zig-fmt will save me from having to switch styles depending on what I'm working on.

Yeah let's not worry about style. Style concerns should be in the form of pull requests to zig fmt.

bnoordhuis · 2018-04-19T09:26:19Z

std/os/time.zig

+                req = rem;
+                continue;
+            },
+            else => return,


Shouldn't anything but EINTR be a fatal error? I mean, they shouldn't happen in a bug-free program.

// Sometimes Darwin returns EINVAL for no reason.

Interesting. I've never observed that but looking at the nanosleep implementation in libc and __semwait_signal_nocancel() in xnu, they can indeed fail with EINVAL for some runtime errors.

I guess treating it as EINTR is a bad idea, that might result in busy-looping.

I didn't write the sleep code, I just moved it from std.os and substituted named constants, so I'm not sure of the reasoning.

bnoordhuis · 2018-04-19T09:32:56Z

std/os/time.zig

+    var err = darwin.gettimeofday(&tv, null);
+    debug.assert(err == 0);
+    const sec_ms = tv.tv_sec * ms_per_s;
+    const usec_ms = @divFloor(tv.tv_usec, (us_per_s / ms_per_s));


Somewhat inconsistent use of parens vis-a-vis line 95. (The ones on line 72 aren't necessary either, for that matter.)

bnoordhuis · 2018-04-19T09:36:02Z

std/os/time.zig

+
+/// Divisions of a second
+pub const ns_per_s = 1000000000;
+pub const us_per_s = 1000000;


Shame, could have benefited from #504...

I could use something like comptime u64(math.pow(f64, 10, 9)); here, but I'm not sure that's actually more readable.

No, I agree. It was more wistful thinking-out-aloud.

bnoordhuis · 2018-04-19T09:38:41Z

std/os/time.zig

+
+    //Initialize the timer structure.
+    //This gives us an oportunity to grab the counter frequency in windows.
+    //On Windows: QueryPerformanceCounter will succeed on anything >= XP/2000.


See libuv/libuv#1268 and the issues linked therein - not an argument per se against QueryPerformanceCounter() but it's not as stable as clock_gettime(CLOCK_MONOTONIC) is on the Unices.

Interesting. I'll add some asserts to cover things like this.

Actually, I don't need to add anything since zig has me covered with runtime checks in the appropriate modes.

bnoordhuis · 2018-04-19T09:45:30Z

std/os/time.zig

+            },
+            Os.linux => {
+                var ts: posix.timespec = undefined;
+                var result = posix.clock_getres(monotonic_clock_id, &ts);


Could benefit from caching. clock_getres() is not a hugely expensive system call (might not be a system call at all, often served from the vDSO) but it's still a few cycles.

The result is cached in the Timer struct. As long as you use the same Timer to do your timing you won't call it again. We could store it in a global and only ever call clock_getres/QPC/mach_timebase_info once, but it seems unintuitive to me that the time.monotonic_resolution (or whatever) would be 0 or some other invalid value until the first call to Timer.start(), so then I'd want to introduce a time.getMonotonicResolution() to retrieve the value and handle initialization if necessary.

I dunno, I guess I didn't consider this because there's too much Java in my past.

Another reason not to globally cache it: every call to getMonotonicResultion() would have to check for an error, instead of just once per start().

bnoordhuis · 2018-04-19T09:46:29Z

std/os/time.zig

+                self.start_time = u64(ts.tv_sec * ns_per_s + ts.tv_nsec);
+            },
+            Os.macosx, Os.ios => {
+                darwin.mach_timebase_info(&self.frequency);


Could also benefit from caching. mach_timebase_info() is fairly expensive.

bnoordhuis · 2018-04-19T09:48:53Z

std/os/time.zig

+
+    fn clockLinux() u64 {
+        var ts: posix.timespec = undefined;
+        var result = posix.clock_gettime(monotonic_clock_id, &ts);


Don't know if it matters but seccomp2 sandboxes can reject clock_gettime(), although that's arguably a broken-beyond-reasonableness sandbox.

I hadn't considered that a system might allow clock_getres but not clock_gettime. It's possible and therefore should be considered. My first thought is to add another check in start() and only throw the error there, but my understanding is that seccomp could be used to reject syscalls using arbitrary criteria, meaning that just because a syscall succeeded once doesn't mean it will in the future, so the appropriate thing to do would be to make every call that makes a syscall potentially throw an error.

That seems really annoying, but edge cases matter, so I dunno. At the very least I can add a check to start() and make EACESS throw an error instead of unreachable.

Actually it turns out seccomp can return an arbitrary errno value. I feel like the best way to handle this is to just use asserts and unreachable because this kind of thing would be roughly on par with using a debugger to insert random values. Open to suggestions though.

Sounds reasonable to me.

bnoordhuis · 2018-04-19T09:51:22Z

std/os/time.zig

+        var ts: posix.timespec = undefined;
+        var result = posix.clock_gettime(monotonic_clock_id, &ts);
+        debug.assert(posix.getErrno(result) == 0);
+        return u64(ts.tv_sec * ns_per_s + ts.tv_nsec);


I think this can overflow on 32 bits systems since tv_sec and tv_nsec are of type isize? Happens in a few more places in this file.

Good point. I had casted the individual values to u64 other places, but apparently was hasty and missed several of them.

…chable on unexpected errno.

…expected error

tgschultz added 4 commits April 18, 2018 13:52

Added timestamp, high-perf. timer functions.

c90f936

Added unstaged changes.

8b66dd8

Fixed compiler errors around darwin code.

bf9cf28

fixed typos.

7cfe328

tiehuis reviewed Apr 18, 2018

View reviewed changes

tgschultz added 3 commits April 18, 2018 18:50

Fixed incorrect sign on epoch.clr

5c83d27

Added notes regarding CLOCK_MONOTONIC_RAW and made it easy to change …

fdebe38

…our mind in the future. Updated std.os imported tests' block with lazy declaration workaround and added time.zig. Corrected some incorrect comments.

Fixed another incorrect comment

3c9b6f8

bnoordhuis reviewed Apr 19, 2018

View reviewed changes

tgschultz added 2 commits April 19, 2018 10:01

Style cleanups, u64 casts, Timer.start returns error instead of unrea…

89eade0

…chable on unexpected errno.

Use std.os.errorUnexpectedPosix if timer initialization encounters un…

ca4053b

…expected error

tiehuis mentioned this pull request Apr 20, 2018

Improving throughput test #935

Merged

andrewrk merged commit ca4053b into ziglang:master Apr 22, 2018

tgschultz deleted the std.os.time branch April 22, 2018 22:45

Brent-A mentioned this pull request Mar 28, 2019

Provide method to convert device timestamps to system timestamps microsoft/Azure-Kinect-Sensor-SDK#198

Closed

3 tasks

Std.os.time #933

Std.os.time #933

Conversation

tgschultz commented Apr 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tiehuis Apr 18, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorangreef Jun 22, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bnoordhuis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tiehuis Apr 18, 2018 •

edited

jorangreef Jun 22, 2021 •

edited