add f16 type #1159

bnoordhuis · 2018-06-25T23:07:17Z

Add support for half-precision floating point operations.

Introduce __extendhfsf2 and __truncsfhf2 in std/special/compiler_rt
and add __gnu_h2f_ieee and __gnu_f2h_ieee as aliases; they are used
in Windows builds.

The logic in std/special/compiler_rt/extendXfYf2.zig has been reworked
and can now operate on 16 bits floating point types.

closes #1122

tiehuis · 2018-06-26T05:19:29Z

std/special/compiler_rt/truncXfYf2.zig

+
+    // Various constants whose values follow from the type parameters.
+    // Any reasonable optimizer will fold and propagate all of these.
+    const srcBits: comptime_int = @sizeOf(src_t) * CHAR_BIT;


You can omit all these explicit comptime_int declarations. I think it is clear enough that all the values are comptime implicitly.

andrewrk

This looks great. Just a minor thing and then I think it's good to go.

andrewrk · 2018-06-26T17:39:45Z

std/special/compiler_rt/extendXfYf2.zig

+    return extendXfYf2(f32, f16, @bitCast(f16, a));
+}
+
+pub extern fn __gnu_h2f_ieee(a: u16) f32 {


I think this would be better as an actual symbol alias like this:

zig/std/special/compiler_rt/index.zig

Lines 23 to 24 in 8e71428

@export("__extenddftf2", @import("extendXfYf2.zig").__extenddftf2, linkage);

@export("__extendsftf2", @import("extendXfYf2.zig").__extendsftf2, linkage);

andrewrk · 2018-06-26T17:41:36Z

std/special/compiler_rt/index.zig

@@ -22,6 +22,11 @@ comptime {
    @export("__floatuntidf", @import("floatuntidf.zig").__floatuntidf, linkage);
    @export("__extenddftf2", @import("extendXfYf2.zig").__extenddftf2, linkage);
    @export("__extendsftf2", @import("extendXfYf2.zig").__extendsftf2, linkage);
+    @export("__extendhfsf2", @import("extendXfYf2.zig").__extendhfsf2, linkage);
+    @export("__gnu_h2f_ieee", @import("extendXfYf2.zig").__gnu_h2f_ieee, linkage);


Check this out, you can make this an actual symbol alias:

@export("__gnu_h2f_ieee", @import("extendXfYf2.zig").__extendhfsf2, linkage);

This avoids an extra function call overhead.

I actually had that in a prior version but it was giving me duplicate symbol definition errors in a couple of tests. It was complaining __gnu_h2f_ieee conflicted with __extendhfsf2, presumably because they're one and the same address.

Hmm, that's curious. I'd like to look at the LLVM IR if you got that error message. It's supposed to work for every target: http://llvm.org/docs/LangRef.html#aliases

For posterity, this is the error it emits:

$ build/zig test std/special/compiler_rt/index.zig lld: error: duplicate symbol: __gnu_h2f_ieee >>> defined at extendXfYf2.zig:13 (/home/bnoordhuis/src/zig/std/special/compiler_rt/extendXfYf2.zig:13) >>> ./zig-cache/test.o:(__extendhfsf2) >>> defined at extendXfYf2.zig:13 >>> ./zig-cache/compiler_rt.o:(__extendhfsf2) lld: error: duplicate symbol: __truncsfhf2 >>> defined at truncXfYf2.zig:3 (/home/bnoordhuis/src/zig/std/special/compiler_rt/truncXfYf2.zig:3) >>> ./zig-cache/test.o:(__gnu_f2h_ieee) >>> defined at truncXfYf2.zig:3 >>> ./zig-cache/compiler_rt.o:(__gnu_f2h_ieee)

That's with the functions removed from extendXfYf2.zig and truncXfYf2.zig.

Giving the symbols different linkage didn't help. Defining the gnu symbols only when !is_test does and otherwise seems to have no ill effects so that's what I'll do.

andrewrk · 2018-06-26T17:43:58Z

I'll look into the windows failure.

bnoordhuis · 2018-06-26T18:13:07Z

I've managed to reproduce it locally under wine. I'll poke at it more tomorrow if you don't get to it first.

bnoordhuis · 2018-06-26T19:28:54Z

I figured out the Windows failures, kind of; it seems to be a stack alignment issue. This fixes it for me:

diff --git a/std/special/compiler_rt/extendXfYf2.zig b/std/special/compiler_rt/extendXfYf2.zig
index c2721be0..a419eefd 100644
--- a/std/special/compiler_rt/extendXfYf2.zig
+++ b/std/special/compiler_rt/extendXfYf2.zig
@@ -20,7 +20,7 @@ pub extern fn __gnu_h2f_ieee(a: u16) f32 {
 
 const CHAR_BIT = 8;
 
-pub fn extendXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
+inline fn extendXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
     const src_rep_t = @IntType(false, @typeInfo(src_t).Float.bits);
     const dst_rep_t = @IntType(false, @typeInfo(dst_t).Float.bits);
     const srcSigBits = std.math.floatMantissaBits(src_t);
diff --git a/std/special/compiler_rt/truncXfYf2.zig b/std/special/compiler_rt/truncXfYf2.zig
index 8491db9a..6e463aba 100644
--- a/std/special/compiler_rt/truncXfYf2.zig
+++ b/std/special/compiler_rt/truncXfYf2.zig
@@ -10,7 +10,7 @@ pub extern fn __truncsfhf2(a: f32) u16 {
 
 const CHAR_BIT = 8;
 
-pub fn truncXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
+inline fn truncXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
     const src_rep_t = @IntType(false, @typeInfo(src_t).Float.bits);
     const dst_rep_t = @IntType(false, @typeInfo(dst_t).Float.bits);
     const srcSigBits = std.math.floatMantissaBits(src_t);

Anyone have ideas on whether a better fix is possible?

andrewrk · 2018-06-26T19:59:00Z

Here are 2 clues:

zig/std/special/bootstrap.zig

Line 47 in 4de60dd

@setAlignStack(16);

zig/std/special/compiler_rt/extendXfYf2.zig

Lines 81 to 82 in 4de60dd

    
           const result: dst_rep_t align(@alignOf(dst_t)) = absResult | dst_rep_t(sign) << @intCast(DstShift, dstBits - srcBits); 
        
           return @bitCast(dst_t, result);

maybe try aligning result before bitcasting it in truncXfYf2, like the other function does:

+    const result: dst_rep_t = absResult | @truncate(dst_rep_t, sign >> @intCast(SrcShift, srcBits - dstBits));
+    return @bitCast(dst_t, result);

andrewrk · 2018-06-26T20:00:39Z

If you can find out the assembly line it's crashing on and it looks something like

=> 0x00000000006a53b4 <+676>:	movaps -0xa8(%rbp),%xmm0

then I think that is #1148

bnoordhuis · 2018-06-26T20:02:19Z

Aw, I'm terribly sorry; I forgot to mention that I adopted that change (edit: align(@alignOf(...))) locally but that alone didn't fix the crash. I'll look into @setAlignStack tomorrow.

andrewrk · 2018-06-26T20:13:00Z

I think the Windows ABI expects the stack to be 16 bytes aligned always. If that fixes it we might consider implicitly doing @setAlignStack(16) on all exported functions by default when targeting Windows.

Come to think of it I wonder if this is related to our 32-bit windows test failures.

Fixes a bug where the result of a @floatCast wasn't actually checked; it was checking the result from the previous @floatCast.

Add support for half-precision floating point operations. Introduce `__extendhfsf2` and `__truncsfhf2` in std/special/compiler_rt. Add `__gnu_h2f_ieee` and `__gnu_f2h_ieee` as aliases that are used in Windows builds. The logic in std/special/compiler_rt/extendXfYf2.zig has been reworked and can now operate on 16 bits floating point types. `extendXfYf2()` and `truncXfYf2()` are marked `inline` to work around a not entirely understood stack alignment issue on Windows when calling the f16 versions of the builtins. closes #1122

Replace a conditional ceil/floor call with an unconditional trunc call.

bnoordhuis · 2018-06-27T14:22:59Z

I've tried various incarnations of @setAlignStack() but to no effect.

tiehuis reviewed Jun 26, 2018

View reviewed changes

andrewrk requested changes Jun 26, 2018

View reviewed changes

bnoordhuis added 4 commits June 27, 2018 16:20

scope variables in floating point cast tests

0ebc7b6

Fixes a bug where the result of a @floatCast wasn't actually checked; it was checking the result from the previous @floatCast.

dry floating-point type definitions

1f45075

simplify comptime floating-point @divTrunc

440c1d5

Replace a conditional ceil/floor call with an unconditional trunc call.

andrewrk merged commit 1b4bae6 into ziglang:master Jun 27, 2018

bnoordhuis mentioned this pull request Jun 28, 2018

Clarify reason implicit cast does not work for large RHS #1168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add f16 type #1159

add f16 type #1159

bnoordhuis commented Jun 25, 2018

tiehuis Jun 26, 2018

bnoordhuis Jun 27, 2018

andrewrk left a comment

andrewrk Jun 26, 2018

andrewrk Jun 26, 2018

bnoordhuis Jun 26, 2018 •

edited

andrewrk Jun 26, 2018

bnoordhuis Jun 27, 2018

andrewrk commented Jun 26, 2018

bnoordhuis commented Jun 26, 2018

bnoordhuis commented Jun 26, 2018

andrewrk commented Jun 26, 2018

andrewrk commented Jun 26, 2018 •

edited

bnoordhuis commented Jun 26, 2018 •

edited

andrewrk commented Jun 26, 2018 •

edited

bnoordhuis commented Jun 27, 2018

	@export("__extenddftf2", @import("extendXfYf2.zig").__extenddftf2, linkage);
	@export("__extendsftf2", @import("extendXfYf2.zig").__extendsftf2, linkage);

add f16 type #1159

add f16 type #1159

Conversation

bnoordhuis commented Jun 25, 2018

tiehuis Jun 26, 2018

Choose a reason for hiding this comment

bnoordhuis Jun 27, 2018

Choose a reason for hiding this comment

andrewrk left a comment

Choose a reason for hiding this comment

andrewrk Jun 26, 2018

Choose a reason for hiding this comment

andrewrk Jun 26, 2018

Choose a reason for hiding this comment

bnoordhuis Jun 26, 2018 • edited

Choose a reason for hiding this comment

andrewrk Jun 26, 2018

Choose a reason for hiding this comment

bnoordhuis Jun 27, 2018

Choose a reason for hiding this comment

andrewrk commented Jun 26, 2018

bnoordhuis commented Jun 26, 2018

bnoordhuis commented Jun 26, 2018

andrewrk commented Jun 26, 2018

andrewrk commented Jun 26, 2018 • edited

bnoordhuis commented Jun 26, 2018 • edited

andrewrk commented Jun 26, 2018 • edited

bnoordhuis commented Jun 27, 2018

bnoordhuis Jun 26, 2018 •

edited

andrewrk commented Jun 26, 2018 •

edited

bnoordhuis commented Jun 26, 2018 •

edited

andrewrk commented Jun 26, 2018 •

edited