New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add f16 type #1159
add f16 type #1159
Conversation
|
||
// Various constants whose values follow from the type parameters. | ||
// Any reasonable optimizer will fold and propagate all of these. | ||
const srcBits: comptime_int = @sizeOf(src_t) * CHAR_BIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can omit all these explicit comptime_int
declarations. I think it is clear enough that all the values are comptime implicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Just a minor thing and then I think it's good to go.
return extendXfYf2(f32, f16, @bitCast(f16, a)); | ||
} | ||
|
||
pub extern fn __gnu_h2f_ieee(a: u16) f32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better as an actual symbol alias like this:
zig/std/special/compiler_rt/index.zig
Lines 23 to 24 in 8e71428
@export("__extenddftf2", @import("extendXfYf2.zig").__extenddftf2, linkage); | |
@export("__extendsftf2", @import("extendXfYf2.zig").__extendsftf2, linkage); |
std/special/compiler_rt/index.zig
Outdated
@@ -22,6 +22,11 @@ comptime { | |||
@export("__floatuntidf", @import("floatuntidf.zig").__floatuntidf, linkage); | |||
@export("__extenddftf2", @import("extendXfYf2.zig").__extenddftf2, linkage); | |||
@export("__extendsftf2", @import("extendXfYf2.zig").__extendsftf2, linkage); | |||
@export("__extendhfsf2", @import("extendXfYf2.zig").__extendhfsf2, linkage); | |||
@export("__gnu_h2f_ieee", @import("extendXfYf2.zig").__gnu_h2f_ieee, linkage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check this out, you can make this an actual symbol alias:
@export("__gnu_h2f_ieee", @import("extendXfYf2.zig").__extendhfsf2, linkage);
This avoids an extra function call overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually had that in a prior version but it was giving me duplicate symbol definition errors in a couple of tests. It was complaining __gnu_h2f_ieee
conflicted with __extendhfsf2
, presumably because they're one and the same address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's curious. I'd like to look at the LLVM IR if you got that error message. It's supposed to work for every target: http://llvm.org/docs/LangRef.html#aliases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity, this is the error it emits:
$ build/zig test std/special/compiler_rt/index.zig
lld: error: duplicate symbol: __gnu_h2f_ieee
>>> defined at extendXfYf2.zig:13 (/home/bnoordhuis/src/zig/std/special/compiler_rt/extendXfYf2.zig:13)
>>> ./zig-cache/test.o:(__extendhfsf2)
>>> defined at extendXfYf2.zig:13
>>> ./zig-cache/compiler_rt.o:(__extendhfsf2)
lld: error: duplicate symbol: __truncsfhf2
>>> defined at truncXfYf2.zig:3 (/home/bnoordhuis/src/zig/std/special/compiler_rt/truncXfYf2.zig:3)
>>> ./zig-cache/test.o:(__gnu_f2h_ieee)
>>> defined at truncXfYf2.zig:3
>>> ./zig-cache/compiler_rt.o:(__gnu_f2h_ieee)
That's with the functions removed from extendXfYf2.zig and truncXfYf2.zig.
Giving the symbols different linkage didn't help. Defining the gnu symbols only when !is_test
does and otherwise seems to have no ill effects so that's what I'll do.
I'll look into the windows failure. |
I've managed to reproduce it locally under wine. I'll poke at it more tomorrow if you don't get to it first. |
I figured out the Windows failures, kind of; it seems to be a stack alignment issue. This fixes it for me: diff --git a/std/special/compiler_rt/extendXfYf2.zig b/std/special/compiler_rt/extendXfYf2.zig
index c2721be0..a419eefd 100644
--- a/std/special/compiler_rt/extendXfYf2.zig
+++ b/std/special/compiler_rt/extendXfYf2.zig
@@ -20,7 +20,7 @@ pub extern fn __gnu_h2f_ieee(a: u16) f32 {
const CHAR_BIT = 8;
-pub fn extendXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
+inline fn extendXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
const src_rep_t = @IntType(false, @typeInfo(src_t).Float.bits);
const dst_rep_t = @IntType(false, @typeInfo(dst_t).Float.bits);
const srcSigBits = std.math.floatMantissaBits(src_t);
diff --git a/std/special/compiler_rt/truncXfYf2.zig b/std/special/compiler_rt/truncXfYf2.zig
index 8491db9a..6e463aba 100644
--- a/std/special/compiler_rt/truncXfYf2.zig
+++ b/std/special/compiler_rt/truncXfYf2.zig
@@ -10,7 +10,7 @@ pub extern fn __truncsfhf2(a: f32) u16 {
const CHAR_BIT = 8;
-pub fn truncXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
+inline fn truncXfYf2(comptime dst_t: type, comptime src_t: type, a: src_t) dst_t {
const src_rep_t = @IntType(false, @typeInfo(src_t).Float.bits);
const dst_rep_t = @IntType(false, @typeInfo(dst_t).Float.bits);
const srcSigBits = std.math.floatMantissaBits(src_t); Anyone have ideas on whether a better fix is possible? |
Here are 2 clues: Line 47 in 4de60dd
zig/std/special/compiler_rt/extendXfYf2.zig Lines 81 to 82 in 4de60dd
maybe try aligning + const result: dst_rep_t = absResult | @truncate(dst_rep_t, sign >> @intCast(SrcShift, srcBits - dstBits));
+ return @bitCast(dst_t, result); |
If you can find out the assembly line it's crashing on and it looks something like
then I think that is #1148 |
Aw, I'm terribly sorry; I forgot to mention that I adopted that change (edit: |
I think the Windows ABI expects the stack to be 16 bytes aligned always. If that fixes it we might consider implicitly doing Come to think of it I wonder if this is related to our 32-bit windows test failures. |
Fixes a bug where the result of a @floatCast wasn't actually checked; it was checking the result from the previous @floatCast.
Add support for half-precision floating point operations. Introduce `__extendhfsf2` and `__truncsfhf2` in std/special/compiler_rt. Add `__gnu_h2f_ieee` and `__gnu_f2h_ieee` as aliases that are used in Windows builds. The logic in std/special/compiler_rt/extendXfYf2.zig has been reworked and can now operate on 16 bits floating point types. `extendXfYf2()` and `truncXfYf2()` are marked `inline` to work around a not entirely understood stack alignment issue on Windows when calling the f16 versions of the builtins. closes #1122
Replace a conditional ceil/floor call with an unconditional trunc call.
I've tried various incarnations of |
Add support for half-precision floating point operations.
Introduce
__extendhfsf2
and__truncsfhf2
in std/special/compiler_rtand add
__gnu_h2f_ieee
and__gnu_f2h_ieee
as aliases; they are usedin Windows builds.
The logic in std/special/compiler_rt/extendXfYf2.zig has been reworked
and can now operate on 16 bits floating point types.
closes #1122