Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A correctly-rounded cube root. #3062

Merged
merged 14 commits into from
Jul 16, 2021
Merged

A correctly-rounded cube root. #3062

merged 14 commits into from
Jul 16, 2021

Conversation

eggrobin
Copy link
Member

Towards #1760; follow-up on #2973.

std::array<double, n> s{};
int k = 0;
s[0] = f[0];
for (int j = 1; j < n; ++j) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a new-style loop.

Sorry, something went wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, following [Nie04] and thus using indices. I guess we could loop for j in some range object, but if we want to go there we need to change a lot of loops.

Sorry, something went wrong.

} else {
s[l + 1] = dʹ;
}
l = l - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--l

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, following [Nie04].

s[k] = c;
if (d != 0) {
int l = k - 1;
k = k + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++k

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, following [Nie04].

auto const [cʹ, dʹ] = TwoSum(s[l], s[l + 1]);
s[l] = cʹ;
if (dʹ == 0) {
k = k - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--k

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, following [Nie04].

static_assert((C - 296) * 0x1p-52 == (2 * 1023 - Γᴋ) / 3);
std::uint64_t const Y = _mm_cvtsi128_si64(_mm_castpd_si128(Y_0));
std::uint64_t const Q = C + Y / 3;
double const q = _mm_cvtsd_f64(_mm_castsi128_pd(_mm_cvtsi64_si128(Q)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code up to this point is identical in both methods, except for the constants. Could we factor it out by passing the constants as template parameters to an auxiliary function? It should be straightforward to inline.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn’t that much code there ; a lot of it is comments on the constants, which differ.
Factoring things out would be very messy since we have a lot of local variables that we want to use throughout.

double const y² = y * y;
double const q² = q * q;
double const q³ = q² * q;
double const d =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea where these formulas come from.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appendix D, it says so in the comment above.

@@ -38,6 +43,77 @@ class CubeRootTest : public ::testing::Test {
: quiet_dead_beef_(FromBits(0x7FF8'0000'DEAD'BEEF)),
signaling_dead_beef_(FromBits(0x7FF0'0000'DEAD'BEEF)) {}

struct RoundedReal {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Real" is not a word in C++. I'd prefer RoundedDouble.

Copy link
Member Author

@eggrobin eggrobin Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real is a real word in the real world though. We round a real, that is, an element of ℝ (which in this case is the cube root of an element of binary64) to binary64 in any of three ways.
Rounding a double makes little sense, it is in binary64 already.

@eggrobin
Copy link
Member Author

Also corrected a typo in the documentation,

where 𝜀̄𝑟 is a bound for 𝜀̄𝑟.

@pleroy pleroy added the LGTM label Jul 16, 2021
@eggrobin
Copy link
Member Author

retest this please

@eggrobin eggrobin merged commit cee0532 into mockingbirdnest:master Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants