Bernstein polynomials with no branching. #591

phkahler · 2020-05-01T20:07:00Z

This is cleaner and smaller, but I was not able to measure a performance impact.

whitequark · 2020-05-01T21:00:34Z

src/srf/ratpoly.cpp

@@ -13,84 +13,31 @@
 // and convergence should be fast by now.
 #define RATPOLY_EPS (LENGTH_EPS/(1e2))

-double SolveSpace::Bernstein(int k, int deg, double t)
+// indexed by [degree][k][exponent]
+const double bernstein_coeff[4][4][4] = {


These should probably be static too, for the same reason the functions are. In this case probably even static constexpr.

whitequark · 2020-05-02T00:11:17Z

(Thanks for keeping the git history clean!)

I'm not sure why the Windows build breaks--does our MSVC version miss constexpr? I thought it should have C++11...

phkahler · 2020-05-02T01:35:35Z

Looks like MS didn't add constexpr until VS2015. See the bottom of page:
https://docs.microsoft.com/en-us/cpp/cpp/constexpr-cpp?view=vs-2019
I'll make them static const.

ruevs · 2020-05-02T13:33:41Z

src/srf/ratpoly.cpp

-            }
-            break;
-    }
-    ssassert(false, "Unexpected degree of spline");


Hm I was going to complain about the lack of range checking but it seems that when the calls are evaluated at compile time nothing "bad" happens:
https://godbolt.org/z/oLA-gN
Maybe leave an assert for debug mode (-O0)?

@ruevs I was a little concerned about the range checking too, but these are only called by other functions in ratpoly.cpp and those are lacking range checking.

ruevs · 2020-05-02T13:35:56Z

src/srf/ratpoly.cpp

+    { { -2.0,2.0,0.0 }, { 2.0,-4.0,0.0 },{ 0.0,2.0,0.0 }, { 0.0,0.0,0.0 } },
+    { { -3.0,6.0,-3.0 },{ 3.0,-12.0,9.0 },{ 0.0,6.0,-9.0}, { 0.0,0.0,3.0 } } };
+
+static double Bernstein(int k, int deg, double t)


Maybe
static double Bernstein(cons int k, const int deg, const double t)
But the optimizer figures it out anyway...

ruevs · 2020-05-02T13:36:26Z

src/srf/ratpoly.cpp

 }

-double SolveSpace::BernsteinDerivative(int k, int deg, double t)
+static double BernsteinDerivative(int k, int deg, double t)


Maybe
static double BernsteinDerivative(cons int k, const int deg, const double t)
But the optimizer figures it out anyway...

ruevs · 2020-05-02T13:41:14Z

src/srf/ratpoly.cpp

@@ -13,84 +13,31 @@
 // and convergence should be fast by now.
 #define RATPOLY_EPS (LENGTH_EPS/(1e2))

-double SolveSpace::Bernstein(int k, int deg, double t)
+// indexed by [degree][k][exponent]
+static const double bernstein_coeff[4][4][4] = {


I would even move these inside the respective functions. (as static const of course).

I moved them inside thinking it would look worse, but now I like it.

ruevs · 2020-05-02T13:55:56Z

By the way, you were probably not able to measure a performance difference, because modern compilers are pretty good. Take a look at this:
https://godbolt.org/z/XxefSH
But I still like your new implementation :-)

phkahler · 2020-05-02T14:48:52Z

@ruevs with the new structure we could do even better. These two functions are always called from within a loop over the k-th polynomial. We could transpose the inner 2 dimensions of the array so we'd have vectors of coefficients for each power. Then write fixed-size loops 0-3 to do each term of the polynomials. SSE or AVX optimizations would compute all 4 polynomials in about the time this does one. It could return (or fill in) an array of 4 results and take the call out of the loops in higher level functions (which might then be more ready for the compiler to optimize).

But that all seems like premature optimisation. I think this PR reduces complexity ;-)

phkahler · 2020-05-03T23:57:04Z

@ruevs I did the parallel implementation on my bernstein branch if you want to see. I used plain for-loops to go over all 4 polynomials one term at a time. From what I've read that's what modern vectorizing compilers like. It slowed things down dramatically (test went from 127ms per frame to 137) when I first changed it but only returned 1 of the 4 values. I didn't expect that, thinking AVX could do all 4 in the time it would normally do only 1. Then I realized I'm not sure how to confirm my compiler setting when building SolveSpace - it's probably using Fedora defaults? IDK. Then I modified it to the latest form and changed PointAt() to make only 1 call and that got the performance back to where it was. I then modified the other 3 functions to use the vector form of the functions and ended up back at 127ms, but it's not doing 4x the work any more. I still think it could run much faster and probably isn't being properly optimized (I'm on a Zen+ core).

Anyway, I have a new version that theoretically could be faster, but I'm considering it shelved until I figure out how to do better measurements and get a handle on what GCC is doing or not.

Lesson: SolveSpace does make a LOT of calls to PointAt() and will suffer measurably if that function is implemented poorly.

whitequark · 2020-05-03T23:59:52Z

Then I realized I'm not sure how to confirm my compiler setting when building SolveSpace - it's probably using Fedora defaults? IDK.

I suspect you'll need to pass -march=native explicitly to see any use of AVX. Also, AVX isn't necessarily going to result in an overall throughput increase due to AVX offset reducing turbo frequency on Intel CPUs.

Bernstein polynomials with no branching. (solvespace#591)

phkahler · 2020-05-04T14:50:01Z

@whitequark where do I put that? This build system is new to me and I can't find where the compiler flags get set.

rpavlik · 2020-05-04T14:55:36Z

CMAKE_CXX_FLAGS in the cmake gui or command line (eg:

mkdir build
cd build
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS=-march=native

)

The lowest-hanging fruit performance-wise I noticed some (semi-long) time ago, at least when testing on Windows, was the stuff I poked at in #432 (recently updated) I thought I handled PointAt, but I was confusing it with DistanceToLine (another hotspot I saw when working with models that caused super slow recompute/export)

whitequark reviewed May 1, 2020

View reviewed changes

whitequark approved these changes May 1, 2020

View reviewed changes

phkahler force-pushed the bernstein branch from d5541fb to 990fb26 Compare May 1, 2020 23:52

phkahler force-pushed the bernstein branch from 990fb26 to b3c916f Compare May 2, 2020 01:46

ruevs reviewed May 2, 2020

View reviewed changes

Bernstein polynomials with no branching.

cd86e7c

phkahler force-pushed the bernstein branch from b3c916f to cd86e7c Compare May 2, 2020 16:11

phkahler merged commit 7366a6c into solvespace:master May 2, 2020

fbradasc added a commit to fbradasc/solvespace that referenced this pull request May 4, 2020

Merge pull request #1 from solvespace/master

c69420e

Bernstein polynomials with no branching. (solvespace#591)

ruevs added the performance label Jan 4, 2022

devin-ai-integration bot pushed a commit to erkinalp/solvespace that referenced this pull request Apr 3, 2025

Bernstein polynomials with no branching. (solvespace#591)

6f3f45f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bernstein polynomials with no branching. #591

Bernstein polynomials with no branching. #591

phkahler commented May 1, 2020

whitequark May 1, 2020 •

edited

Loading

phkahler May 1, 2020

whitequark commented May 2, 2020

phkahler commented May 2, 2020

ruevs May 2, 2020 •

edited

Loading

phkahler May 2, 2020

ruevs May 2, 2020

ruevs May 2, 2020

ruevs May 2, 2020

phkahler May 2, 2020

ruevs commented May 2, 2020 •

edited

Loading

phkahler commented May 2, 2020

phkahler commented May 3, 2020

whitequark commented May 3, 2020

phkahler commented May 4, 2020

rpavlik commented May 4, 2020 •

edited

Loading

Bernstein polynomials with no branching. #591

Bernstein polynomials with no branching. #591

Conversation

phkahler commented May 1, 2020

whitequark May 1, 2020 • edited Loading

Choose a reason for hiding this comment

phkahler May 1, 2020

Choose a reason for hiding this comment

whitequark commented May 2, 2020

phkahler commented May 2, 2020

ruevs May 2, 2020 • edited Loading

Choose a reason for hiding this comment

phkahler May 2, 2020

Choose a reason for hiding this comment

ruevs May 2, 2020

Choose a reason for hiding this comment

ruevs May 2, 2020

Choose a reason for hiding this comment

ruevs May 2, 2020

Choose a reason for hiding this comment

phkahler May 2, 2020

Choose a reason for hiding this comment

ruevs commented May 2, 2020 • edited Loading

phkahler commented May 2, 2020

phkahler commented May 3, 2020

whitequark commented May 3, 2020

phkahler commented May 4, 2020

rpavlik commented May 4, 2020 • edited Loading

whitequark May 1, 2020 •

edited

Loading

ruevs May 2, 2020 •

edited

Loading

ruevs commented May 2, 2020 •

edited

Loading

rpavlik commented May 4, 2020 •

edited

Loading