Small optimization to Vector::Equals to improve performance. #308

rpavlik · 2018-01-02T18:56:13Z

This (as well as some other unexpected locations, like

solvespace/src/polygon.cpp

Line 23 in c6fc012

if(MinAltitude() < LENGTH_EPS) {

and

solvespace/src/util.cpp

Line 587 in c6fc012

double Vector::DistanceToLine(Vector p0, Vector dp) const {

) showed up as a substantial hot-spot when I ran the VS CPU profiler in VS2017 on an x64 build of the latest master, looking at time periods when the GUI was non-responsive and SolveSpace was consuming a full core of CPU "doing something".

This is a fairly small change, but it seemed to improve the profiling results here. I imagine there are further ways to do this, or better ways (do we have profiling data that the early-outs actually improve performance? since this is "inner-loop" code essentially - they might just be extra pain for the branch predictor...), so please do perform your own verification of whether this improves perf for you before merging.

This function showed up surprisingly high on a CPU time profile when the GUI was unresponsive "doing things". Removed a duplicated difference in the not-equal case, and switched to abs and a single compare instead of two compares with a negation. It seems to have moved the function further down in the profile.

Evil-Spirit · 2018-01-05T13:40:35Z

src/util.cpp

-    double dx = v.x - x; if(dx < -tol || dx > tol) return false;
-    double dy = v.y - y; if(dy < -tol || dy > tol) return false;
-    double dz = v.z - z; if(dz < -tol || dz > tol) return false;
+    const Vector dv = this->Minus(v);


Are you sure this is faster than it was?
I think Minus() call can break performance, because it run for every call. It's faster to run bbox test without computing full difference firstly.

The minus can be vectorized, or sometimes (seen in practice) auto-vectorized.

Evil-Spirit · 2018-01-05T13:43:54Z

src/util.cpp


-    return (this->Minus(v)).MagSquared() < tol*tol;
+    return dv.MagSquared() < tol*tol;


this is super rare case when we are in bbox, but not inside sphere, so we even can remove this and just return true. But I don't beleive this really helps. Look at SnapToMesh function

Do you run profiling as release?

RelWith Deb Info, yes.

I don't remember the exact context of this little change, but I do remember that my profiling tests (which included load and export to stl, as well as manipulation of a model that had gotten too complex to edit interactively) showed these really simple things (in some cases sanity checks, or even ostensibly "optimization") as major hotspots, and after making these changes, which mostly reduce duplicated work and avoid scalar manipulation in favor of vector (so auto vectorization can take place, or eventually replacement of some internals with a manually vectorized library like Eigen), these functions disappeared from the hot spots list (leaving primarily list and associated string operations/copies). It was also noticably faster and more usable in the aforementioned pathological model.

It was also

So, what problem did you solve? Which case of performance drop? Can you describe your test case in details? I will try to test it.

Evil-Spirit · 2018-01-05T13:47:38Z

src/util.cpp

-    double dy = v.y - y; if(dy < -tol || dy > tol) return false;
-    double dz = v.z - z; if(dz < -tol || dz > tol) return false;
+    const Vector dv = this->Minus(v);
+    if (std::abs(dv.x) > tol) return false;


Is abs really faster? Is this because it compile as intrinsic?

I don't remember if I tested this single change independently. However, you're on the right path with intrinsic: my hunch is it gives better data to the compiler and thus can be optimized better. (It also reduces operations: one comparison, and the abs which probably is a bit manipulation, vs two comparisons, but that's down to the level where I'm not sure if intuition applies when compiling with a sophisticated modern compiler)

I have the opinon about this (ofc this should be tested): for your case dv calculated always (for all the three dimensions), which is probably resultedd in a function call (in case where inline is not performed). In case of previous implementation, the function have ability to terminate early without computation of whole vector difference.

whitequark · 2019-05-13T15:01:57Z

With gcc 8.3 in RelWithDebInfo mode, before:

after:

To me this looks like a clear improvement. Most people should use Release artifacts so that's what we care about.

whitequark · 2019-05-13T15:03:01Z

Thank you @rpavlik, and I apologize for taking so long to review.

rpavlik · 2019-05-13T21:07:03Z

np, glad you liked it enough to merge it! I think I recently just pushed some wip branches from my old machine that I had alluded to before, so if you're interested in some cleanup, you're welcome to look at them (or ignore them, your choice)

I also recently, accidentally, made a bit of a promo/intro for solvespace in showing how to customize a little 3d printable gizmo: https://www.thingiverse.com/thing:3625652

whitequark · 2019-05-14T10:40:39Z

I think I recently just pushed some wip branches from my old machine that I had alluded to before, so if you're interested in some cleanup, you're welcome to look at them (or ignore them, your choice)

I have a lot of workload so I don't think I have nearly enough time to merge someone else's cleanups without their assistance, explanations of rationale, etc.

rpavlik · 2019-05-15T18:44:10Z

certainly, no trouble at all. I've still got it on my backlog to review those and push the results as proper PRs.

I should load up one of the torture files I was working on when I found the need to work on those patches, see if the problems have gone away already.

Evil-Spirit reviewed Jan 7, 2018

View reviewed changes

whitequark force-pushed the master branch 3 times, most recently from 371f33b to fb1065d Compare July 12, 2018 20:24

whitequark merged commit a9ec644 into solvespace:master May 13, 2019

rpavlik deleted the vector-equals-perf branch May 15, 2019 18:44

ruevs added the performance label Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small optimization to Vector::Equals to improve performance. #308

Small optimization to Vector::Equals to improve performance. #308

rpavlik commented Jan 2, 2018

Evil-Spirit Jan 5, 2018

rpavlik Feb 10, 2018

Evil-Spirit Jan 5, 2018

Evil-Spirit Jan 5, 2018

rpavlik Feb 10, 2018

Evil-Spirit Feb 11, 2018 •

edited

Loading

Evil-Spirit Jan 5, 2018

rpavlik Feb 10, 2018

Evil-Spirit Feb 11, 2018

whitequark commented May 13, 2019

whitequark commented May 13, 2019

rpavlik commented May 13, 2019 •

edited

Loading

whitequark commented May 14, 2019

rpavlik commented May 15, 2019


		return (this->Minus(v)).MagSquared() < tol*tol;
		return dv.MagSquared() < tol*tol;

Small optimization to Vector::Equals to improve performance. #308

Small optimization to Vector::Equals to improve performance. #308

Conversation

rpavlik commented Jan 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Evil-Spirit Feb 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whitequark commented May 13, 2019

whitequark commented May 13, 2019

rpavlik commented May 13, 2019 • edited Loading

whitequark commented May 14, 2019

rpavlik commented May 15, 2019

Evil-Spirit Feb 11, 2018 •

edited

Loading

rpavlik commented May 13, 2019 •

edited

Loading