Polygon Cleanup #387

foxox · 2019-03-16T22:17:42Z

I have been using SolveSpace a lot to design things for my 3d printer recently and I'm a professional software engineer so I thought I would take a look at the code.
I cleaned up various things in the Polygon component... const correctness, project code format adherence, int types, pass by const reference for nonprimitives, readability improvements, etc.
These changes should make the cost more robust to bugs in the future and may even have some performance impact.
If you like this I may do the same with other files later.
I can squash commits after review if you like.

CLAassistant · 2019-03-16T22:17:49Z

All committers have signed the CLA.

whitequark · 2019-03-16T22:37:02Z

Thank you for spending time to improve SolveSpace. However, I would ask you that you consult the maintainers before making far-reaching, sweeping changes.

There are a few serious issues with this PR that would prevent it from being merged, and which are general for any changes that can be applied:

Every commit should do one specific change with one specific purpose.
Every commit should have a commit message that clearly states why the change is done.
SolveSpace is a mature codebase that is written in a somewhat unorthodox dialect of C++. We do not change it for the sake of change or for the sake of making it slightly more elegant, since the amount of churn and bugs introduced due to typos and other simple mistakes outweighs the benefits.
In particular, as an extension of the above, SolveSpace does not have a large testsuite. We heavily rely on git annotate for determining what commit introduced some code, and use that for determining how it is supposed to work. Sweeping changes like modifying indentation, adding braces, etc break this completely, which would make further maintenance much harder.

Specific to this PR:

There is no good justification for making local variables (themselves) constant. In a better language (like Rust) this would be the default, but we already have our codebase in C++, and changing almost every single variable declaration does not justify the minimal (if non-zero at all) amount of bugs it would highlight.
There is absolutely no point in replacing i++ with ++i when i is POD. If there is a compiler where the latter is more efficient, then this compiler is explicitly not supported for building SolveSpace. Trivial optimizations can always be relied on happening. (I'm pretty sure this is also true for non-POD cases at this point, but I'd have to check.)

Cleaned up various things in the Polygon component... const correctness, code format, readability improvements, etc.

foxox · 2019-03-16T23:26:23Z

@whitequark Pull requests exist to facilitate this kind of conversation -- thank you. In my day job, a 300 line cleanup PR to one component is nothing... gets reviewed by 2 or 3 other people, they leave a few comments, I push commits in response, get approval, squash the commits into one, it gets merged, and everyone is glad for it. I don't often contribute to open source projects, so your response was a bit surprising. Since you maintain this project, I'll concede on most points since they don't matter.

However, I disagree about const correctness. Const correctness should be applied to all code, even local variables. If a variable is not meant to be mutated after it is initialized, it should be const. It makes the code easier to read and easier to change because it shows the author's intent and (as you mentioned) helps catch bugs. Rust is irrelevant... const is a core C++ feature and it's there to be used. C++ coding standards like HIC++ or the JSF standard or whatever typically include a requirement for const correctness in all situations.

Also, "churn" isn't a problem if code follows best practices and is documented properly. Would things be different if there were more unit tests in this project?

whitequark · 2019-03-16T23:46:32Z

In my day job, a 300 line cleanup PR to one component is nothing... gets reviewed by 2 or 3 other people, they leave a few comments, I push commits in response, get approval, squash the commits into one, it gets merged, and everyone is glad for it. I don't often contribute to open source projects, so your response was a bit surprising.

This is common in certain open source projects with a large team of developers typically paid by top tech companies maintaining them, for example, LLVM. However, SolveSpace has one maintainer--me--I am not paid (in fact I'm beyond broke at this point), and I have health issues that limit the amount of time I can spend on FOSS. I definitely do not have the bandwidth to review thousands of lines of cleanups that just check a box in some coding standard.

C++ coding standards like HIC++ or the JSF standard or whatever typically include a requirement for const correctness in all situations.

Coding standards are not a gospel but a tool in a large array of tools that improve code quality. SolveSpace does not and will not blindly adhere to any particular coding standard or rule unless there's demonstrable evidence that following it (and changing code to follow it) does more good than harm.

If you look through the git log, there is a number of commits that justify a few sweeping changes in great detail, e.g. the commit that changed the initialization style. I expect any cleanups to justify themselves at least that well, including any churn they cause.

Also, "churn" isn't a problem if code follows best practices and is documented properly. Would things be different if there were more unit tests in this project?

I've spent something like 1.5 years learning how to maintain this project without breaking it, and about half of my time was spent fixing bugs that were the direct result of various "cleanups" and "consistency improvements". This codebase is sparsely tested--almost untested, really--and has a very large amount of hidden invariants. Your approach is unfortunately not applicable to a legacy codebase like this.

In short, my direct experience doing what you are trying to do in this PR results in my understanding that it is undesirable and infeasible to do while removing more bugs than adding them. This, of course, could change if this codebase would become well-documented and well-tested to the point where any trivial errors introduced in bulk refactoring would be caught by the test suite. That is not the case today.

whitequark · 2019-03-16T23:49:48Z

However, I disagree about const correctness. Const correctness should be applied to all code, even local variables. If a variable is not meant to be mutated after it is initialized, it should be const. It makes the code easier to read and easier to change because it shows the author's intent and (as you mentioned) helps catch bugs.

To expand on this, I agree with you in principle, and, although I do not use this style in my other C++ projects, I would be open to using it in a greenfield codebase. But I don't think that const correctness for locals justifies a commit that would break git annotate for almost the entire codebase; git annotate is a tool that is critical to understanding the code, so it's not going to happen.

Evil-Spirit · 2019-03-18T05:19:51Z

@foxox const correctness is something you can apply for projects created from scratch. Even in this case, you can't link any third-party libraries without pain. Your programming skills and time will start working on const-correctness maintaining only. If sometime someone show you what really means to follow some code style exactly how it written, you will be scared about typing every space and newline. You will be disarmed completely. My way was a hard - I was stucked inside premature optimizations, const-correctness, template magic, and so on. Pure programming without a result. But I worked with some people who passed through this. Now I don't bother about optimization - there is profiler to show me what's wrong with, I don't bother because of codestyle - I can follow any codestyle. I don't bother about programming language - I can use any. I can blame any code and make it more optimal in some context - make it faster, make it shorter, make it more flexible, or more understandable. But I can't do it all simultaneously - only balance can be reached.

whitequark · 2019-03-18T15:40:12Z

@Evil-Spirit Thanks, I completely forgot about the issue with 3rd party libraries, which we've already hit.

bcmpinc

I see quite a few changes that hurt readability, possibly hurt performance or increase the risk of introducing bugs. For a code cleanup, improving readability should be your primary concern. Applying coding guidelines, can be useful, but does require you to understand their purpose and underlying considerations.

W.r.t. const, I hope you understand that C++ does not really have a concept of constant variables. The only thing that const does is tell the compiler it should create an error whenever you try to assign to that variable name. The variable might still be modified through other means. Hence for compilation and optimization purposes, the compiler completely ignores const. I suppose that those errors are still useful in commercial code, where code quality often varies wildly and can be extremely low, often due to time constraints, frequent deadlines and a lack of maintenance, but this is not commercial code.

bcmpinc · 2019-04-16T18:08:45Z

src/polygon.h

@@ -180,14 +184,14 @@ class STriangle {
        Vector normals[3];
    };

-    static STriangle From(STriMeta meta, Vector a, Vector b, Vector c);
+    static STriangle From(const STriMeta& meta, const Vector& a, const Vector& b, const Vector& c);


Why pass STriMeta and Vector as references? They are small POD-type classes, used in a similar fashion as an int. Just like writing foo(const int& bar) instead of foo(int bar) is less readable and probably slower, so is using const STriMeta& meta as argument.

What's the cutoff? How big does a type need to be before it's worthwhile to pass by reference/pointer instead of by value? I can find a number of references online indicating that it's good practice in C++ to pass any non-primitive type by reference. Has this stuff been profiled?

Has this stuff been profiled?

Yes, and we've converted most functions on the hot paths to pass POD classes that are larger than a pointer to pass-by-const-reference.

bcmpinc · 2019-04-16T18:14:28Z

src/polygon.cpp

-    for(i = 0; i < l.n; i++) {
+int SPointList::IndexForPoint(const Vector& pt) const {
+    size_t i;
+    for(i = 0; i < l.n; ++i) {


In a function returning a signed int value, why define the variable that is used as the return value as an unsigned int (size_t)?

Hmm, this function should probably initialize int i with -1 and return it at the end

bcmpinc · 2019-04-16T18:27:55Z

src/polygon.cpp

            if(se->EdgeCrosses(a, b, pi, spl)) {
-                inters++;
+                ++inters;


Optimizations like these are so easy that a compiler does them for you. A compiler is probably much better at that too, so don't try to outsmart it.

It's not a matter of outsmarting the compiler. Pre-increment is the preferred operation in this context because it is simpler, which makes it more readable/understandable and less prone to errors introduced in the future. Ex: imagine if something like this would need to be done in the future: const size_t num_inters_this_iteration = inters++; It's easy to make this kind of bug when post-increment is used habitually. It's a shame that the language wasn't called ++C

I don't agree that pre-increment is simpler or more readable in this context (indeed, since no other codebase I work on uses pre-increment pervasively, it trips me up each time), but more importantly, you are not considering the unintended consequences of making such a change globally. Namely:

it is not semantics-preserving, and so it will introduce bugs, and we don't have nearly enough test coverage to catch them;

it screws up git blame, and we don't have nearly enough documentation to not rely on that.

Therefore this change and changes similar to it are not useful, but are harmful.

bcmpinc · 2019-04-16T19:10:05Z

src/polygon.cpp

+bool STriangle::ContainsPointProjd(const Vector& n, const Vector& p) const {
+    const Vector ab = b.Minus(a);
+    const Vector bc = c.Minus(b);
+    const Vector ca = a.Minus(c);


Splitting these out over multiple lines is indeed useful. In this case it makes it easier to check that they all follow the same pattern xy = y.Minux(x) where x, y are replaced with a, b or c.

bcmpinc · 2019-04-16T19:47:14Z

src/polygon.h

    void FixContourDirections();
    void Clear();
    bool SelfIntersecting(Vector *intersectsAt) const;
    bool IsEmpty() const;
    Vector AnyPoint() const;
-    void OffsetInto(SPolygon *dest, double r) const;
+    void OffsetInto(SPolygon *dest, const double r) const;


Whether r is modified inside OffsetInto is an implementation detail and should not be exposed in the header file.

bcmpinc · 2019-04-16T19:51:35Z

src/polygon.h

+    bool IsClockwiseProjdToNormal(const Vector& n) const;
+    bool ContainsPointProjdToNormal(const Vector& n, const Vector& p) const;
+    void OffsetInto(SContour *dest, const double r) const;
+    void CopyInto(SContour * const dest) const;


This is confusing, you define dest as const, but in the implementation dest is clearly modified. Yes, I understand it is the pointer which is const, not the object being pointed to. However it takes time to realize that, so here, using const hurts readability.

Isn't this basic C++?

No, it is nothing but visual noise in the header (the constness of the argument inside the function has no bearing whatsoever on its public interface; yet another design mistake in C++).

I think it's like writing void foo(const int param). No one writing code like this.

bcmpinc · 2019-04-16T19:54:38Z

src/polygon.cpp

-    for(i = 0; i < (l.n - 1); i++) {
+void SContour::MakeEdgesInto(SEdgeList * const el) const {
+    size_t i;
+    for(i = 0; (i + 1) < l.n; ++i) {


The loop condition here is correct, but written in a non-usual way to avoid underflow errors (size_t is unsigned). You need to be careful about this when using size_t, which makes this type prone for introducing errors and thus should be avoided.

whitequark · 2019-04-17T20:16:11Z

@bcmpinc Thank you, this review reflects my thoughts as well.

whitequark · 2019-05-21T01:08:24Z

I believe this kind of cleanup is unproductive and so I am going to close this PR. The reasons are outlined above multiple times and I don't have anything to add to that.

I would, of course, be happy to collaborate with you on other changes that take into account the requirements imposed by the project's history and maintenance needs.

foxox force-pushed the jdrake-polygon-cleanup branch from cb42d82 to 2577086 Compare March 16, 2019 22:26

Polygon Cleanup

4ae706b

Cleaned up various things in the Polygon component... const correctness, code format, readability improvements, etc.

foxox force-pushed the jdrake-polygon-cleanup branch from 2577086 to 4ae706b Compare March 16, 2019 22:39

bcmpinc reviewed Apr 16, 2019

View reviewed changes

whitequark closed this May 21, 2019

ruevs mentioned this pull request Jun 1, 2022

Use nullptr instead of NULL #1250

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polygon Cleanup #387

Polygon Cleanup #387

foxox commented Mar 16, 2019

CLAassistant commented Mar 16, 2019 •

edited

Loading

whitequark commented Mar 16, 2019

foxox commented Mar 16, 2019

whitequark commented Mar 16, 2019

whitequark commented Mar 16, 2019 •

edited

Loading

Evil-Spirit commented Mar 18, 2019 •

edited

Loading

whitequark commented Mar 18, 2019

bcmpinc left a comment

bcmpinc Apr 16, 2019

foxox Apr 18, 2019

whitequark Apr 18, 2019

bcmpinc Apr 16, 2019

foxox Apr 18, 2019

bcmpinc Apr 16, 2019

foxox Apr 18, 2019

whitequark Apr 18, 2019

bcmpinc Apr 16, 2019

bcmpinc Apr 16, 2019

bcmpinc Apr 16, 2019

foxox Apr 18, 2019

whitequark Apr 18, 2019

Evil-Spirit Apr 18, 2019 •

edited

Loading

bcmpinc Apr 16, 2019

whitequark commented Apr 17, 2019

whitequark commented May 21, 2019

Polygon Cleanup #387

Polygon Cleanup #387

Conversation

foxox commented Mar 16, 2019

CLAassistant commented Mar 16, 2019 • edited Loading

whitequark commented Mar 16, 2019

foxox commented Mar 16, 2019

whitequark commented Mar 16, 2019

whitequark commented Mar 16, 2019 • edited Loading

Evil-Spirit commented Mar 18, 2019 • edited Loading

whitequark commented Mar 18, 2019

bcmpinc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Evil-Spirit Apr 18, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whitequark commented Apr 17, 2019

whitequark commented May 21, 2019

CLAassistant commented Mar 16, 2019 •

edited

Loading

whitequark commented Mar 16, 2019 •

edited

Loading

Evil-Spirit commented Mar 18, 2019 •

edited

Loading

Evil-Spirit Apr 18, 2019 •

edited

Loading