Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mkmf improvements #4118

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

joankaradimov
Copy link
Contributor

@joankaradimov joankaradimov commented Aug 27, 2016

This builds on what is done in #4031. What it does is help some native extensions pass the configuration phase and fail in the build phase. Thus a better error message is reported.

P.S.
As far as I understand the change I did in 96cf5a8 must go to https://github.com/jruby/ruby. However, I am not sure where. Is it only in the jruby-ruby_2_3_1 branch?

@headius
Copy link
Member

headius commented Sep 6, 2016

@joankaradimov We keep our copy of stdlib in jruby/ruby but usually don't bother contributors with committing there. It's not a great process but I think it beats maintaining a patch and always pulling pristine stdlib for our builds.

I'll review your changes and get it merged in.

@headius
Copy link
Member

headius commented Sep 6, 2016

Ok, quick review and it seems mostly good.

Regarding removing ruby.h from default headers...

It seems to me that doing so may make C extensions produce a very confusing error about not being able to find identifiers that should be present from ruby.h, rather than the nice message indicating we don't have C ext support. If people did not have to add ruby.h to their extconf.rb before, then they're expecting it to be added for them. Am I correct in this?

Of course the first fix I can come up with is to make ruby.h stub out ALL identifiers and present a better error when they're accessed at build time. That's not great either, and it would mean we have to keep updating it.

@joankaradimov
Copy link
Contributor Author

If people did not have to add ruby.h to their extconf.rb before, then they're expecting it to be added for them. Am I correct in this?

That sounds correct, yes. It did not occur to me, because I never saw it in practice in the native extension gems I tested with.

Of course the first fix I can come up with is to make ruby.h stub out ALL identifiers and present a better error when they're accessed at build time. That's not great either, and it would mean we have to keep updating it.

I can't really come up with a really good alternative. I do have some ideas, though...

One thing that could be done is to change all #error statements to #warning and bring back ruby.h to the default headers. That would mean that the configuration phase will spam stderr with warnings, even though the gem might compile without issues. However, for the gems that do fail (actual native extensions), the warnings should be enough to indicate what's going on.

Another thing I'm thinking of is having two sets of headers. The original ones from MRI can be used during the configuration phase and the ones with #error in them can be used during the actual gem compilation. But that would mean that some errors in the configuration phase caused by missing identifiers will still happen. This time it will happen during linking, though. Maybe this can be handled somehow.

A third things is to hard-code some regexes that look at the src argument of create_tmpsrc. That way we can fail gracefully if anything that looks like a Ruby C API identifier is detected. Unfortunately, this would need to constantly be updated too (although maybe to a lesser degree, since the stuff in ruby.h tend to have some common prefixes).

@headius
Copy link
Member

headius commented Sep 21, 2016

I think our best option would be some way to automate the generation of a ruby.h that defines all symbols but errors if they're accessed. I don't know how to do that at the moment, though :-)

@headius
Copy link
Member

headius commented Sep 22, 2016

A short-term improvement on simply removing ruby.h from default headers would be to provide an EMPTY ruby.h. Users will get symbol not found errors if they try to build a C extension on JRuby, but non-extensions would build fine. The error messages would not be very helpful, though.

I have a simple script that can generate an error-stubbed header from a Ruby dynamic library. This may be sufficient.

Here's the generator for the ruby.h:

nm -gUfj /usr/lib/libruby.2.0.0.dylib | ruby -e 'while x = gets; x = x[1..-1].chomp; puts "\#define #{x}(...) \"JRuby does not support Ruby C extensions\"[-1]"; end' > ruby.h

Here's the sort of defines it creates for all exported Ruby symbols:

#define InitVM_Enumerator(...) "JRuby does not support Ruby C extensions"[-1]
#define Init_Array(...) "JRuby does not support Ruby C extensions"[-1]
#define Init_BareVM(...) "JRuby does not support Ruby C extensions"[-1]
#define Init_Bignum(...) "JRuby does not support Ruby C extensions"[-1]
...

This allows the following code to compile just fine:

#include "./ruby.h"

int main() {
}

But the following code produces an error:

#include "./ruby.h"

int main() {
  rb_str_new("foo");
}

Unfortunately, the error is still a bit cryptic...but it does at least have the cause in plain text. I've had little luck finding a better way:

$ cc blah.c
blah.c:4:3: warning: array index -1 is before the beginning of the array [-Warray-bounds]
  rb_str_new("foo");
  ^~~~~~~~~~~~~~~~~
././ruby.h:1383:25: note: expanded from macro 'rb_str_new'
#define rb_str_new(...) "JRuby does not support Ruby C extensions"[-1]
                        ^                                          ~~
1 warning generated.

It's a start, anyway! We can generate it periodically from the latest Ruby major release.

@drosehn
Copy link

drosehn commented Sep 22, 2016

At least for someone using 'cc' on Yosemite, the following might be slightly more helpful:

/* This is a dummy routine which is only used by deliberately "calling"
 * it with an unwanted parameter, thus generating a compile-time error.
 * But the routine itself needs to compile without any warnings.
 */
static int compile_time_ERROR(void) { return (99); }

#define rb_str_new(...) compile_time_ERROR("JRuby_does_not_support_Ruby_C_Extensions")

Which has the advantage that the error that this generates will underline the key information:

poison_main.c:21:5: error: too many arguments to function call, expected 0, have
      1
    rb_str_new("foo");
    ^~~~~~~~~~~~~~~~~
./ruby.h:8:44: note: expanded from macro 'rb_str_new'
  ...compile_time_ERROR("JRuby_does_not_support_Ruby_C_Extensions")
     ~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./ruby.h:6:1: note: 'compile_time_ERROR' declared here
static int compile_time_ERROR(void) { return (99); }
^
1 error generated.

Disclaimer: I have tested this on exactly one compiler on exactly one system. I have no idea how other C-compilers would format this error. You'd want the "compile_time_ERROR" routine to have a name which is very unlikely to be used by any legit program, to avoid errors in the declaration of that routine.

@headius
Copy link
Member

headius commented Sep 22, 2016

@drosehn Thanks for the suggestion! Yes, I need to test both our versions on gcc; it may display the error message and nothing else, which would make this approach largely moot.

I find it interesting that with all the compile-time trickery possible in C, there's still no good way to stub out a function such that it produces a useful error when referenced.

@headius
Copy link
Member

headius commented Sep 22, 2016

Oh, another missing bit from my patch: typedefs from MRI's ruby.h. Without those, many C exts won't even get to our clever error tricks; they'll error out before even reaching a function call. So all those types need to be handled in a similar way.

@joankaradimov
Copy link
Contributor Author

joankaradimov commented Oct 6, 2016

Something like this could be used for the compile time error:

#include <assert.h>

#if defined(static_assert) || __cplusplus >= 201103L
#define compile_time_ERROR(MSG) static_assert(0, MSG)
#else
static int compile_time_ERROR(void) { return (99); }
#endif

... it tries to detect if you are running C 11 or C++ 11 compiler. In these cases the compiler should either have a macro (C) or a keyword (C++) called static_assert. It should provide a bit friendlier error message [if possible].

Here's an example:

Joan ~ $ gcc --std=c89 test.c
test.c: In function ‘main’:
test.c:9:25: error: too many arguments to function ‘compile_time_ERROR’
 #define rb_str_new(...) compile_time_ERROR("JRuby_does_not_support_Ruby_C_Extensions")
                         ^
test.c:12:3: note: in expansion of macro ‘rb_str_new’
   rb_str_new("asd");
   ^
test.c:6:12: note: declared here
 static int compile_time_ERROR(void) { return (99); }
            ^
Joan ~ $ gcc --std=c11 test.c
In file included from test.c:1:0:
test.c: In function ‘main’:
test.c:4:33: error: static assertion failed: "JRuby_does_not_support_Ruby_C_Extensions"
 #define compile_time_ERROR(MSG) static_assert(0, MSG)
                                 ^
test.c:9:25: note: in expansion of macro ‘compile_time_ERROR’
 #define rb_str_new(...) compile_time_ERROR("JRuby_does_not_support_Ruby_C_Extensions")
                         ^
test.c:12:3: note: in expansion of macro ‘rb_str_new’
   rb_str_new("asd");
   ^
Joan ~ $ g++ --std=c++11 test.c
test.c: In function ‘int main()’:
test.c:4:33: error: static assertion failed: JRuby_does_not_support_Ruby_C_Extensions
 #define compile_time_ERROR(MSG) static_assert(0, MSG)
                                 ^
test.c:9:25: note: in expansion of macro ‘compile_time_ERROR’
 #define rb_str_new(...) compile_time_ERROR("JRuby_does_not_support_Ruby_C_Extensions")
                         ^
test.c:12:3: note: in expansion of macro ‘rb_str_new’
   rb_str_new("asd");

@headius
Copy link
Member

headius commented Oct 6, 2016

@joankaradimov That's not bad.

@joankaradimov
Copy link
Contributor Author

joankaradimov commented Sep 24, 2017

I have a horrible piece of bash. It works with the Ruby headers instead of the dynamic library file.

It downloads everything in https://github.com/jruby/ruby/tree/trunk/include. It uses github's aprils-fools-svn-feature so that only the contents of /include are downloaded (git cannot clone a subdirectory of a repo).

It then proceeds to grep for anything that looks like it might be a part of the C extension API (anything that starts with RB_ of rb_). There is also [an attempted] detection if the macro/symbol has parameters. Finally everything is dumped in ruby.h (which is in turn included by all other headers).

Here's the script. Running it will generate a bunch of files in the current working directory...

svn export --force https://github.com/jruby/ruby/trunk/include .

header_symbols() {
    grep \
        --recursive \
        --extended-regexp \
        --only-matching \
        --ignore-case \
        --no-filename \
        $1 **/*.h | sort --unique
}

all_symbols=$(
    header_symbols 'rb_\w+'
)

symbols_with_parameters=$(
    header_symbols 'rb_\w+\s*\(' | grep -Eoi 'rb_\w+'
)

symbols_without_parameters=$(
    comm -23 <(echo "$all_symbols") <(echo "$symbols_with_parameters")
)

symbol_with_parameter_errors=$(
    echo "$symbols_with_parameters" |
        awk '{print "#define " $0 "(...) compile_time_ERROR(\"JRuby does not support Ruby C Extensions\");"}'
)

symbol_without_parameter_errors=$(
    echo "$symbols_without_parameters" |
        awk '{print "#define " $0 " compile_time_ERROR(\"JRuby does not support Ruby C Extensions\");"}'
)

echo "#pragma once

#include <assert.h>

#if defined(static_assert) || __cplusplus >= 201103L
#define compile_time_ERROR(MSG) static_assert(0, MSG)
#else
 static int compile_time_ERROR(void) { return (99); }
#endif

$symbol_with_parameter_errors
$symbol_without_parameter_errors" > ruby.h

find ruby -type f -name '*.h' -print0 | while IFS= read -r -d $'\0' file
do
    # Replace every directory name in $file with ".."
    # and then replace the file name with "ruby.h"
    # e.g. "some/path/some_file.h" becomes "../../ruby.h"
    main_header_path=$(
        echo $file | sed -e 's/[^\/]\+\//..\//g' -e 's/\/[^\/]\+$/\/ruby.h/g'
    )

    echo "#include \"$main_header_path\"" > $file
done

@headius: If something like this makes sense I could rewrite it in Ruby. Or maybe leave it like that and hope that it is not completely unreadable.

@joankaradimov
Copy link
Contributor Author

joankaradimov commented Jan 4, 2018

I've rebased on top of 9.1.15.0. I've included headers with a bunch of macros/functions/typedefs and a script that regenerates them.

P.S.
I've no idea why the tests fail on the CI server. It doesn't look related to my changes.

@glebm
Copy link
Contributor

glebm commented May 29, 2019

This actually fixes #5749 which I discovered today (I've also sent a minimal standalone fix, #5750). Is anything blocking this PR from merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants