Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tempfile and datapath helpers to specs #5951

Merged

Conversation

straight-shoota
Copy link
Member

@straight-shoota straight-shoota commented Apr 15, 2018

This PR introduces with_tempfile and datapath helpers to the stdlib spec suite.

  • with_tempfile receives a path and expands it relative to a temporary directory unique to every spec run. All temporary paths are prefixed by the name of the spec file that requests them. The constructed path is yielded and cleaned up after the block has returned. It can be used as file or directory path. The unique directory is removed at the end of the spec run (unless env var SPEC_TEMPFILE_CLEANUP is set to 0 - which can be useful for debugging).
  • datapath is used to create platform-independent paths to spec/std/data (or spec/compiler/data in the compiler spec).

Closes #5811

This is work in progress. Feedback is still welcome.

@bew
Copy link
Contributor

bew commented Apr 15, 2018

Looks good! Could you add a little doc for datapath, with_tempfile and SPEC_TEMPFILE_CLEANUP to explain when and how to use them?

@@ -97,138 +97,102 @@ describe "FileUtils" do

describe "touch" do
it "creates file if it doesn't exists" do
filename = File.join(__DIR__, "data/test_touch.txt")
begin
File.exists?(filename).should be_false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you removed this check, is it intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with_tempfiles should guarantee that the path does not exist, don't need to check that again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it just to make sure.

]
begin
it "creates multiple files if they don't exists" do
with_tempfile("file_utils-touch") do |path_prefix|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be a problem here: you'll have 3 files created that don't get cleaned after with_tempfile, maybe use sth like:

with_tempfile("touch_1.txt", "touch_2.txt", "touch_3.txt") do |*paths|
  # ...
end

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, that needs to be refactored.

end

it "tests rm with non existing path" do
expect_raises Errno do
FileUtils.rm("/tmp/crystal_rm_test_#{Process.pid}")
with_tempfile("rm-nonexstinent") do |path|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: non-existent

handler.call context
response.close
io.rewind
HTTP::Client::Response.from_io(io, ignore_body)
end

describe HTTP::StaticFileHandler do
file_text = File.read "#{__DIR__}/static/test.txt"
file_text = File.read datapath("static_file_handler", "test.txt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe memorize datapath("static_file_handler", "test.txt") in a static_file_path variable for all the specs below that uses it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's worth the indirection.

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 16, 2018

Damn, the temp paths are too long for socket_spec because unix socket paths are limited to 103 bytes on OSX. Linux works though, it seems to have 108 bytes.

@straight-shoota
Copy link
Member Author

@bew I added a comment to with_tempfile. It might even make sense to move this to the standard library as this feature would certainly be useful for other spec suites as well.

datapath is self-explanatory, I don't think it needs a comment.

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 16, 2018

On OSX it seems that $TEMPDIR it a randomized folder in /var/folders/ (like /var/folders/ms/xg67k5sn16xc7sdr_w3q45840000gn/T/ in one case). That's particularly bad for unix socket paths because together with the folder structure added by with_tempfile that's already 78 of maximum 103 bytes: /var/folders/ms/xg67k5sn16xc7sdr_w3q45840000gn/T/crystal-spec-12345678/socket/.

Any ideas on this? If we just set $TEMPDIR to /tmp in the ci script it should be fine, CI doesn't need user-specific temp dirs. Otherwise we could let the socket specs use custom temp paths. Or simply use shorter file names in the spec to stay below the limit?

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 16, 2018

Apart from the OSX issue this is mostly complete. The remaining instances of __DIR__ and __FILE__ in spec are used as compile time resources such as file location or parser tokens.

crystal_path_spec still needs to be reworked, I have to look into the specs and implementation to figure out how exactly it is supposed to work. I wouldn't mind if someone had some insight on that (or even take over that part).

ecr_spec still uses hard coded paths to #{__DIR__}/../data/. This isn't bad in itself, because they are used at compile-time and can't use datapath for this. But I guess the paths could at least be hard coded relative to project root instead of __DIR__.

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 18, 2018

It would be beneficial if this PR could get reviewed and hopefully merged soon. It touches a lot of code in the spec suite, but it's mostly insignificant changes. I'd like to avoid lots of upcoming rebasing in this and future PRs.

Any idea on the OSX issue @RX14?

@RX14
Copy link
Contributor

RX14 commented Apr 18, 2018

If TMPDIR is some weird path on osx, just ignore it and use /tmp.

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 18, 2018

@RX14 Do you mean generally, in with_tempfile (and if so, should it always switch to /tmp if it is /var/folders?) or just use custom temp file paths in Unix socket specs?

IMHO the cleanest solution would be to set TMPDIR to /tmp in the ci script because it keeps platform specifics out of the spec code.

@straight-shoota straight-shoota changed the title (WIP) Add tempfile and datapath helpers to specs Add tempfile and datapath helpers to specs Apr 18, 2018
@RX14
Copy link
Contributor

RX14 commented Apr 18, 2018

IMHO the cleanest solution would be to set TMPDIR to /tmp in the ci script because it keeps platform specifics out of the spec code.

Fixing things in the CI script is always wrong because you're admitting you're going to break the specs on everyone's osx laptops.

I meant change Tempfile.tempdir itself :)

@straight-shoota
Copy link
Member Author

I meant change Tempfile.tempdir itself :)

But why? If a user consciously sets TMPDIR to a specific path (or doesn't change the OS default but the difference is hard to tell), we should not just overwrite that. I don't think that makes any sense.
Consider the case that two programs on OSX are configured to connect to $TMPDR/mysocket.sock and TMPDIR is /var/folders/ms/xg67k5sn16xc7sdr_w3q45840000gn/T. If one is a Crystal program and ignores TMPDIR, they won't connect even if the path is actually valid. And that's hard to debug. If the path would be too long, it's even more confusing: The Crystal program would create a socket without issues but any other program complains about the path.

We could think about overwriting the temp dir used by with_tempfile. That can either be done in the spec code (which I don't like) or by configuring TMPDIR appropriately.

You could see it that way: The specs for Unix sockets puts them in a sub path of TMPDIR. In order for that to proceed, a short TMPDIR path is required. That's just an factual, external requirement for running the stdlib's spec suite. But certainly not for every Crystal program.

Let's consider the downsides: If we change the behaviour of Dir.tempdir to partially ignore TMPDIR, this will be highly unexpected (see example above) and for the great majority of use cases it is completely unnecessary. When TMPDIR path is too long, it can cause problems with Unix sockets. That's a fact, and we can't change it.

If we set TMPDIR in the ci script, the worst thing that can happen is that Unix socket specs fail on some people's OSX laptops. That's not a real problem and they can easy fix it by setting TMPDIR to a shorter path. We could even consider setting TMPDIR in the Makefile or bin wrapper to get it right for everyone's OSX laptop as well. Again: It's a specific requirement of the stdlib spec suite, not for every Crystal program.

However, this absolutely needs to be documented. Unix sockets will also fail in applications using a TMPDIR that is too long - and that's on any system, not just OSX.

Changing the spec suite to adapt to a special case is not a good solution for a generic problem. We should rather improve the error message for Unix sockets failing because of path length. Instead of Path size exceeds the maximum size of 103 bytes, it should at least show the exact path. And as it is pretty common to put Unix sockets in a temp dir, it could even check if the path is in TMPDIR and suggest adjusting the environment variable.

@RX14
Copy link
Contributor

RX14 commented Apr 18, 2018

Tempfile.dirname isn't documented or specified to be TMPDIR, just a tempoary directory. If your program relies on it being TMPDIR, you can use ENV["TMPDIR"].

Specs failing is very much a "real problem"...

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 18, 2018

It's not documented but I'd say it's just missing documentation. The specs for Tempfile.dirname explicitly test for ENV["TMPDIR"] so it's part of the public API definition and looks to be intended (and expected) behaviour: if I create a Tempfile I expect the path to be in TMPDIR.

And again, I see no sane reason to generally ignore the environment variable if we think it's to long for one specific use case.

@RX14
Copy link
Contributor

RX14 commented Apr 19, 2018

Let's just fix it by hard coding to /tmp on osx in with_tempfile

@straight-shoota
Copy link
Member Author

I really don't like that with_tempfile should override TMPDIR when it is specified. I would much rather prefer to set TMPDIR appropriately for running specs.

To get this moving forward, what do others think?

/cc @crystal-lang

@RX14
Copy link
Contributor

RX14 commented Apr 26, 2018

well, the fact is that $TMPDIR on macos can be set to an insane value. How do you possibly intend on returning to sanity while respecting $TMPDIR?

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 26, 2018

  1. Expect users to have a sane $TMPDIR when using Unix sockets (including but not limited to stdlib specs). This doesn't affect only MacOS: When a Linux user has a long $TMPDIR, Unix sockets will fail as well. There is nothing we can do about that unless we want to dictate everyone where their $TMPDIR should be.
  2. Add export TMPDIR=/tmp to bin/ci so that the spec suit is run with a sane environment setting.

@RX14
Copy link
Contributor

RX14 commented Apr 26, 2018

I use bin/crystal run spec/std/xyz_spec.cr all the time, I refuse to merge any implementation that breaks doing that on osx. Which means either: reset $TMPDIR in bin/crystal (:-1:), or ignore $TMPDIR entirely because macos is fucked.

Note i'm assuming that people using osx have a really long $TMPDIR outside of circleci.

@straight-shoota
Copy link
Member Author

straight-shoota commented Apr 26, 2018

It would only break if the Unix socket specs are run and $TMPDIR is too long. That's unfortunately the default on OSX but it can easily be changed.

There are countless ways to do that. A few suggestions: change global config, apply context-sensitive in crystal project folders, explicitly export in the current session, manually prepend TMPDIR=/tmp or just use an alias such as bin-crystal with TMPDIR=/tmp bin/crystal $@. I think there should be a feasible solution for everyone.

And I'd improve the error message for Unix sockets failing because of path length, including an advice to change TMPDIR if the socket path is in that folder (see last paragraph of #5951 (comment)).

@straight-shoota
Copy link
Member Author

@RX14 I'll try it with a shorter tempfile subpath. Let's see if it works.

@RX14
Copy link
Contributor

RX14 commented Jun 14, 2018

$ echo -n '/var/folders/ms/xg67k5sn16xc7sdr_w3q45840000gn/T/cr-spec-12345678/socket/unix_server-accept-closed.sock' | wc -c
103
if @path.bytesize + 1 > MAX_PATH_SIZE

we're one off.

Copy link
Member

@bcardiff bcardiff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some issues and some doubts, but is almost right.

@@ -153,7 +154,7 @@ class Crystal::SpecRunOutput
end
end

def run(code, filename = nil, inject_primitives = true, debug = Crystal::Debug::None)
def run(code, filename = nil, inject_primitives = true, debug = Crystal::Debug::None, *, file = __FILE__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specs doesn't care about the location of this files.
They are not even generated sometimes (JIT is used unless prelude is required).
I would avoid adding this named arg and just using a fixed tempfile prefix/sufix in this helper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why you'd want to remove this. Maybe it's not strictly necessary, but using with_tempfile here certainly doesn't hurt and it's easier than managing a tempfile location manually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, if specs are failing I would probably run just a few or one. Or even add a debug message for the spec/tempfile association while running specs in some verbose mode.

Having filename and file arguments in the same method seems misleading.

So, between not been able see an addition here, plus the misleading name, plus having more data to cary on runtime on the high populated spec suite of the compiler, I would vote for removing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll revert it.

# write code to the temp file
File.write(code_file.path, code)
def build_and_run(code, *, file = __FILE__)
with_tempfile("build_and_run_code", "build_and_run_bin", file: file) do |code_path, binary_path|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

o_filename = "#{__DIR__}/temp_abi.o"
begin
File.write(c_filename, c_code)
def test_c(c_code, crystal_code, *, file = __FILE__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

File.join(".", "spec", "std", "data", "dir", "dots", ""),
File.join(".", "spec", "std", "data", "dir", "subdir", ""),
File.join(".", "spec", "std", "data", "dir", "subdir2", ""),
].sort
end

it "tests with relative path (starts with ..)" do
base_path = File.join("..", File.basename(File.dirname(File.dirname(__DIR__))), "spec", "std", "data", "dir")
Dir["../#{File.basename(File.dirname(File.dirname(__DIR__)))}/spec/std/data/dir/*/"].sort.should eq [
base_path = File.join("..", File.basename(Dir.current), "spec", "std", "data", "dir")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that relaying on __DIR__ is better that Dir.current since the later involves more state.
Why this change?
I also notice that the datapath helper relies on the current directory and not an absolute path by using __DIR__

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__DIR__ is compile time, Dir.current runtime. In order for the specs to correctly execute when cross-compiled, we need to use Dir.current because __DIR__ might not even exist on the target system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understood.

end

it "gives false" do
File.file?("#{__DIR__}/data").should be_false
File.file?(datapath("nonexistent_dir")).should be_false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should point to an existing dir

@@ -594,7 +578,7 @@ describe "File" do
it "converts a pathname to an absolute pathname, using ~ (home) as base (trailing /)" do
prev_home = home
begin
ENV["HOME"] = __DIR__ + "/"
ENV["HOME"] = File.expand_path(datapath("", ""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the double "", ""?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure... probably left over.

@straight-shoota
Copy link
Member Author

Merge? (before this PR needs yet another rebase...)

@sdogruyol sdogruyol merged commit 750258d into crystal-lang:master Jun 19, 2018
@straight-shoota straight-shoota deleted the jm/feature/specs-tempdir branch June 19, 2018 19:24
@RX14 RX14 added this to the 0.25.1 milestone Jun 19, 2018
@bcardiff
Copy link
Member

I think that after this PR the make spec is broken. The ci uses make compiler_spec and make std_spec independently so it went unnoticed.

$ make crystal spec
Using /usr/local/bin/llvm-config [version=5.0.1]
CRYSTAL_CONFIG_PATH="/path/to/crystal" ./bin/crystal build  -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
.build/all_spec 
Error: File spec/std/data/compiler_sample does not exist

@straight-shoota
Copy link
Member Author

True. The datapath definition from std spec helper overwrites the one from compiler spec.
We could easily rename it to datapath_compiler. Or just split make spec into compiler_spec and std_spec. I don't think it makes much sense to combine compiler and std specs into a single spec run/binary. They're actually independent spec suits.

@RX14
Copy link
Contributor

RX14 commented Jun 29, 2018

I disagree. It's faster to compile combined.

@bcardiff
Copy link
Member

bcardiff commented Jul 2, 2018

I think we should have 2 datapath helpers also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use a global tmpdir for specs