Optimize bc+obj codegen, focus: devel rebuilds. #2591

ozra · 2016-05-13T14:20:42Z

Renders bc-file in RAM
Compares cached bc-file with ram-buffer
Writes out if changed
Generates obj if changed
File writes always go via a temp-file -> rename now
bc-flags changes incorporated in cached files names
Removes forking for each render
Only single threaded concurrency now (could be improved further using core-count threads with one part of the list each)
On my machine I consistently get about 2s bc-obj-phase in re-compile (when working on the parser in the compiler for instance) compared to about 40s before - (for the bc-obj-phase alone).
- Full compile in re-compile is about 20s vs ca 60s before (on my slow HDD machine).
Again: the focus is improving compile time for re-compile in devel mode, when there's obviously changes in just a few files each iteration.
Please try and bench it with similar test-case on your machine, comparing with main-compiler.
Bonus: I can actually do other things while compiling now - before my machine completely froze during compile.

asterite · 2016-05-13T14:33:52Z

I think it failed because Travis still uses llvm 3.5... but the compiler doesn't use that version anymore and should work with 3.6. This needs to be fixed somewhere...

asterite · 2016-05-13T14:34:38Z

It uses llvm 3.5 because we tell it so: https://github.com/crystal-lang/crystal/blob/master/bin/ci#L68

luislavena · 2016-05-13T14:36:37Z

@asterite the other failure is caused by format changes in compiler.cr.

ozra · 2016-05-13T14:37:15Z

Ah, sorry, didn't run format!

ozra · 2016-05-13T14:48:31Z

Now format should be there X-)

luislavena · 2016-05-13T16:58:49Z

32bits error might be a fluke of Travis worker memory while OSX error seems to be related to LLVM version Ary mentioned.

@asterite any particular reason we don't build against 3.6 now? What is the status of OSX and LLVM in this matter?

asterite · 2016-05-13T17:02:28Z

@luislavena Everything should work well with LLVM 3.6. We could even simplify omnibus-crystal to use it, I think right now it uses LLVM 3.5. The reason why it's using an old version is because it works and it's always easy to keep thing unchanged when they work ¯_(ツ)_/¯

ozra · 2016-05-15T17:18:22Z

Ok, so I'll let this "hang" meanwhile, since it won't work with <3.something? I'll see what I can do about it, but I'm kinda keen on getting the codegen working against 3.8+, I might look into making that.

asterite · 2016-05-15T18:05:00Z

@ozra I was planning on testing this PR next week, and updating travis and others if it works well

ozra · 2016-05-15T18:06:32Z

Alright, cool.

asterite · 2016-05-16T22:46:07Z

@ozra Did you try this on linux or mac?

ozra · 2016-05-17T16:15:12Z

Linux is all I run, are there weird problems? I'll be happy to do what I can to make it work, because it's quite nice with faster compiles :)

asterite · 2016-05-17T16:16:30Z

I believe the performance you get is similar to the ones here #745 . Basically, forking a lot in linux is slow, but in mac it seems to be fast enough.

asterite · 2016-05-17T16:16:55Z

I say because I tried your changes on Mac and it's always is slower than before.

ozra · 2016-05-17T16:21:05Z

Aha, interesting. What are the speeds comparatively in Mach?

asterite · 2016-05-19T14:58:02Z

@ozra I can't remember, but it was like two or three times slower than what we have now.

Probably the best thing to do is to use fork in mac and use spawn in linux, using your PR.

ozra · 2016-05-20T13:24:08Z

@asterite, I'll re-work this to use a ifdef, and also look in to the "just create n-threads workers handling a list each" and update. I googled around, but found no benches other than pagefaults being slower in Darwin then in Linux kernel (but they were dated so it might have changed a lot), so I haven't figured out what could be the main culprit (and on "what side"). Two different approaches will no doubt be simplest, but utilizing as many cores but with only using initial forking before looping should benefit both.

ozra · 2016-09-02T13:57:31Z

@asterite - What's the status on LLVM-versions and such now - I was thinking about re-implementing this with the platform-conditional - the speed up is badly needed for us Linux folks.

asterite · 2016-09-02T14:41:13Z

@ozra We are still using LLVM 3.5~3.6 for releasing, because LLVM 3.8 needs an entirely different way to be built.

ozra · 2016-09-02T14:44:49Z

Alright.

asterite · 2016-12-16T21:03:37Z

@ozra I finally combined ideas from #745 and your PR and put them all in c792139 . I used memory buffers from LLVM and compare that to the old .bc before writing them, which is slightly faster than writing to disk. Now also only a limited number of forks (8) is done, so this should be considerably faster on linux. And on mac it's already faster. So thank you for this PR! :-)

ozra · 2017-03-02T09:42:18Z

Awesome! :-)

ozra mentioned this pull request May 13, 2016

Spawning and Forking Troubles #2571

Closed

ozra force-pushed the codegen-bc-obj-phase-optimization branch from 7088880 to 3361ef0 Compare May 13, 2016 14:38

Optimize bc+obj codegen, focus: devel rebuilds.

b7a111b

ozra force-pushed the codegen-bc-obj-phase-optimization branch from 3361ef0 to b7a111b Compare May 13, 2016 14:47

jhass added the topic:compiler label Jun 2, 2016

asterite closed this Dec 16, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize bc+obj codegen, focus: devel rebuilds. #2591

Optimize bc+obj codegen, focus: devel rebuilds. #2591

ozra commented May 13, 2016 •

edited

Loading

asterite commented May 13, 2016

asterite commented May 13, 2016

luislavena commented May 13, 2016

ozra commented May 13, 2016

ozra commented May 13, 2016

luislavena commented May 13, 2016

asterite commented May 13, 2016

ozra commented May 15, 2016

asterite commented May 15, 2016

ozra commented May 15, 2016

asterite commented May 16, 2016

ozra commented May 17, 2016

asterite commented May 17, 2016

asterite commented May 17, 2016 •

edited

Loading

ozra commented May 17, 2016

asterite commented May 19, 2016

ozra commented May 20, 2016 •

edited

Loading

ozra commented Sep 2, 2016

asterite commented Sep 2, 2016

ozra commented Sep 2, 2016

asterite commented Dec 16, 2016

ozra commented Mar 2, 2017

Optimize bc+obj codegen, focus: devel rebuilds. #2591

Optimize bc+obj codegen, focus: devel rebuilds. #2591

Conversation

ozra commented May 13, 2016 • edited Loading

asterite commented May 13, 2016

asterite commented May 13, 2016

luislavena commented May 13, 2016

ozra commented May 13, 2016

ozra commented May 13, 2016

luislavena commented May 13, 2016

asterite commented May 13, 2016

ozra commented May 15, 2016

asterite commented May 15, 2016

ozra commented May 15, 2016

asterite commented May 16, 2016

ozra commented May 17, 2016

asterite commented May 17, 2016

asterite commented May 17, 2016 • edited Loading

ozra commented May 17, 2016

asterite commented May 19, 2016

ozra commented May 20, 2016 • edited Loading

ozra commented Sep 2, 2016

asterite commented Sep 2, 2016

ozra commented Sep 2, 2016

asterite commented Dec 16, 2016

ozra commented Mar 2, 2017

ozra commented May 13, 2016 •

edited

Loading

asterite commented May 17, 2016 •

edited

Loading

ozra commented May 20, 2016 •

edited

Loading