Skip to content

Commit

Permalink
compiler: tune the LLVM optimizer pipeline (fixes #315).
Browse files Browse the repository at this point in the history
whitequark committed Mar 26, 2016
1 parent 3ee9834 commit f5c720c
Showing 1 changed file with 19 additions and 12 deletions.
31 changes: 19 additions & 12 deletions artiq/compiler/targets.py
Original file line number Diff line number Diff line change
@@ -79,6 +79,23 @@ class Target:
def __init__(self):
self.llcontext = ll.Context()

def target_machine(self):
lltarget = llvm.Target.from_triple(self.triple)
return lltarget.create_target_machine(
features=",".join(["+{}".format(f) for f in self.features]),
reloc="pic", codemodel="default")

def optimize(self, llmodule):
llpassmgr = llvm.create_module_pass_manager()
self.target_machine().target_data.add_pass(llpassmgr)
llpassmgr.add_constant_merge_pass()
llpassmgr.add_cfg_simplification_pass()
llpassmgr.add_instruction_combining_pass()
llpassmgr.add_sroa_pass()
llpassmgr.add_dead_code_elimination_pass()
llpassmgr.add_gvn_pass()
llpassmgr.run(llmodule)

def compile(self, module):
"""Compile the module to a relocatable object for this target."""

@@ -102,25 +119,15 @@ def compile(self, module):
_dump(os.getenv("ARTIQ_DUMP_UNOPT_LLVM"), "LLVM IR (generated)", "_unopt.ll",
lambda: str(llparsedmod))

llpassmgrbuilder = llvm.create_pass_manager_builder()
llpassmgrbuilder.opt_level = 2 # -O2
llpassmgrbuilder.size_level = 1 # -Os
llpassmgrbuilder.inlining_threshold = 75 # -Os threshold

llpassmgr = llvm.create_module_pass_manager()
llpassmgrbuilder.populate(llpassmgr)
llpassmgr.run(llparsedmod)
self.optimize(llparsedmod)

_dump(os.getenv("ARTIQ_DUMP_LLVM"), "LLVM IR (optimized)", ".ll",
lambda: str(llparsedmod))

return llparsedmod

def assemble(self, llmodule):
lltarget = llvm.Target.from_triple(self.triple)
llmachine = lltarget.create_target_machine(
features=",".join(["+{}".format(f) for f in self.features]),
reloc="pic", codemodel="default")
llmachine = self.target_machine()

_dump(os.getenv("ARTIQ_DUMP_ASM"), "Assembly", ".s",
lambda: llmachine.emit_assembly(llmodule))

6 comments on commit f5c720c

@dnadlinger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment from the sidelines: If I'm not mistaken (I'm not familiar with the Python API), this chooses a very small, fixed set of passes? In particular, there is no LLVM-level inlining? It is true that the defaults given by PassManagerBuilder are tuned for IR similar to that produced similar to the Clang frontend (and thus manually choosing a set of passes can yield quite some benefits both in terms of compile- and runtime), but do expect having to adapt the pipeline when switching between LLVM releases.

@whitequark
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klickverbot Sure. Since compile times are a problem for us, I only enabled the passes that provide direct benefit. Thus, inlining is not enabled yet because method calls are emitted via a form of indirection that prevents LLVM from statically determining the callee. Once that's taken care of, I'll add inlining with an appropriate threshold, as well as LICM, IPSCCP and probably a few others, whose impact I currently cannot measure.

@whitequark
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, we're stuck on LLVM 3.5, since I don't think anyone is porting the OR1K backend to more recent LLVM, so whatever problems may arise during migration to more recent LLVM are immaterial in foreseeable future.

@dnadlinger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks – I just happened to see the commit and wanted to make sure you are aware of the possible issues since I've been bitten by release-to-release changes in the past. I'm not sure how good the ARTIQ codegen is (just starting to play around with it), but in tight code (think C/C++/D/Rust/…), changes in the assumptions made by various passes on the "level of canonicalization" of their input IR due to changes in the "expected" pass pipeline are are definitely noticeable.

@whitequark
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely. But the default pipeline produces results that are very bad, because ARTIQ's codegen is designed with running SROA as the very first step, to break the aggregates containing local variables (which are necessary to faithfully support closures).

@whitequark
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klickverbot I adjusted pipeline to a more realistic state in 20ad762.

Please sign in to comment.