auto merge of #4931 : thestinger/rust/glue, r=pcwalton

Using `noinline` causes a 3-10% hit in performance for most compiled Rust code. For the TreeMap it's ~15% and that's where I first noticed it.

Removing the noinline attribute doesn't slow down unoptimized builds, but it does significantly increase the time spent in LLVM passes for optimized builds. The improved speed of the compiler actually improves compile-times when optimization is off.

However, the reason for the increase is because more optimization is being done - I'm sure it would speed up compiles to mark *everything* with noinline, but it wouldn't be a good idea.

LLVM is clever enough with the inlining heuristics that this doesn't cause a notable increase in code size - some code becomes a bit bigger, some becomes a bit smaller. There are some cases where it's able to strip out a ton of code thanks to inlining.

I tried out `optsize` for glue code instead but it caused the same hit for LLVM passes in the compile time and the compiled code was a bit slower than just trusting LLVM to make the decisions.

* [TIME_PASSES=1 benchmarks](http://ompldr.org/vaGdxaA) (showing the performance increase in `rustc` and also the extra time spent in LLVM passes for more optimization)
This commit is contained in:
bors 2013-02-14 12:06:27 -08:00
commit 8ec6f43d6c

View file

@ -382,6 +382,15 @@ pub fn get_tydesc(ccx: @crate_ctxt, t: ty::t) -> @mut tydesc_info {
}
}
pub fn set_optimize_for_size(f: ValueRef) {
unsafe {
llvm::LLVMAddFunctionAttr(f,
lib::llvm::OptimizeForSizeAttribute
as c_ulonglong,
0u as c_ulonglong);
}
}
pub fn set_no_inline(f: ValueRef) {
unsafe {
llvm::LLVMAddFunctionAttr(f,
@ -440,7 +449,7 @@ pub fn set_custom_stack_growth_fn(f: ValueRef) {
pub fn set_glue_inlining(f: ValueRef, t: ty::t) {
if ty::type_is_structural(t) {
set_no_inline(f);
set_optimize_for_size(f);
} else { set_always_inline(f); }
}