Lifetime intrinsics help to reduce stack usage, because LLVM can apply
stack coloring to reuse the stack slots of dead allocas for new ones.
For example these functions now both use the same amount of stack, while
previous `bar()` used five times as much as `foo()`:
````rust
fn foo() {
println("{}", 5);
}
fn bar() {
println("{}", 5);
println("{}", 5);
println("{}", 5);
println("{}", 5);
println("{}", 5);
}
````
On top of that, LLVM can also optimize out certain operations when it
knows that memory is dead after a certain point. For example, it can
sometimes remove the zeroing used to cancel the drop glue. This is
possible when the glue drop itself was already removed because the
zeroing dominated the drop glue call. For example in:
````rust
pub fn bar(x: (Box<int>, int)) -> (Box<int>, int) {
x
}
````
With optimizations, this currently results in:
````llvm
define void @_ZN3bar20h330fa42547df8179niaE({ i64*, i64 }* noalias nocapture nonnull sret, { i64*, i64 }* noalias nocapture nonnull) unnamed_addr #0 {
"_ZN29_$LP$Box$LT$int$GT$$C$int$RP$39glue_drop.$x22glue_drop$x22$LP$1347$RP$17h88cf42702e5a322aE.exit":
%2 = bitcast { i64*, i64 }* %1 to i8*
%3 = bitcast { i64*, i64 }* %0 to i8*
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 16, i32 8, i1 false)
tail call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 16, i32 8, i1 false)
ret void
}
````
But with lifetime intrinsics we get:
````llvm
define void @_ZN3bar20h330fa42547df8179niaE({ i64*, i64 }* noalias nocapture nonnull sret, { i64*, i64 }* noalias nocapture nonnull) unnamed_addr #0 {
"_ZN29_$LP$Box$LT$int$GT$$C$int$RP$39glue_drop.$x22glue_drop$x22$LP$1347$RP$17h88cf42702e5a322aE.exit":
%2 = bitcast { i64*, i64 }* %1 to i8*
%3 = bitcast { i64*, i64 }* %0 to i8*
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 16, i32 8, i1 false)
tail call void @llvm.lifetime.end(i64 16, i8* %2)
ret void
}
````
Fixes#15665
Adds a new runtime unwinding function that encapsulates the printing of the words "explicit failure" when `fail!()` is called w/o arguments.
The before/after optimized assembly:
```
leaq "str\"str\"(1412)"(%rip), %rax
movq %rax, 24(%rsp)
movq $16, 32(%rsp)
leaq "str\"str\"(1413)"(%rip), %rax
movq %rax, 8(%rsp)
movq $19, 16(%rsp)
leaq 24(%rsp), %rdi
leaq 8(%rsp), %rsi
movl $11, %edx
callq _ZN6unwind12begin_unwind21h15836560661922107792E
```
```
leaq "str\"str\"(1369)"(%rip), %rax
movq %rax, 8(%rsp)
movq $19, 16(%rsp)
leaq 8(%rsp), %rdi
movl $11, %esi
callq _ZN6unwind31begin_unwind_no_time_to_explain20hd1c720cdde6a116480dE@PLT
```
Before/after filesizes:
rwxrwxr-x 1 brian brian 21479503 Jul 20 22:09 stage2-old/lib/librustc-4e7c5e5c.so
rwxrwxr-x 1 brian brian 21475415 Jul 20 22:30 x86_64-unknown-linux-gnu/stage2/lib/librustc-4e7c5e5c.so
This is the lowest-hanging fruit in the fail-bloat wars. Further fixes are going to require harder tradeoffs.
r? @pcwalton
Lifetime intrinsics help to reduce stack usage, because LLVM can apply
stack coloring to reuse the stack slots of dead allocas for new ones.
For example these functions now both use the same amount of stack, while
previous `bar()` used five times as much as `foo()`:
````rust
fn foo() {
println("{}", 5);
}
fn bar() {
println("{}", 5);
println("{}", 5);
println("{}", 5);
println("{}", 5);
println("{}", 5);
}
````
On top of that, LLVM can also optimize out certain operations when it
knows that memory is dead after a certain point. For example, it can
sometimes remove the zeroing used to cancel the drop glue. This is
possible when the glue drop itself was already removed because the
zeroing dominated the drop glue call. For example in:
````rust
pub fn bar(x: (Box<int>, int)) -> (Box<int>, int) {
x
}
````
With optimizations, this currently results in:
````llvm
define void @_ZN3bar20h330fa42547df8179niaE({ i64*, i64 }* noalias nocapture nonnull sret, { i64*, i64 }* noalias nocapture nonnull) unnamed_addr #0 {
"_ZN29_$LP$Box$LT$int$GT$$C$int$RP$39glue_drop.$x22glue_drop$x22$LP$1347$RP$17h88cf42702e5a322aE.exit":
%2 = bitcast { i64*, i64 }* %1 to i8*
%3 = bitcast { i64*, i64 }* %0 to i8*
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 16, i32 8, i1 false)
tail call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 16, i32 8, i1 false)
ret void
}
````
But with lifetime intrinsics we get:
````llvm
define void @_ZN3bar20h330fa42547df8179niaE({ i64*, i64 }* noalias nocapture nonnull sret, { i64*, i64 }* noalias nocapture nonnull) unnamed_addr #0 {
"_ZN29_$LP$Box$LT$int$GT$$C$int$RP$39glue_drop.$x22glue_drop$x22$LP$1347$RP$17h88cf42702e5a322aE.exit":
%2 = bitcast { i64*, i64 }* %1 to i8*
%3 = bitcast { i64*, i64 }* %0 to i8*
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 16, i32 8, i1 false)
tail call void @llvm.lifetime.end(i64 16, i8* %2)
ret void
}
````
Fixes#15665
- Made custom syntax extensions capable of expanding custom macros by moving `SyntaxEnv` into `ExtCtx`
- Added convenience method on `ExtCtx` for getting a macro expander.
- Made a few things private to force only a single way to use them (through `ExtCtx`)
- Removed some ancient commented-out code
Closes#14946
Closes#15690 (Guide: improve error handling)
Closes#15729 (Guide: guessing game)
Closes#15751 (repair macro docs)
Closes#15766 (rustc: Print a smaller hash on -v)
Closes#15815 (Add unit test for rlibc)
Closes#15820 (Minor refactoring and features in rustc driver for embedders)
Closes#15822 (rustdoc: Add an --extern flag analagous to rustc's)
Closes#15824 (Document Deque trait and bitv.)
Closes#15832 (syntax: Join consecutive string literals in format strings together)
Closes#15837 (Update LLVM to include NullCheckElimination pass)
Closes#15841 (Rename to_str to to_string)
Closes#15847 (Purge #[!resolve_unexported] from the compiler)
Closes#15848 (privacy: Add publically-reexported foreign item to exported item set)
Closes#15849 (fix string in from_utf8_lossy_100_multibyte benchmark)
Closes#15850 (Get rid of few warnings in tests)
Closes#15852 (Clarify the std::vec::Vec::with_capacity docs)
Emit a single rt::Piece per consecutive string literals. String literals
are split on {{ or }} escapes.
Saves a small amount of static storage and emitted code size.