Disable the PLT where possible to improve performance
for indirect calls into shared libraries.
This optimization is enabled by default where possible.
- Add the `NonLazyBind` attribute to `rustllvm`:
This attribute informs LLVM to skip PLT calls in codegen.
- Disable PLT unconditionally:
Apply the `NonLazyBind` attribute on every function.
- Only enable no-plt when full relro is enabled:
Ensures we only enable it when we have linker support.
- Add `-Z plt` as a compiler option
This is only applicable when neither of --emit=llvm-ir or --emit=llvm-bc are not
requested.
In case either of these outputs are wanted, but the benefits of such context are
desired as well, -Zfewer_names option provides the same functionality regardless
of the outputs requested.
As per the LLVM style guide, use CamelCase for all locals and classes,
and camelCase for all non-FFI functions.
Also, make names of variables of commonly used types more consistent.
Fixes#38688.
The librustc_llvm API remains mostly unchanged, except that llvm::Attribute is no longer a bitflag but represents only a *single* attribute.
The ability to store many attributes in a small number of bits and modify them without interacting with LLVM is only used in rustc_trans::abi and closely related modules, and only attributes for function arguments are considered there.
Thus rustc_trans::abi now has its own bit-packed representation of argument attributes, which are translated to rustc_llvm::Attribute when applying the attributes.
This commit updates the LLVM submodule in use to the current HEAD of the LLVM
repository. This is primarily being done to start picking up unwinding support
for MSVC, which is currently unimplemented in the revision of LLVM we are using.
Along the way a few changes had to be made:
* As usual, lots of C++ debuginfo bindings in LLVM changed, so there were some
significant changes to our RustWrapper.cpp
* As usual, some pass management changed in LLVM, so clang was re-scrutinized to
ensure that we're doing the same thing as clang.
* Some optimization options are now passed directly into the
`PassManagerBuilder` instead of through CLI switches to LLVM.
* The `NoFramePointerElim` option was removed from LLVM, favoring instead the
`no-frame-pointer-elim` function attribute instead.
Additionally, LLVM has picked up some new optimizations which required fixing an
existing soundness hole in the IR we generate. It appears that the current LLVM
we use does not expose this hole. When an enum is moved, the previous slot in
memory is overwritten with a bit pattern corresponding to "dropped". When the
drop glue for this slot is run, however, the switch on the discriminant can
often start executing the `unreachable` block of the switch due to the
discriminant now being outside the normal range. This was patched over locally
for now by having the `unreachable` block just change to a `ret void`.
* Removes `RustJITMemoryManager` from public API.
This was really sort of an implementation detail to begin with.
* `__morestack` is linked to C++ wrapper code and this pointer
is used when resolving the symbol for `ExecutionEngine` code.
* `__morestack_addr` is also resolved for `ExecutionEngine` code.
This function is sometimes referenced in LLVM-generated code,
but was not able to be resolved on Mac OS systems.
* Added Windows support to `ExecutionEngine` API.
* Added a test for basic `ExecutionEngine` functionality.
Many of the instances of setting a global error variable ended up leaving a
dangling pointer into free'd memory. This changes the method of error
transmission to strdup any error and "relinquish ownership" to rustc when it
gets an error. The corresponding Rust code will then free the error as
necessary.
Closes#12865
This comes with a number of fixes to be compatible with upstream LLVM:
* Previously all monomorphizations of "mem::size_of()" would receive the same
symbol. In the past LLVM would silently rename duplicated symbols, but it
appears to now be dropping the duplicate symbols and functions now. The symbol
names of monomorphized functions are now no longer solely based on the type of
the function, but rather the type and the unique hash for the
monomorphization.
* Split stacks are no longer a global feature controlled by a flag in LLVM.
Instead, they are opt-in on a per-function basis through a function attribute.
The rust #[no_split_stack] attribute will disable this, otherwise all
functions have #[split_stack] attached to them.
* The compare and swap instruction now takes two atomic orderings, one for the
successful case and one for the failure case. LLVM internally has an
implementation of calculating the appropriate failure ordering given a
particular success ordering (previously only a success ordering was
specified), and I copied that into the intrinsic translation so the failure
ordering isn't supplied on a source level for now.
* Minor tweaks to LLVM's API in terms of debuginfo, naming, c++11 conventions,
etc.
The travis builds have been breaking recently because LLVM 3.5 upstream is
changing. This looks like it's likely to continue, so it would be more useful
for us if we could lock ourselves to a system LLVM version that is not changing.
This commit has the support to bring our C++ glue to LLVM back in line with what
was possible back in LLVM 3.{3,4}. I don't think we're going to be able to
reasonably protect against regressions in the future, but this kind of code is a
good sign that we can continue to use the system LLVM for simple-ish things.
Codegen for ARM won't work and it won't have some of the perf improvements we
have, but using the system LLVM should work well enough for development.
This upgrade brings commit by @eddyb to help optimizations of virtual calls in
a few places (6d2bd95) as well as a
commit by @c-a to *greatly* improve the runtime of the optimization passes
(https://github.com/rust-lang/llvm/pull/3).
Nice work to these guys!
This commit implements LTO for rust leveraging LLVM's passes. What this means
is:
* When compiling an rlib, in addition to insdering foo.o into the archive, also
insert foo.bc (the LLVM bytecode) of the optimized module.
* When the compiler detects the -Z lto option, it will attempt to perform LTO on
a staticlib or binary output. The compiler will emit an error if a dylib or
rlib output is being generated.
* The actual act of performing LTO is as follows:
1. Force all upstream libraries to have an rlib version available.
2. Load the bytecode of each upstream library from the rlib.
3. Link all this bytecode into the current LLVM module (just using llvm
apis)
4. Run an internalization pass which internalizes all symbols except those
found reachable for the local crate of compilation.
5. Run the LLVM LTO pass manager over this entire module
6a. If assembling an archive, then add all upstream rlibs into the output
archive. This ignores all of the object/bitcode/metadata files rust
generated and placed inside the rlibs.
6b. If linking a binary, create copies of all upstream rlibs, remove the
rust-generated object-file, and then link everything as usual.
As I have explained in #10741, this process is excruciatingly slow, so this is
*not* turned on by default, and it is also why I have decided to hide it behind
a -Z flag for now. The good news is that the binary sizes are about as small as
they can be as a result of LTO, so it's definitely working.
Closes#10741Closes#10740
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
* LLVM now has a C interface to LLVMBuildAtomicRMW
* The exception handling support for the JIT seems to have been dropped
* Various interfaces have been added or headers have changed
This un-reverts the reverts of the rusti commits made awhile back. These were reverted for an LLVM failure in rustpkg. I believe that this is not a problem with these commits, but rather that rustc is being used in parallel for rustpkg tests (in-process). This is not working yet (almost! see #7011), so I serialized all the tests to run one after another.
@brson, I'm mainly just guessing as to the cause of the LLVM failures in rustpkg tests. I'm confident that running tests in parallel is more likely to be the problem than those commits I made.
Additionally, this fixes two recently reported issues with rusti.
Refactor the optimization passes to explicitly use the passes. This commit
just re-implements the same passes as were already being run.
It also adds an option (behind `-Z`) to run the LLVM lint pass on the
unoptimized IR.