1
Fork 0

Adjust -Ctarget-cpu=native handling in cg_llvm

When cg_llvm encounters the `-Ctarget-cpu=native` it computes an
explciit set of features that applies to the target in order to
correctly compile code for the host CPU (because e.g. `skylake` alone is
not sufficient to tell if some of the instructions are available or
not).

However there were a couple of issues with how we did this. Firstly, the
order in which features were overriden wasn't quite right – conceptually
you'd expect `-Ctarget-cpu=native` option to override the features that
are implicitly set by the target definition. However due to how other
`-Ctarget-cpu` values are handled we must adopt the following order
of priority:

* Features from -Ctarget-cpu=*; are overriden by
* Features implied by --target; are overriden by
* Features from -Ctarget-feature; are overriden by
* function specific features.

Another problem was in that the function level `target-features`
attribute would overwrite the entire set of the globally enabled
features, rather than just the features the
`#[target_feature(enable/disable)]` specified. With something like
`-Ctarget-cpu=native` we'd end up in a situation wherein a function
without `#[target_feature(enable)]` annotation would have a broader
set of features compared to a function with one such attribute. This
turned out to be a cause of heavy run-time regressions in some code
using these function-level attributes in conjunction with
`-Ctarget-cpu=native`, for example.

With this PR rustc is more careful about specifying the entire set of
features for functions that use `#[target_feature(enable/disable)]` or
`#[instruction_set]` attributes.

Sadly testing the original reproducer for this behaviour is quite
impossible – we cannot rely on `-Ctarget-cpu=native` to be anything in
particular on developer or CI machines.
This commit is contained in:
Simonas Kazlauskas 2021-03-13 15:29:39 +02:00
parent 0517acd543
commit 72fb4379d5
7 changed files with 155 additions and 52 deletions

View file

@ -152,18 +152,6 @@ fn set_probestack(cx: &CodegenCx<'ll, '_>, llfn: &'ll Value) {
}
}
pub fn llvm_target_features(sess: &Session) -> impl Iterator<Item = &str> {
const RUSTC_SPECIFIC_FEATURES: &[&str] = &["crt-static"];
let cmdline = sess
.opts
.cg
.target_feature
.split(',')
.filter(|f| !RUSTC_SPECIFIC_FEATURES.iter().any(|s| f.contains(s)));
sess.target.features.split(',').chain(cmdline).filter(|l| !l.is_empty())
}
pub fn apply_target_cpu_attr(cx: &CodegenCx<'ll, '_>, llfn: &'ll Value) {
let target_cpu = SmallCStr::new(llvm_util::target_cpu(cx.tcx.sess));
llvm::AddFunctionAttrStringValue(
@ -301,20 +289,22 @@ pub fn from_fn_attrs(cx: &CodegenCx<'ll, 'tcx>, llfn: &'ll Value, instance: ty::
// The target doesn't care; the subtarget reads our attribute.
apply_tune_cpu_attr(cx, llfn);
let features = llvm_target_features(cx.tcx.sess)
.map(|s| s.to_string())
.chain(codegen_fn_attrs.target_features.iter().map(|f| {
let function_features = codegen_fn_attrs
.target_features
.iter()
.map(|f| {
let feature = &f.as_str();
format!("+{}", llvm_util::to_llvm_feature(cx.tcx.sess, feature))
}))
})
.chain(codegen_fn_attrs.instruction_set.iter().map(|x| match x {
InstructionSetAttr::ArmA32 => "-thumb-mode".to_string(),
InstructionSetAttr::ArmT32 => "+thumb-mode".to_string(),
}))
.collect::<Vec<String>>()
.join(",");
if !features.is_empty() {
.collect::<Vec<String>>();
if !function_features.is_empty() {
let mut global_features = llvm_util::llvm_global_features(cx.tcx.sess);
global_features.extend(function_features.into_iter());
let features = global_features.join(",");
let val = CString::new(features).unwrap();
llvm::AddFunctionAttrStringValue(
llfn,