Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk

Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
This commit is contained in:
Stuart Cook 2025-04-05 13:18:13 +11:00 committed by GitHub
commit c6bf3a01ef
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
21 changed files with 728 additions and 234 deletions

View file

@ -2,7 +2,7 @@ use std::str::FromStr;
use rustc_abi::ExternAbi;
use rustc_ast::expand::autodiff_attrs::{AutoDiffAttrs, DiffActivity, DiffMode};
use rustc_ast::{MetaItem, MetaItemInner, attr};
use rustc_ast::{LitKind, MetaItem, MetaItemInner, attr};
use rustc_attr_parsing::ReprAttr::ReprAlign;
use rustc_attr_parsing::{AttributeKind, InlineAttr, InstructionSetAttr, OptimizeAttr};
use rustc_data_structures::fx::FxHashMap;
@ -805,8 +805,8 @@ fn autodiff_attrs(tcx: TyCtxt<'_>, id: DefId) -> Option<AutoDiffAttrs> {
return Some(AutoDiffAttrs::source());
}
let [mode, input_activities @ .., ret_activity] = &list[..] else {
span_bug!(attr.span(), "rustc_autodiff attribute must contain mode and activities");
let [mode, width_meta, input_activities @ .., ret_activity] = &list[..] else {
span_bug!(attr.span(), "rustc_autodiff attribute must contain mode, width and activities");
};
let mode = if let MetaItemInner::MetaItem(MetaItem { path: p1, .. }) = mode {
p1.segments.first().unwrap().ident
@ -823,6 +823,30 @@ fn autodiff_attrs(tcx: TyCtxt<'_>, id: DefId) -> Option<AutoDiffAttrs> {
}
};
let width: u32 = match width_meta {
MetaItemInner::MetaItem(MetaItem { path: p1, .. }) => {
let w = p1.segments.first().unwrap().ident;
match w.as_str().parse() {
Ok(val) => val,
Err(_) => {
span_bug!(w.span, "rustc_autodiff width should fit u32");
}
}
}
MetaItemInner::Lit(lit) => {
if let LitKind::Int(val, _) = lit.kind {
match val.get().try_into() {
Ok(val) => val,
Err(_) => {
span_bug!(lit.span, "rustc_autodiff width should fit u32");
}
}
} else {
span_bug!(lit.span, "rustc_autodiff width should be an integer");
}
}
};
// First read the ret symbol from the attribute
let ret_symbol = if let MetaItemInner::MetaItem(MetaItem { path: p1, .. }) = ret_activity {
p1.segments.first().unwrap().ident
@ -860,7 +884,7 @@ fn autodiff_attrs(tcx: TyCtxt<'_>, id: DefId) -> Option<AutoDiffAttrs> {
}
}
Some(AutoDiffAttrs { mode, ret_activity, input_activity: arg_activities })
Some(AutoDiffAttrs { mode, width, ret_activity, input_activity: arg_activities })
}
pub(crate) fn provide(providers: &mut Providers) {