Fix quadratic behavior of repeated vectored writes
Some implementations of `Write::write_vectored` in the standard library (`BufWriter`, `LineWriter`, `Stdout`, `Stderr`) check all buffers to calculate the total length. This is O(n) over the number of buffers.
It's common that only a limited number of buffers is written at a time (e.g. 1024 for `writev(2)`). `write_vectored_all` will then call `write_vectored` repeatedly, leading to a runtime of O(n²) over the number of buffers.
This fix is to only calculate as much as needed if it's needed.
Here's a test program:
```rust
#![feature(write_all_vectored)]
use std::fs::File;
use std::io::{BufWriter, IoSlice, Write};
use std::time::Instant;
fn main() {
let buf = vec![b'\0'; 100_000_000];
let mut slices: Vec<IoSlice<'_>> = buf.chunks(100).map(IoSlice::new).collect();
let mut writer = BufWriter::new(File::create("/dev/null").unwrap());
let start = Instant::now();
write_smart(&slices, &mut writer);
println!("write_smart(): {:?}", start.elapsed());
let start = Instant::now();
writer.write_all_vectored(&mut slices).unwrap();
println!("write_all_vectored(): {:?}", start.elapsed());
}
fn write_smart(mut slices: &[IoSlice<'_>], writer: &mut impl Write) {
while !slices.is_empty() {
// Only try to write as many slices as can be written
let res = writer
.write_vectored(slices.get(..1024).unwrap_or(slices))
.unwrap();
slices = &slices[(res / 100)..];
}
}
```
Before this change:
```
write_smart(): 6.666952ms
write_all_vectored(): 498.437092ms
```
After this change:
```
write_smart(): 6.377158ms
write_all_vectored(): 6.923412ms
```
`LineWriter` (and by extension `Stdout`) isn't fully repaired by this because it looks for newlines. I could open an issue for that after this is merged, I think it's fixable but not trivially.