- Rust High Performance
- Iban Eguia Moraza
- 618字
- 2021-08-27 19:59:13
Using iterators
There is a way around this, though, that gives the same effect as the C/C++ code: using iterators. The previous code can be converted into the following:
let arr = ['a', 'b', 'c', 'd', 'e', 'f'];
for c in &arr {
println!("{}", c);
}
This will compile roughly to the same machine code as the C/C++ variant since it won't check the bounds of the slice more than once, and it will then use the same pointer arithmetic. This is great when iterating through a slice, but in the case of a direct lookup, it can be an issue. Suppose we will receive thousands of 100-element slices, and we are supposed to get the last element of each and print it. In this case, iterating through all 100 elements of each array just to get the last one is a bad idea, as it would be more efficient to bounds check just the last element. There are a couple of ways of doing this.
The first one is straightforward:
for arr in array_of_arrays {
let last_index = arr.len() - 1;
println!("{}", arr[last_index]);
}
In this concrete case, where we want to get the last element, we can do something like this:
for arr in array_of_arrays {
if let Some(elt) = arr.iter().rev().next() {
println!("{}", elt);
}
}
This will reverse the iterator with the call to rev() and then get the next element (the last one). If it exists, it will print it. But if we have to get a number that is not close to the end or to the beginning of the slice, the best way is to use the get() method:
for arr in array_of_arrays {
if let Some(elt) = arr.get(125) {
println!("{}", elt);
}
}
This last one has a double bound check, though. It will first check if the index is correct to return a Some(elt) or a None, and then the last check will see if the returned element is Some or None. If we know for sure, and I mean 100% sure, that the index is always inside the slice, we can use get_unchecked() to get the element. This is an exact equivalent to the C/C++ indexing operation, so it will not do bounds checking, allowing for better performance, but it will be unsafe to use. So in the HTTP example before, an attacker would be able to get what was stored in that index even if it was a memory address outside the slice. You will need to use an unsafe scope, of course:
for arr in array_of_arrays {
println!("{}", unsafe { arr.get_unchecked(125) });
}
The get_unchecked() function will always return something or segfault, so no need to check if it's Some or None. Remember also that upon a segfault, this will not panic, and no destructors will be called. It should only be used if a safe alternative would not meet the performance requirements and if the bounds of the slice were previously known.
In most cases, you will want to use an iterator. Iterators allow for precise iteration of elements, even filtering them, skipping some, taking a maximum amount of them, and finally, collecting them into a collection. They can even be extended or joined with other iterators to allow for any kind of solution. Everything gets managed by the std::iter::Iterator trait. You now understand the most-used methods of the trait, and I leave the rest to you to research in the standard library documentation.
It's important to properly use and understand iterators since they will be very useful for doing really fast loops. Iterators are cost-free abstractions that work the same way as indexing, but will not require bounds checking, making them ideal for efficiency improvements.