Add inherent methods in libcore for [T], [u8], str, f32, and f64
# Background
Primitive types are defined by the language, they don’t have a type definition like `pub struct Foo { … }` in any crate. So they don’t “belong” to any crate as far as `impl` coherence is concerned, and on principle no crate would be able to define inherent methods for them, without a trait. Since we want these types to have inherent methods anyway, the standard library (with cooperation from the compiler) bends this rule with code like [`#[lang = "u8"] impl u8 { /*…*/ }`](https://github.com/rust-lang/rust/blob/1.25.0/src/libcore/num/mod.rs#L2244-L2245). The `#[lang]` attribute is permanently-unstable and never intended to be used outside of the standard library.
Each lang item can only be defined once. Before this PR there is one impl-coherence-rule-bending lang item per primitive type (plus one for `[u8]`, which overlaps with `[T]`). And so one `impl` block each. These blocks for `str`, `[T]` and `[u8]` are in liballoc rather than libcore because *some* of the methods (like `<[T]>::to_vec(&self) -> Vec<T> where T: Clone`) need a global memory allocator which we don’t want to make a requirement in libcore. Similarly, `impl f32` and `impl f64` are in libstd because some of the methods are based on FFI calls to C’s `libm` and we want, as much as possible, libcore not to require “runtime support”.
In libcore, the methods of `str` and `[T]` that don’t allocate are made available through two **unstable traits** `StrExt` and `SliceExt` (so the traits can’t be *named* by programs on the Stable release channel) that have **stable methods** and are re-exported in the libcore prelude (so that programs on Stable can *call* these methods anyway). Non-allocating `[u8]` methods are not available in libcore: https://github.com/rust-lang/rust/issues/45803. Some `f32` and `f64` methods are in an unstable `core::num::Float` trait with stable methods, but that one is **not in the libcore prelude**. (So as far as Stable programs are concerns it doesn’t exist, and I don’t know what the point was to mark these methods `#[stable]`.)
https://github.com/rust-lang/rust/issues/32110 is the tracking issue for these unstable traits.
# High-level proposal
Since the standard library is already bending the rules, why not bend them *a little more*? By defining a few additional lang items, the compiler can allow the standard library to have *two* `impl` blocks (in different crates) for some primitive types.
The `StrExt` and `SliceExt` traits still exist for now so that we can bootstrap from a previous-version compiler that doesn’t have these lang items yet, but they can be removed in next release cycle. (`Float` is used internally and needs to be public for libcore unit tests, but was already `#[doc(hidden)]`.) I don’t know if https://github.com/rust-lang/rust/issues/32110 should be closed by this PR, or only when the traits are entirely removed after we make a new bootstrap compiler.
# Float methods
Among the methods of the `core::num::Float` trait, three are based on LLVM intrinsics: `abs`, `signum`, and `powi`. PR https://github.com/rust-lang/rust/pull/27823 “Remove dependencies on libm functions from libcore” moved a bunch of `core::num::Float` methods back to libstd, but left these three behind. However they aren’t specifically discussed in the PR thread. The `compiler_builtins` crate defines `__powisf2` and `__powidf2` functions that look like implementations of `powi`, but I couldn’t find a connection with the `llvm.powi.f32` and `llvm.powi.f32` intrinsics by grepping through LLVM’s code.
In discussion starting at https://github.com/rust-lang/rust/issues/32110#issuecomment-370647922 Alex says that we do not want methods in libcore that require “runtime support”, but it’s not clear whether that applies to these `abs`, `signum`, or `powi`. In doubt, I’ve **removed** them for the trait and moved them to inherent methods in libstd for now. We can move them back later (or in this PR) if we decide that’s appropriate.
# Change details
For users on the Stable release channel:
* I believe this PR does not make any breaking change
* Some methods for `[u8]`, `f32`, and `f64` are newly available to `#![no_std]` users (fixes https://github.com/rust-lang/rust/issues/45803)
* There should be no visible change for `std` users in terms of what programs compile or what their behavior is. (Only in compiler error messages, possibly.)
For Nightly users, additionally:
* The unstable `StrExt` and `SliceExt` traits are gone
* Their methods are now inherent methods of `str` and `[T]` (so only code that explicitly named the traits should be affected, not "normal" method calls)
* The `abs`, `signum` and `powi` methods of the `Float` trait are gone
* The `Float` trait’s unstable feature name changed to `float_internals` with no associated tracking issue, to reflect it being a permanently unstable implementation detail rather than a public API on a path to stabilization.
* Its remaining methods are now inherent methods of `f32` and `f64`.
-----
CC @rust-lang/libs for the API changes, @rust-lang/compiler for the new lang items
Add Iterator::find_map
I'd like to propose to add `find_map` method to the `Iterator`: an occasionally useful utility, which relates to `filter_map` in the same way that `find` relates to `filter`.
`find_map` takes an `Option`-returning function, applies it to the elements of the iterator, and returns the first non-`None` result. In other words, `find_map(f) == filter_map(f).next()`.
Why do we want to add a function to the `Iterator`, which can be trivially expressed as a combination of existing ones? Observe that `find(f) == filter(f).next()`, so, by the same logic, `find` itself is unnecessary!
The more positive argument is that desugaring of `find[_map]` in terms of `filter[_map]().next()` is not super obvious, because the `filter` operation reads as if it is applies to the whole collection, although in reality we are interested only in the first element. That is, the jump from "I need a **single** result" to "let's use a function which maps **many** values to **many** values" is a non-trivial speed-bump, and causes friction when reading and writing code.
Does the need for `find_map` arise in practice? Yes!
* Anecdotally, I've more than once searched the docs for the function with `[T] -> (T -> Option<U>) -> Option<U>` signature.
* The direct cause for this PR was [this](1291c50e86 (r174934173)) discussion in Cargo, which boils down to "there's some pattern that we try to express here, but current approaches looks non-pretty" (and the pattern is `filter_map`
* There are several `filter_map().next` combos in Cargo: [[1]](545a4a2c93/src/cargo/ops/cargo_new.rs (L585)), [[2]](545a4a2c93/src/cargo/core/resolver/mod.rs (L1130)), [[3]](545a4a2c93/src/cargo/ops/cargo_rustc/mod.rs (L1086)).
* I've also needed similar functionality in `Kotlin` several times. There, it is expressed as `mapNotNull {}.firstOrNull`, as can be seen [here](ee8bdb4e07/src/main/kotlin/org/rust/cargo/project/model/impl/CargoProjectImpl.kt (L154)), [here](ee8bdb4e07/src/main/kotlin/org/rust/lang/core/resolve/ImplLookup.kt (L444)) [here](ee8bdb4e07/src/main/kotlin/org/rust/ide/inspections/RsLint.kt (L38)) and [here](ee8bdb4e07/src/main/kotlin/org/rust/cargo/toolchain/RustToolchain.kt (L74)) (and maybe in some other cases as well)
Note that it is definitely not among the most popular functions (it definitely is less popular than `find`), but, for example it (in case of Cargo) seems to be more popular than `rposition` (1 occurrence), `step_by` (zero occurrences) and `nth` (three occurrences as `nth(0)` which probably should be replaced with `next`).
Do we necessary need this function in `std`? Could we move it to itertools? That is possible, but observe that `filter`, `filter_map`, `find` and `find_map` together really form a complete table:
|||
|-------|---------|
| filter| find|
|filter_map|find_map|
It would be somewhat unsatisfying to have one quarter of this table live elsewhere :) Also, if `Itertools` adds an `find_map` method, it would be more difficult to move it to std due to name collision.
Hm, at this point I've searched for `filter_map` the umpteenth time, and, strangely, this time I do find this RFC: https://github.com/rust-lang/rfcs/issues/1801. I guess this could be an implementation though? :)
To sum up:
Pro:
- complete the symmetry with existing method
- codify a somewhat common non-obvious pattern
Contra:
- niche use case
- we can, and do, live without it
Move ascii::escape_default to libcore
As requested in #46409, the `ascii::escape_default` method has been added to the core library. All I did was copy over the `std::ascii` module file, remove the (redundant) `AsciiExt` trait, and change some of the documentation to match. None of the tests were changed.
I wasn't sure how to handle the annotations. For `EscapeDefault` and `escape_default()`, I changed them to `#[unstable(feature = "core_ascii", issue = "46409")]`. Is that alright? Or should I leave them as they were?
Required moving all fulldeps tests depending on `rand` to different locations as
now there's multiple `rand` crates that can't be implicitly linked against.
Add std/core::iter::repeat_with
Adds an iterator primitive `repeat_with` which is the "lazy" version of `repeat` but also more flexible since you can build up state with the `FnMut`. The design is mostly taken from `repeat`.
r? @rust-lang/libs
cc @withoutboats, @scottmcm
Add Range[Inclusive]::is_empty
During https://github.com/rust-lang/rfcs/pull/1980, it was discussed that figuring out whether a range is empty was subtle, and thus there should be a clear and obvious way to do it. It can't just be ExactSizeIterator::is_empty (also unstable) because not all ranges are ExactSize -- such as `Range<i64>` and `RangeInclusive<usize>`.
Things to ponder:
- Unless this is stabilized first, this makes stabilizing ExactSizeIterator::is_empty more icky, since this hides that.
- This is only on `Range` and `RangeInclusive`, as those are the only ones where it's interesting. But one could argue that it should be on more for consistency, or on RangeArgument instead.
- The bound on this is PartialOrd, since that works ok (see tests for float examples) and is consistent with `contains`. But ranges like `NAN..=NAN`_are_ kinda weird.
- [x] ~~There's not a real issue number on this yet~~
During the RFC, it was discussed that figuring out whether a range is empty was subtle, and thus there should be a clear and obvious way to do it. It can't just be ExactSizeIterator::is_empty (also unstable) because not all ranges are ExactSize -- not even Range<i32> or RangeInclusive<usize>.
std: Add a new wasm32-unknown-unknown target
This commit adds a new target to the compiler: wasm32-unknown-unknown. This target is a reimagining of what it looks like to generate WebAssembly code from Rust. Instead of using Emscripten which can bring with it a weighty runtime this instead is a target which uses only the LLVM backend for WebAssembly and a "custom linker" for now which will hopefully one day be direct calls to lld.
Notable features of this target include:
* There is zero runtime footprint. The target assumes nothing exists other than the wasm32 instruction set.
* There is zero toolchain footprint beyond adding the target. No custom linker is needed, rustc contains everything.
* Very small wasm modules can be generated directly from Rust code using this target.
* Most of the standard library is stubbed out to return an error, but anything related to allocation works (aka `HashMap`, `Vec`, etc).
* Naturally, any `#[no_std]` crate should be 100% compatible with this new target.
This target is currently somewhat janky due to how linking works. The "linking" is currently unconditional whole program LTO (aka LLVM is being used as a linker). Naturally that means compiling programs is pretty slow! Eventually though this target should have a linker.
This target is also intended to be quite experimental. I'm hoping that this can act as a catalyst for further experimentation in Rust with WebAssembly. Breaking changes are very likely to land to this target, so it's not recommended to rely on it in any critical capacity yet. We'll let you know when it's "production ready".
### Building yourself
First you'll need to configure the build of LLVM and enable this target
```
$ ./configure --target=wasm32-unknown-unknown --set llvm.experimental-targets=WebAssembly
```
Next you'll want to remove any previously compiled LLVM as it needs to be rebuilt with WebAssembly support. You can do that with:
```
$ rm -rf build
```
And then you're good to go! A `./x.py build` should give you a rustc with the appropriate libstd target.
### Test support
Currently testing-wise this target is looking pretty good but isn't complete. I've got almost the entire `run-pass` test suite working with this target (lots of tests ignored, but many passing as well). The `core` test suite is [still getting LLVM bugs fixed](https://reviews.llvm.org/D39866) to get that working and will take some time. Relatively simple programs all seem to work though!
In general I've only tested this with a local fork that makes use of LLVM 5 rather than our current LLVM 4 on master. The LLVM 4 WebAssembly backend AFAIK isn't broken per se but is likely missing bug fixes available on LLVM 5. I'm hoping though that we can decouple the LLVM 5 upgrade and adding this wasm target!
### But the modules generated are huge!
It's worth nothing that you may not immediately see the "smallest possible wasm module" for the input you feed to rustc. For various reasons it's very difficult to get rid of the final "bloat" in vanilla rustc (again, a real linker should fix all this). For now what you'll have to do is:
cargo install --git https://github.com/alexcrichton/wasm-gc
wasm-gc foo.wasm bar.wasm
And then `bar.wasm` should be the smallest we can get it!
---
In any case for now I'd love feedback on this, particularly on the various integration points if you've got better ideas of how to approach them!
This commit adds a new target to the compiler: wasm32-unknown-unknown. This
target is a reimagining of what it looks like to generate WebAssembly code from
Rust. Instead of using Emscripten which can bring with it a weighty runtime this
instead is a target which uses only the LLVM backend for WebAssembly and a
"custom linker" for now which will hopefully one day be direct calls to lld.
Notable features of this target include:
* There is zero runtime footprint. The target assumes nothing exists other than
the wasm32 instruction set.
* There is zero toolchain footprint beyond adding the target. No custom linker
is needed, rustc contains everything.
* Very small wasm modules can be generated directly from Rust code using this
target.
* Most of the standard library is stubbed out to return an error, but anything
related to allocation works (aka `HashMap`, `Vec`, etc).
* Naturally, any `#[no_std]` crate should be 100% compatible with this new
target.
This target is currently somewhat janky due to how linking works. The "linking"
is currently unconditional whole program LTO (aka LLVM is being used as a
linker). Naturally that means compiling programs is pretty slow! Eventually
though this target should have a linker.
This target is also intended to be quite experimental. I'm hoping that this can
act as a catalyst for further experimentation in Rust with WebAssembly. Breaking
changes are very likely to land to this target, so it's not recommended to rely
on it in any critical capacity yet. We'll let you know when it's "production
ready".
---
Currently testing-wise this target is looking pretty good but isn't complete.
I've got almost the entire `run-pass` test suite working with this target (lots
of tests ignored, but many passing as well). The `core` test suite is still
getting LLVM bugs fixed to get that working and will take some time. Relatively
simple programs all seem to work though!
---
It's worth nothing that you may not immediately see the "smallest possible wasm
module" for the input you feed to rustc. For various reasons it's very difficult
to get rid of the final "bloat" in vanilla rustc (again, a real linker should
fix all this). For now what you'll have to do is:
cargo install --git https://github.com/alexcrichton/wasm-gc
wasm-gc foo.wasm bar.wasm
And then `bar.wasm` should be the smallest we can get it!
---
In any case for now I'd love feedback on this, particularly on the various
integration points if you've got better ideas of how to approach them!
This is the core method in terms of which the other methods (fold, all, any, find, position, nth, ...) can be implemented, allowing Iterator implementors to get the full goodness of internal iteration by only overriding one method (per direction).
Many of the iterator adaptors will perform faster folds if they forward
to their inner iterator's folds, especially for inner types like `Chain`
which are optimized too. The following types are newly specialized:
| Type | `fold` | `rfold` |
| ----------- | ------ | ------- |
| `Enumerate` | ✓ | ✓ |
| `Filter` | ✓ | ✓ |
| `FilterMap` | ✓ | ✓ |
| `FlatMap` | exists | ✓ |
| `Fuse` | ✓ | ✓ |
| `Inspect` | ✓ | ✓ |
| `Peekable` | ✓ | N/A¹ |
| `Skip` | ✓ | N/A² |
| `SkipWhile` | ✓ | N/A¹ |
¹ not a `DoubleEndedIterator`
² `Skip::next_back` doesn't pull skipped items at all, but this couldn't
be avoided if `Skip::rfold` were to call its inner iterator's `rfold`.
Benchmarks
----------
In the following results, plain `_sum` computes the sum of a million
integers -- note that `sum()` is implemented with `fold()`. The
`_ref_sum` variants do the same on a `by_ref()` iterator, which is
limited to calling `next()` one by one, without specialized `fold`.
The `chain` variants perform the same tests on two iterators chained
together, to show a greater benefit of forwarding `fold` internally.
test iter::bench_enumerate_chain_ref_sum ... bench: 2,216,264 ns/iter (+/- 29,228)
test iter::bench_enumerate_chain_sum ... bench: 922,380 ns/iter (+/- 2,676)
test iter::bench_enumerate_ref_sum ... bench: 476,094 ns/iter (+/- 7,110)
test iter::bench_enumerate_sum ... bench: 476,438 ns/iter (+/- 3,334)
test iter::bench_filter_chain_ref_sum ... bench: 2,266,095 ns/iter (+/- 6,051)
test iter::bench_filter_chain_sum ... bench: 745,594 ns/iter (+/- 2,013)
test iter::bench_filter_ref_sum ... bench: 889,696 ns/iter (+/- 1,188)
test iter::bench_filter_sum ... bench: 667,325 ns/iter (+/- 1,894)
test iter::bench_filter_map_chain_ref_sum ... bench: 2,259,195 ns/iter (+/- 353,440)
test iter::bench_filter_map_chain_sum ... bench: 1,223,280 ns/iter (+/- 1,972)
test iter::bench_filter_map_ref_sum ... bench: 611,607 ns/iter (+/- 2,507)
test iter::bench_filter_map_sum ... bench: 611,610 ns/iter (+/- 472)
test iter::bench_fuse_chain_ref_sum ... bench: 2,246,106 ns/iter (+/- 22,395)
test iter::bench_fuse_chain_sum ... bench: 634,887 ns/iter (+/- 1,341)
test iter::bench_fuse_ref_sum ... bench: 444,816 ns/iter (+/- 1,748)
test iter::bench_fuse_sum ... bench: 316,954 ns/iter (+/- 2,616)
test iter::bench_inspect_chain_ref_sum ... bench: 2,245,431 ns/iter (+/- 21,371)
test iter::bench_inspect_chain_sum ... bench: 631,645 ns/iter (+/- 4,928)
test iter::bench_inspect_ref_sum ... bench: 317,437 ns/iter (+/- 702)
test iter::bench_inspect_sum ... bench: 315,942 ns/iter (+/- 4,320)
test iter::bench_peekable_chain_ref_sum ... bench: 2,243,585 ns/iter (+/- 12,186)
test iter::bench_peekable_chain_sum ... bench: 634,848 ns/iter (+/- 1,712)
test iter::bench_peekable_ref_sum ... bench: 444,808 ns/iter (+/- 480)
test iter::bench_peekable_sum ... bench: 317,133 ns/iter (+/- 3,309)
test iter::bench_skip_chain_ref_sum ... bench: 1,778,734 ns/iter (+/- 2,198)
test iter::bench_skip_chain_sum ... bench: 761,850 ns/iter (+/- 1,645)
test iter::bench_skip_ref_sum ... bench: 478,207 ns/iter (+/- 119,252)
test iter::bench_skip_sum ... bench: 315,614 ns/iter (+/- 3,054)
test iter::bench_skip_while_chain_ref_sum ... bench: 2,486,370 ns/iter (+/- 4,845)
test iter::bench_skip_while_chain_sum ... bench: 633,915 ns/iter (+/- 5,892)
test iter::bench_skip_while_ref_sum ... bench: 666,926 ns/iter (+/- 804)
test iter::bench_skip_while_sum ... bench: 444,405 ns/iter (+/- 571)
Delete deprecated & unstable range-specific `step_by`
Using the new one is annoying while this one exists, since the inherent method hides the one on iterator.
Tracking issue: #27741
Replacement: #41439
Deprecation: #42310 for 1.19
Fixes#41477
Pursuant to issue #25663, this commit adds the max and min functions to the Ord trait, enabling items that implement Ord to use UFCS (ex. 1.max(2)) instead of the longer std::cmp::max(1,2) format.
Add an in-place rotate method for slices to libcore
A helpful primitive for moving chunks of data around inside a slice.
For example, if you have a range selected and are drag-and-dropping it somewhere else (Example from [Sean Parent's talk](https://youtu.be/qH6sSOr-yk8?t=560)).
(If this should be an RFC instead of a PR, please let me know.)
Edit: changed example
Override size_hint and propagate ExactSizeIterator for iter::StepBy
Generally useful, but also a prerequisite for moving a bunch of unit tests off `Range*::step_by`.
A small non-breaking subset of https://github.com/rust-lang/rust/pull/42110 (which I closed).
Includes two small documentation changes @ivandardi requested on that PR.
r? @alexcrichton
A helpful primitive for moving chunks of data around inside a slice.
In particular, adding elements to the end of a Vec then moving them
somewhere else, as a way to do efficient multiple-insert. (There's
drain for efficient block-remove, but no easy way to block-insert.)
Talk with another example: <https://youtu.be/qH6sSOr-yk8?t=560>
Not being an enum improves ergonomics, especially since NonEmpty could be Empty. It can still be iterable without an extra "done" bit by making the range have !(start <= end), which is even possible without changing the Step trait.
Implements RFC 1980