Lower intrinsics calls: forget, size_of, unreachable, wrapping_*
This allows constant propagation to evaluate `size_of` and `wrapping_*`,
and unreachable propagation to propagate a call to `unreachable`.
The lowering is performed as a MIR optimization, rather than during MIR
building to preserve the special status of intrinsics with respect to
unsafety checks and promotion.
Currently enabled by default to determine the performance impact (no
significant impact expected). In practice only useful when combined with
inlining since intrinsics are rarely used directly (with exception of
`unreachable` and `discriminant_value` used by built-in derive macros).
Closes#32716.
add error_occured field to ConstQualifs,
fix#76064
I wasn't sure what `in_return_place` actually did and not sure why it returns `ConstQualifs` while it's sibling functions return `bool`. So I tried to make as minimal changes to the structure as possible. Please point out whether I have to refactor it or not.
r? `@oli-obk`
cc `@RalfJung`
specialize io::copy to use copy_file_range, splice or sendfile
Fixes#74426.
Also covers #60689 but only as an optimization instead of an official API.
The specialization only covers std-owned structs so it should avoid the problems with #71091
Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar.
There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors.
The test case executes the following:
```
[pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0
[pid 103776] write(4, "wxyz", 4) = 4
[pid 103776] write(4, "iklmn", 5) = 5
[pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5
```
0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted
𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`.
𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing.
## Benchmarks
```
OLD
test io::tests::bench_file_to_file_copy ... bench: 21,002 ns/iter (+/- 750) = 6240 MB/s [ext4]
test io::tests::bench_file_to_file_copy ... bench: 35,704 ns/iter (+/- 1,108) = 3671 MB/s [btrfs]
test io::tests::bench_file_to_socket_copy ... bench: 57,002 ns/iter (+/- 4,205) = 2299 MB/s
test io::tests::bench_socket_pipe_socket_copy ... bench: 142,640 ns/iter (+/- 77,851) = 918 MB/s
NEW
test io::tests::bench_file_to_file_copy ... bench: 14,745 ns/iter (+/- 519) = 8889 MB/s [ext4]
test io::tests::bench_file_to_file_copy ... bench: 6,128 ns/iter (+/- 227) = 21389 MB/s [btrfs]
test io::tests::bench_file_to_socket_copy ... bench: 13,767 ns/iter (+/- 3,767) = 9520 MB/s
test io::tests::bench_socket_pipe_socket_copy ... bench: 26,471 ns/iter (+/- 6,412) = 4951 MB/s
```
rustc_target: Mark UEFI targets as `is_like_windows`/`is_like_msvc`
And document what `is_like_windows` and `is_like_msvc` actually mean in more detail.
Addresses FIXMEs left from https://github.com/rust-lang/rust/pull/71030.
r? `@nagisa`
rustc_parse: Remove optimization for 0-length streams in `collect_tokens`
The optimization conflates empty token streams with unknown token stream, which is at least suspicious, and doesn't affect performance because 0-length token streams are very rare.
r? `@Aaron1011`
This allows constant propagation to evaluate `size_of` and `wrapping_*`,
and unreachable propagation to propagate a call to `unreachable`.
The lowering is performed as a MIR optimization, rather than during MIR
building to preserve the special status of intrinsics with respect to
unsafety checks and promotion.
Previously EOVERFLOW handling was only applied for io::copy specialization
but not for fs::copy sharing the same code.
Additionally we lower the chunk size to 1GB since we have a user report
that older kernels may return EINVAL when passing 0x8000_0000
but smaller values succeed.
Android builds use feature level 14, the libc wrapper for splice is gated
on feature level 21+ so we have to invoke the syscall directly.
Additionally the emulator doesn't seem to support it so we also have to
add ENOSYS checks.
Fix and re-enable two coverage tests on MacOS
Note, in the coverage-reports test, the comment about MacOS was wrong.
The setting is based on config.toml llvm `optimize` setting. There
doesn't appear to be any environment variable I can check, and I
don't think we should add one. Testing the binary itself is a more
reliable way to check anyway.
For the coverage-spanview test, I removed the dependency on sed
altogether, which is much less ugly than trying to work around the
MacOS sed differences.
I tested these changes on Linux, Windows, and Mac.
r? `@tmandry`
FYI `@wesleywiser`
Update cargo
Fixing an important publish bug.
2 commits in 8662ab427a8d6ad8047811cc4d78dbd20dd07699..2af662e22177a839763ac8fb70d245a680b15214
2020-11-12 03:47:53 +0000 to 2020-11-12 19:04:56 +0000
- Fix publishing with optional dependencies. (rust-lang/cargo#8853)
- Minor typo in features.md (rust-lang/cargo#8851)
Rustdoc check option
The ultimate goal behind this option would be to have `rustdoc --check` being run when you use `cargo check` as a second step.
r? `@jyn514`
Add type to `ConstKind::Placeholder`
I simply threaded `<'tcx>` through everything that required it. I'm not sure whether this is the correct thing to do, but it seems to work.
r? `@nikomatsakis`
Eliminate some temporary vectors
This PR changes `get_item_attrs` and `get_item_variances` to return iterator impls instead of vectors. On top of that, this PR replaces some seemingly unnecessary vectors with iterators or SmallVec, and also reserves space where we know (the minimum) number of elements that will be inserted. This change hopes to remove a few heap allocations and unnecessary copies.
Bump version number to 1.50.0
First PR of the release process of Rust 1.48.0. All PRs landed after this one will be included in Rust 1.50.0.
r? `@ghost`
cc `@rust-lang/release`
The optimization conflates empty token streams with unknown token stream, which is at least suspicious, and doesn't affect performance because 0-length token streams are very rare.
extend min_const_generics param ty tests
Apparently we never tested for `u128` and `i128` before this, so I added a test for all types which are allowed.
r? ``@varkor``
Update cargo
5 commits in d5556aeb8405b1fe696adb6e297ad7a1f2989b62..8662ab427a8d6ad8047811cc4d78dbd20dd07699
2020-11-04 22:20:36 +0000 to 2020-11-12 03:47:53 +0000
- Check if rust-src contains a vendor dir, and patch it in (rust-lang/cargo#8834)
- Improve performance of almost fresh builds (rust-lang/cargo#8837)
- Use u32/64::to/from_le_bytes instead of bit fiddling (rust-lang/cargo#8847)
- Avoid constructing an anyhow::Error when not necessary (rust-lang/cargo#8844)
- Skip extracting .cargo-ok files from packages (rust-lang/cargo#8835)
Add asm register information for SPIR-V
As discussed in [zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Defining.20asm!.20for.20new.20architecture), we at [rust-gpu](https://github.com/EmbarkStudios/rust-gpu) would like to support `asm!` for our SPIR-V backend. However, we cannot do so purely without frontend support: [this match](d4ea0b3e46/compiler/rustc_target/src/asm/mod.rs (L185)) fails and so `asm!` is not supported ([error reported here](d4ea0b3e46/compiler/rustc_ast_lowering/src/expr.rs (L1095))). To resolve this, we need to stub out register information for SPIR-V to support getting the `asm!` content all the way to [`AsmBuilderMethods::codegen_inline_asm`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.AsmBuilderMethods.html#tymethod.codegen_inline_asm), at which point the rust-gpu backend can do all the parsing and codegen that is needed.
This is a pretty weird PR - adding support for a backend that isn't in-tree feels pretty gross to me, but I don't see an easy way around this. ``@Amanieu`` said I should submit it anyway, so, here we are! Let me know if this needs to go through a more formal process (MCP?) and what I should do to help this along.
I based this off the [wasm asm PR](https://github.com/rust-lang/rust/pull/78684), which unfortunately this PR conflicts with that one quite a bit, sorry for any merge conflict pain :(
---
Some open questions:
- What do we call the register class? Some context, SPIR-V is an SSA-based IR, there are "instructions" that create IDs (referred to as `<id>` in the spec), which can be referenced by other instructions. So, `reg` isn't exactly accurate, they're SSA IDs, not re-assignable registers.
- What happens when a SPIR-V register gets to the LLVM backend? Right now it's a `bug!`, but should that be a `sess.fatal()`? I'm not sure if it's even possible to reach that point, maybe there's a check that prevents the `spirv` target from even reaching that codepath.