Commit graph

292 commits

Author SHA1 Message Date
Nicholas Nethercote 36b495f3cf Introduce ChunkedBitSet and use it for some dataflow analyses.
This reduces peak memory usage significantly for some programs with very
large functions, such as:
- `keccak`, `unicode_normalization`, and `match-stress-enum`, from
  the `rustc-perf` benchmark suite;
- `http-0.2.6` from crates.io.

The new type is used in the analyses where the bitsets can get huge
(e.g. 10s of thousands of bits): `MaybeInitializedPlaces`,
`MaybeUninitializedPlaces`, and `EverInitializedPlaces`.

Some refactoring was required in `rustc_mir_dataflow`. All existing
analysis domains are either `BitSet` or a trivial wrapper around
`BitSet`, and access in a few places is done via `Borrow<BitSet>` or
`BorrowMut<BitSet>`. Now that some of these domains are `ClusterBitSet`,
that no longer works. So this commit replaces the `Borrow`/`BorrowMut`
usage with a new trait `BitSetExt` containing the needed bitset
operations. The impls just forward these to the underlying bitset type.
This required fiddling with trait bounds in a few places.

The commit also:
- Moves `static_assert_size` from `rustc_data_structures` to
  `rustc_index` so it can be used in the latter; the former now
  re-exports it so existing users are unaffected.
- Factors out some common "clear excess bits in the final word"
  functionality in `bit_set.rs`.
- Uses `fill` in a few places instead of loops.
2022-02-23 10:18:49 +11:00
bors c1aa85475c Auto merge of #93934 - rusticstuff:inline_ensure_sufficient_stack, r=estebank
Allow inlining of `ensure_sufficient_stack()`

This functions is monomorphized a lot and allowing the compiler to inline it improves instructions count and max RSS significantly in my local tests.
2022-02-20 15:10:19 +00:00
est31 2ef8af6619 Adopt let else in more places 2022-02-19 17:27:43 +01:00
Nicholas Nethercote 80632de4a1 Address review comments. 2022-02-15 16:20:01 +11:00
Nicholas Nethercote 0c2ebbd412 Rename PtrKey as Interned and improve it.
In particular, there's now more protection against incorrect usage,
because you can only create one via `Interned::new_unchecked`, which
makes it more obvious that you must be careful.

There are also some tests.
2022-02-15 15:50:29 +11:00
Santiago Pastorino f4bb4500dd
Call the method fork instead of clone and add proper comments 2022-02-14 12:57:20 -03:00
Hans Kratz 59536fb023 Allow inlining of ensure_sufficient_stack() 2022-02-12 11:30:04 +01:00
Jakub Beránek 5fc2e5623b
Use const generics in SipHasher128's short_write 2022-02-05 19:55:44 +01:00
Jakub Beránek c21b8e12a4
Fix isize optimization in StableHasher for big-endian architectures 2022-02-03 11:47:41 +01:00
bors 1be5c8f909 Auto merge of #93432 - Kobzol:stable-hash-isize-hash-compression, r=the8472
Compress amount of hashed bytes for `isize` values in StableHasher

This is another attempt to land https://github.com/rust-lang/rust/pull/92103, this time hopefully with a correct implementation w.r.t. stable hashing guarantees. The previous PR was [reverted](https://github.com/rust-lang/rust/pull/93014) because it could produce the [same hash](https://github.com/rust-lang/rust/pull/92103#issuecomment-1014625442) for different values even in quite simple situations. I have since added a basic [test](https://github.com/rust-lang/rust/pull/93193) that should guard against that situation, I also added a new test in this PR, specialised for this optimization.

## Why this optimization helps
Since the original PR, I have tried to analyze why this optimization even helps (and why it especially helps for `clap`). I found that the vast majority of stable-hashing `i64` actually comes from hashing `isize` (which is converted to `i64` in the stable hasher). I only found a single place where is this datatype used directly in the compiler, and this place has also been showing up in traces that I used to find out when is `isize` being hashed. This place is `rustc_span::FileName::DocTest`, however, I suppose that isizes also come from other places, but they might not be so easy to find (there were some other entries in the trace). `clap` hashes about 8.5 million `isize`s, and all of them fit into a single byte, which is why this optimization has helped it [quite a lot](https://github.com/rust-lang/rust/pull/92103#issuecomment-1005711861).

Now, I'm not sure if special casing `isize` is the correct solution here, maybe something could be done with that `isize` inside `DocTest` or in other places, but that's for another discussion I suppose. In this PR, instead of hardcoding a special case inside `SipHasher128`, I instead put it into `StableHasher`, and only used it for `isize` (I tested that for `i64` it doesn't help, or at least not for `clap` and other few benchmarks that I was testing).

## New approach
Since the most common case is a single byte, I added a fast path for hashing `isize` values which positive value fits within a single byte, and a cold path for the rest of the values.

To avoid the previous correctness problem, we need to make sure that each unique `isize` value will produce a unique hash stream to the hasher. By hash stream I mean a sequence of bytes that will be hashed (a different sequence should produce a different hash, but that is of course not guaranteed).

We have to distinguish different values that produce the same bit pattern when we combine them. For example, if we just simply skipped the leading zero bytes for values that fit within a single byte, `(0xFF, 0xFFFFFFFFFFFFFFFF)` and `(0xFFFFFFFFFFFFFFFF, 0xFF)` would send the same hash stream to the hasher, which must not happen.

To avoid this situation, values `[0, 0xFE]` are hashed as a single byte. When we hash a larger (treating `isize` as `u64`) value, we first hash an additional byte `0xFF`. Since `0xFF` cannot occur when we apply the single byte optimization, we guarantee that the hash streams will be unique when hashing two values `(a, b)` and `(b, a)` if `a != b`:
1) When both `a` and `b` are within `[0, 0xFE]`, their hash streams will be different.
2) When neither `a` and `b` are within `[0, 0xFE]`, their hash streams will be different.
3) When `a` is within `[0, 0xFE]` and `b` isn't, when we hash `(a, b)`, the hash stream will definitely not begin with `0xFF`. When we hash `(b, a)`, the hash stream will definitely begin with `0xFF`. Therefore the hash streams will be different.

r? `@the8472`
2022-02-03 01:08:45 +00:00
Matthias Krüger 21ffe45631
Rollup merge of #92528 - tmiasko:combine-commutative, r=michaelwoerister
Make `Fingerprint::combine_commutative` associative

The previous implementation swapped lower and upper 64-bits of a result
of modular addition, so the function was non-associative.

r? `@Aaron1011`
2022-02-02 19:34:01 +01:00
lcnr a1a30f7548 add a rustc::query_stability lint 2022-02-01 10:15:59 +01:00
Jakub Beránek 8de59be933
Compress amount of hashed bytes for isize values in StableHasher 2022-01-30 09:52:44 +01:00
Matthias Krüger bc26f97394
Rollup merge of #93193 - Kobzol:stable-hash-permutation-test, r=the8472
Add test for stable hash uniqueness of adjacent field values

This PR adds a simple test to check that stable hash will produce a different hash if the order of two values that have the same combined bit pattern changes.

r? `@the8472`
2022-01-27 22:32:24 +01:00
bors e7825f2b69 Auto merge of #90842 - pierwill:localdefid-indexmap, r=wesleywiser
Use `indexmap` to avoid sorting `LocalDefId`s

See discussion in https://github.com/rust-lang/rust/pull/90408#discussion_r745935459.

Related to work on https://github.com/rust-lang/rust/issues/90317.
2022-01-24 22:04:55 +00:00
Jakub Beránek 1ffd043caf
Add test stable hash uniqueness of adjacent field values 2022-01-24 15:35:52 +01:00
Jakub Beránek 50f8062316
Revert "Do not hash leading zero bytes of i64 numbers in Sip128 hasher" 2022-01-24 09:07:47 +01:00
pierwill 4f89224f7f Use an indexmap to avoid sorting LocalDefIds
Update `indexmap` to 1.8.0.

Bless test
2022-01-22 22:34:16 -06:00
Nicholas Nethercote 416399dc10 Make Decodable and Decoder infallible.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
  currently panics on failure (e.g. if the input is too short, or on a
  bad `Result` discriminant), and in some places it returns an error
  (e.g. on a bad `Option` discriminant). The number of places where
  either happens is surprisingly small, just because the binary
  representation has very little redundancy and a lot of input reading
  can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
  `.rlink` file production, and there's a `FIXME` comment suggesting it
  should change to a binary format, and (b) in a few tests in
  non-fundamental ways. Indeed #85993 is open to remove it entirely.

And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.

Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
  optimization for small counts that the impl for `Result<T, E>` has,
  because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
  `collect`, which is nice; the one for `Vec` uses unsafe code, because
  that gave better perf on some benchmarks.
2022-01-22 10:38:31 +11:00
bors 42852d7857 Auto merge of #92740 - cuviper:update-rayons, r=Mark-Simulacrum
Update rayon and rustc-rayon

This updates rayon for various tools and rustc-rayon for the compiler's parallel mode.

- rayon v1.3.1 -> v1.5.1
- rayon-core v1.7.1 -> v1.9.1
- rustc-rayon v0.3.1 -> v0.3.2
- rustc-rayon-core v0.3.1 -> v0.3.2

... and indirectly, this updates all of crossbeam-* to their latest versions.

Fixes #92677 by removing crossbeam-queue, but there's still a lingering question about how tidy discovers "runtime" dependencies. None of this is truly in the standard library's dependency tree at all.
2022-01-16 08:12:23 +00:00
bors 2e2c86eba2 Auto merge of #92070 - rukai:replace_vec_into_iter_with_array_into_iter, r=Mark-Simulacrum
Replace usages of vec![].into_iter with [].into_iter

`[].into_iter` is idiomatic over `vec![].into_iter` because its simpler and faster (unless the vec is optimized away in which case it would be the same)

So we should change all the implementation, documentation and tests to use it.

I skipped:
* `src/tools` - Those are copied in from upstream
* `src/test/ui` - Hard to tell if `vec![].into_iter` was used intentionally or not here and not much benefit to changing it.
*  any case where `vec![].into_iter` was used because we specifically needed a `Vec::IntoIter<T>`
*  any case where it looked like we were intentionally using `vec![].into_iter` to test it.
2022-01-11 14:23:24 +00:00
Josh Stone f3b8812f24 Update rayon and rustc-rayon 2022-01-10 11:34:07 -08:00
Lucas Kent 08829853d3 eplace usages of vec![].into_iter with [].into_iter 2022-01-09 14:09:25 +11:00
Aaron Hill 5580e5e1dd
Ensure that Fingerprint caching respects hashing configuration
Fixes #92266

In some `HashStable` impls, we use a cache to avoid re-computing
the same `Fingerprint` from the same structure (e.g. an `AdtDef`).
However, the `StableHashingContext` used can be configured to
perform hashing in different ways (e.g. skipping `Span`s). This
configuration information is not included in the cache key,
which will cause an incorrect `Fingerprint` to be used if
we hash the same structure with different `StableHashingContext`
settings.

To fix this, the configuration settings of `StableHashingContext`
are split out into a separate `HashingControls` struct. This
struct is used as part of the cache key, ensuring that our caches
always produce the correct result for the given settings.

With this in place, we now turn off `Span` hashing during the
entire process of computing the hash included in legacy symbols.
This current has no effect, but will matter when a future PR
starts hashing more `Span`s that we currently skip.
2022-01-05 10:13:28 -05:00
Jakub Beránek 65a3279f4a
Do not hash zero bytes of i64 and u32 in Sip128 hasher 2022-01-04 19:12:10 +01:00
Tomasz Miąsko 1d64b59664 Make Fingerprint::combine_commutative associative
The previous implementation swapped lower and upper 64-bits of a result
of modular addition, so the function was non-associative.
2022-01-03 19:07:29 +01:00
Jakub Beránek 3d8d3f1435
Rustdoc: use ThinVec for GenericArgs bindings 2022-01-01 11:29:14 +01:00
bors 41ce641a40 Auto merge of #92130 - Kobzol:stable-hash-str, r=cjgillot
Use hash_stable for hashing str

This seemed like an oversight. With this change the hash can go through the `HashStable` machinery directly.
2021-12-28 01:04:33 +00:00
pierwill e6ff0bac1e rustc VecGraph: require the index type to implement Ord 2021-12-22 10:50:57 -06:00
pierwill 8df9248591 Remove PartialOrd and Ord from LocalDefId
Implement `Ord`, `PartialOrd` for SpanData
2021-12-22 10:50:57 -06:00
bors 87e8639d8d Auto merge of #91903 - tmiasko:bit-set-hash, r=jackh726
Implement StableHash for BitSet and BitMatrix via Hash

This fixes an issue where bit sets / bit matrices the same word
content but a different domain size would receive the same hash.
2021-12-21 05:42:10 +00:00
Jakub Beránek c695b6026c
Use hash_stable for hashing str 2021-12-20 18:46:34 +01:00
bors daf2204aa4 Auto merge of #91837 - Kobzol:stable-hash-map-avoid-sort, r=the8472
Avoid sorting in hash map stable hashing

Suggested by `@the8472` [here](https://github.com/rust-lang/rust/pull/89404#issuecomment-991813333). I hope that I understood it right, I replaced the sort with modular multiplication, which should be commutative.

Can I ask for a perf. run? However, locally it didn't help at all. Creating the `StableHasher` all over again is probably slowing it down quite a lot. And using `FxHasher` is not straightforward, because the keys and values only implement `HashStable` (and probably they shouldn't be just hashed via `Hash` anyway for it to actually be stable).

Maybe the `StableHash` interface could be changed somehow to better suppor these scenarios where the hasher is short-lived. Or the `StableHasher` implementation could have variants with e.g. a shorter buffer for these scenarios.
2021-12-18 21:23:37 +00:00
Tomasz Miąsko d0281bcb25 Implement StableHash for BitSet and BitMatrix via Hash
This fixes an issue where bit sets / bit matrices the same word
content but a different domain size would receive the same hash.
2021-12-18 00:00:00 +00:00
Jakub Beránek 1f284b07ed
Add special case for length 1 2021-12-13 21:34:54 +01:00
Jakub Beránek ac08f13948
Remove sort from hashing hashset, treeset and treemap 2021-12-13 16:11:28 +01:00
Jakub Beránek 6e33d3ecc2
Use modular arithmetic 2021-12-12 23:48:11 +01:00
bors 22f8bde876 Auto merge of #91549 - fee1-dead:const_env, r=spastorino
Eliminate ConstnessAnd again

Closes #91489.
Closes #89432.

Reverts #91491.
Reverts #89450.

r? `@spastorino`
2021-12-12 22:15:32 +00:00
Jakub Beránek a5f5f6b689
Avoid sorting in hash map stable hashing 2021-12-12 20:57:24 +01:00
Deadbeef 83587e8d30
Small performance tweaks 2021-12-12 12:35:01 +08:00
bors 58457bbfd3 Auto merge of #89404 - Kobzol:hash-stable-sort, r=Mark-Simulacrum
Slightly optimize hash map stable hashing

I was profiling some of the `rustc-perf` benchmarks locally and noticed that quite some time is spent inside the stable hash of hashmaps. I tried to use a `SmallVec` instead of a `Vec` there, which helped very slightly.

Then I tried to remove the sorting, which was a bottleneck, and replaced it with insertion into a binary heap. Locally, it yielded nice improvements in instruction counts and RSS in several benchmarks for incremental builds. The implementation could probably be much nicer and possibly extended to other stable hashes, but first I wanted to test the perf impact properly.

Can I ask someone to do a perf run? Thank you!
2021-12-12 03:50:30 +00:00
Matthias Krüger 7e18b79c77
Rollup merge of #91426 - eggyal:idfunctor-panic-safety, r=lcnr
Make IdFunctor::try_map_id panic-safe

Addresses FIXME comment created in #78313

r? ```@lcnr```
2021-12-11 08:22:32 +01:00
est31 15de4cbc4b Remove redundant [..]s 2021-12-09 00:01:29 +01:00
Alan Egerton acd39ff0fe
Make IdFunctor::try_map_id panic-safe 2021-12-07 11:11:23 +00:00
Mark Rousskov 15483ccf9d Annotate comments onto the LT algorithm 2021-12-06 20:30:15 -05:00
Mark Rousskov 3187480070 Avoid using Option where values are always Some 2021-12-06 15:05:22 -05:00
Mark Rousskov 2b63059772 Create newtype around the pre order index 2021-12-06 15:05:22 -05:00
Mark Rousskov cc63ec32fb Use variables rather than lengths directly 2021-12-06 15:05:22 -05:00
Mark Rousskov 345ada0e1b Optimize: reuse the real-to-preorder mapping as the visited set 2021-12-06 15:05:22 -05:00
Mark Rousskov 8991002644 Remove separate RPO traversal
This integrates the preorder and postorder traversals into one.
2021-12-06 15:05:22 -05:00