rust/library/core
bors 4bb4b96ee7 Auto merge of #74562 - pickfire:is_ascii_branchless, r=nagisa
Remove branch in optimized is_ascii

Performs slightly better in short or medium bytes by eliminating
the last branch check on `byte_pos == len` and always check the
last byte as it is always at most one `usize`.

Benchmark, before `libcore`, after `libcore_new`. It improves
medium and short by 1ns but regresses unaligned_tail by 2ns,
either way we can get unaligned_tail have a tiny chance of 1/8
on a 64 bit machine. I don't think we should bet on that, the
probability is worse than dice.

```
test long::case00_libcore                     ... bench:          38 ns/iter (+/- 1) = 183947 MB/s
test long::case00_libcore_new                 ... bench:          38 ns/iter (+/- 1) = 183947 MB/s
test long::case01_iter_all                    ... bench:         227 ns/iter (+/- 6) = 30792 MB/s
test long::case02_align_to                    ... bench:          40 ns/iter (+/- 1) = 174750 MB/s
test long::case03_align_to_unrolled           ... bench:          19 ns/iter (+/- 1) = 367894 MB/s
test medium::case00_libcore                   ... bench:           5 ns/iter (+/- 0) = 6400 MB/s
test medium::case00_libcore_new               ... bench:           4 ns/iter (+/- 0) = 8000 MB/s
test medium::case01_iter_all                  ... bench:          20 ns/iter (+/- 1) = 1600 MB/s
test medium::case02_align_to                  ... bench:           6 ns/iter (+/- 0) = 5333 MB/s
test medium::case03_align_to_unrolled         ... bench:           5 ns/iter (+/- 0) = 6400 MB/s
test short::case00_libcore                    ... bench:           7 ns/iter (+/- 0) = 1000 MB/s
test short::case00_libcore_new                ... bench:           6 ns/iter (+/- 0) = 1166 MB/s
test short::case01_iter_all                   ... bench:           5 ns/iter (+/- 0) = 1400 MB/s
test short::case02_align_to                   ... bench:           5 ns/iter (+/- 0) = 1400 MB/s
test short::case03_align_to_unrolled          ... bench:           5 ns/iter (+/- 1) = 1400 MB/s
test unaligned_both::case00_libcore           ... bench:           4 ns/iter (+/- 0) = 7500 MB/s
test unaligned_both::case00_libcore_new       ... bench:           4 ns/iter (+/- 0) = 7500 MB/s
test unaligned_both::case01_iter_all          ... bench:          26 ns/iter (+/- 0) = 1153 MB/s
test unaligned_both::case02_align_to          ... bench:          13 ns/iter (+/- 2) = 2307 MB/s
test unaligned_both::case03_align_to_unrolled ... bench:          11 ns/iter (+/- 0) = 2727 MB/s
test unaligned_head::case00_libcore           ... bench:           5 ns/iter (+/- 0) = 6200 MB/s
test unaligned_head::case00_libcore_new       ... bench:           5 ns/iter (+/- 0) = 6200 MB/s
test unaligned_head::case01_iter_all          ... bench:          19 ns/iter (+/- 1) = 1631 MB/s
test unaligned_head::case02_align_to          ... bench:          10 ns/iter (+/- 0) = 3100 MB/s
test unaligned_head::case03_align_to_unrolled ... bench:          14 ns/iter (+/- 0) = 2214 MB/s
test unaligned_tail::case00_libcore           ... bench:           3 ns/iter (+/- 0) = 10333 MB/s
test unaligned_tail::case00_libcore_new       ... bench:           5 ns/iter (+/- 0) = 6200 MB/s
test unaligned_tail::case01_iter_all          ... bench:          19 ns/iter (+/- 0) = 1631 MB/s
test unaligned_tail::case02_align_to          ... bench:          10 ns/iter (+/- 0) = 3100 MB/s
test unaligned_tail::case03_align_to_unrolled ... bench:          13 ns/iter (+/- 0) = 2384 MB/s
```

Rough (unfair) maths on improvements for fun: 1ns * 7/8 - 2ns * 1/8 = 0.625ns

Inspired by fish and zsh clever trick to highlight missing linefeeds (⏎)
and branchless implementation of binary_search in rust.

cc @thomcc https://github.com/rust-lang/rust/pull/74066
r? @nagisa
2020-08-16 23:52:32 +00:00
..
benches
src Auto merge of #74562 - pickfire:is_ascii_branchless, r=nagisa 2020-08-16 23:52:32 +00:00
tests Add drop check test & MaybeUninit::first_ptr_mut 2020-08-13 03:51:08 +00:00
Cargo.toml