Auto merge of #86761 - Alexhuszagh:master, r=estebank

Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.

# Summary

Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).

Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.

In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.

Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.

```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>();   // Err(ParseFloatError { kind: Invalid })
```

# Solution

This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.

**Documentation**

Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:

```rust
    // Round-to-even only happens for negative values of q
    // when q ≥ −4 in the 64-bit case and when q ≥ −17 in
    // the 32-bitcase.
    //
    // When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
    // have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
    // 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
    //
    // When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
    // so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
    // or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
    // (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
    // or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
    //
    // Thus we have that we only need to round ties to even when
    // we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
    // (in the 32-bit case). In both cases,the power of five(5^|q|)
    // fits in a 64-bit word.
    const MIN_EXPONENT_ROUND_TO_EVEN: i32;
    const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```

This ensures maintainability of the code base.

**Improvements for Disguised Fast-Path Cases**

The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.

However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.

**Digit Parsing Improvements**

Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.

**Unsafe Changes**

Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.

```rust
impl AsciiStr {
    #[inline]
    pub fn step_by(&mut self, n: usize) -> &mut Self {
        unsafe { self.ptr = self.ptr.add(n) };
        self
    }
}

...

#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
    // the first character is 'e'/'E' and scientific mode is enabled
    let start = *s;
    s.step();
    ...
}
```

The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.

```rust
impl AsciiStr {
    /// Advance the view by n, advancing it in-place to (n..).
    pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
        // SAFETY: same as step_by, safe as long n is less than the buffer length
        self.ptr = unsafe { self.ptr.add(n) };
        self
    }
}

...

/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
    let start = *s;
    // SAFETY: the first character is 'e'/'E' and scientific mode is enabled
    unsafe {
        s.step();
    }
    ...
}
```

This allows us to trivially demonstrate the new implementation of dec2flt is safe.

**Inline Annotations Have Been Removed**

In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).

**Fixed Correctness Tests**

Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.

**Undefined Behavior**

An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:

```rust
    #[inline]
    pub fn check_len(&self, n: usize) -> bool {
        unsafe { self.ptr.add(n) <= self.end }
    }
```

And the new implementation is as follows:

```rust
    /// Check if the slice at least `n` length.
    fn check_len(&self, n: usize) -> bool {
        n <= self.as_ref().len()
    }
```

Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).

**Inferring Binary Exponents**

Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).

# Code Size

The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.

**new**

Using rustc version 1.55.0-dev.

opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K

**old**

Using rustc version 1.53.0-nightly.

opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K

# Correctness

The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham  [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.

# Issues Addressed

This will fix and close the following issues:

- resolves #85198
- resolves #85214
- resolves #85234
- fixes #31407
- fixes #31109
- fixes #53015
- resolves #68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
This commit is contained in:
bors 2021-07-17 12:56:22 +00:00
commit f502bd3abd
43 changed files with 2538 additions and 2831 deletions

View file

@ -172,7 +172,6 @@ pub enum LitToConstError {
/// This is used for graceful error handling (`delay_span_bug`) in
/// type checking (`Const::from_anon_const`).
TypeError,
UnparseableFloat,
Reported,
}

View file

@ -46,9 +46,7 @@ crate fn lit_to_const<'tcx>(
(ast::LitKind::Int(n, _), ty::Uint(_)) | (ast::LitKind::Int(n, _), ty::Int(_)) => {
trunc(if neg { (*n as i128).overflowing_neg().0 as u128 } else { *n })?
}
(ast::LitKind::Float(n, _), ty::Float(fty)) => {
parse_float(*n, *fty, neg).map_err(|_| LitToConstError::UnparseableFloat)?
}
(ast::LitKind::Float(n, _), ty::Float(fty)) => parse_float(*n, *fty, neg),
(ast::LitKind::Bool(b), ty::Bool) => ConstValue::Scalar(Scalar::from_bool(*b)),
(ast::LitKind::Char(c), ty::Char) => ConstValue::Scalar(Scalar::from_char(*c)),
(ast::LitKind::Err(_), _) => return Err(LitToConstError::Reported),
@ -57,12 +55,14 @@ crate fn lit_to_const<'tcx>(
Ok(ty::Const::from_value(tcx, lit, ty))
}
fn parse_float<'tcx>(num: Symbol, fty: ty::FloatTy, neg: bool) -> Result<ConstValue<'tcx>, ()> {
fn parse_float<'tcx>(num: Symbol, fty: ty::FloatTy, neg: bool) -> ConstValue<'tcx> {
let num = num.as_str();
use rustc_apfloat::ieee::{Double, Single};
let scalar = match fty {
ty::FloatTy::F32 => {
let rust_f = num.parse::<f32>().map_err(|_| ())?;
let rust_f = num
.parse::<f32>()
.unwrap_or_else(|e| panic!("f32 failed to parse `{}`: {:?}", num, e));
let mut f = num.parse::<Single>().unwrap_or_else(|e| {
panic!("apfloat::ieee::Single failed to parse `{}`: {:?}", num, e)
});
@ -82,7 +82,9 @@ fn parse_float<'tcx>(num: Symbol, fty: ty::FloatTy, neg: bool) -> Result<ConstVa
Scalar::from_f32(f)
}
ty::FloatTy::F64 => {
let rust_f = num.parse::<f64>().map_err(|_| ())?;
let rust_f = num
.parse::<f64>()
.unwrap_or_else(|e| panic!("f64 failed to parse `{}`: {:?}", num, e));
let mut f = num.parse::<Double>().unwrap_or_else(|e| {
panic!("apfloat::ieee::Double failed to parse `{}`: {:?}", num, e)
});
@ -103,5 +105,5 @@ fn parse_float<'tcx>(num: Symbol, fty: ty::FloatTy, neg: bool) -> Result<ConstVa
}
};
Ok(ConstValue::Scalar(scalar))
ConstValue::Scalar(scalar)
}

View file

@ -67,12 +67,6 @@ impl<'tcx> Cx<'tcx> {
match self.tcx.at(sp).lit_to_const(LitToConstInput { lit, ty, neg }) {
Ok(c) => c,
Err(LitToConstError::UnparseableFloat) => {
// FIXME(#31407) this is only necessary because float parsing is buggy
self.tcx.sess.span_err(sp, "could not evaluate float literal (see issue #31407)");
// create a dummy value and continue compiling
self.tcx.const_error(ty)
}
Err(LitToConstError::Reported) => {
// create a dummy value and continue compiling
self.tcx.const_error(ty)

View file

@ -84,7 +84,7 @@ impl<'tcx> Visitor<'tcx> for MatchVisitor<'_, 'tcx> {
}
impl PatCtxt<'_, '_> {
fn report_inlining_errors(&self, pat_span: Span) {
fn report_inlining_errors(&self) {
for error in &self.errors {
match *error {
PatternError::StaticInPattern(span) => {
@ -96,14 +96,6 @@ impl PatCtxt<'_, '_> {
PatternError::ConstParamInPattern(span) => {
self.span_e0158(span, "const parameters cannot be referenced in patterns")
}
PatternError::FloatBug => {
// FIXME(#31407) this is only necessary because float parsing is buggy
rustc_middle::mir::interpret::struct_error(
self.tcx.at(pat_span),
"could not evaluate float literal (see issue #31407)",
)
.emit();
}
PatternError::NonConstPath(span) => {
rustc_middle::mir::interpret::struct_error(
self.tcx.at(span),
@ -142,7 +134,7 @@ impl<'tcx> MatchVisitor<'_, 'tcx> {
let pattern: &_ = cx.pattern_arena.alloc(expand_pattern(pattern));
if !patcx.errors.is_empty() {
*have_errors = true;
patcx.report_inlining_errors(pat.span);
patcx.report_inlining_errors();
}
(pattern, pattern_ty)
}

View file

@ -31,7 +31,6 @@ crate enum PatternError {
AssocConstInPattern(Span),
ConstParamInPattern(Span),
StaticInPattern(Span),
FloatBug,
NonConstPath(Span),
}
@ -563,10 +562,6 @@ impl<'a, 'tcx> PatCtxt<'a, 'tcx> {
LitToConstInput { lit: &lit.node, ty: self.typeck_results.expr_ty(expr), neg };
match self.tcx.at(expr.span).lit_to_const(lit_input) {
Ok(val) => *self.const_to_pat(val, expr.hir_id, lit.span, false).kind,
Err(LitToConstError::UnparseableFloat) => {
self.errors.push(PatternError::FloatBug);
PatKind::Wild
}
Err(LitToConstError::Reported) => PatKind::Wild,
Err(LitToConstError::TypeError) => bug!("lower_lit: had type error"),
}

View file

@ -1,429 +0,0 @@
//! The various algorithms from the paper.
use crate::cmp::min;
use crate::cmp::Ordering::{Equal, Greater, Less};
use crate::num::dec2flt::num::{self, Big};
use crate::num::dec2flt::rawfp::{self, fp_to_float, next_float, prev_float, RawFloat, Unpacked};
use crate::num::dec2flt::table;
use crate::num::diy_float::Fp;
/// Number of significand bits in Fp
const P: u32 = 64;
// We simply store the best approximation for *all* exponents, so the variable "h" and the
// associated conditions can be omitted. This trades performance for a couple kilobytes of space.
fn power_of_ten(e: i16) -> Fp {
assert!(e >= table::MIN_E);
let i = e - table::MIN_E;
let sig = table::POWERS.0[i as usize];
let exp = table::POWERS.1[i as usize];
Fp { f: sig, e: exp }
}
// In most architectures, floating point operations have an explicit bit size, therefore the
// precision of the computation is determined on a per-operation basis.
#[cfg(any(not(target_arch = "x86"), target_feature = "sse2"))]
mod fpu_precision {
pub fn set_precision<T>() {}
}
// On x86, the x87 FPU is used for float operations if the SSE/SSE2 extensions are not available.
// The x87 FPU operates with 80 bits of precision by default, which means that operations will
// round to 80 bits causing double rounding to happen when values are eventually represented as
// 32/64 bit float values. To overcome this, the FPU control word can be set so that the
// computations are performed in the desired precision.
#[cfg(all(target_arch = "x86", not(target_feature = "sse2")))]
mod fpu_precision {
use crate::mem::size_of;
/// A structure used to preserve the original value of the FPU control word, so that it can be
/// restored when the structure is dropped.
///
/// The x87 FPU is a 16-bits register whose fields are as follows:
///
/// | 12-15 | 10-11 | 8-9 | 6-7 | 5 | 4 | 3 | 2 | 1 | 0 |
/// |------:|------:|----:|----:|---:|---:|---:|---:|---:|---:|
/// | | RC | PC | | PM | UM | OM | ZM | DM | IM |
///
/// The documentation for all of the fields is available in the IA-32 Architectures Software
/// Developer's Manual (Volume 1).
///
/// The only field which is relevant for the following code is PC, Precision Control. This
/// field determines the precision of the operations performed by the FPU. It can be set to:
/// - 0b00, single precision i.e., 32-bits
/// - 0b10, double precision i.e., 64-bits
/// - 0b11, double extended precision i.e., 80-bits (default state)
/// The 0b01 value is reserved and should not be used.
pub struct FPUControlWord(u16);
fn set_cw(cw: u16) {
// SAFETY: the `fldcw` instruction has been audited to be able to work correctly with
// any `u16`
unsafe {
asm!(
"fldcw word ptr [{}]",
in(reg) &cw,
options(nostack),
)
}
}
/// Sets the precision field of the FPU to `T` and returns a `FPUControlWord`.
pub fn set_precision<T>() -> FPUControlWord {
let mut cw = 0_u16;
// Compute the value for the Precision Control field that is appropriate for `T`.
let cw_precision = match size_of::<T>() {
4 => 0x0000, // 32 bits
8 => 0x0200, // 64 bits
_ => 0x0300, // default, 80 bits
};
// Get the original value of the control word to restore it later, when the
// `FPUControlWord` structure is dropped
// SAFETY: the `fnstcw` instruction has been audited to be able to work correctly with
// any `u16`
unsafe {
asm!(
"fnstcw word ptr [{}]",
in(reg) &mut cw,
options(nostack),
)
}
// Set the control word to the desired precision. This is achieved by masking away the old
// precision (bits 8 and 9, 0x300) and replacing it with the precision flag computed above.
set_cw((cw & 0xFCFF) | cw_precision);
FPUControlWord(cw)
}
impl Drop for FPUControlWord {
fn drop(&mut self) {
set_cw(self.0)
}
}
}
/// The fast path of Bellerophon using machine-sized integers and floats.
///
/// This is extracted into a separate function so that it can be attempted before constructing
/// a bignum.
pub fn fast_path<T: RawFloat>(integral: &[u8], fractional: &[u8], e: i64) -> Option<T> {
let num_digits = integral.len() + fractional.len();
// log_10(f64::MAX_SIG) ~ 15.95. We compare the exact value to MAX_SIG near the end,
// this is just a quick, cheap rejection (and also frees the rest of the code from
// worrying about underflow).
if num_digits > 16 {
return None;
}
if e.abs() >= T::CEIL_LOG5_OF_MAX_SIG as i64 {
return None;
}
let f = num::from_str_unchecked(integral.iter().chain(fractional.iter()));
if f > T::MAX_SIG {
return None;
}
// The fast path crucially depends on arithmetic being rounded to the correct number of bits
// without any intermediate rounding. On x86 (without SSE or SSE2) this requires the precision
// of the x87 FPU stack to be changed so that it directly rounds to 64/32 bit.
// The `set_precision` function takes care of setting the precision on architectures which
// require setting it by changing the global state (like the control word of the x87 FPU).
let _cw = fpu_precision::set_precision::<T>();
// The case e < 0 cannot be folded into the other branch. Negative powers result in
// a repeating fractional part in binary, which are rounded, which causes real
// (and occasionally quite significant!) errors in the final result.
if e >= 0 {
Some(T::from_int(f) * T::short_fast_pow10(e as usize))
} else {
Some(T::from_int(f) / T::short_fast_pow10(e.abs() as usize))
}
}
/// Algorithm Bellerophon is trivial code justified by non-trivial numeric analysis.
///
/// It rounds ``f`` to a float with 64 bit significand and multiplies it by the best approximation
/// of `10^e` (in the same floating point format). This is often enough to get the correct result.
/// However, when the result is close to halfway between two adjacent (ordinary) floats, the
/// compound rounding error from multiplying two approximation means the result may be off by a
/// few bits. When this happens, the iterative Algorithm R fixes things up.
///
/// The hand-wavy "close to halfway" is made precise by the numeric analysis in the paper.
/// In the words of Clinger:
///
/// > Slop, expressed in units of the least significant bit, is an inclusive bound for the error
/// > accumulated during the floating point calculation of the approximation to f * 10^e. (Slop is
/// > not a bound for the true error, but bounds the difference between the approximation z and
/// > the best possible approximation that uses p bits of significand.)
pub fn bellerophon<T: RawFloat>(f: &Big, e: i16) -> T {
let slop = if f <= &Big::from_u64(T::MAX_SIG) {
// The cases abs(e) < log5(2^N) are in fast_path()
if e >= 0 { 0 } else { 3 }
} else {
if e >= 0 { 1 } else { 4 }
};
let z = rawfp::big_to_fp(f).mul(&power_of_ten(e)).normalize();
let exp_p_n = 1 << (P - T::SIG_BITS as u32);
let lowbits: i64 = (z.f % exp_p_n) as i64;
// Is the slop large enough to make a difference when
// rounding to n bits?
if (lowbits - exp_p_n as i64 / 2).abs() <= slop {
algorithm_r(f, e, fp_to_float(z))
} else {
fp_to_float(z)
}
}
/// An iterative algorithm that improves a floating point approximation of `f * 10^e`.
///
/// Each iteration gets one unit in the last place closer, which of course takes terribly long to
/// converge if `z0` is even mildly off. Luckily, when used as fallback for Bellerophon, the
/// starting approximation is off by at most one ULP.
fn algorithm_r<T: RawFloat>(f: &Big, e: i16, z0: T) -> T {
let mut z = z0;
loop {
let raw = z.unpack();
let (m, k) = (raw.sig, raw.k);
let mut x = f.clone();
let mut y = Big::from_u64(m);
// Find positive integers `x`, `y` such that `x / y` is exactly `(f * 10^e) / (m * 2^k)`.
// This not only avoids dealing with the signs of `e` and `k`, we also eliminate the
// power of two common to `10^e` and `2^k` to make the numbers smaller.
make_ratio(&mut x, &mut y, e, k);
let m_digits = [(m & 0xFF_FF_FF_FF) as u32, (m >> 32) as u32];
// This is written a bit awkwardly because our bignums don't support
// negative numbers, so we use the absolute value + sign information.
// The multiplication with m_digits can't overflow. If `x` or `y` are large enough that
// we need to worry about overflow, then they are also large enough that `make_ratio` has
// reduced the fraction by a factor of 2^64 or more.
let (d2, d_negative) = if x >= y {
// Don't need x any more, save a clone().
x.sub(&y).mul_pow2(1).mul_digits(&m_digits);
(x, false)
} else {
// Still need y - make a copy.
let mut y = y.clone();
y.sub(&x).mul_pow2(1).mul_digits(&m_digits);
(y, true)
};
if d2 < y {
let mut d2_double = d2;
d2_double.mul_pow2(1);
if m == T::MIN_SIG && d_negative && d2_double > y {
z = prev_float(z);
} else {
return z;
}
} else if d2 == y {
if m % 2 == 0 {
if m == T::MIN_SIG && d_negative {
z = prev_float(z);
} else {
return z;
}
} else if d_negative {
z = prev_float(z);
} else {
z = next_float(z);
}
} else if d_negative {
z = prev_float(z);
} else {
z = next_float(z);
}
}
}
/// Given `x = f` and `y = m` where `f` represent input decimal digits as usual and `m` is the
/// significand of a floating point approximation, make the ratio `x / y` equal to
/// `(f * 10^e) / (m * 2^k)`, possibly reduced by a power of two both have in common.
fn make_ratio(x: &mut Big, y: &mut Big, e: i16, k: i16) {
let (e_abs, k_abs) = (e.abs() as usize, k.abs() as usize);
if e >= 0 {
if k >= 0 {
// x = f * 10^e, y = m * 2^k, except that we reduce the fraction by some power of two.
let common = min(e_abs, k_abs);
x.mul_pow5(e_abs).mul_pow2(e_abs - common);
y.mul_pow2(k_abs - common);
} else {
// x = f * 10^e * 2^abs(k), y = m
// This can't overflow because it requires positive `e` and negative `k`, which can
// only happen for values extremely close to 1, which means that `e` and `k` will be
// comparatively tiny.
x.mul_pow5(e_abs).mul_pow2(e_abs + k_abs);
}
} else {
if k >= 0 {
// x = f, y = m * 10^abs(e) * 2^k
// This can't overflow either, see above.
y.mul_pow5(e_abs).mul_pow2(k_abs + e_abs);
} else {
// x = f * 2^abs(k), y = m * 10^abs(e), again reducing by a common power of two.
let common = min(e_abs, k_abs);
x.mul_pow2(k_abs - common);
y.mul_pow5(e_abs).mul_pow2(e_abs - common);
}
}
}
/// Conceptually, Algorithm M is the simplest way to convert a decimal to a float.
///
/// We form a ratio that is equal to `f * 10^e`, then throwing in powers of two until it gives
/// a valid float significand. The binary exponent `k` is the number of times we multiplied
/// numerator or denominator by two, i.e., at all times `f * 10^e` equals `(u / v) * 2^k`.
/// When we have found out significand, we only need to round by inspecting the remainder of the
/// division, which is done in helper functions further below.
///
/// This algorithm is super slow, even with the optimization described in `quick_start()`.
/// However, it's the simplest of the algorithms to adapt for overflow, underflow, and subnormal
/// results. This implementation takes over when Bellerophon and Algorithm R are overwhelmed.
/// Detecting underflow and overflow is easy: The ratio still isn't an in-range significand,
/// yet the minimum/maximum exponent has been reached. In the case of overflow, we simply return
/// infinity.
///
/// Handling underflow and subnormals is trickier. One big problem is that, with the minimum
/// exponent, the ratio might still be too large for a significand. See underflow() for details.
pub fn algorithm_m<T: RawFloat>(f: &Big, e: i16) -> T {
let mut u;
let mut v;
let e_abs = e.abs() as usize;
let mut k = 0;
if e < 0 {
u = f.clone();
v = Big::from_small(1);
v.mul_pow5(e_abs).mul_pow2(e_abs);
} else {
// FIXME possible optimization: generalize big_to_fp so that we can do the equivalent of
// fp_to_float(big_to_fp(u)) here, only without the double rounding.
u = f.clone();
u.mul_pow5(e_abs).mul_pow2(e_abs);
v = Big::from_small(1);
}
quick_start::<T>(&mut u, &mut v, &mut k);
let mut rem = Big::from_small(0);
let mut x = Big::from_small(0);
let min_sig = Big::from_u64(T::MIN_SIG);
let max_sig = Big::from_u64(T::MAX_SIG);
loop {
u.div_rem(&v, &mut x, &mut rem);
if k == T::MIN_EXP_INT {
// We have to stop at the minimum exponent, if we wait until `k < T::MIN_EXP_INT`,
// then we'd be off by a factor of two. Unfortunately this means we have to special-
// case normal numbers with the minimum exponent.
// FIXME find a more elegant formulation, but run the `tiny-pow10` test to make sure
// that it's actually correct!
if x >= min_sig && x <= max_sig {
break;
}
return underflow(x, v, rem);
}
if k > T::MAX_EXP_INT {
return T::INFINITY;
}
if x < min_sig {
u.mul_pow2(1);
k -= 1;
} else if x > max_sig {
v.mul_pow2(1);
k += 1;
} else {
break;
}
}
let q = num::to_u64(&x);
let z = rawfp::encode_normal(Unpacked::new(q, k));
round_by_remainder(v, rem, q, z)
}
/// Skips over most Algorithm M iterations by checking the bit length.
fn quick_start<T: RawFloat>(u: &mut Big, v: &mut Big, k: &mut i16) {
// The bit length is an estimate of the base two logarithm, and log(u / v) = log(u) - log(v).
// The estimate is off by at most 1, but always an under-estimate, so the error on log(u)
// and log(v) are of the same sign and cancel out (if both are large). Therefore the error
// for log(u / v) is at most one as well.
// The target ratio is one where u/v is in an in-range significand. Thus our termination
// condition is log2(u / v) being the significand bits, plus/minus one.
// FIXME Looking at the second bit could improve the estimate and avoid some more divisions.
let target_ratio = T::SIG_BITS as i16;
let log2_u = u.bit_length() as i16;
let log2_v = v.bit_length() as i16;
let mut u_shift: i16 = 0;
let mut v_shift: i16 = 0;
assert!(*k == 0);
loop {
if *k == T::MIN_EXP_INT {
// Underflow or subnormal. Leave it to the main function.
break;
}
if *k == T::MAX_EXP_INT {
// Overflow. Leave it to the main function.
break;
}
let log2_ratio = (log2_u + u_shift) - (log2_v + v_shift);
if log2_ratio < target_ratio - 1 {
u_shift += 1;
*k -= 1;
} else if log2_ratio > target_ratio + 1 {
v_shift += 1;
*k += 1;
} else {
break;
}
}
u.mul_pow2(u_shift as usize);
v.mul_pow2(v_shift as usize);
}
fn underflow<T: RawFloat>(x: Big, v: Big, rem: Big) -> T {
if x < Big::from_u64(T::MIN_SIG) {
let q = num::to_u64(&x);
let z = rawfp::encode_subnormal(q);
return round_by_remainder(v, rem, q, z);
}
// Ratio isn't an in-range significand with the minimum exponent, so we need to round off
// excess bits and adjust the exponent accordingly. The real value now looks like this:
//
// x lsb
// /--------------\/
// 1010101010101010.10101010101010 * 2^k
// \-----/\-------/ \------------/
// q trunc. (represented by rem)
//
// Therefore, when the rounded-off bits are != 0.5 ULP, they decide the rounding
// on their own. When they are equal and the remainder is non-zero, the value still
// needs to be rounded up. Only when the rounded off bits are 1/2 and the remainder
// is zero, we have a half-to-even situation.
let bits = x.bit_length();
let lsb = bits - T::SIG_BITS as usize;
let q = num::get_bits(&x, lsb, bits);
let k = T::MIN_EXP_INT + lsb as i16;
let z = rawfp::encode_normal(Unpacked::new(q, k));
let q_even = q % 2 == 0;
match num::compare_with_half_ulp(&x, lsb) {
Greater => next_float(z),
Less => z,
Equal if rem.is_zero() && q_even => z,
Equal => next_float(z),
}
}
/// Ordinary round-to-even, obfuscated by having to round based on the remainder of a division.
fn round_by_remainder<T: RawFloat>(v: Big, r: Big, q: u64, z: T) -> T {
let mut v_minus_r = v;
v_minus_r.sub(&r);
if r < v_minus_r {
z
} else if r > v_minus_r {
next_float(z)
} else if q % 2 == 0 {
z
} else {
next_float(z)
}
}

View file

@ -0,0 +1,198 @@
//! Common utilities, for internal use only.
use crate::ptr;
/// Helper methods to process immutable bytes.
pub(crate) trait ByteSlice: AsRef<[u8]> {
unsafe fn first_unchecked(&self) -> u8 {
debug_assert!(!self.is_empty());
// SAFETY: safe as long as self is not empty
unsafe { *self.as_ref().get_unchecked(0) }
}
/// Get if the slice contains no elements.
fn is_empty(&self) -> bool {
self.as_ref().is_empty()
}
/// Check if the slice at least `n` length.
fn check_len(&self, n: usize) -> bool {
n <= self.as_ref().len()
}
/// Check if the first character in the slice is equal to c.
fn first_is(&self, c: u8) -> bool {
self.as_ref().first() == Some(&c)
}
/// Check if the first character in the slice is equal to c1 or c2.
fn first_is2(&self, c1: u8, c2: u8) -> bool {
if let Some(&c) = self.as_ref().first() { c == c1 || c == c2 } else { false }
}
/// Bounds-checked test if the first character in the slice is a digit.
fn first_isdigit(&self) -> bool {
if let Some(&c) = self.as_ref().first() { c.is_ascii_digit() } else { false }
}
/// Check if self starts with u with a case-insensitive comparison.
fn eq_ignore_case(&self, u: &[u8]) -> bool {
debug_assert!(self.as_ref().len() >= u.len());
let iter = self.as_ref().iter().zip(u.iter());
let d = iter.fold(0, |i, (&x, &y)| i | (x ^ y));
d == 0 || d == 32
}
/// Get the remaining slice after the first N elements.
fn advance(&self, n: usize) -> &[u8] {
&self.as_ref()[n..]
}
/// Get the slice after skipping all leading characters equal c.
fn skip_chars(&self, c: u8) -> &[u8] {
let mut s = self.as_ref();
while s.first_is(c) {
s = s.advance(1);
}
s
}
/// Get the slice after skipping all leading characters equal c1 or c2.
fn skip_chars2(&self, c1: u8, c2: u8) -> &[u8] {
let mut s = self.as_ref();
while s.first_is2(c1, c2) {
s = s.advance(1);
}
s
}
/// Read 8 bytes as a 64-bit integer in little-endian order.
unsafe fn read_u64_unchecked(&self) -> u64 {
debug_assert!(self.check_len(8));
let src = self.as_ref().as_ptr() as *const u64;
// SAFETY: safe as long as self is at least 8 bytes
u64::from_le(unsafe { ptr::read_unaligned(src) })
}
/// Try to read the next 8 bytes from the slice.
fn read_u64(&self) -> Option<u64> {
if self.check_len(8) {
// SAFETY: self must be at least 8 bytes.
Some(unsafe { self.read_u64_unchecked() })
} else {
None
}
}
/// Calculate the offset of slice from another.
fn offset_from(&self, other: &Self) -> isize {
other.as_ref().len() as isize - self.as_ref().len() as isize
}
}
impl ByteSlice for [u8] {}
/// Helper methods to process mutable bytes.
pub(crate) trait ByteSliceMut: AsMut<[u8]> {
/// Write a 64-bit integer as 8 bytes in little-endian order.
unsafe fn write_u64_unchecked(&mut self, value: u64) {
debug_assert!(self.as_mut().len() >= 8);
let dst = self.as_mut().as_mut_ptr() as *mut u64;
// NOTE: we must use `write_unaligned`, since dst is not
// guaranteed to be properly aligned. Miri will warn us
// if we use `write` instead of `write_unaligned`, as expected.
// SAFETY: safe as long as self is at least 8 bytes
unsafe {
ptr::write_unaligned(dst, u64::to_le(value));
}
}
}
impl ByteSliceMut for [u8] {}
/// Bytes wrapper with specialized methods for ASCII characters.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) struct AsciiStr<'a> {
slc: &'a [u8],
}
impl<'a> AsciiStr<'a> {
pub fn new(slc: &'a [u8]) -> Self {
Self { slc }
}
/// Advance the view by n, advancing it in-place to (n..).
pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
// SAFETY: safe as long n is less than the buffer length
self.slc = unsafe { self.slc.get_unchecked(n..) };
self
}
/// Advance the view by n, advancing it in-place to (1..).
pub unsafe fn step(&mut self) -> &mut Self {
// SAFETY: safe as long as self is not empty
unsafe { self.step_by(1) }
}
/// Iteratively parse and consume digits from bytes.
pub fn parse_digits(&mut self, mut func: impl FnMut(u8)) {
while let Some(&c) = self.as_ref().first() {
let c = c.wrapping_sub(b'0');
if c < 10 {
func(c);
// SAFETY: self cannot be empty
unsafe {
self.step();
}
} else {
break;
}
}
}
}
impl<'a> AsRef<[u8]> for AsciiStr<'a> {
#[inline]
fn as_ref(&self) -> &[u8] {
self.slc
}
}
impl<'a> ByteSlice for AsciiStr<'a> {}
/// Determine if 8 bytes are all decimal digits.
/// This does not care about the order in which the bytes were loaded.
pub(crate) fn is_8digits(v: u64) -> bool {
let a = v.wrapping_add(0x4646_4646_4646_4646);
let b = v.wrapping_sub(0x3030_3030_3030_3030);
(a | b) & 0x8080_8080_8080_8080 == 0
}
/// Iteratively parse and consume digits from bytes.
pub(crate) fn parse_digits(s: &mut &[u8], mut f: impl FnMut(u8)) {
while let Some(&c) = s.get(0) {
let c = c.wrapping_sub(b'0');
if c < 10 {
f(c);
*s = s.advance(1);
} else {
break;
}
}
}
/// A custom 64-bit floating point type, representing `f * 2^e`.
/// e is biased, so it be directly shifted into the exponent bits.
#[derive(Debug, Copy, Clone, PartialEq, Eq, Default)]
pub struct BiasedFp {
/// The significant digits.
pub f: u64,
/// The biased, binary exponent.
pub e: i32,
}
impl BiasedFp {
pub const fn zero_pow2(e: i32) -> Self {
Self { f: 0, e }
}
}

View file

@ -0,0 +1,351 @@
//! Arbitrary-precision decimal class for fallback algorithms.
//!
//! This is only used if the fast-path (native floats) and
//! the Eisel-Lemire algorithm are unable to unambiguously
//! determine the float.
//!
//! The technique used is "Simple Decimal Conversion", developed
//! by Nigel Tao and Ken Thompson. A detailed description of the
//! algorithm can be found in "ParseNumberF64 by Simple Decimal Conversion",
//! available online: <https://nigeltao.github.io/blog/2020/parse-number-f64-simple.html>.
use crate::num::dec2flt::common::{is_8digits, parse_digits, ByteSlice, ByteSliceMut};
#[derive(Clone)]
pub struct Decimal {
/// The number of significant digits in the decimal.
pub num_digits: usize,
/// The offset of the decimal point in the significant digits.
pub decimal_point: i32,
/// If the number of significant digits stored in the decimal is truncated.
pub truncated: bool,
/// Buffer of the raw digits, in the range [0, 9].
pub digits: [u8; Self::MAX_DIGITS],
}
impl Default for Decimal {
fn default() -> Self {
Self { num_digits: 0, decimal_point: 0, truncated: false, digits: [0; Self::MAX_DIGITS] }
}
}
impl Decimal {
/// The maximum number of digits required to unambiguously round a float.
///
/// For a double-precision IEEE-754 float, this required 767 digits,
/// so we store the max digits + 1.
///
/// We can exactly represent a float in radix `b` from radix 2 if
/// `b` is divisible by 2. This function calculates the exact number of
/// digits required to exactly represent that float.
///
/// According to the "Handbook of Floating Point Arithmetic",
/// for IEEE754, with emin being the min exponent, p2 being the
/// precision, and b being the radix, the number of digits follows as:
///
/// `emin + p2 + ⌊(emin + 1) log(2, b) log(1 2^(p2), b)⌋`
///
/// For f32, this follows as:
/// emin = -126
/// p2 = 24
///
/// For f64, this follows as:
/// emin = -1022
/// p2 = 53
///
/// In Python:
/// `-emin + p2 + math.floor((emin+ 1)*math.log(2, b)-math.log(1-2**(-p2), b))`
pub const MAX_DIGITS: usize = 768;
/// The max digits that can be exactly represented in a 64-bit integer.
pub const MAX_DIGITS_WITHOUT_OVERFLOW: usize = 19;
pub const DECIMAL_POINT_RANGE: i32 = 2047;
/// Append a digit to the buffer.
pub fn try_add_digit(&mut self, digit: u8) {
if self.num_digits < Self::MAX_DIGITS {
self.digits[self.num_digits] = digit;
}
self.num_digits += 1;
}
/// Trim trailing zeros from the buffer.
pub fn trim(&mut self) {
// All of the following calls to `Decimal::trim` can't panic because:
//
// 1. `parse_decimal` sets `num_digits` to a max of `Decimal::MAX_DIGITS`.
// 2. `right_shift` sets `num_digits` to `write_index`, which is bounded by `num_digits`.
// 3. `left_shift` `num_digits` to a max of `Decimal::MAX_DIGITS`.
//
// Trim is only called in `right_shift` and `left_shift`.
debug_assert!(self.num_digits <= Self::MAX_DIGITS);
while self.num_digits != 0 && self.digits[self.num_digits - 1] == 0 {
self.num_digits -= 1;
}
}
pub fn round(&self) -> u64 {
if self.num_digits == 0 || self.decimal_point < 0 {
return 0;
} else if self.decimal_point > 18 {
return 0xFFFF_FFFF_FFFF_FFFF_u64;
}
let dp = self.decimal_point as usize;
let mut n = 0_u64;
for i in 0..dp {
n *= 10;
if i < self.num_digits {
n += self.digits[i] as u64;
}
}
let mut round_up = false;
if dp < self.num_digits {
round_up = self.digits[dp] >= 5;
if self.digits[dp] == 5 && dp + 1 == self.num_digits {
round_up = self.truncated || ((dp != 0) && (1 & self.digits[dp - 1] != 0))
}
}
if round_up {
n += 1;
}
n
}
/// Computes decimal * 2^shift.
pub fn left_shift(&mut self, shift: usize) {
if self.num_digits == 0 {
return;
}
let num_new_digits = number_of_digits_decimal_left_shift(self, shift);
let mut read_index = self.num_digits;
let mut write_index = self.num_digits + num_new_digits;
let mut n = 0_u64;
while read_index != 0 {
read_index -= 1;
write_index -= 1;
n += (self.digits[read_index] as u64) << shift;
let quotient = n / 10;
let remainder = n - (10 * quotient);
if write_index < Self::MAX_DIGITS {
self.digits[write_index] = remainder as u8;
} else if remainder > 0 {
self.truncated = true;
}
n = quotient;
}
while n > 0 {
write_index -= 1;
let quotient = n / 10;
let remainder = n - (10 * quotient);
if write_index < Self::MAX_DIGITS {
self.digits[write_index] = remainder as u8;
} else if remainder > 0 {
self.truncated = true;
}
n = quotient;
}
self.num_digits += num_new_digits;
if self.num_digits > Self::MAX_DIGITS {
self.num_digits = Self::MAX_DIGITS;
}
self.decimal_point += num_new_digits as i32;
self.trim();
}
/// Computes decimal * 2^-shift.
pub fn right_shift(&mut self, shift: usize) {
let mut read_index = 0;
let mut write_index = 0;
let mut n = 0_u64;
while (n >> shift) == 0 {
if read_index < self.num_digits {
n = (10 * n) + self.digits[read_index] as u64;
read_index += 1;
} else if n == 0 {
return;
} else {
while (n >> shift) == 0 {
n *= 10;
read_index += 1;
}
break;
}
}
self.decimal_point -= read_index as i32 - 1;
if self.decimal_point < -Self::DECIMAL_POINT_RANGE {
// `self = Self::Default()`, but without the overhead of clearing `digits`.
self.num_digits = 0;
self.decimal_point = 0;
self.truncated = false;
return;
}
let mask = (1_u64 << shift) - 1;
while read_index < self.num_digits {
let new_digit = (n >> shift) as u8;
n = (10 * (n & mask)) + self.digits[read_index] as u64;
read_index += 1;
self.digits[write_index] = new_digit;
write_index += 1;
}
while n > 0 {
let new_digit = (n >> shift) as u8;
n = 10 * (n & mask);
if write_index < Self::MAX_DIGITS {
self.digits[write_index] = new_digit;
write_index += 1;
} else if new_digit > 0 {
self.truncated = true;
}
}
self.num_digits = write_index;
self.trim();
}
}
/// Parse a big integer representation of the float as a decimal.
pub fn parse_decimal(mut s: &[u8]) -> Decimal {
let mut d = Decimal::default();
let start = s;
s = s.skip_chars(b'0');
parse_digits(&mut s, |digit| d.try_add_digit(digit));
if s.first_is(b'.') {
s = s.advance(1);
let first = s;
// Skip leading zeros.
if d.num_digits == 0 {
s = s.skip_chars(b'0');
}
while s.len() >= 8 && d.num_digits + 8 < Decimal::MAX_DIGITS {
// SAFETY: s is at least 8 bytes.
let v = unsafe { s.read_u64_unchecked() };
if !is_8digits(v) {
break;
}
// SAFETY: d.num_digits + 8 is less than d.digits.len()
unsafe {
d.digits[d.num_digits..].write_u64_unchecked(v - 0x3030_3030_3030_3030);
}
d.num_digits += 8;
s = s.advance(8);
}
parse_digits(&mut s, |digit| d.try_add_digit(digit));
d.decimal_point = s.len() as i32 - first.len() as i32;
}
if d.num_digits != 0 {
// Ignore the trailing zeros if there are any
let mut n_trailing_zeros = 0;
for &c in start[..(start.len() - s.len())].iter().rev() {
if c == b'0' {
n_trailing_zeros += 1;
} else if c != b'.' {
break;
}
}
d.decimal_point += n_trailing_zeros as i32;
d.num_digits -= n_trailing_zeros;
d.decimal_point += d.num_digits as i32;
if d.num_digits > Decimal::MAX_DIGITS {
d.truncated = true;
d.num_digits = Decimal::MAX_DIGITS;
}
}
if s.first_is2(b'e', b'E') {
s = s.advance(1);
let mut neg_exp = false;
if s.first_is(b'-') {
neg_exp = true;
s = s.advance(1);
} else if s.first_is(b'+') {
s = s.advance(1);
}
let mut exp_num = 0_i32;
parse_digits(&mut s, |digit| {
if exp_num < 0x10000 {
exp_num = 10 * exp_num + digit as i32;
}
});
d.decimal_point += if neg_exp { -exp_num } else { exp_num };
}
for i in d.num_digits..Decimal::MAX_DIGITS_WITHOUT_OVERFLOW {
d.digits[i] = 0;
}
d
}
fn number_of_digits_decimal_left_shift(d: &Decimal, mut shift: usize) -> usize {
#[rustfmt::skip]
const TABLE: [u16; 65] = [
0x0000, 0x0800, 0x0801, 0x0803, 0x1006, 0x1009, 0x100D, 0x1812, 0x1817, 0x181D, 0x2024,
0x202B, 0x2033, 0x203C, 0x2846, 0x2850, 0x285B, 0x3067, 0x3073, 0x3080, 0x388E, 0x389C,
0x38AB, 0x38BB, 0x40CC, 0x40DD, 0x40EF, 0x4902, 0x4915, 0x4929, 0x513E, 0x5153, 0x5169,
0x5180, 0x5998, 0x59B0, 0x59C9, 0x61E3, 0x61FD, 0x6218, 0x6A34, 0x6A50, 0x6A6D, 0x6A8B,
0x72AA, 0x72C9, 0x72E9, 0x7B0A, 0x7B2B, 0x7B4D, 0x8370, 0x8393, 0x83B7, 0x83DC, 0x8C02,
0x8C28, 0x8C4F, 0x9477, 0x949F, 0x94C8, 0x9CF2, 0x051C, 0x051C, 0x051C, 0x051C,
];
#[rustfmt::skip]
const TABLE_POW5: [u8; 0x051C] = [
5, 2, 5, 1, 2, 5, 6, 2, 5, 3, 1, 2, 5, 1, 5, 6, 2, 5, 7, 8, 1, 2, 5, 3, 9, 0, 6, 2, 5, 1,
9, 5, 3, 1, 2, 5, 9, 7, 6, 5, 6, 2, 5, 4, 8, 8, 2, 8, 1, 2, 5, 2, 4, 4, 1, 4, 0, 6, 2, 5,
1, 2, 2, 0, 7, 0, 3, 1, 2, 5, 6, 1, 0, 3, 5, 1, 5, 6, 2, 5, 3, 0, 5, 1, 7, 5, 7, 8, 1, 2,
5, 1, 5, 2, 5, 8, 7, 8, 9, 0, 6, 2, 5, 7, 6, 2, 9, 3, 9, 4, 5, 3, 1, 2, 5, 3, 8, 1, 4, 6,
9, 7, 2, 6, 5, 6, 2, 5, 1, 9, 0, 7, 3, 4, 8, 6, 3, 2, 8, 1, 2, 5, 9, 5, 3, 6, 7, 4, 3, 1,
6, 4, 0, 6, 2, 5, 4, 7, 6, 8, 3, 7, 1, 5, 8, 2, 0, 3, 1, 2, 5, 2, 3, 8, 4, 1, 8, 5, 7, 9,
1, 0, 1, 5, 6, 2, 5, 1, 1, 9, 2, 0, 9, 2, 8, 9, 5, 5, 0, 7, 8, 1, 2, 5, 5, 9, 6, 0, 4, 6,
4, 4, 7, 7, 5, 3, 9, 0, 6, 2, 5, 2, 9, 8, 0, 2, 3, 2, 2, 3, 8, 7, 6, 9, 5, 3, 1, 2, 5, 1,
4, 9, 0, 1, 1, 6, 1, 1, 9, 3, 8, 4, 7, 6, 5, 6, 2, 5, 7, 4, 5, 0, 5, 8, 0, 5, 9, 6, 9, 2,
3, 8, 2, 8, 1, 2, 5, 3, 7, 2, 5, 2, 9, 0, 2, 9, 8, 4, 6, 1, 9, 1, 4, 0, 6, 2, 5, 1, 8, 6,
2, 6, 4, 5, 1, 4, 9, 2, 3, 0, 9, 5, 7, 0, 3, 1, 2, 5, 9, 3, 1, 3, 2, 2, 5, 7, 4, 6, 1, 5,
4, 7, 8, 5, 1, 5, 6, 2, 5, 4, 6, 5, 6, 6, 1, 2, 8, 7, 3, 0, 7, 7, 3, 9, 2, 5, 7, 8, 1, 2,
5, 2, 3, 2, 8, 3, 0, 6, 4, 3, 6, 5, 3, 8, 6, 9, 6, 2, 8, 9, 0, 6, 2, 5, 1, 1, 6, 4, 1, 5,
3, 2, 1, 8, 2, 6, 9, 3, 4, 8, 1, 4, 4, 5, 3, 1, 2, 5, 5, 8, 2, 0, 7, 6, 6, 0, 9, 1, 3, 4,
6, 7, 4, 0, 7, 2, 2, 6, 5, 6, 2, 5, 2, 9, 1, 0, 3, 8, 3, 0, 4, 5, 6, 7, 3, 3, 7, 0, 3, 6,
1, 3, 2, 8, 1, 2, 5, 1, 4, 5, 5, 1, 9, 1, 5, 2, 2, 8, 3, 6, 6, 8, 5, 1, 8, 0, 6, 6, 4, 0,
6, 2, 5, 7, 2, 7, 5, 9, 5, 7, 6, 1, 4, 1, 8, 3, 4, 2, 5, 9, 0, 3, 3, 2, 0, 3, 1, 2, 5, 3,
6, 3, 7, 9, 7, 8, 8, 0, 7, 0, 9, 1, 7, 1, 2, 9, 5, 1, 6, 6, 0, 1, 5, 6, 2, 5, 1, 8, 1, 8,
9, 8, 9, 4, 0, 3, 5, 4, 5, 8, 5, 6, 4, 7, 5, 8, 3, 0, 0, 7, 8, 1, 2, 5, 9, 0, 9, 4, 9, 4,
7, 0, 1, 7, 7, 2, 9, 2, 8, 2, 3, 7, 9, 1, 5, 0, 3, 9, 0, 6, 2, 5, 4, 5, 4, 7, 4, 7, 3, 5,
0, 8, 8, 6, 4, 6, 4, 1, 1, 8, 9, 5, 7, 5, 1, 9, 5, 3, 1, 2, 5, 2, 2, 7, 3, 7, 3, 6, 7, 5,
4, 4, 3, 2, 3, 2, 0, 5, 9, 4, 7, 8, 7, 5, 9, 7, 6, 5, 6, 2, 5, 1, 1, 3, 6, 8, 6, 8, 3, 7,
7, 2, 1, 6, 1, 6, 0, 2, 9, 7, 3, 9, 3, 7, 9, 8, 8, 2, 8, 1, 2, 5, 5, 6, 8, 4, 3, 4, 1, 8,
8, 6, 0, 8, 0, 8, 0, 1, 4, 8, 6, 9, 6, 8, 9, 9, 4, 1, 4, 0, 6, 2, 5, 2, 8, 4, 2, 1, 7, 0,
9, 4, 3, 0, 4, 0, 4, 0, 0, 7, 4, 3, 4, 8, 4, 4, 9, 7, 0, 7, 0, 3, 1, 2, 5, 1, 4, 2, 1, 0,
8, 5, 4, 7, 1, 5, 2, 0, 2, 0, 0, 3, 7, 1, 7, 4, 2, 2, 4, 8, 5, 3, 5, 1, 5, 6, 2, 5, 7, 1,
0, 5, 4, 2, 7, 3, 5, 7, 6, 0, 1, 0, 0, 1, 8, 5, 8, 7, 1, 1, 2, 4, 2, 6, 7, 5, 7, 8, 1, 2,
5, 3, 5, 5, 2, 7, 1, 3, 6, 7, 8, 8, 0, 0, 5, 0, 0, 9, 2, 9, 3, 5, 5, 6, 2, 1, 3, 3, 7, 8,
9, 0, 6, 2, 5, 1, 7, 7, 6, 3, 5, 6, 8, 3, 9, 4, 0, 0, 2, 5, 0, 4, 6, 4, 6, 7, 7, 8, 1, 0,
6, 6, 8, 9, 4, 5, 3, 1, 2, 5, 8, 8, 8, 1, 7, 8, 4, 1, 9, 7, 0, 0, 1, 2, 5, 2, 3, 2, 3, 3,
8, 9, 0, 5, 3, 3, 4, 4, 7, 2, 6, 5, 6, 2, 5, 4, 4, 4, 0, 8, 9, 2, 0, 9, 8, 5, 0, 0, 6, 2,
6, 1, 6, 1, 6, 9, 4, 5, 2, 6, 6, 7, 2, 3, 6, 3, 2, 8, 1, 2, 5, 2, 2, 2, 0, 4, 4, 6, 0, 4,
9, 2, 5, 0, 3, 1, 3, 0, 8, 0, 8, 4, 7, 2, 6, 3, 3, 3, 6, 1, 8, 1, 6, 4, 0, 6, 2, 5, 1, 1,
1, 0, 2, 2, 3, 0, 2, 4, 6, 2, 5, 1, 5, 6, 5, 4, 0, 4, 2, 3, 6, 3, 1, 6, 6, 8, 0, 9, 0, 8,
2, 0, 3, 1, 2, 5, 5, 5, 5, 1, 1, 1, 5, 1, 2, 3, 1, 2, 5, 7, 8, 2, 7, 0, 2, 1, 1, 8, 1, 5,
8, 3, 4, 0, 4, 5, 4, 1, 0, 1, 5, 6, 2, 5, 2, 7, 7, 5, 5, 5, 7, 5, 6, 1, 5, 6, 2, 8, 9, 1,
3, 5, 1, 0, 5, 9, 0, 7, 9, 1, 7, 0, 2, 2, 7, 0, 5, 0, 7, 8, 1, 2, 5, 1, 3, 8, 7, 7, 7, 8,
7, 8, 0, 7, 8, 1, 4, 4, 5, 6, 7, 5, 5, 2, 9, 5, 3, 9, 5, 8, 5, 1, 1, 3, 5, 2, 5, 3, 9, 0,
6, 2, 5, 6, 9, 3, 8, 8, 9, 3, 9, 0, 3, 9, 0, 7, 2, 2, 8, 3, 7, 7, 6, 4, 7, 6, 9, 7, 9, 2,
5, 5, 6, 7, 6, 2, 6, 9, 5, 3, 1, 2, 5, 3, 4, 6, 9, 4, 4, 6, 9, 5, 1, 9, 5, 3, 6, 1, 4, 1,
8, 8, 8, 2, 3, 8, 4, 8, 9, 6, 2, 7, 8, 3, 8, 1, 3, 4, 7, 6, 5, 6, 2, 5, 1, 7, 3, 4, 7, 2,
3, 4, 7, 5, 9, 7, 6, 8, 0, 7, 0, 9, 4, 4, 1, 1, 9, 2, 4, 4, 8, 1, 3, 9, 1, 9, 0, 6, 7, 3,
8, 2, 8, 1, 2, 5, 8, 6, 7, 3, 6, 1, 7, 3, 7, 9, 8, 8, 4, 0, 3, 5, 4, 7, 2, 0, 5, 9, 6, 2,
2, 4, 0, 6, 9, 5, 9, 5, 3, 3, 6, 9, 1, 4, 0, 6, 2, 5,
];
shift &= 63;
let x_a = TABLE[shift];
let x_b = TABLE[shift + 1];
let num_new_digits = (x_a >> 11) as _;
let pow5_a = (0x7FF & x_a) as usize;
let pow5_b = (0x7FF & x_b) as usize;
let pow5 = &TABLE_POW5[pow5_a..];
for (i, &p5) in pow5.iter().enumerate().take(pow5_b - pow5_a) {
if i >= d.num_digits {
return num_new_digits - 1;
} else if d.digits[i] == p5 {
continue;
} else if d.digits[i] < p5 {
return num_new_digits - 1;
} else {
return num_new_digits;
}
}
num_new_digits
}

View file

@ -0,0 +1,207 @@
//! Helper trait for generic float types.
use crate::fmt::{Debug, LowerExp};
use crate::num::FpCategory;
use crate::ops::{Add, Div, Mul, Neg};
/// A helper trait to avoid duplicating basically all the conversion code for `f32` and `f64`.
///
/// See the parent module's doc comment for why this is necessary.
///
/// Should **never ever** be implemented for other types or be used outside the dec2flt module.
#[doc(hidden)]
pub trait RawFloat:
Sized
+ Div<Output = Self>
+ Neg<Output = Self>
+ Mul<Output = Self>
+ Add<Output = Self>
+ LowerExp
+ PartialEq
+ PartialOrd
+ Default
+ Clone
+ Copy
+ Debug
{
const INFINITY: Self;
const NEG_INFINITY: Self;
const NAN: Self;
const NEG_NAN: Self;
/// The number of bits in the significand, *excluding* the hidden bit.
const MANTISSA_EXPLICIT_BITS: usize;
// Round-to-even only happens for negative values of q
// when q ≥ 4 in the 64-bit case and when q ≥ 17 in
// the 32-bitcase.
//
// When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
// have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
// 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
//
// When q < 0, we have w ≥ (2m+1)×5^q. We must have that w < 2^64
// so (2m+1)×5^q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
// or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^q < 2^64
// (64-bit) and 2^24×5^q < 2^64 (32-bit). Hence we have 5^q < 2^11
// or q ≥ 4 (64-bit case) and 5^q < 2^40 or q ≥ 17 (32-bitcase).
//
// Thus we have that we only need to round ties to even when
// we have that q ∈ [4,23](in the 64-bit case) or q∈[17,10]
// (in the 32-bit case). In both cases,the power of five(5^|q|)
// fits in a 64-bit word.
const MIN_EXPONENT_ROUND_TO_EVEN: i32;
const MAX_EXPONENT_ROUND_TO_EVEN: i32;
// Minimum exponent that for a fast path case, or `-⌊(MANTISSA_EXPLICIT_BITS+1)/log2(5)⌋`
const MIN_EXPONENT_FAST_PATH: i64;
// Maximum exponent that for a fast path case, or `⌊(MANTISSA_EXPLICIT_BITS+1)/log2(5)⌋`
const MAX_EXPONENT_FAST_PATH: i64;
// Maximum exponent that can be represented for a disguised-fast path case.
// This is `MAX_EXPONENT_FAST_PATH + ⌊(MANTISSA_EXPLICIT_BITS+1)/log2(10)⌋`
const MAX_EXPONENT_DISGUISED_FAST_PATH: i64;
// Minimum exponent value `-(1 << (EXP_BITS - 1)) + 1`.
const MINIMUM_EXPONENT: i32;
// Largest exponent value `(1 << EXP_BITS) - 1`.
const INFINITE_POWER: i32;
// Index (in bits) of the sign.
const SIGN_INDEX: usize;
// Smallest decimal exponent for a non-zero value.
const SMALLEST_POWER_OF_TEN: i32;
// Largest decimal exponent for a non-infinite value.
const LARGEST_POWER_OF_TEN: i32;
// Maximum mantissa for the fast-path (`1 << 53` for f64).
const MAX_MANTISSA_FAST_PATH: u64 = 2_u64 << Self::MANTISSA_EXPLICIT_BITS;
/// Convert integer into float through an as cast.
/// This is only called in the fast-path algorithm, and therefore
/// will not lose precision, since the value will always have
/// only if the value is <= Self::MAX_MANTISSA_FAST_PATH.
fn from_u64(v: u64) -> Self;
/// Performs a raw transmutation from an integer.
fn from_u64_bits(v: u64) -> Self;
/// Get a small power-of-ten for fast-path multiplication.
fn pow10_fast_path(exponent: usize) -> Self;
/// Returns the category that this number falls into.
fn classify(self) -> FpCategory;
/// Returns the mantissa, exponent and sign as integers.
fn integer_decode(self) -> (u64, i16, i8);
}
impl RawFloat for f32 {
const INFINITY: Self = f32::INFINITY;
const NEG_INFINITY: Self = f32::NEG_INFINITY;
const NAN: Self = f32::NAN;
const NEG_NAN: Self = -f32::NAN;
const MANTISSA_EXPLICIT_BITS: usize = 23;
const MIN_EXPONENT_ROUND_TO_EVEN: i32 = -17;
const MAX_EXPONENT_ROUND_TO_EVEN: i32 = 10;
const MIN_EXPONENT_FAST_PATH: i64 = -10; // assuming FLT_EVAL_METHOD = 0
const MAX_EXPONENT_FAST_PATH: i64 = 10;
const MAX_EXPONENT_DISGUISED_FAST_PATH: i64 = 17;
const MINIMUM_EXPONENT: i32 = -127;
const INFINITE_POWER: i32 = 0xFF;
const SIGN_INDEX: usize = 31;
const SMALLEST_POWER_OF_TEN: i32 = -65;
const LARGEST_POWER_OF_TEN: i32 = 38;
fn from_u64(v: u64) -> Self {
debug_assert!(v <= Self::MAX_MANTISSA_FAST_PATH);
v as _
}
fn from_u64_bits(v: u64) -> Self {
f32::from_bits((v & 0xFFFFFFFF) as u32)
}
fn pow10_fast_path(exponent: usize) -> Self {
#[allow(clippy::use_self)]
const TABLE: [f32; 16] =
[1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6, 1e7, 1e8, 1e9, 1e10, 0., 0., 0., 0., 0.];
TABLE[exponent & 15]
}
/// Returns the mantissa, exponent and sign as integers.
fn integer_decode(self) -> (u64, i16, i8) {
let bits = self.to_bits();
let sign: i8 = if bits >> 31 == 0 { 1 } else { -1 };
let mut exponent: i16 = ((bits >> 23) & 0xff) as i16;
let mantissa =
if exponent == 0 { (bits & 0x7fffff) << 1 } else { (bits & 0x7fffff) | 0x800000 };
// Exponent bias + mantissa shift
exponent -= 127 + 23;
(mantissa as u64, exponent, sign)
}
fn classify(self) -> FpCategory {
self.classify()
}
}
impl RawFloat for f64 {
const INFINITY: Self = f64::INFINITY;
const NEG_INFINITY: Self = f64::NEG_INFINITY;
const NAN: Self = f64::NAN;
const NEG_NAN: Self = -f64::NAN;
const MANTISSA_EXPLICIT_BITS: usize = 52;
const MIN_EXPONENT_ROUND_TO_EVEN: i32 = -4;
const MAX_EXPONENT_ROUND_TO_EVEN: i32 = 23;
const MIN_EXPONENT_FAST_PATH: i64 = -22; // assuming FLT_EVAL_METHOD = 0
const MAX_EXPONENT_FAST_PATH: i64 = 22;
const MAX_EXPONENT_DISGUISED_FAST_PATH: i64 = 37;
const MINIMUM_EXPONENT: i32 = -1023;
const INFINITE_POWER: i32 = 0x7FF;
const SIGN_INDEX: usize = 63;
const SMALLEST_POWER_OF_TEN: i32 = -342;
const LARGEST_POWER_OF_TEN: i32 = 308;
fn from_u64(v: u64) -> Self {
debug_assert!(v <= Self::MAX_MANTISSA_FAST_PATH);
v as _
}
fn from_u64_bits(v: u64) -> Self {
f64::from_bits(v)
}
fn pow10_fast_path(exponent: usize) -> Self {
const TABLE: [f64; 32] = [
1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6, 1e7, 1e8, 1e9, 1e10, 1e11, 1e12, 1e13, 1e14, 1e15,
1e16, 1e17, 1e18, 1e19, 1e20, 1e21, 1e22, 0., 0., 0., 0., 0., 0., 0., 0., 0.,
];
TABLE[exponent & 31]
}
/// Returns the mantissa, exponent and sign as integers.
fn integer_decode(self) -> (u64, i16, i8) {
let bits = self.to_bits();
let sign: i8 = if bits >> 63 == 0 { 1 } else { -1 };
let mut exponent: i16 = ((bits >> 52) & 0x7ff) as i16;
let mantissa = if exponent == 0 {
(bits & 0xfffffffffffff) << 1
} else {
(bits & 0xfffffffffffff) | 0x10000000000000
};
// Exponent bias + mantissa shift
exponent -= 1023 + 52;
(mantissa, exponent, sign)
}
fn classify(self) -> FpCategory {
self.classify()
}
}

View file

@ -0,0 +1,89 @@
//! Platform-specific, assembly instructions to avoid
//! intermediate rounding on architectures with FPUs.
pub use fpu_precision::set_precision;
// On x86, the x87 FPU is used for float operations if the SSE/SSE2 extensions are not available.
// The x87 FPU operates with 80 bits of precision by default, which means that operations will
// round to 80 bits causing double rounding to happen when values are eventually represented as
// 32/64 bit float values. To overcome this, the FPU control word can be set so that the
// computations are performed in the desired precision.
#[cfg(all(target_arch = "x86", not(target_feature = "sse2")))]
mod fpu_precision {
use core::mem::size_of;
/// A structure used to preserve the original value of the FPU control word, so that it can be
/// restored when the structure is dropped.
///
/// The x87 FPU is a 16-bits register whose fields are as follows:
///
/// | 12-15 | 10-11 | 8-9 | 6-7 | 5 | 4 | 3 | 2 | 1 | 0 |
/// |------:|------:|----:|----:|---:|---:|---:|---:|---:|---:|
/// | | RC | PC | | PM | UM | OM | ZM | DM | IM |
///
/// The documentation for all of the fields is available in the IA-32 Architectures Software
/// Developer's Manual (Volume 1).
///
/// The only field which is relevant for the following code is PC, Precision Control. This
/// field determines the precision of the operations performed by the FPU. It can be set to:
/// - 0b00, single precision i.e., 32-bits
/// - 0b10, double precision i.e., 64-bits
/// - 0b11, double extended precision i.e., 80-bits (default state)
/// The 0b01 value is reserved and should not be used.
pub struct FPUControlWord(u16);
fn set_cw(cw: u16) {
// SAFETY: the `fldcw` instruction has been audited to be able to work correctly with
// any `u16`
unsafe {
asm!(
"fldcw word ptr [{}]",
in(reg) &cw,
options(nostack),
)
}
}
/// Sets the precision field of the FPU to `T` and returns a `FPUControlWord`.
pub fn set_precision<T>() -> FPUControlWord {
let mut cw = 0_u16;
// Compute the value for the Precision Control field that is appropriate for `T`.
let cw_precision = match size_of::<T>() {
4 => 0x0000, // 32 bits
8 => 0x0200, // 64 bits
_ => 0x0300, // default, 80 bits
};
// Get the original value of the control word to restore it later, when the
// `FPUControlWord` structure is dropped
// SAFETY: the `fnstcw` instruction has been audited to be able to work correctly with
// any `u16`
unsafe {
asm!(
"fnstcw word ptr [{}]",
in(reg) &mut cw,
options(nostack),
)
}
// Set the control word to the desired precision. This is achieved by masking away the old
// precision (bits 8 and 9, 0x300) and replacing it with the precision flag computed above.
set_cw((cw & 0xFCFF) | cw_precision);
FPUControlWord(cw)
}
impl Drop for FPUControlWord {
fn drop(&mut self) {
set_cw(self.0)
}
}
}
// In most architectures, floating point operations have an explicit bit size, therefore the
// precision of the computation is determined on a per-operation basis.
#[cfg(any(not(target_arch = "x86"), target_feature = "sse2"))]
mod fpu_precision {
pub fn set_precision<T>() {}
}

View file

@ -0,0 +1,166 @@
//! Implementation of the Eisel-Lemire algorithm.
use crate::num::dec2flt::common::BiasedFp;
use crate::num::dec2flt::float::RawFloat;
use crate::num::dec2flt::table::{
LARGEST_POWER_OF_FIVE, POWER_OF_FIVE_128, SMALLEST_POWER_OF_FIVE,
};
/// Compute a float using an extended-precision representation.
///
/// Fast conversion of a the significant digits and decimal exponent
/// a float to a extended representation with a binary float. This
/// algorithm will accurately parse the vast majority of cases,
/// and uses a 128-bit representation (with a fallback 192-bit
/// representation).
///
/// This algorithm scales the exponent by the decimal exponent
/// using pre-computed powers-of-5, and calculates if the
/// representation can be unambiguously rounded to the nearest
/// machine float. Near-halfway cases are not handled here,
/// and are represented by a negative, biased binary exponent.
///
/// The algorithm is described in detail in "Daniel Lemire, Number Parsing
/// at a Gigabyte per Second" in section 5, "Fast Algorithm", and
/// section 6, "Exact Numbers And Ties", available online:
/// <https://arxiv.org/abs/2101.11408.pdf>.
pub fn compute_float<F: RawFloat>(q: i64, mut w: u64) -> BiasedFp {
let fp_zero = BiasedFp::zero_pow2(0);
let fp_inf = BiasedFp::zero_pow2(F::INFINITE_POWER);
let fp_error = BiasedFp::zero_pow2(-1);
// Short-circuit if the value can only be a literal 0 or infinity.
if w == 0 || q < F::SMALLEST_POWER_OF_TEN as i64 {
return fp_zero;
} else if q > F::LARGEST_POWER_OF_TEN as i64 {
return fp_inf;
}
// Normalize our significant digits, so the most-significant bit is set.
let lz = w.leading_zeros();
w <<= lz;
let (lo, hi) = compute_product_approx(q, w, F::MANTISSA_EXPLICIT_BITS + 3);
if lo == 0xFFFF_FFFF_FFFF_FFFF {
// If we have failed to approximate w x 5^-q with our 128-bit value.
// Since the addition of 1 could lead to an overflow which could then
// round up over the half-way point, this can lead to improper rounding
// of a float.
//
// However, this can only occur if q ∈ [-27, 55]. The upper bound of q
// is 55 because 5^55 < 2^128, however, this can only happen if 5^q > 2^64,
// since otherwise the product can be represented in 64-bits, producing
// an exact result. For negative exponents, rounding-to-even can
// only occur if 5^-q < 2^64.
//
// For detailed explanations of rounding for negative exponents, see
// <https://arxiv.org/pdf/2101.11408.pdf#section.9.1>. For detailed
// explanations of rounding for positive exponents, see
// <https://arxiv.org/pdf/2101.11408.pdf#section.8>.
let inside_safe_exponent = (q >= -27) && (q <= 55);
if !inside_safe_exponent {
return fp_error;
}
}
let upperbit = (hi >> 63) as i32;
let mut mantissa = hi >> (upperbit + 64 - F::MANTISSA_EXPLICIT_BITS as i32 - 3);
let mut power2 = power(q as i32) + upperbit - lz as i32 - F::MINIMUM_EXPONENT;
if power2 <= 0 {
if -power2 + 1 >= 64 {
// Have more than 64 bits below the minimum exponent, must be 0.
return fp_zero;
}
// Have a subnormal value.
mantissa >>= -power2 + 1;
mantissa += mantissa & 1;
mantissa >>= 1;
power2 = (mantissa >= (1_u64 << F::MANTISSA_EXPLICIT_BITS)) as i32;
return BiasedFp { f: mantissa, e: power2 };
}
// Need to handle rounding ties. Normally, we need to round up,
// but if we fall right in between and and we have an even basis, we
// need to round down.
//
// This will only occur if:
// 1. The lower 64 bits of the 128-bit representation is 0.
// IE, 5^q fits in single 64-bit word.
// 2. The least-significant bit prior to truncated mantissa is odd.
// 3. All the bits truncated when shifting to mantissa bits + 1 are 0.
//
// Or, we may fall between two floats: we are exactly halfway.
if lo <= 1
&& q >= F::MIN_EXPONENT_ROUND_TO_EVEN as i64
&& q <= F::MAX_EXPONENT_ROUND_TO_EVEN as i64
&& mantissa & 3 == 1
&& (mantissa << (upperbit + 64 - F::MANTISSA_EXPLICIT_BITS as i32 - 3)) == hi
{
// Zero the lowest bit, so we don't round up.
mantissa &= !1_u64;
}
// Round-to-even, then shift the significant digits into place.
mantissa += mantissa & 1;
mantissa >>= 1;
if mantissa >= (2_u64 << F::MANTISSA_EXPLICIT_BITS) {
// Rounding up overflowed, so the carry bit is set. Set the
// mantissa to 1 (only the implicit, hidden bit is set) and
// increase the exponent.
mantissa = 1_u64 << F::MANTISSA_EXPLICIT_BITS;
power2 += 1;
}
// Zero out the hidden bit.
mantissa &= !(1_u64 << F::MANTISSA_EXPLICIT_BITS);
if power2 >= F::INFINITE_POWER {
// Exponent is above largest normal value, must be infinite.
return fp_inf;
}
BiasedFp { f: mantissa, e: power2 }
}
/// Calculate a base 2 exponent from a decimal exponent.
/// This uses a pre-computed integer approximation for
/// log2(10), where 217706 / 2^16 is accurate for the
/// entire range of non-finite decimal exponents.
fn power(q: i32) -> i32 {
(q.wrapping_mul(152_170 + 65536) >> 16) + 63
}
fn full_multiplication(a: u64, b: u64) -> (u64, u64) {
let r = (a as u128) * (b as u128);
(r as u64, (r >> 64) as u64)
}
// This will compute or rather approximate w * 5**q and return a pair of 64-bit words
// approximating the result, with the "high" part corresponding to the most significant
// bits and the low part corresponding to the least significant bits.
fn compute_product_approx(q: i64, w: u64, precision: usize) -> (u64, u64) {
debug_assert!(q >= SMALLEST_POWER_OF_FIVE as i64);
debug_assert!(q <= LARGEST_POWER_OF_FIVE as i64);
debug_assert!(precision <= 64);
let mask = if precision < 64 {
0xFFFF_FFFF_FFFF_FFFF_u64 >> precision
} else {
0xFFFF_FFFF_FFFF_FFFF_u64
};
// 5^q < 2^64, then the multiplication always provides an exact value.
// That means whenever we need to round ties to even, we always have
// an exact value.
let index = (q - SMALLEST_POWER_OF_FIVE as i64) as usize;
let (lo5, hi5) = POWER_OF_FIVE_128[index];
// Only need one multiplication as long as there is 1 zero but
// in the explicit mantissa bits, +1 for the hidden bit, +1 to
// determine the rounding direction, +1 for if the computed
// product has a leading zero.
let (mut first_lo, mut first_hi) = full_multiplication(w, lo5);
if first_hi & mask == mask {
// Need to do a second multiplication to get better precision
// for the lower product. This will always be exact
// where q is < 55, since 5^55 < 2^128. If this wraps,
// then we need to need to round up the hi product.
let (_, second_hi) = full_multiplication(w, hi5);
first_lo = first_lo.wrapping_add(second_hi);
if second_hi > first_lo {
first_hi += 1;
}
}
(first_lo, first_hi)
}

View file

@ -27,20 +27,12 @@
//!
//! We then try a long chain of progressively more general and expensive special cases using
//! machine-sized integers and small, fixed-sized floating point numbers (first `f32`/`f64`, then
//! a type with 64 bit significand, `Fp`). When all these fail, we bite the bullet and resort to a
//! simple but very slow algorithm that involved computing `f * 10^e` fully and doing an iterative
//! search for the best approximation.
//!
//! Primarily, this module and its children implement the algorithms described in:
//! "How to Read Floating Point Numbers Accurately" by William D. Clinger,
//! available online: <https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.4152>
//!
//! In addition, there are numerous helper functions that are used in the paper but not available
//! in Rust (or at least in core). Our version is additionally complicated by the need to handle
//! overflow and underflow and the desire to handle subnormal numbers. Bellerophon and
//! Algorithm R have trouble with overflow, subnormals, and underflow. We conservatively switch to
//! Algorithm M (with the modifications described in section 8 of the paper) well before the
//! inputs get into the critical region.
//! a type with 64 bit significand). The extended-precision algorithm
//! uses the Eisel-Lemire algorithm, which uses a 128-bit (or 192-bit)
//! representation that can accurately and quickly compute the vast majority
//! of floats. When all these fail, we bite the bullet and resort to using
//! a large-decimal representation, shifting the digits into range, calculating
//! the upper significant bits and exactly round to the nearest representation.
//!
//! Another aspect that needs attention is the ``RawFloat`` trait by which almost all functions
//! are parametrized. One might think that it's enough to parse to `f64` and cast the result to
@ -54,10 +46,9 @@
//! operations as well, if you want 0.5 ULP accuracy you need to do *everything* in full precision
//! and round *exactly once, at the end*, by considering all truncated bits at once.
//!
//! FIXME: Although some code duplication is necessary, perhaps parts of the code could be shuffled
//! around such that less code is duplicated. Large parts of the algorithms are independent of the
//! float type to output, or only needs access to a few constants, which could be passed in as
//! parameters.
//! Primarily, this module and its children implement the algorithms described in:
//! "Number Parsing at a Gigabyte per Second", available online:
//! <https://arxiv.org/abs/2101.11408>.
//!
//! # Other
//!
@ -87,16 +78,22 @@
use crate::fmt;
use crate::str::FromStr;
use self::num::digits_to_big;
use self::parse::{parse_decimal, Decimal, ParseResult, Sign};
use self::rawfp::RawFloat;
use self::common::{BiasedFp, ByteSlice};
use self::float::RawFloat;
use self::lemire::compute_float;
use self::parse::{parse_inf_nan, parse_number};
use self::slow::parse_long_mantissa;
mod algorithm;
mod num;
mod common;
mod decimal;
mod fpu;
mod slow;
mod table;
// These two have their own tests.
// float is used in flt2dec, and all are used in unit tests.
pub mod float;
pub mod lemire;
pub mod number;
pub mod parse;
pub mod rawfp;
macro_rules! from_str_float_impl {
($t:ty) => {
@ -136,13 +133,6 @@ macro_rules! from_str_float_impl {
///
/// [EBNF]: https://www.w3.org/TR/REC-xml/#sec-notation
///
/// # Known bugs
///
/// In some situations, some strings that should create a valid float
/// instead return an error. See [issue #31407] for details.
///
/// [issue #31407]: https://github.com/rust-lang/rust/issues/31407
///
/// # Arguments
///
/// * src - A string
@ -211,148 +201,70 @@ impl fmt::Display for ParseFloatError {
}
}
fn pfe_empty() -> ParseFloatError {
pub(super) fn pfe_empty() -> ParseFloatError {
ParseFloatError { kind: FloatErrorKind::Empty }
}
fn pfe_invalid() -> ParseFloatError {
// Used in unit tests, keep public.
// This is much better than making FloatErrorKind and ParseFloatError::kind public.
pub fn pfe_invalid() -> ParseFloatError {
ParseFloatError { kind: FloatErrorKind::Invalid }
}
/// Splits a decimal string into sign and the rest, without inspecting or validating the rest.
fn extract_sign(s: &str) -> (Sign, &str) {
match s.as_bytes()[0] {
b'+' => (Sign::Positive, &s[1..]),
b'-' => (Sign::Negative, &s[1..]),
// If the string is invalid, we never use the sign, so we don't need to validate here.
_ => (Sign::Positive, s),
}
/// Converts a `BiasedFp` to the closest machine float type.
fn biased_fp_to_float<T: RawFloat>(x: BiasedFp) -> T {
let mut word = x.f;
word |= (x.e as u64) << T::MANTISSA_EXPLICIT_BITS;
T::from_u64_bits(word)
}
/// Converts a decimal string into a floating point number.
fn dec2flt<T: RawFloat>(s: &str) -> Result<T, ParseFloatError> {
if s.is_empty() {
pub fn dec2flt<F: RawFloat>(s: &str) -> Result<F, ParseFloatError> {
let mut s = s.as_bytes();
let c = if let Some(&c) = s.first() {
c
} else {
return Err(pfe_empty());
};
let negative = c == b'-';
if c == b'-' || c == b'+' {
s = s.advance(1);
}
let (sign, s) = extract_sign(s);
let flt = match parse_decimal(s) {
ParseResult::Valid(decimal) => convert(decimal)?,
ParseResult::ShortcutToInf => T::INFINITY,
ParseResult::ShortcutToZero => T::ZERO,
ParseResult::Invalid => {
if s.eq_ignore_ascii_case("nan") {
T::NAN
} else if s.eq_ignore_ascii_case("inf") || s.eq_ignore_ascii_case("infinity") {
T::INFINITY
if s.is_empty() {
return Err(pfe_invalid());
}
let num = match parse_number(s, negative) {
Some(r) => r,
None => {
if let Some(value) = parse_inf_nan(s, negative) {
return Ok(value);
} else {
return Err(pfe_invalid());
}
}
};
match sign {
Sign::Positive => Ok(flt),
Sign::Negative => Ok(-flt),
if let Some(value) = num.try_fast_path::<F>() {
return Ok(value);
}
}
/// The main workhorse for the decimal-to-float conversion: Orchestrate all the preprocessing
/// and figure out which algorithm should do the actual conversion.
fn convert<T: RawFloat>(mut decimal: Decimal<'_>) -> Result<T, ParseFloatError> {
simplify(&mut decimal);
if let Some(x) = trivial_cases(&decimal) {
return Ok(x);
}
// Remove/shift out the decimal point.
let e = decimal.exp - decimal.fractional.len() as i64;
if let Some(x) = algorithm::fast_path(decimal.integral, decimal.fractional, e) {
return Ok(x);
}
// Big32x40 is limited to 1280 bits, which translates to about 385 decimal digits.
// If we exceed this, we'll crash, so we error out before getting too close (within 10^10).
let upper_bound = bound_intermediate_digits(&decimal, e);
if upper_bound > 375 {
return Err(pfe_invalid());
}
let f = digits_to_big(decimal.integral, decimal.fractional);
// Now the exponent certainly fits in 16 bit, which is used throughout the main algorithms.
let e = e as i16;
// FIXME These bounds are rather conservative. A more careful analysis of the failure modes
// of Bellerophon could allow using it in more cases for a massive speed up.
let exponent_in_range = table::MIN_E <= e && e <= table::MAX_E;
let value_in_range = upper_bound <= T::MAX_NORMAL_DIGITS as u64;
if exponent_in_range && value_in_range {
Ok(algorithm::bellerophon(&f, e))
} else {
Ok(algorithm::algorithm_m(&f, e))
}
}
// As written, this optimizes badly (see #27130, though it refers to an old version of the code).
// `inline(always)` is a workaround for that. There are only two call sites overall and it doesn't
// make code size worse.
/// Strip zeros where possible, even when this requires changing the exponent
#[inline(always)]
fn simplify(decimal: &mut Decimal<'_>) {
let is_zero = &|&&d: &&u8| -> bool { d == b'0' };
// Trimming these zeros does not change anything but may enable the fast path (< 15 digits).
let leading_zeros = decimal.integral.iter().take_while(is_zero).count();
decimal.integral = &decimal.integral[leading_zeros..];
let trailing_zeros = decimal.fractional.iter().rev().take_while(is_zero).count();
let end = decimal.fractional.len() - trailing_zeros;
decimal.fractional = &decimal.fractional[..end];
// Simplify numbers of the form 0.0...x and x...0.0, adjusting the exponent accordingly.
// This may not always be a win (possibly pushes some numbers out of the fast path), but it
// simplifies other parts significantly (notably, approximating the magnitude of the value).
if decimal.integral.is_empty() {
let leading_zeros = decimal.fractional.iter().take_while(is_zero).count();
decimal.fractional = &decimal.fractional[leading_zeros..];
decimal.exp -= leading_zeros as i64;
} else if decimal.fractional.is_empty() {
let trailing_zeros = decimal.integral.iter().rev().take_while(is_zero).count();
let end = decimal.integral.len() - trailing_zeros;
decimal.integral = &decimal.integral[..end];
decimal.exp += trailing_zeros as i64;
}
}
/// Returns a quick-an-dirty upper bound on the size (log10) of the largest value that Algorithm R
/// and Algorithm M will compute while working on the given decimal.
fn bound_intermediate_digits(decimal: &Decimal<'_>, e: i64) -> u64 {
// We don't need to worry too much about overflow here thanks to trivial_cases() and the
// parser, which filter out the most extreme inputs for us.
let f_len: u64 = decimal.integral.len() as u64 + decimal.fractional.len() as u64;
if e >= 0 {
// In the case e >= 0, both algorithms compute about `f * 10^e`. Algorithm R proceeds to
// do some complicated calculations with this but we can ignore that for the upper bound
// because it also reduces the fraction beforehand, so we have plenty of buffer there.
f_len + (e as u64)
} else {
// If e < 0, Algorithm R does roughly the same thing, but Algorithm M differs:
// It tries to find a positive number k such that `f << k / 10^e` is an in-range
// significand. This will result in about `2^53 * f * 10^e` < `10^17 * f * 10^e`.
// One input that triggers this is 0.33...33 (375 x 3).
f_len + e.unsigned_abs() + 17
}
}
/// Detects obvious overflows and underflows without even looking at the decimal digits.
fn trivial_cases<T: RawFloat>(decimal: &Decimal<'_>) -> Option<T> {
// There were zeros but they were stripped by simplify()
if decimal.integral.is_empty() && decimal.fractional.is_empty() {
return Some(T::ZERO);
}
// This is a crude approximation of ceil(log10(the real value)). We don't need to worry too
// much about overflow here because the input length is tiny (at least compared to 2^64) and
// the parser already handles exponents whose absolute value is greater than 10^18
// (which is still 10^19 short of 2^64).
let max_place = decimal.exp + decimal.integral.len() as i64;
if max_place > T::INF_CUTOFF {
return Some(T::INFINITY);
} else if max_place < T::ZERO_CUTOFF {
return Some(T::ZERO);
}
None
// If significant digits were truncated, then we can have rounding error
// only if `mantissa + 1` produces a different result. We also avoid
// redundantly using the Eisel-Lemire algorithm if it was unable to
// correctly round on the first pass.
let mut fp = compute_float::<F>(num.exponent, num.mantissa);
if num.many_digits && fp.e >= 0 && fp != compute_float::<F>(num.exponent, num.mantissa + 1) {
fp.e = -1;
}
// Unable to correctly round the float using the Eisel-Lemire algorithm.
// Fallback to a slower, but always correct algorithm.
if fp.e < 0 {
fp = parse_long_mantissa::<F>(s);
}
let mut float = biased_fp_to_float::<F>(fp);
if num.negative {
float = -float;
}
Ok(float)
}

View file

@ -1,81 +0,0 @@
//! Utility functions for bignums that don't make too much sense to turn into methods.
// FIXME This module's name is a bit unfortunate, since other modules also import `core::num`.
use crate::cmp::Ordering::{self, Equal, Greater, Less};
pub use crate::num::bignum::Big32x40 as Big;
/// Test whether truncating all bits less significant than `ones_place` introduces
/// a relative error less, equal, or greater than 0.5 ULP.
pub fn compare_with_half_ulp(f: &Big, ones_place: usize) -> Ordering {
if ones_place == 0 {
return Less;
}
let half_bit = ones_place - 1;
if f.get_bit(half_bit) == 0 {
// < 0.5 ULP
return Less;
}
// If all remaining bits are zero, it's = 0.5 ULP, otherwise > 0.5
// If there are no more bits (half_bit == 0), the below also correctly returns Equal.
for i in 0..half_bit {
if f.get_bit(i) == 1 {
return Greater;
}
}
Equal
}
/// Converts an ASCII string containing only decimal digits to a `u64`.
///
/// Does not perform checks for overflow or invalid characters, so if the caller is not careful,
/// the result is bogus and can panic (though it won't be `unsafe`). Additionally, empty strings
/// are treated as zero. This function exists because
///
/// 1. using `FromStr` on `&[u8]` requires `from_utf8_unchecked`, which is bad, and
/// 2. piecing together the results of `integral.parse()` and `fractional.parse()` is
/// more complicated than this entire function.
pub fn from_str_unchecked<'a, T>(bytes: T) -> u64
where
T: IntoIterator<Item = &'a u8>,
{
let mut result = 0;
for &c in bytes {
result = result * 10 + (c - b'0') as u64;
}
result
}
/// Converts a string of ASCII digits into a bignum.
///
/// Like `from_str_unchecked`, this function relies on the parser to weed out non-digits.
pub fn digits_to_big(integral: &[u8], fractional: &[u8]) -> Big {
let mut f = Big::from_small(0);
for &c in integral.iter().chain(fractional) {
let n = (c - b'0') as u32;
f.mul_small(10);
f.add_small(n);
}
f
}
/// Unwraps a bignum into a 64 bit integer. Panics if the number is too large.
pub fn to_u64(x: &Big) -> u64 {
assert!(x.bit_length() < 64);
let d = x.digits();
if d.len() < 2 { d[0] as u64 } else { (d[1] as u64) << 32 | d[0] as u64 }
}
/// Extracts a range of bits.
/// Index 0 is the least significant bit and the range is half-open as usual.
/// Panics if asked to extract more bits than fit into the return type.
pub fn get_bits(x: &Big, start: usize, end: usize) -> u64 {
assert!(end - start <= 64);
let mut result: u64 = 0;
for i in (start..end).rev() {
result = result << 1 | x.get_bit(i) as u64;
}
result
}

View file

@ -0,0 +1,86 @@
//! Representation of a float as the significant digits and exponent.
use crate::num::dec2flt::float::RawFloat;
use crate::num::dec2flt::fpu::set_precision;
#[rustfmt::skip]
const INT_POW10: [u64; 16] = [
1,
10,
100,
1000,
10000,
100000,
1000000,
10000000,
100000000,
1000000000,
10000000000,
100000000000,
1000000000000,
10000000000000,
100000000000000,
1000000000000000,
];
#[derive(Clone, Copy, Debug, Default, PartialEq, Eq)]
pub struct Number {
pub exponent: i64,
pub mantissa: u64,
pub negative: bool,
pub many_digits: bool,
}
impl Number {
/// Detect if the float can be accurately reconstructed from native floats.
fn is_fast_path<F: RawFloat>(&self) -> bool {
F::MIN_EXPONENT_FAST_PATH <= self.exponent
&& self.exponent <= F::MAX_EXPONENT_DISGUISED_FAST_PATH
&& self.mantissa <= F::MAX_MANTISSA_FAST_PATH
&& !self.many_digits
}
/// The fast path algorithmn using machine-sized integers and floats.
///
/// This is extracted into a separate function so that it can be attempted before constructing
/// a Decimal. This only works if both the mantissa and the exponent
/// can be exactly represented as a machine float, since IEE-754 guarantees
/// no rounding will occur.
///
/// There is an exception: disguised fast-path cases, where we can shift
/// powers-of-10 from the exponent to the significant digits.
pub fn try_fast_path<F: RawFloat>(&self) -> Option<F> {
// The fast path crucially depends on arithmetic being rounded to the correct number of bits
// without any intermediate rounding. On x86 (without SSE or SSE2) this requires the precision
// of the x87 FPU stack to be changed so that it directly rounds to 64/32 bit.
// The `set_precision` function takes care of setting the precision on architectures which
// require setting it by changing the global state (like the control word of the x87 FPU).
let _cw = set_precision::<F>();
if self.is_fast_path::<F>() {
let mut value = if self.exponent <= F::MAX_EXPONENT_FAST_PATH {
// normal fast path
let value = F::from_u64(self.mantissa);
if self.exponent < 0 {
value / F::pow10_fast_path((-self.exponent) as _)
} else {
value * F::pow10_fast_path(self.exponent as _)
}
} else {
// disguised fast path
let shift = self.exponent - F::MAX_EXPONENT_FAST_PATH;
let mantissa = self.mantissa.checked_mul(INT_POW10[shift as usize])?;
if mantissa > F::MAX_MANTISSA_FAST_PATH {
return None;
}
F::from_u64(mantissa) * F::pow10_fast_path(F::MAX_EXPONENT_FAST_PATH as _)
};
if self.negative {
value = -value;
}
Some(value)
} else {
None
}
}
}

View file

@ -1,121 +1,233 @@
//! Validating and decomposing a decimal string of the form:
//!
//! `(digits | digits? '.'? digits?) (('e' | 'E') ('+' | '-')? digits)?`
//!
//! In other words, standard floating-point syntax, with two exceptions: No sign, and no
//! handling of "inf" and "NaN". These are handled by the driver function (super::dec2flt).
//!
//! Although recognizing valid inputs is relatively easy, this module also has to reject the
//! countless invalid variations, never panic, and perform numerous checks that the other
//! modules rely on to not panic (or overflow) in turn.
//! To make matters worse, all that happens in a single pass over the input.
//! So, be careful when modifying anything, and double-check with the other modules.
use self::ParseResult::{Invalid, ShortcutToInf, ShortcutToZero, Valid};
use super::num;
//! Functions to parse floating-point numbers.
#[derive(Debug)]
pub enum Sign {
Positive,
Negative,
use crate::num::dec2flt::common::{is_8digits, AsciiStr, ByteSlice};
use crate::num::dec2flt::float::RawFloat;
use crate::num::dec2flt::number::Number;
const MIN_19DIGIT_INT: u64 = 100_0000_0000_0000_0000;
/// Parse 8 digits, loaded as bytes in little-endian order.
///
/// This uses the trick where every digit is in [0x030, 0x39],
/// and therefore can be parsed in 3 multiplications, much
/// faster than the normal 8.
///
/// This is based off the algorithm described in "Fast numeric string to
/// int", available here: <https://johnnylee-sde.github.io/Fast-numeric-string-to-int/>.
fn parse_8digits(mut v: u64) -> u64 {
const MASK: u64 = 0x0000_00FF_0000_00FF;
const MUL1: u64 = 0x000F_4240_0000_0064;
const MUL2: u64 = 0x0000_2710_0000_0001;
v -= 0x3030_3030_3030_3030;
v = (v * 10) + (v >> 8); // will not overflow, fits in 63 bits
let v1 = (v & MASK).wrapping_mul(MUL1);
let v2 = ((v >> 16) & MASK).wrapping_mul(MUL2);
((v1.wrapping_add(v2) >> 32) as u32) as u64
}
#[derive(Debug, PartialEq, Eq)]
/// The interesting parts of a decimal string.
pub struct Decimal<'a> {
pub integral: &'a [u8],
pub fractional: &'a [u8],
/// The decimal exponent, guaranteed to have fewer than 18 decimal digits.
pub exp: i64,
/// Parse digits until a non-digit character is found.
fn try_parse_digits(s: &mut AsciiStr<'_>, x: &mut u64) {
// may cause overflows, to be handled later
s.parse_digits(|digit| {
*x = x.wrapping_mul(10).wrapping_add(digit as _);
});
}
impl<'a> Decimal<'a> {
pub fn new(integral: &'a [u8], fractional: &'a [u8], exp: i64) -> Decimal<'a> {
Decimal { integral, fractional, exp }
}
}
#[derive(Debug, PartialEq, Eq)]
pub enum ParseResult<'a> {
Valid(Decimal<'a>),
ShortcutToInf,
ShortcutToZero,
Invalid,
}
/// Checks if the input string is a valid floating point number and if so, locate the integral
/// part, the fractional part, and the exponent in it. Does not handle signs.
pub fn parse_decimal(s: &str) -> ParseResult<'_> {
if s.is_empty() {
return Invalid;
}
let s = s.as_bytes();
let (integral, s) = eat_digits(s);
match s.first() {
None => Valid(Decimal::new(integral, b"", 0)),
Some(&b'e' | &b'E') => {
if integral.is_empty() {
return Invalid; // No digits before 'e'
/// Parse up to 19 digits (the max that can be stored in a 64-bit integer).
fn try_parse_19digits(s: &mut AsciiStr<'_>, x: &mut u64) {
while *x < MIN_19DIGIT_INT {
if let Some(&c) = s.as_ref().first() {
let digit = c.wrapping_sub(b'0');
if digit < 10 {
*x = (*x * 10) + digit as u64; // no overflows here
// SAFETY: cannot be empty
unsafe {
s.step();
}
} else {
break;
}
parse_exp(integral, b"", &s[1..])
} else {
break;
}
Some(&b'.') => {
let (fractional, s) = eat_digits(&s[1..]);
if integral.is_empty() && fractional.is_empty() {
// We require at least a single digit before or after the point.
return Invalid;
}
}
}
match s.first() {
None => Valid(Decimal::new(integral, fractional, 0)),
Some(&b'e' | &b'E') => parse_exp(integral, fractional, &s[1..]),
_ => Invalid, // Trailing junk after fractional part
/// Try to parse 8 digits at a time, using an optimized algorithm.
fn try_parse_8digits(s: &mut AsciiStr<'_>, x: &mut u64) {
// may cause overflows, to be handled later
if let Some(v) = s.read_u64() {
if is_8digits(v) {
*x = x.wrapping_mul(1_0000_0000).wrapping_add(parse_8digits(v));
// SAFETY: already ensured the buffer was >= 8 bytes in read_u64.
unsafe {
s.step_by(8);
}
if let Some(v) = s.read_u64() {
if is_8digits(v) {
*x = x.wrapping_mul(1_0000_0000).wrapping_add(parse_8digits(v));
// SAFETY: already ensured the buffer was >= 8 bytes in try_read_u64.
unsafe {
s.step_by(8);
}
}
}
}
_ => Invalid, // Trailing junk after first digit string
}
}
/// Carves off decimal digits up to the first non-digit character.
fn eat_digits(s: &[u8]) -> (&[u8], &[u8]) {
let pos = s.iter().position(|c| !c.is_ascii_digit()).unwrap_or(s.len());
s.split_at(pos)
/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> Option<i64> {
let mut exponent = 0_i64;
let mut negative = false;
if let Some(&c) = s.as_ref().get(0) {
negative = c == b'-';
if c == b'-' || c == b'+' {
// SAFETY: s cannot be empty
unsafe {
s.step();
}
}
}
if s.first_isdigit() {
s.parse_digits(|digit| {
// no overflows here, saturate well before overflow
if exponent < 0x10000 {
exponent = 10 * exponent + digit as i64;
}
});
if negative { Some(-exponent) } else { Some(exponent) }
} else {
None
}
}
/// Exponent extraction and error checking.
fn parse_exp<'a>(integral: &'a [u8], fractional: &'a [u8], rest: &'a [u8]) -> ParseResult<'a> {
let (sign, rest) = match rest.first() {
Some(&b'-') => (Sign::Negative, &rest[1..]),
Some(&b'+') => (Sign::Positive, &rest[1..]),
_ => (Sign::Positive, rest),
};
let (mut number, trailing) = eat_digits(rest);
if !trailing.is_empty() {
return Invalid; // Trailing junk after exponent
/// Parse a partial, non-special floating point number.
///
/// This creates a representation of the float as the
/// significant digits and the decimal exponent.
fn parse_partial_number(s: &[u8], negative: bool) -> Option<(Number, usize)> {
let mut s = AsciiStr::new(s);
let start = s;
debug_assert!(!s.is_empty());
// parse initial digits before dot
let mut mantissa = 0_u64;
let digits_start = s;
try_parse_digits(&mut s, &mut mantissa);
let mut n_digits = s.offset_from(&digits_start);
// handle dot with the following digits
let mut n_after_dot = 0;
let mut exponent = 0_i64;
let int_end = s;
if s.first_is(b'.') {
// SAFETY: s cannot be empty due to first_is
unsafe { s.step() };
let before = s;
try_parse_8digits(&mut s, &mut mantissa);
try_parse_digits(&mut s, &mut mantissa);
n_after_dot = s.offset_from(&before);
exponent = -n_after_dot as i64;
}
if number.is_empty() {
return Invalid; // Empty exponent
n_digits += n_after_dot;
if n_digits == 0 {
return None;
}
// At this point, we certainly have a valid string of digits. It may be too long to put into
// an `i64`, but if it's that huge, the input is certainly zero or infinity. Since each zero
// in the decimal digits only adjusts the exponent by +/- 1, at exp = 10^18 the input would
// have to be 17 exabyte (!) of zeros to get even remotely close to being finite.
// This is not exactly a use case we need to cater to.
while number.first() == Some(&b'0') {
number = &number[1..];
// handle scientific format
let mut exp_number = 0_i64;
if s.first_is2(b'e', b'E') {
// SAFETY: s cannot be empty
unsafe {
s.step();
}
// If None, we have no trailing digits after exponent, or an invalid float.
exp_number = parse_scientific(&mut s)?;
exponent += exp_number;
}
if number.len() >= 18 {
return match sign {
Sign::Positive => ShortcutToInf,
Sign::Negative => ShortcutToZero,
};
let len = s.offset_from(&start) as _;
// handle uncommon case with many digits
if n_digits <= 19 {
return Some((Number { exponent, mantissa, negative, many_digits: false }, len));
}
let abs_exp = num::from_str_unchecked(number);
let e = match sign {
Sign::Positive => abs_exp as i64,
Sign::Negative => -(abs_exp as i64),
};
Valid(Decimal::new(integral, fractional, e))
n_digits -= 19;
let mut many_digits = false;
let mut p = digits_start;
while p.first_is2(b'0', b'.') {
// SAFETY: p cannot be empty due to first_is2
unsafe {
// '0' = b'.' + 2
n_digits -= p.first_unchecked().saturating_sub(b'0' - 1) as isize;
p.step();
}
}
if n_digits > 0 {
// at this point we have more than 19 significant digits, let's try again
many_digits = true;
mantissa = 0;
let mut s = digits_start;
try_parse_19digits(&mut s, &mut mantissa);
exponent = if mantissa >= MIN_19DIGIT_INT {
// big int
int_end.offset_from(&s)
} else {
// SAFETY: the next byte must be present and be '.'
// We know this is true because we had more than 19
// digits previously, so we overflowed a 64-bit integer,
// but parsing only the integral digits produced less
// than 19 digits. That means we must have a decimal
// point, and at least 1 fractional digit.
unsafe { s.step() };
let before = s;
try_parse_19digits(&mut s, &mut mantissa);
-s.offset_from(&before)
} as i64;
// add back the explicit part
exponent += exp_number;
}
Some((Number { exponent, mantissa, negative, many_digits }, len))
}
/// Try to parse a non-special floating point number.
pub fn parse_number(s: &[u8], negative: bool) -> Option<Number> {
if let Some((float, rest)) = parse_partial_number(s, negative) {
if rest == s.len() {
return Some(float);
}
}
None
}
/// Parse a partial representation of a special, non-finite float.
fn parse_partial_inf_nan<F: RawFloat>(s: &[u8]) -> Option<(F, usize)> {
fn parse_inf_rest(s: &[u8]) -> usize {
if s.len() >= 8 && s[3..].as_ref().eq_ignore_case(b"inity") { 8 } else { 3 }
}
if s.len() >= 3 {
if s.eq_ignore_case(b"nan") {
return Some((F::NAN, 3));
} else if s.eq_ignore_case(b"inf") {
return Some((F::INFINITY, parse_inf_rest(s)));
}
}
None
}
/// Try to parse a special, non-finite float.
pub fn parse_inf_nan<F: RawFloat>(s: &[u8], negative: bool) -> Option<F> {
if let Some((mut float, rest)) = parse_partial_inf_nan::<F>(s) {
if rest == s.len() {
if negative {
float = -float;
}
return Some(float);
}
}
None
}

View file

@ -1,363 +0,0 @@
//! Bit fiddling on positive IEEE 754 floats. Negative numbers aren't and needn't be handled.
//! Normal floating point numbers have a canonical representation as (frac, exp) such that the
//! value is 2<sup>exp</sup> * (1 + sum(frac[N-i] / 2<sup>i</sup>)) where N is the number of bits.
//! Subnormals are slightly different and weird, but the same principle applies.
//!
//! Here, however, we represent them as (sig, k) with f positive, such that the value is f *
//! 2<sup>e</sup>. Besides making the "hidden bit" explicit, this changes the exponent by the
//! so-called mantissa shift.
//!
//! Put another way, normally floats are written as (1) but here they are written as (2):
//!
//! 1. `1.101100...11 * 2^m`
//! 2. `1101100...11 * 2^n`
//!
//! We call (1) the **fractional representation** and (2) the **integral representation**.
//!
//! Many functions in this module only handle normal numbers. The dec2flt routines conservatively
//! take the universally-correct slow path (Algorithm M) for very small and very large numbers.
//! That algorithm needs only next_float() which does handle subnormals and zeros.
use crate::cmp::Ordering::{Equal, Greater, Less};
use crate::convert::{TryFrom, TryInto};
use crate::fmt::{Debug, LowerExp};
use crate::num::dec2flt::num::{self, Big};
use crate::num::dec2flt::table;
use crate::num::diy_float::Fp;
use crate::num::FpCategory;
use crate::num::FpCategory::{Infinite, Nan, Normal, Subnormal, Zero};
use crate::ops::{Add, Div, Mul, Neg};
#[derive(Copy, Clone, Debug)]
pub struct Unpacked {
pub sig: u64,
pub k: i16,
}
impl Unpacked {
pub fn new(sig: u64, k: i16) -> Self {
Unpacked { sig, k }
}
}
/// A helper trait to avoid duplicating basically all the conversion code for `f32` and `f64`.
///
/// See the parent module's doc comment for why this is necessary.
///
/// Should **never ever** be implemented for other types or be used outside the dec2flt module.
pub trait RawFloat:
Copy + Debug + LowerExp + Mul<Output = Self> + Div<Output = Self> + Neg<Output = Self>
{
const INFINITY: Self;
const NAN: Self;
const ZERO: Self;
/// Type used by `to_bits` and `from_bits`.
type Bits: Add<Output = Self::Bits> + From<u8> + TryFrom<u64>;
/// Performs a raw transmutation to an integer.
fn to_bits(self) -> Self::Bits;
/// Performs a raw transmutation from an integer.
fn from_bits(v: Self::Bits) -> Self;
/// Returns the category that this number falls into.
fn classify(self) -> FpCategory;
/// Returns the mantissa, exponent and sign as integers.
fn integer_decode(self) -> (u64, i16, i8);
/// Decodes the float.
fn unpack(self) -> Unpacked;
/// Casts from a small integer that can be represented exactly. Panic if the integer can't be
/// represented, the other code in this module makes sure to never let that happen.
fn from_int(x: u64) -> Self;
/// Gets the value 10<sup>e</sup> from a pre-computed table.
/// Panics for `e >= CEIL_LOG5_OF_MAX_SIG`.
fn short_fast_pow10(e: usize) -> Self;
/// What the name says. It's easier to hard code than juggling intrinsics and
/// hoping LLVM constant folds it.
const CEIL_LOG5_OF_MAX_SIG: i16;
// A conservative bound on the decimal digits of inputs that can't produce overflow or zero or
/// subnormals. Probably the decimal exponent of the maximum normal value, hence the name.
const MAX_NORMAL_DIGITS: usize;
/// When the most significant decimal digit has a place value greater than this, the number
/// is certainly rounded to infinity.
const INF_CUTOFF: i64;
/// When the most significant decimal digit has a place value less than this, the number
/// is certainly rounded to zero.
const ZERO_CUTOFF: i64;
/// The number of bits in the exponent.
const EXP_BITS: u8;
/// The number of bits in the significand, *including* the hidden bit.
const SIG_BITS: u8;
/// The number of bits in the significand, *excluding* the hidden bit.
const EXPLICIT_SIG_BITS: u8;
/// The maximum legal exponent in fractional representation.
const MAX_EXP: i16;
/// The minimum legal exponent in fractional representation, excluding subnormals.
const MIN_EXP: i16;
/// `MAX_EXP` for integral representation, i.e., with the shift applied.
const MAX_EXP_INT: i16;
/// `MAX_EXP` encoded (i.e., with offset bias)
const MAX_ENCODED_EXP: i16;
/// `MIN_EXP` for integral representation, i.e., with the shift applied.
const MIN_EXP_INT: i16;
/// The maximum normalized significand in integral representation.
const MAX_SIG: u64;
/// The minimal normalized significand in integral representation.
const MIN_SIG: u64;
}
// Mostly a workaround for #34344.
macro_rules! other_constants {
($type: ident) => {
const EXPLICIT_SIG_BITS: u8 = Self::SIG_BITS - 1;
const MAX_EXP: i16 = (1 << (Self::EXP_BITS - 1)) - 1;
const MIN_EXP: i16 = -<Self as RawFloat>::MAX_EXP + 1;
const MAX_EXP_INT: i16 = <Self as RawFloat>::MAX_EXP - (Self::SIG_BITS as i16 - 1);
const MAX_ENCODED_EXP: i16 = (1 << Self::EXP_BITS) - 1;
const MIN_EXP_INT: i16 = <Self as RawFloat>::MIN_EXP - (Self::SIG_BITS as i16 - 1);
const MAX_SIG: u64 = (1 << Self::SIG_BITS) - 1;
const MIN_SIG: u64 = 1 << (Self::SIG_BITS - 1);
const INFINITY: Self = $type::INFINITY;
const NAN: Self = $type::NAN;
const ZERO: Self = 0.0;
};
}
impl RawFloat for f32 {
type Bits = u32;
const SIG_BITS: u8 = 24;
const EXP_BITS: u8 = 8;
const CEIL_LOG5_OF_MAX_SIG: i16 = 11;
const MAX_NORMAL_DIGITS: usize = 35;
const INF_CUTOFF: i64 = 40;
const ZERO_CUTOFF: i64 = -48;
other_constants!(f32);
/// Returns the mantissa, exponent and sign as integers.
fn integer_decode(self) -> (u64, i16, i8) {
let bits = self.to_bits();
let sign: i8 = if bits >> 31 == 0 { 1 } else { -1 };
let mut exponent: i16 = ((bits >> 23) & 0xff) as i16;
let mantissa =
if exponent == 0 { (bits & 0x7fffff) << 1 } else { (bits & 0x7fffff) | 0x800000 };
// Exponent bias + mantissa shift
exponent -= 127 + 23;
(mantissa as u64, exponent, sign)
}
fn unpack(self) -> Unpacked {
let (sig, exp, _sig) = self.integer_decode();
Unpacked::new(sig, exp)
}
fn from_int(x: u64) -> f32 {
// rkruppe is uncertain whether `as` rounds correctly on all platforms.
debug_assert!(x as f32 == fp_to_float(Fp { f: x, e: 0 }));
x as f32
}
fn short_fast_pow10(e: usize) -> Self {
table::F32_SHORT_POWERS[e]
}
fn classify(self) -> FpCategory {
self.classify()
}
fn to_bits(self) -> Self::Bits {
self.to_bits()
}
fn from_bits(v: Self::Bits) -> Self {
Self::from_bits(v)
}
}
impl RawFloat for f64 {
type Bits = u64;
const SIG_BITS: u8 = 53;
const EXP_BITS: u8 = 11;
const CEIL_LOG5_OF_MAX_SIG: i16 = 23;
const MAX_NORMAL_DIGITS: usize = 305;
const INF_CUTOFF: i64 = 310;
const ZERO_CUTOFF: i64 = -326;
other_constants!(f64);
/// Returns the mantissa, exponent and sign as integers.
fn integer_decode(self) -> (u64, i16, i8) {
let bits = self.to_bits();
let sign: i8 = if bits >> 63 == 0 { 1 } else { -1 };
let mut exponent: i16 = ((bits >> 52) & 0x7ff) as i16;
let mantissa = if exponent == 0 {
(bits & 0xfffffffffffff) << 1
} else {
(bits & 0xfffffffffffff) | 0x10000000000000
};
// Exponent bias + mantissa shift
exponent -= 1023 + 52;
(mantissa, exponent, sign)
}
fn unpack(self) -> Unpacked {
let (sig, exp, _sig) = self.integer_decode();
Unpacked::new(sig, exp)
}
fn from_int(x: u64) -> f64 {
// rkruppe is uncertain whether `as` rounds correctly on all platforms.
debug_assert!(x as f64 == fp_to_float(Fp { f: x, e: 0 }));
x as f64
}
fn short_fast_pow10(e: usize) -> Self {
table::F64_SHORT_POWERS[e]
}
fn classify(self) -> FpCategory {
self.classify()
}
fn to_bits(self) -> Self::Bits {
self.to_bits()
}
fn from_bits(v: Self::Bits) -> Self {
Self::from_bits(v)
}
}
/// Converts an `Fp` to the closest machine float type.
/// Does not handle subnormal results.
pub fn fp_to_float<T: RawFloat>(x: Fp) -> T {
let x = x.normalize();
// x.f is 64 bit, so x.e has a mantissa shift of 63
let e = x.e + 63;
if e > T::MAX_EXP {
panic!("fp_to_float: exponent {} too large", e)
} else if e > T::MIN_EXP {
encode_normal(round_normal::<T>(x))
} else {
panic!("fp_to_float: exponent {} too small", e)
}
}
/// Round the 64-bit significand to T::SIG_BITS bits with half-to-even.
/// Does not handle exponent overflow.
pub fn round_normal<T: RawFloat>(x: Fp) -> Unpacked {
let excess = 64 - T::SIG_BITS as i16;
let half: u64 = 1 << (excess - 1);
let (q, rem) = (x.f >> excess, x.f & ((1 << excess) - 1));
assert_eq!(q << excess | rem, x.f);
// Adjust mantissa shift
let k = x.e + excess;
if rem < half {
Unpacked::new(q, k)
} else if rem == half && (q % 2) == 0 {
Unpacked::new(q, k)
} else if q == T::MAX_SIG {
Unpacked::new(T::MIN_SIG, k + 1)
} else {
Unpacked::new(q + 1, k)
}
}
/// Inverse of `RawFloat::unpack()` for normalized numbers.
/// Panics if the significand or exponent are not valid for normalized numbers.
pub fn encode_normal<T: RawFloat>(x: Unpacked) -> T {
debug_assert!(
T::MIN_SIG <= x.sig && x.sig <= T::MAX_SIG,
"encode_normal: significand not normalized"
);
// Remove the hidden bit
let sig_enc = x.sig & !(1 << T::EXPLICIT_SIG_BITS);
// Adjust the exponent for exponent bias and mantissa shift
let k_enc = x.k + T::MAX_EXP + T::EXPLICIT_SIG_BITS as i16;
debug_assert!(k_enc != 0 && k_enc < T::MAX_ENCODED_EXP, "encode_normal: exponent out of range");
// Leave sign bit at 0 ("+"), our numbers are all positive
let bits = (k_enc as u64) << T::EXPLICIT_SIG_BITS | sig_enc;
T::from_bits(bits.try_into().unwrap_or_else(|_| unreachable!()))
}
/// Construct a subnormal. A mantissa of 0 is allowed and constructs zero.
pub fn encode_subnormal<T: RawFloat>(significand: u64) -> T {
assert!(significand < T::MIN_SIG, "encode_subnormal: not actually subnormal");
// Encoded exponent is 0, the sign bit is 0, so we just have to reinterpret the bits.
T::from_bits(significand.try_into().unwrap_or_else(|_| unreachable!()))
}
/// Approximate a bignum with an Fp. Rounds within 0.5 ULP with half-to-even.
pub fn big_to_fp(f: &Big) -> Fp {
let end = f.bit_length();
assert!(end != 0, "big_to_fp: unexpectedly, input is zero");
let start = end.saturating_sub(64);
let leading = num::get_bits(f, start, end);
// We cut off all bits prior to the index `start`, i.e., we effectively right-shift by
// an amount of `start`, so this is also the exponent we need.
let e = start as i16;
let rounded_down = Fp { f: leading, e }.normalize();
// Round (half-to-even) depending on the truncated bits.
match num::compare_with_half_ulp(f, start) {
Less => rounded_down,
Equal if leading % 2 == 0 => rounded_down,
Equal | Greater => match leading.checked_add(1) {
Some(f) => Fp { f, e }.normalize(),
None => Fp { f: 1 << 63, e: e + 1 },
},
}
}
/// Finds the largest floating point number strictly smaller than the argument.
/// Does not handle subnormals, zero, or exponent underflow.
pub fn prev_float<T: RawFloat>(x: T) -> T {
match x.classify() {
Infinite => panic!("prev_float: argument is infinite"),
Nan => panic!("prev_float: argument is NaN"),
Subnormal => panic!("prev_float: argument is subnormal"),
Zero => panic!("prev_float: argument is zero"),
Normal => {
let Unpacked { sig, k } = x.unpack();
if sig == T::MIN_SIG {
encode_normal(Unpacked::new(T::MAX_SIG, k - 1))
} else {
encode_normal(Unpacked::new(sig - 1, k))
}
}
}
}
// Find the smallest floating point number strictly larger than the argument.
// This operation is saturating, i.e., next_float(inf) == inf.
// Unlike most code in this module, this function does handle zero, subnormals, and infinities.
// However, like all other code here, it does not deal with NaN and negative numbers.
pub fn next_float<T: RawFloat>(x: T) -> T {
match x.classify() {
Nan => panic!("next_float: argument is NaN"),
Infinite => T::INFINITY,
// This seems too good to be true, but it works.
// 0.0 is encoded as the all-zero word. Subnormals are 0x000m...m where m is the mantissa.
// In particular, the smallest subnormal is 0x0...01 and the largest is 0x000F...F.
// The smallest normal number is 0x0010...0, so this corner case works as well.
// If the increment overflows the mantissa, the carry bit increments the exponent as we
// want, and the mantissa bits become zero. Because of the hidden bit convention, this
// too is exactly what we want!
// Finally, f64::MAX + 1 = 7eff...f + 1 = 7ff0...0 = f64::INFINITY.
Zero | Subnormal | Normal => T::from_bits(x.to_bits() + T::Bits::from(1u8)),
}
}

View file

@ -0,0 +1,109 @@
//! Slow, fallback algorithm for cases the Eisel-Lemire algorithm cannot round.
use crate::num::dec2flt::common::BiasedFp;
use crate::num::dec2flt::decimal::{parse_decimal, Decimal};
use crate::num::dec2flt::float::RawFloat;
/// Parse the significant digits and biased, binary exponent of a float.
///
/// This is a fallback algorithm that uses a big-integer representation
/// of the float, and therefore is considerably slower than faster
/// approximations. However, it will always determine how to round
/// the significant digits to the nearest machine float, allowing
/// use to handle near half-way cases.
///
/// Near half-way cases are halfway between two consecutive machine floats.
/// For example, the float `16777217.0` has a bitwise representation of
/// `100000000000000000000000 1`. Rounding to a single-precision float,
/// the trailing `1` is truncated. Using round-nearest, tie-even, any
/// value above `16777217.0` must be rounded up to `16777218.0`, while
/// any value before or equal to `16777217.0` must be rounded down
/// to `16777216.0`. These near-halfway conversions therefore may require
/// a large number of digits to unambiguously determine how to round.
///
/// The algorithms described here are based on "Processing Long Numbers Quickly",
/// available here: <https://arxiv.org/pdf/2101.11408.pdf#section.11>.
pub(crate) fn parse_long_mantissa<F: RawFloat>(s: &[u8]) -> BiasedFp {
const MAX_SHIFT: usize = 60;
const NUM_POWERS: usize = 19;
const POWERS: [u8; 19] =
[0, 3, 6, 9, 13, 16, 19, 23, 26, 29, 33, 36, 39, 43, 46, 49, 53, 56, 59];
let get_shift = |n| {
if n < NUM_POWERS { POWERS[n] as usize } else { MAX_SHIFT }
};
let fp_zero = BiasedFp::zero_pow2(0);
let fp_inf = BiasedFp::zero_pow2(F::INFINITE_POWER);
let mut d = parse_decimal(s);
// Short-circuit if the value can only be a literal 0 or infinity.
if d.num_digits == 0 || d.decimal_point < -324 {
return fp_zero;
} else if d.decimal_point >= 310 {
return fp_inf;
}
let mut exp2 = 0_i32;
// Shift right toward (1/2 ... 1].
while d.decimal_point > 0 {
let n = d.decimal_point as usize;
let shift = get_shift(n);
d.right_shift(shift);
if d.decimal_point < -Decimal::DECIMAL_POINT_RANGE {
return fp_zero;
}
exp2 += shift as i32;
}
// Shift left toward (1/2 ... 1].
while d.decimal_point <= 0 {
let shift = if d.decimal_point == 0 {
match d.digits[0] {
digit if digit >= 5 => break,
0 | 1 => 2,
_ => 1,
}
} else {
get_shift((-d.decimal_point) as _)
};
d.left_shift(shift);
if d.decimal_point > Decimal::DECIMAL_POINT_RANGE {
return fp_inf;
}
exp2 -= shift as i32;
}
// We are now in the range [1/2 ... 1] but the binary format uses [1 ... 2].
exp2 -= 1;
while (F::MINIMUM_EXPONENT + 1) > exp2 {
let mut n = ((F::MINIMUM_EXPONENT + 1) - exp2) as usize;
if n > MAX_SHIFT {
n = MAX_SHIFT;
}
d.right_shift(n);
exp2 += n as i32;
}
if (exp2 - F::MINIMUM_EXPONENT) >= F::INFINITE_POWER {
return fp_inf;
}
// Shift the decimal to the hidden bit, and then round the value
// to get the high mantissa+1 bits.
d.left_shift(F::MANTISSA_EXPLICIT_BITS + 1);
let mut mantissa = d.round();
if mantissa >= (1_u64 << (F::MANTISSA_EXPLICIT_BITS + 1)) {
// Rounding up overflowed to the carry bit, need to
// shift back to the hidden bit.
d.right_shift(1);
exp2 += 1;
mantissa = d.round();
if (exp2 - F::MINIMUM_EXPONENT) >= F::INFINITE_POWER {
return fp_inf;
}
}
let mut power2 = exp2 - F::MINIMUM_EXPONENT;
if mantissa < (1_u64 << F::MANTISSA_EXPLICIT_BITS) {
power2 -= 1;
}
// Zero out all the bits above the explicit mantissa bits.
mantissa &= (1_u64 << F::MANTISSA_EXPLICIT_BITS) - 1;
BiasedFp { f: mantissa, e: power2 }
}

File diff suppressed because it is too large Load diff

View file

@ -1,6 +1,6 @@
//! Decodes a floating-point value into individual parts and error ranges.
use crate::num::dec2flt::rawfp::RawFloat;
use crate::num::dec2flt::float::RawFloat;
use crate::num::FpCategory;
/// Decoded unsigned finite value, such that:

View file

@ -0,0 +1,33 @@
use core::num::dec2flt::float::RawFloat;
#[test]
fn test_f32_integer_decode() {
assert_eq!(3.14159265359f32.integer_decode(), (13176795, -22, 1));
assert_eq!((-8573.5918555f32).integer_decode(), (8779358, -10, -1));
assert_eq!(2f32.powf(100.0).integer_decode(), (8388608, 77, 1));
assert_eq!(0f32.integer_decode(), (0, -150, 1));
assert_eq!((-0f32).integer_decode(), (0, -150, -1));
assert_eq!(f32::INFINITY.integer_decode(), (8388608, 105, 1));
assert_eq!(f32::NEG_INFINITY.integer_decode(), (8388608, 105, -1));
// Ignore the "sign" (quiet / signalling flag) of NAN.
// It can vary between runtime operations and LLVM folding.
let (nan_m, nan_e, _nan_s) = f32::NAN.integer_decode();
assert_eq!((nan_m, nan_e), (12582912, 105));
}
#[test]
fn test_f64_integer_decode() {
assert_eq!(3.14159265359f64.integer_decode(), (7074237752028906, -51, 1));
assert_eq!((-8573.5918555f64).integer_decode(), (4713381968463931, -39, -1));
assert_eq!(2f64.powf(100.0).integer_decode(), (4503599627370496, 48, 1));
assert_eq!(0f64.integer_decode(), (0, -1075, 1));
assert_eq!((-0f64).integer_decode(), (0, -1075, -1));
assert_eq!(f64::INFINITY.integer_decode(), (4503599627370496, 972, 1));
assert_eq!(f64::NEG_INFINITY.integer_decode(), (4503599627370496, 972, -1));
// Ignore the "sign" (quiet / signalling flag) of NAN.
// It can vary between runtime operations and LLVM folding.
let (nan_m, nan_e, _nan_s) = f64::NAN.integer_decode();
assert_eq!((nan_m, nan_e), (6755399441055744, 972));
}

View file

@ -0,0 +1,53 @@
use core::num::dec2flt::lemire::compute_float;
fn compute_float32(q: i64, w: u64) -> (i32, u64) {
let fp = compute_float::<f32>(q, w);
(fp.e, fp.f)
}
fn compute_float64(q: i64, w: u64) -> (i32, u64) {
let fp = compute_float::<f64>(q, w);
(fp.e, fp.f)
}
#[test]
fn compute_float_f32_rounding() {
// These test near-halfway cases for single-precision floats.
assert_eq!(compute_float32(0, 16777216), (151, 0));
assert_eq!(compute_float32(0, 16777217), (151, 0));
assert_eq!(compute_float32(0, 16777218), (151, 1));
assert_eq!(compute_float32(0, 16777219), (151, 2));
assert_eq!(compute_float32(0, 16777220), (151, 2));
// These are examples of the above tests, with
// digits from the exponent shifted to the mantissa.
assert_eq!(compute_float32(-10, 167772160000000000), (151, 0));
assert_eq!(compute_float32(-10, 167772170000000000), (151, 0));
assert_eq!(compute_float32(-10, 167772180000000000), (151, 1));
// Let's check the lines to see if anything is different in table...
assert_eq!(compute_float32(-10, 167772190000000000), (151, 2));
assert_eq!(compute_float32(-10, 167772200000000000), (151, 2));
}
#[test]
fn compute_float_f64_rounding() {
// These test near-halfway cases for double-precision floats.
assert_eq!(compute_float64(0, 9007199254740992), (1076, 0));
assert_eq!(compute_float64(0, 9007199254740993), (1076, 0));
assert_eq!(compute_float64(0, 9007199254740994), (1076, 1));
assert_eq!(compute_float64(0, 9007199254740995), (1076, 2));
assert_eq!(compute_float64(0, 9007199254740996), (1076, 2));
assert_eq!(compute_float64(0, 18014398509481984), (1077, 0));
assert_eq!(compute_float64(0, 18014398509481986), (1077, 0));
assert_eq!(compute_float64(0, 18014398509481988), (1077, 1));
assert_eq!(compute_float64(0, 18014398509481990), (1077, 2));
assert_eq!(compute_float64(0, 18014398509481992), (1077, 2));
// These are examples of the above tests, with
// digits from the exponent shifted to the mantissa.
assert_eq!(compute_float64(-3, 9007199254740992000), (1076, 0));
assert_eq!(compute_float64(-3, 9007199254740993000), (1076, 0));
assert_eq!(compute_float64(-3, 9007199254740994000), (1076, 1));
assert_eq!(compute_float64(-3, 9007199254740995000), (1076, 2));
assert_eq!(compute_float64(-3, 9007199254740996000), (1076, 2));
}

View file

@ -1,7 +1,8 @@
#![allow(overflowing_literals)]
mod float;
mod lemire;
mod parse;
mod rawfp;
// Take a float literal, turn it into a string in various ways (that are all trusted
// to be correct) and see if those strings are parsed back to the value of the literal.
@ -28,12 +29,6 @@ fn ordinary() {
test_literal!(0.1);
test_literal!(12345.);
test_literal!(0.9999999);
if cfg!(miri) {
// Miri is too slow
return;
}
test_literal!(2.2250738585072014e-308);
}
@ -54,7 +49,6 @@ fn large() {
}
#[test]
#[cfg_attr(miri, ignore)] // Miri is too slow
fn subnormals() {
test_literal!(5e-324);
test_literal!(91e-324);
@ -66,7 +60,6 @@ fn subnormals() {
}
#[test]
#[cfg_attr(miri, ignore)] // Miri is too slow
fn infinity() {
test_literal!(1e400);
test_literal!(1e309);
@ -78,12 +71,6 @@ fn infinity() {
fn zero() {
test_literal!(0.0);
test_literal!(1e-325);
if cfg!(miri) {
// Miri is too slow
return;
}
test_literal!(1e-326);
test_literal!(1e-500);
}

View file

@ -1,17 +1,23 @@
use core::num::dec2flt::parse::ParseResult::{Invalid, Valid};
use core::num::dec2flt::parse::{parse_decimal, Decimal};
use core::num::dec2flt::number::Number;
use core::num::dec2flt::parse::parse_number;
use core::num::dec2flt::{dec2flt, pfe_invalid};
fn new_number(e: i64, m: u64) -> Number {
Number { exponent: e, mantissa: m, negative: false, many_digits: false }
}
#[test]
fn missing_pieces() {
let permutations = &[".e", "1e", "e4", "e", ".12e", "321.e", "32.12e+", "12.32e-"];
for &s in permutations {
assert_eq!(parse_decimal(s), Invalid);
assert_eq!(dec2flt::<f64>(s), Err(pfe_invalid()));
}
}
#[test]
fn invalid_chars() {
let invalid = "r,?<j";
let error = Err(pfe_invalid());
let valid_strings = &["123", "666.", ".1", "5e1", "7e-3", "0.0e+1"];
for c in invalid.chars() {
for s in valid_strings {
@ -19,23 +25,153 @@ fn invalid_chars() {
let mut input = String::new();
input.push_str(s);
input.insert(i, c);
assert!(parse_decimal(&input) == Invalid, "did not reject invalid {:?}", input);
assert!(dec2flt::<f64>(&input) == error, "did not reject invalid {:?}", input);
}
}
}
}
fn parse_positive(s: &[u8]) -> Option<Number> {
parse_number(s, false)
}
#[test]
fn valid() {
assert_eq!(parse_decimal("123.456e789"), Valid(Decimal::new(b"123", b"456", 789)));
assert_eq!(parse_decimal("123.456e+789"), Valid(Decimal::new(b"123", b"456", 789)));
assert_eq!(parse_decimal("123.456e-789"), Valid(Decimal::new(b"123", b"456", -789)));
assert_eq!(parse_decimal(".050"), Valid(Decimal::new(b"", b"050", 0)));
assert_eq!(parse_decimal("999"), Valid(Decimal::new(b"999", b"", 0)));
assert_eq!(parse_decimal("1.e300"), Valid(Decimal::new(b"1", b"", 300)));
assert_eq!(parse_decimal(".1e300"), Valid(Decimal::new(b"", b"1", 300)));
assert_eq!(parse_decimal("101e-33"), Valid(Decimal::new(b"101", b"", -33)));
assert_eq!(parse_positive(b"123.456e789"), Some(new_number(786, 123456)));
assert_eq!(parse_positive(b"123.456e+789"), Some(new_number(786, 123456)));
assert_eq!(parse_positive(b"123.456e-789"), Some(new_number(-792, 123456)));
assert_eq!(parse_positive(b".050"), Some(new_number(-3, 50)));
assert_eq!(parse_positive(b"999"), Some(new_number(0, 999)));
assert_eq!(parse_positive(b"1.e300"), Some(new_number(300, 1)));
assert_eq!(parse_positive(b".1e300"), Some(new_number(299, 1)));
assert_eq!(parse_positive(b"101e-33"), Some(new_number(-33, 101)));
let zeros = "0".repeat(25);
let s = format!("1.5e{}", zeros);
assert_eq!(parse_decimal(&s), Valid(Decimal::new(b"1", b"5", 0)));
assert_eq!(parse_positive(s.as_bytes()), Some(new_number(-1, 15)));
}
macro_rules! assert_float_result_bits_eq {
($bits:literal, $ty:ty, $str:literal) => {{
let p = dec2flt::<$ty>($str);
assert_eq!(p.map(|x| x.to_bits()), Ok($bits));
}};
}
#[test]
fn issue31109() {
// Regression test for #31109.
// Ensure the test produces a valid float with the expected bit pattern.
assert_float_result_bits_eq!(
0x3fd5555555555555,
f64,
"0.3333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333"
);
}
#[test]
fn issue31407() {
// Regression test for #31407.
// Ensure the test produces a valid float with the expected bit pattern.
assert_float_result_bits_eq!(
0x1752a64e34ba0d3,
f64,
"1234567890123456789012345678901234567890e-340"
);
assert_float_result_bits_eq!(
0xfffffffffffff,
f64,
"2.225073858507201136057409796709131975934819546351645648023426109724822222021076945516529523908135087914149158913039621106870086438694594645527657207407820621743379988141063267329253552286881372149012981122451451889849057222307285255133155755015914397476397983411801999323962548289017107081850690630666655994938275772572015763062690663332647565300009245888316433037779791869612049497390377829704905051080609940730262937128958950003583799967207254304360284078895771796150945516748243471030702609144621572289880258182545180325707018860872113128079512233426288368622321503775666622503982534335974568884423900265498198385487948292206894721689831099698365846814022854243330660339850886445804001034933970427567186443383770486037861622771738545623065874679014086723327636718749999999999999999999999999999999999999e-308"
);
assert_float_result_bits_eq!(
0x10000000000000,
f64,
"2.22507385850720113605740979670913197593481954635164564802342610972482222202107694551652952390813508791414915891303962110687008643869459464552765720740782062174337998814106326732925355228688137214901298112245145188984905722230728525513315575501591439747639798341180199932396254828901710708185069063066665599493827577257201576306269066333264756530000924588831643303777979186961204949739037782970490505108060994073026293712895895000358379996720725430436028407889577179615094551674824347103070260914462157228988025818254518032570701886087211312807951223342628836862232150377566662250398253433597456888442390026549819838548794829220689472168983109969836584681402285424333066033985088644580400103493397042756718644338377048603786162277173854562306587467901408672332763671875e-308"
);
assert_float_result_bits_eq!(
0x10000000000000,
f64,
"0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000222507385850720138309023271733240406421921598046233183055332741688720443481391819585428315901251102056406733973103581100515243416155346010885601238537771882113077799353200233047961014744258363607192156504694250373420837525080665061665815894872049117996859163964850063590877011830487479978088775374994945158045160505091539985658247081864511353793580499211598108576605199243335211435239014879569960959128889160299264151106346631339366347758651302937176204732563178148566435087212282863764204484681140761391147706280168985324411002416144742161856716615054015428508471675290190316132277889672970737312333408698898317506783884692609277397797285865965494109136909540613646756870239867831529068098461721092462539672851562500000000000000001"
);
assert_float_result_bits_eq!(
0x7fefffffffffffff,
f64,
"179769313486231580793728971405303415079934132710037826936173778980444968292764750946649017977587207096330286416692887910946555547851940402630657488671505820681908902000708383676273854845817711531764475730270069855571366959622842914819860834936475292719074168444365510704342711559699508093042880177904174497791.9999999999999999999999999999999999999999999999999999999999999999999999"
);
assert_float_result_bits_eq!(0x0, f64, "2.47032822920623272e-324");
assert_float_result_bits_eq!(
0x8000000,
f64,
"6.631236871469758276785396630275967243399099947355303144249971758736286630139265439618068200788048744105960420552601852889715006376325666595539603330361800519107591783233358492337208057849499360899425128640718856616503093444922854759159988160304439909868291973931426625698663157749836252274523485312442358651207051292453083278116143932569727918709786004497872322193856150225415211997283078496319412124640111777216148110752815101775295719811974338451936095907419622417538473679495148632480391435931767981122396703443803335529756003353209830071832230689201383015598792184172909927924176339315507402234836120730914783168400715462440053817592702766213559042115986763819482654128770595766806872783349146967171293949598850675682115696218943412532098591327667236328125E-316"
);
assert_float_result_bits_eq!(
0x10000,
f64,
"3.237883913302901289588352412501532174863037669423108059901297049552301970670676565786835742587799557860615776559838283435514391084153169252689190564396459577394618038928365305143463955100356696665629202017331344031730044369360205258345803431471660032699580731300954848363975548690010751530018881758184174569652173110473696022749934638425380623369774736560008997404060967498028389191878963968575439222206416981462690113342524002724385941651051293552601421155333430225237291523843322331326138431477823591142408800030775170625915670728657003151953664260769822494937951845801530895238439819708403389937873241463484205608000027270531106827387907791444918534771598750162812548862768493201518991668028251730299953143924168545708663913273994694463908672332763671875E-319"
);
assert_float_result_bits_eq!(
0x800000000100,
f64,
"6.953355807847677105972805215521891690222119817145950754416205607980030131549636688806115726399441880065386399864028691275539539414652831584795668560082999889551357784961446896042113198284213107935110217162654939802416034676213829409720583759540476786936413816541621287843248433202369209916612249676005573022703244799714622116542188837770376022371172079559125853382801396219552418839469770514904192657627060319372847562301074140442660237844114174497210955449896389180395827191602886654488182452409583981389442783377001505462015745017848754574668342161759496661766020028752888783387074850773192997102997936619876226688096314989645766000479009083731736585750335262099860150896718774401964796827166283225641992040747894382698751809812609536720628966577351093292236328125E-310"
);
assert_float_result_bits_eq!(
0x10800,
f64,
"3.339068557571188581835713701280943911923401916998521771655656997328440314559615318168849149074662609099998113009465566426808170378434065722991659642619467706034884424989741080790766778456332168200464651593995817371782125010668346652995912233993254584461125868481633343674905074271064409763090708017856584019776878812425312008812326260363035474811532236853359905334625575404216060622858633280744301892470300555678734689978476870369853549413277156622170245846166991655321535529623870646888786637528995592800436177901746286272273374471701452991433047257863864601424252024791567368195056077320885329384322332391564645264143400798619665040608077549162173963649264049738362290606875883456826586710961041737908872035803481241600376705491726170293986797332763671875E-319"
);
assert_float_result_bits_eq!(
0x0,
f64,
"2.4703282292062327208828439643411068618252990130716238221279284125033775363510437593264991818081799618989828234772285886546332835517796989819938739800539093906315035659515570226392290858392449105184435931802849936536152500319370457678249219365623669863658480757001585769269903706311928279558551332927834338409351978015531246597263579574622766465272827220056374006485499977096599470454020828166226237857393450736339007967761930577506740176324673600968951340535537458516661134223766678604162159680461914467291840300530057530849048765391711386591646239524912623653881879636239373280423891018672348497668235089863388587925628302755995657524455507255189313690836254779186948667994968324049705821028513185451396213837722826145437693412532098591327667236328124999e-324"
);
assert_float_result_bits_eq!(
0x0,
f64,
"2.4703282292062327208828439643411068618252990130716238221279284125033775363510437593264991818081799618989828234772285886546332835517796989819938739800539093906315035659515570226392290858392449105184435931802849936536152500319370457678249219365623669863658480757001585769269903706311928279558551332927834338409351978015531246597263579574622766465272827220056374006485499977096599470454020828166226237857393450736339007967761930577506740176324673600968951340535537458516661134223766678604162159680461914467291840300530057530849048765391711386591646239524912623653881879636239373280423891018672348497668235089863388587925628302755995657524455507255189313690836254779186948667994968324049705821028513185451396213837722826145437693412532098591327667236328125e-324"
);
assert_float_result_bits_eq!(
0x1,
f64,
"2.4703282292062327208828439643411068618252990130716238221279284125033775363510437593264991818081799618989828234772285886546332835517796989819938739800539093906315035659515570226392290858392449105184435931802849936536152500319370457678249219365623669863658480757001585769269903706311928279558551332927834338409351978015531246597263579574622766465272827220056374006485499977096599470454020828166226237857393450736339007967761930577506740176324673600968951340535537458516661134223766678604162159680461914467291840300530057530849048765391711386591646239524912623653881879636239373280423891018672348497668235089863388587925628302755995657524455507255189313690836254779186948667994968324049705821028513185451396213837722826145437693412532098591327667236328125001e-324"
);
assert_float_result_bits_eq!(
0x1,
f64,
"7.4109846876186981626485318930233205854758970392148714663837852375101326090531312779794975454245398856969484704316857659638998506553390969459816219401617281718945106978546710679176872575177347315553307795408549809608457500958111373034747658096871009590975442271004757307809711118935784838675653998783503015228055934046593739791790738723868299395818481660169122019456499931289798411362062484498678713572180352209017023903285791732520220528974020802906854021606612375549983402671300035812486479041385743401875520901590172592547146296175134159774938718574737870961645638908718119841271673056017045493004705269590165763776884908267986972573366521765567941072508764337560846003984904972149117463085539556354188641513168478436313080237596295773983001708984374999e-324"
);
assert_float_result_bits_eq!(
0x2,
f64,
"7.4109846876186981626485318930233205854758970392148714663837852375101326090531312779794975454245398856969484704316857659638998506553390969459816219401617281718945106978546710679176872575177347315553307795408549809608457500958111373034747658096871009590975442271004757307809711118935784838675653998783503015228055934046593739791790738723868299395818481660169122019456499931289798411362062484498678713572180352209017023903285791732520220528974020802906854021606612375549983402671300035812486479041385743401875520901590172592547146296175134159774938718574737870961645638908718119841271673056017045493004705269590165763776884908267986972573366521765567941072508764337560846003984904972149117463085539556354188641513168478436313080237596295773983001708984375e-324"
);
assert_float_result_bits_eq!(
0x2,
f64,
"7.4109846876186981626485318930233205854758970392148714663837852375101326090531312779794975454245398856969484704316857659638998506553390969459816219401617281718945106978546710679176872575177347315553307795408549809608457500958111373034747658096871009590975442271004757307809711118935784838675653998783503015228055934046593739791790738723868299395818481660169122019456499931289798411362062484498678713572180352209017023903285791732520220528974020802906854021606612375549983402671300035812486479041385743401875520901590172592547146296175134159774938718574737870961645638908718119841271673056017045493004705269590165763776884908267986972573366521765567941072508764337560846003984904972149117463085539556354188641513168478436313080237596295773983001708984375001e-324"
);
assert_float_result_bits_eq!(
0x6c9a143590c14,
f64,
"94393431193180696942841837085033647913224148539854e-358"
);
assert_float_result_bits_eq!(
0x7802665fd9600,
f64,
"104308485241983990666713401708072175773165034278685682646111762292409330928739751702404658197872319129036519947435319418387839758990478549477777586673075945844895981012024387992135617064532141489278815239849108105951619997829153633535314849999674266169258928940692239684771590065027025835804863585454872499320500023126142553932654370362024104462255244034053203998964360882487378334860197725139151265590832887433736189468858614521708567646743455601905935595381852723723645799866672558576993978025033590728687206296379801363024094048327273913079612469982585674824156000783167963081616214710691759864332339239688734656548790656486646106983450809073750535624894296242072010195710276073042036425579852459556183541199012652571123898996574563824424330960027873516082763671875e-1075"
);
}
#[test]
fn many_digits() {
// Check large numbers of digits to ensure we have cases where significant
// digits (above Decimal::MAX_DIGITS) occurs.
assert_float_result_bits_eq!(
0x7ffffe,
f32,
"1.175494140627517859246175898662808184331245864732796240031385942718174675986064769972472277004271745681762695312500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000e-38"
);
assert_float_result_bits_eq!(
0x7ffffe,
f32,
"1.175494140627517859246175898662808184331245864732796240031385942718174675986064769972472277004271745681762695312500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000e-38"
);
}

View file

@ -1,172 +0,0 @@
use core::num::dec2flt::rawfp::RawFloat;
use core::num::dec2flt::rawfp::{fp_to_float, next_float, prev_float, round_normal};
use core::num::diy_float::Fp;
fn integer_decode(f: f64) -> (u64, i16, i8) {
RawFloat::integer_decode(f)
}
#[test]
fn fp_to_float_half_to_even() {
fn is_normalized(sig: u64) -> bool {
// intentionally written without {min,max}_sig() as a sanity check
sig >> 52 == 1 && sig >> 53 == 0
}
fn conv(sig: u64) -> u64 {
// The significands are perfectly in range, so the exponent should not matter
let (m1, e1, _) = integer_decode(fp_to_float::<f64>(Fp { f: sig, e: 0 }));
assert_eq!(e1, 0 + 64 - 53);
let (m2, e2, _) = integer_decode(fp_to_float::<f64>(Fp { f: sig, e: 55 }));
assert_eq!(e2, 55 + 64 - 53);
assert_eq!(m2, m1);
let (m3, e3, _) = integer_decode(fp_to_float::<f64>(Fp { f: sig, e: -78 }));
assert_eq!(e3, -78 + 64 - 53);
assert_eq!(m3, m2);
m3
}
let odd = 0x1F_EDCB_A012_345F;
let even = odd - 1;
assert!(is_normalized(odd));
assert!(is_normalized(even));
assert_eq!(conv(odd << 11), odd);
assert_eq!(conv(even << 11), even);
assert_eq!(conv(odd << 11 | 1 << 10), odd + 1);
assert_eq!(conv(even << 11 | 1 << 10), even);
assert_eq!(conv(even << 11 | 1 << 10 | 1), even + 1);
assert_eq!(conv(odd << 11 | 1 << 9), odd);
assert_eq!(conv(even << 11 | 1 << 9), even);
assert_eq!(conv(odd << 11 | 0x7FF), odd + 1);
assert_eq!(conv(even << 11 | 0x7FF), even + 1);
assert_eq!(conv(odd << 11 | 0x3FF), odd);
assert_eq!(conv(even << 11 | 0x3FF), even);
}
#[test]
fn integers_to_f64() {
assert_eq!(fp_to_float::<f64>(Fp { f: 1, e: 0 }), 1.0);
assert_eq!(fp_to_float::<f64>(Fp { f: 42, e: 7 }), (42 << 7) as f64);
assert_eq!(fp_to_float::<f64>(Fp { f: 1 << 20, e: 30 }), (1u64 << 50) as f64);
assert_eq!(fp_to_float::<f64>(Fp { f: 4, e: -3 }), 0.5);
}
const SOME_FLOATS: [f64; 9] = [
0.1f64,
33.568,
42.1e-5,
777.0e9,
1.1111,
0.347997,
9843579834.35892,
12456.0e-150,
54389573.0e-150,
];
#[test]
fn human_f64_roundtrip() {
for &x in &SOME_FLOATS {
let (f, e, _) = integer_decode(x);
let fp = Fp { f: f, e: e };
assert_eq!(fp_to_float::<f64>(fp), x);
}
}
#[test]
fn rounding_overflow() {
let x = Fp { f: 0xFF_FF_FF_FF_FF_FF_FF_00u64, e: 42 };
let rounded = round_normal::<f64>(x);
let adjusted_k = x.e + 64 - 53;
assert_eq!(rounded.sig, 1 << 52);
assert_eq!(rounded.k, adjusted_k + 1);
}
#[test]
fn prev_float_monotonic() {
let mut x = 1.0;
for _ in 0..100 {
let x1 = prev_float(x);
assert!(x1 < x);
assert!(x - x1 < 1e-15);
x = x1;
}
}
const MIN_SUBNORMAL: f64 = 5e-324;
#[test]
fn next_float_zero() {
let tiny = next_float(0.0);
assert_eq!(tiny, MIN_SUBNORMAL);
assert!(tiny != 0.0);
}
#[test]
fn next_float_subnormal() {
let second = next_float(MIN_SUBNORMAL);
// For subnormals, MIN_SUBNORMAL is the ULP
assert!(second != MIN_SUBNORMAL);
assert!(second > 0.0);
assert_eq!(second - MIN_SUBNORMAL, MIN_SUBNORMAL);
}
#[test]
fn next_float_inf() {
assert_eq!(next_float(f64::MAX), f64::INFINITY);
assert_eq!(next_float(f64::INFINITY), f64::INFINITY);
}
#[test]
fn next_prev_identity() {
for &x in &SOME_FLOATS {
assert_eq!(prev_float(next_float(x)), x);
assert_eq!(prev_float(prev_float(next_float(next_float(x)))), x);
assert_eq!(next_float(prev_float(x)), x);
assert_eq!(next_float(next_float(prev_float(prev_float(x)))), x);
}
}
#[test]
fn next_float_monotonic() {
let mut x = 0.49999999999999;
assert!(x < 0.5);
for _ in 0..200 {
let x1 = next_float(x);
assert!(x1 > x);
assert!(x1 - x < 1e-15, "next_float_monotonic: delta = {:?}", x1 - x);
x = x1;
}
assert!(x > 0.5);
}
#[test]
fn test_f32_integer_decode() {
assert_eq!(3.14159265359f32.integer_decode(), (13176795, -22, 1));
assert_eq!((-8573.5918555f32).integer_decode(), (8779358, -10, -1));
assert_eq!(2f32.powf(100.0).integer_decode(), (8388608, 77, 1));
assert_eq!(0f32.integer_decode(), (0, -150, 1));
assert_eq!((-0f32).integer_decode(), (0, -150, -1));
assert_eq!(f32::INFINITY.integer_decode(), (8388608, 105, 1));
assert_eq!(f32::NEG_INFINITY.integer_decode(), (8388608, 105, -1));
// Ignore the "sign" (quiet / signalling flag) of NAN.
// It can vary between runtime operations and LLVM folding.
let (nan_m, nan_e, _nan_s) = f32::NAN.integer_decode();
assert_eq!((nan_m, nan_e), (12582912, 105));
}
#[test]
fn test_f64_integer_decode() {
assert_eq!(3.14159265359f64.integer_decode(), (7074237752028906, -51, 1));
assert_eq!((-8573.5918555f64).integer_decode(), (4713381968463931, -39, -1));
assert_eq!(2f64.powf(100.0).integer_decode(), (4503599627370496, 48, 1));
assert_eq!(0f64.integer_decode(), (0, -1075, 1));
assert_eq!((-0f64).integer_decode(), (0, -1075, -1));
assert_eq!(f64::INFINITY.integer_decode(), (4503599627370496, 972, 1));
assert_eq!(f64::NEG_INFINITY.integer_decode(), (4503599627370496, 972, -1));
// Ignore the "sign" (quiet / signalling flag) of NAN.
// It can vary between runtime operations and LLVM folding.
let (nan_m, nan_e, _nan_s) = f64::NAN.integer_decode();
assert_eq!((nan_m, nan_e), (6755399441055744, 972));
}

195
src/etc/dec2flt_table.py Executable file → Normal file
View file

@ -1,141 +1,110 @@
#!/usr/bin/env python3
"""
Generate powers of ten using William Clinger's ``AlgorithmM`` for use in
Generate powers of five using Daniel Lemire's ``Eisel-Lemire algorithm`` for use in
decimal to floating point conversions.
Specifically, computes and outputs (as Rust code) a table of 10^e for some
range of exponents e. The output is one array of 64 bit significands and
another array of corresponding base two exponents. The approximations are
normalized and rounded perfectly, i.e., within 0.5 ULP of the true value.
range of exponents e. The output is one array of 128 bit significands.
The base two exponents can be inferred using a logarithmic slope
of the decimal exponent. The approximations are normalized and rounded perfectly,
i.e., within 0.5 ULP of the true value.
The representation ([u64], [i16]) instead of the more natural [(u64, i16)]
is used because (u64, i16) has a ton of padding which would make the table
even larger, and it's already uncomfortably large (6 KiB).
Adapted from Daniel Lemire's fast_float ``table_generation.py``,
available here: <https://github.com/fastfloat/fast_float/blob/main/script/table_generation.py>.
"""
from __future__ import print_function
from math import ceil, log
from math import ceil, floor, log, log2
from fractions import Fraction
from collections import namedtuple
N = 64 # Size of the significand field in bits
MIN_SIG = 2 ** (N - 1)
MAX_SIG = (2 ** N) - 1
# Hand-rolled fp representation without arithmetic or any other operations.
# The significand is normalized and always N bit, but the exponent is
# unrestricted in range.
Fp = namedtuple('Fp', 'sig exp')
def algorithm_m(f, e):
assert f > 0
if e < 0:
u = f
v = 10 ** abs(e)
else:
u = f * 10 ** e
v = 1
k = 0
x = u // v
while True:
if x < MIN_SIG:
u <<= 1
k -= 1
elif x >= MAX_SIG:
v <<= 1
k += 1
else:
break
x = u // v
return ratio_to_float(u, v, k)
def ratio_to_float(u, v, k):
q, r = divmod(u, v)
v_r = v - r
z = Fp(q, k)
if r < v_r:
return z
elif r > v_r:
return next_float(z)
elif q % 2 == 0:
return z
else:
return next_float(z)
def next_float(z):
if z.sig == MAX_SIG:
return Fp(MIN_SIG, z.exp + 1)
else:
return Fp(z.sig + 1, z.exp)
def error(f, e, z):
decimal = f * Fraction(10) ** e
binary = z.sig * Fraction(2) ** z.exp
abs_err = abs(decimal - binary)
# The unit in the last place has value z.exp
ulp_err = abs_err / Fraction(2) ** z.exp
return float(ulp_err)
from collections import deque
HEADER = """
//! Tables of approximations of powers of ten.
//! Pre-computed tables powers-of-5 for extended-precision representations.
//!
//! These tables enable fast scaling of the significant digits
//! of a float to the decimal exponent, with minimal rounding
//! errors, in a 128 or 192-bit representation.
//!
//! DO NOT MODIFY: Generated by `src/etc/dec2flt_table.py`
"""
STATIC_WARNING = """
// Use static to avoid long compile times: Rust compiler errors
// can have the entire table compiled multiple times, and then
// emit code multiple times, even if it's stripped out in
// the final binary.
"""
def main():
min_exp = minimum_exponent(10)
max_exp = maximum_exponent(10)
bias = -minimum_exponent(5)
print(HEADER.strip())
print()
print_proper_powers()
print('pub const SMALLEST_POWER_OF_FIVE: i32 = {};'.format(min_exp))
print('pub const LARGEST_POWER_OF_FIVE: i32 = {};'.format(max_exp))
print('pub const N_POWERS_OF_FIVE: usize = ', end='')
print('(LARGEST_POWER_OF_FIVE - SMALLEST_POWER_OF_FIVE + 1) as usize;')
print()
print_short_powers(32, 24)
print()
print_short_powers(64, 53)
print_proper_powers(min_exp, max_exp, bias)
def print_proper_powers():
MIN_E = -305
MAX_E = 305
e_range = range(MIN_E, MAX_E+1)
def minimum_exponent(base):
return ceil(log(5e-324, base) - log(0xFFFFFFFFFFFFFFFF, base))
def maximum_exponent(base):
return floor(log(1.7976931348623157e+308, base))
def print_proper_powers(min_exp, max_exp, bias):
powers = deque()
# Add negative exponents.
# 2^(2b)/(5^q) with b=64 + int(math.ceil(log2(5^q)))
powers = []
for e in e_range:
z = algorithm_m(1, e)
err = error(1, e, z)
assert err < 0.5
powers.append(z)
print("pub const MIN_E: i16 = {};".format(MIN_E))
print("pub const MAX_E: i16 = {};".format(MAX_E))
print()
print("#[rustfmt::skip]")
typ = "([u64; {0}], [i16; {0}])".format(len(powers))
print("pub static POWERS: ", typ, " = (", sep='')
print(" [")
for z in powers:
print(" 0x{:x},".format(z.sig))
print(" ],")
print(" [")
for z in powers:
print(" {},".format(z.exp))
print(" ],")
print(");")
for q in range(min_exp, 0):
power5 = 5 ** -q
z = 0
while (1 << z) < power5:
z += 1
if q >= -27:
b = z + 127
c = 2 ** b // power5 + 1
powers.append((c, q))
else:
b = 2 * z + 2 * 64
c = 2 ** b // power5 + 1
# truncate
while c >= (1<<128):
c //= 2
powers.append((c, q))
# Add positive exponents
for q in range(0, max_exp + 1):
power5 = 5 ** q
# move the most significant bit in position
while power5 < (1<<127):
power5 *= 2
# *truncate*
while power5 >= (1<<128):
power5 //= 2
powers.append((power5, q))
def print_short_powers(num_bits, significand_size):
max_sig = 2**significand_size - 1
# The fast path bails out for exponents >= ceil(log5(max_sig))
max_e = int(ceil(log(max_sig, 5)))
e_range = range(max_e)
typ = "[f{}; {}]".format(num_bits, len(e_range))
print("#[rustfmt::skip]")
print("pub const F", num_bits, "_SHORT_POWERS: ", typ, " = [", sep='')
for e in e_range:
print(" 1e{},".format(e))
print("];")
# Print the powers.
print(STATIC_WARNING.strip())
print('#[rustfmt::skip]')
typ = '[(u64, u64); N_POWERS_OF_FIVE]'
print('pub static POWER_OF_FIVE_128: {} = ['.format(typ))
lo_mask = (1 << 64) - 1
for c, exp in powers:
hi = '0x{:x}'.format(c // (1 << 64))
lo = '0x{:x}'.format(c % (1 << 64))
value = ' ({}, {}), '.format(hi, lo)
comment = '// {}^{}'.format(5, exp)
print(value.ljust(46, ' ') + comment)
print('];')
if __name__ == '__main__':

View file

@ -0,0 +1,13 @@
[package]
name = "test-float-parse"
version = "0.1.0"
edition = "2018"
publish = false
[workspace]
[dependencies]
rand = "0.4"
[lib]
name = "test_float_parse"

View file

@ -131,22 +131,20 @@ def write_errors():
exit_status = 101
def rustc(test):
rs = test + '.rs'
exe = test + '.exe' # hopefully this makes it work on *nix
print("compiling", test)
def cargo():
print("compiling tests")
sys.stdout.flush()
check_call(['rustc', rs, '-o', exe])
check_call(['cargo', 'build', '--release'])
def run(test):
global test_name
test_name = test
t0 = time.clock()
t0 = time.perf_counter()
msg("setting up supervisor")
exe = test + '.exe'
proc = Popen(exe, bufsize=1<<20 , stdin=PIPE, stdout=PIPE, stderr=PIPE)
command = ['cargo', 'run', '--bin', test, '--release']
proc = Popen(command, bufsize=1<<20 , stdin=PIPE, stdout=PIPE, stderr=PIPE)
done = multiprocessing.Value(ctypes.c_bool)
queue = multiprocessing.Queue(maxsize=5)#(maxsize=1024)
workers = []
@ -166,7 +164,7 @@ def run(test):
worker.join()
msg("python is done")
assert queue.empty(), "did not validate everything"
dt = time.clock() - t0
dt = time.perf_counter() - t0
msg("took", round(dt, 3), "seconds")
@ -176,7 +174,7 @@ def interact(proc, queue):
line = proc.stdout.readline()
if not line:
continue
assert line.endswith('\n'), "incomplete line: " + repr(line)
assert line.endswith(b'\n'), "incomplete line: " + repr(line)
queue.put(line)
n += 1
if n % UPDATE_EVERY_N == 0:
@ -185,7 +183,7 @@ def interact(proc, queue):
rest, stderr = proc.communicate()
if stderr:
msg("rust stderr output:", stderr)
for line in rest.split('\n'):
for line in rest.split(b'\n'):
if not line:
continue
queue.put(line)
@ -193,18 +191,19 @@ def interact(proc, queue):
def main():
global MAILBOX
all_tests = [os.path.splitext(f)[0] for f in glob('*.rs') if not f.startswith('_')]
files = glob('src/bin/*.rs')
basenames = [os.path.basename(i) for i in files]
all_tests = [os.path.splitext(f)[0] for f in basenames if not f.startswith('_')]
args = sys.argv[1:]
if args:
tests = [test for test in all_tests if test in args]
else
else:
tests = all_tests
if not tests:
print("Error: No tests to run")
sys.exit(1)
# Compile first for quicker feedback
for test in tests:
rustc(test)
cargo()
# Set up mailbox once for all tests
MAILBOX = multiprocessing.Queue()
mailman = threading.Thread(target=write_errors)
@ -251,7 +250,7 @@ def do_work(queue):
else:
continue
bin64, bin32, text = line.rstrip().split()
validate(bin64, bin32, text)
validate(bin64, bin32, text.decode('utf-8'))
def decode_binary64(x):
@ -331,7 +330,11 @@ SINGLE_ZERO_CUTOFF = MIN_SUBNORMAL_SINGLE / 2
SINGLE_INF_CUTOFF = MAX_SINGLE + 2 ** (MAX_ULP_SINGLE - 1)
def validate(bin64, bin32, text):
double = decode_binary64(bin64)
try:
double = decode_binary64(bin64)
except AssertionError:
print(bin64, bin32, text)
raise
single = decode_binary32(bin32)
real = Fraction(text)

View file

@ -1,6 +1,4 @@
mod _common;
use _common::validate;
use test_float_parse::validate;
fn main() {
let mut pow = vec![];

View file

@ -1,6 +1,4 @@
mod _common;
use _common::validate;
use test_float_parse::validate;
fn main() {
for e in 300..310 {

View file

@ -1,7 +1,5 @@
mod _common;
use _common::validate;
use std::char;
use test_float_parse::validate;
fn main() {
for n in 0..10 {

View file

@ -1,11 +1,9 @@
extern crate rand;
mod _common;
use _common::{validate, SEED};
use rand::distributions::{Range, Sample};
use rand::{IsaacRng, Rng, SeedableRng};
use std::char;
use test_float_parse::{validate, SEED};
fn main() {
let mut rnd = IsaacRng::from_seed(&SEED);

View file

@ -1,10 +1,8 @@
extern crate rand;
mod _common;
use _common::{validate, SEED};
use rand::{IsaacRng, Rng, SeedableRng};
use std::mem::transmute;
use test_float_parse::{validate, SEED};
fn main() {
let mut rnd = IsaacRng::from_seed(&SEED);

View file

@ -1,6 +1,4 @@
mod _common;
use _common::validate;
use test_float_parse::validate;
fn main() {
// Skip e = 0 because small-u32 already does those.

View file

@ -1,7 +1,5 @@
mod _common;
use _common::validate;
use std::mem::transmute;
use test_float_parse::validate;
fn main() {
for bits in 0u32..(1 << 21) {

View file

@ -1,6 +1,4 @@
mod _common;
use _common::validate;
use test_float_parse::validate;
fn main() {
for e in 301..327 {

View file

@ -1,6 +1,4 @@
mod _common;
use _common::validate;
use test_float_parse::validate;
fn main() {
for i in 0..(1 << 19) {

View file

@ -1,6 +1,4 @@
mod _common;
use _common::validate;
use test_float_parse::validate;
fn main() {
for exp in 19..64 {

View file

@ -1,6 +0,0 @@
fn main() {
// FIXME(#31407) this error should go away, but in the meantime we test that it
// is accompanied by a somewhat useful error message.
let _: f64 = 1234567890123456789012345678901234567890e-340;
//~^ ERROR could not evaluate float literal (see issue #31407)
}

View file

@ -1,8 +0,0 @@
error: could not evaluate float literal (see issue #31407)
--> $DIR/issue-31109.rs:4:18
|
LL | let _: f64 = 1234567890123456789012345678901234567890e-340;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to previous error

View file

@ -0,0 +1,9 @@
// build-pass
// ignore-tidy-linelength
// Regression test for #31109 and #31407.
pub fn main() {
let _: f64 = 0.3333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333;
let _: f64 = 1234567890123456789012345678901234567890e-340;
}

View file

@ -1,7 +0,0 @@
fn main() {
let 1234567890123456789012345678901234567890e-340: f64 = 0.0;
//~^ ERROR could not evaluate float literal (see issue #31407)
fn param(1234567890123456789012345678901234567890e-340: f64) {}
//~^ ERROR could not evaluate float literal (see issue #31407)
}

View file

@ -1,15 +0,0 @@
error[E0080]: could not evaluate float literal (see issue #31407)
--> $DIR/issue-68396-let-float-bug.rs:2:9
|
LL | let 1234567890123456789012345678901234567890e-340: f64 = 0.0;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0080]: could not evaluate float literal (see issue #31407)
--> $DIR/issue-68396-let-float-bug.rs:5:14
|
LL | fn param(1234567890123456789012345678901234567890e-340: f64) {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to 2 previous errors
For more information about this error, try `rustc --explain E0080`.