[libc++] Unspecified behavior randomization in libc++

This effort is dedicated to deflake the tests of the users which depend
on the unspecified behavior of algorithms and containers. This also
might help updating the sorting algorithm in libcxx which has the
quadratic worst case in the future or at least create a new one under
flag.

For detailed design, please see the design doc I provide in the patch.

Differential Revision: https://reviews.llvm.org/D96946
This commit is contained in:
Danila Kutenin 2021-11-16 15:48:59 -05:00 committed by Louis Dionne
parent aeb3c772d3
commit a45d2287ad
13 changed files with 504 additions and 20 deletions

View file

@ -58,6 +58,15 @@ The following containers and classes support iterator debugging:
The remaining containers do not currently support iterator debugging.
Patches welcome.
Randomizing Unspecified Behavior (``_LIBCPP_DEBUG == 1``)
---------------------------------------------------------
This also enables the randomization of unspecified behavior, for
example, for equal elements in ``std::sort`` or randomizing both parts of
the partition after ``std::nth_element`` call. This effort helps you to migrate
to potential future faster versions of these algorithms and deflake your tests
which depend on such behavior. To fix the seed, use
``_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY_SEED=seed`` definition.
Handling Assertion Failures
===========================
When a debug assertion fails the assertion handler is called via the

View file

@ -0,0 +1,86 @@
==================================
Unspecified Behavior Randomization
==================================
Background
==========
Consider the follow snippet which steadily happens in tests:
.. code-block:: cpp
std::vector<std::pair<int, int>> v(SomeData());
std::sort(v.begin(), v.end(), [](const auto& lhs, const auto& rhs) {
return lhs.first < rhs.first;
});
Under this assumption all elements in the vector whose first elements are equal
do not guarantee any order. Unfortunately, this prevents libcxx introducing
other implementatiosn because tests might silently fail and the users might
heavily depend on the stability of implementations.
Goal
===================
Provide functionality for randomizing the unspecified behavior so that the users
can test and migrate their components and libcxx can introduce new sorting
algorithms and optimizations to the containers.
For example, as of LLVM version 13, libcxx sorting algorithm takes
`O(n^2) worst case <https://llvm.org/PR20837>`_ but according
to the standard its worst case should be `O(n log n)`. This effort helps users
to gradually fix their tests while updating to new faster algorithms.
Design
======
* Introduce new macro `_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY` which should
be a part of the libcxx config.
* This macro randomizes the unspecified behavior of algorithms and containers.
For example, for sorting algorithm the input range is shuffled and then
sorted.
* This macro is off by default because users should enable it only for testing
purposes and/or migrations if they happen to libcxx.
* This feature is only available for C++11 and further because of
`std::shuffle` availability.
* We may use `ASLR <https://en.wikipedia.org/wiki/Address_space_layout_randomization>`_ or
static `std::random_device` for seeding the random number generator. This
guarantees the same stability guarantee within a run but not through different
runs, for example, for tests become flaky and eventually be seen as broken.
For platforms which do not support ASLR, the seed is fixed during build.
* The users can fix the seed of the random number generator by providing
`_LIBCPP_RANDOMIZE_UNSPECIFIED_STABILITY_SEED=seed` definition.
This comes with some side effects if any of the flags is on:
* Computation penalty, we think users are OK with that if they use this feature.
* Non reproducible results if they don't use the fixed seed.
Impact
------------------
Google has measured couple of thousands of tests to be dependent on the
stability of sorting and selection algorithms. As we also plan on updating
(or least, providing under flag more) sorting algorithms, this effort helps
doing it gradually and sustainably. This is also bad for users to depend on the
unspecified behavior in their tests, this effort helps to turn this flag in
debug mode.
Potential breakages
-------------------
None if the flag is off. If the flag is on, it may lead to some non-reproducible
results, for example, for caching.
Currently supported randomization
---------------------------------
* `std::sort`, there is no guarantee on the order of equal elements
* `std::partial_sort`, there is no guarantee on the order of equal elements and
on the order of the remaining part
* `std::nth_element`, there is no guarantee on the order from both sides of the
partition
Patches welcome.

View file

@ -51,6 +51,10 @@ New Features
added. This is useful for building libc++ in an embedded setting, and it adds itself to the various
freestanding-friendly options provided by libc++.
- ``_LIBCPP_DEBUG`` equals to ``1`` enables the randomization of unspecified
behavior of standard algorithms (e.g. equal elements in ``std::sort`` or
randomization of both sides of partition for ``std::nth_element``)
API Changes
-----------

View file

@ -177,6 +177,7 @@ Design Documents
DesignDocs/NoexceptPolicy
DesignDocs/ThreadingSupportAPI
DesignDocs/UniquePtrTrivialAbi
DesignDocs/UnspecifiedBehaviorRandomization
DesignDocs/VisibilityMacros

View file

@ -11,6 +11,11 @@
#include <__config>
#ifdef _LIBCPP_DEBUG
# include <__debug>
# include <__utility/declval.h>
#endif
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
#pragma GCC system_header
#endif

View file

@ -16,6 +16,10 @@
#include <__iterator/iterator_traits.h>
#include <__utility/swap.h>
#if defined(_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY)
# include <__algorithm/shuffle.h>
#endif
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
#pragma GCC system_header
#endif
@ -222,8 +226,13 @@ inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_AFTER_CXX17
void
nth_element(_RandomAccessIterator __first, _RandomAccessIterator __nth, _RandomAccessIterator __last, _Compare __comp)
{
typedef typename __comp_ref_type<_Compare>::type _Comp_ref;
_VSTD::__nth_element<_Comp_ref>(__first, __nth, __last, __comp);
_LIBCPP_DEBUG_RANDOMIZE_RANGE(__first, __last);
typedef typename __comp_ref_type<_Compare>::type _Comp_ref;
_VSTD::__nth_element<_Comp_ref>(__first, __nth, __last, __comp);
_LIBCPP_DEBUG_RANDOMIZE_RANGE(__first, __nth);
if (__nth != __last) {
_LIBCPP_DEBUG_RANDOMIZE_RANGE(++__nth, __last);
}
}
template <class _RandomAccessIterator>

View file

@ -18,6 +18,10 @@
#include <__iterator/iterator_traits.h>
#include <__utility/swap.h>
#if defined(_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY)
# include <__algorithm/shuffle.h>
#endif
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
#pragma GCC system_header
#endif
@ -48,8 +52,10 @@ void
partial_sort(_RandomAccessIterator __first, _RandomAccessIterator __middle, _RandomAccessIterator __last,
_Compare __comp)
{
typedef typename __comp_ref_type<_Compare>::type _Comp_ref;
_VSTD::__partial_sort<_Comp_ref>(__first, __middle, __last, __comp);
_LIBCPP_DEBUG_RANDOMIZE_RANGE(__first, __last);
typedef typename __comp_ref_type<_Compare>::type _Comp_ref;
_VSTD::__partial_sort<_Comp_ref>(__first, __middle, __last, __comp);
_LIBCPP_DEBUG_RANDOMIZE_RANGE(__middle, __last);
}
template <class _RandomAccessIterator>

View file

@ -25,6 +25,40 @@ _LIBCPP_PUSH_MACROS
_LIBCPP_BEGIN_NAMESPACE_STD
class _LIBCPP_TYPE_VIS __libcpp_debug_randomizer {
public:
__libcpp_debug_randomizer() {
__state = __seed();
__inc = __state + 0xda3e39cb94b95bdbULL;
__inc = (__inc << 1) | 1;
}
typedef uint_fast32_t result_type;
static const result_type _Min = 0;
static const result_type _Max = 0xFFFFFFFF;
_LIBCPP_HIDE_FROM_ABI result_type operator()() {
uint_fast64_t __oldstate = __state;
__state = __oldstate * 6364136223846793005ULL + __inc;
return __oldstate >> 32;
}
static _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR result_type min() { return _Min; }
static _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR result_type max() { return _Max; }
private:
uint_fast64_t __state;
uint_fast64_t __inc;
_LIBCPP_HIDE_FROM_ABI static uint_fast64_t __seed() {
#ifdef _LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY_SEED
return _LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY_SEED;
#else
static char __x;
return reinterpret_cast<uintptr_t>(&__x);
#endif
}
};
#if _LIBCPP_STD_VER <= 14 || defined(_LIBCPP_ENABLE_CXX17_REMOVED_RANDOM_SHUFFLE) \
|| defined(_LIBCPP_BUILDING_LIBRARY)
class _LIBCPP_TYPE_VIS __rs_default;

View file

@ -18,6 +18,10 @@
#include <__utility/swap.h>
#include <memory>
#if defined(_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY)
# include <__algorithm/shuffle.h>
#endif
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
#pragma GCC system_header
#endif
@ -529,12 +533,13 @@ inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_AFTER_CXX17
void
sort(_RandomAccessIterator __first, _RandomAccessIterator __last, _Compare __comp)
{
typedef typename __comp_ref_type<_Compare>::type _Comp_ref;
if (__libcpp_is_constant_evaluated()) {
_VSTD::__partial_sort<_Comp_ref>(__first, __last, __last, _Comp_ref(__comp));
} else {
_VSTD::__sort<_Comp_ref>(_VSTD::__unwrap_iter(__first), _VSTD::__unwrap_iter(__last), _Comp_ref(__comp));
}
_LIBCPP_DEBUG_RANDOMIZE_RANGE(__first, __last);
typedef typename __comp_ref_type<_Compare>::type _Comp_ref;
if (__libcpp_is_constant_evaluated()) {
_VSTD::__partial_sort<_Comp_ref>(__first, __last, __last, _Comp_ref(__comp));
} else {
_VSTD::__sort<_Comp_ref>(_VSTD::__unwrap_iter(__first), _VSTD::__unwrap_iter(__last), _Comp_ref(__comp));
}
}
template <class _RandomAccessIterator>

View file

@ -858,16 +858,35 @@ typedef unsigned int char32_t;
// _LIBCPP_DEBUG potential values:
// - undefined: No assertions. This is the default.
// - 0: Basic assertions
// - 1: Basic assertions + iterator validity checks.
#if !defined(_LIBCPP_DEBUG)
# define _LIBCPP_DEBUG_LEVEL 0
#elif _LIBCPP_DEBUG == 0
# define _LIBCPP_DEBUG_LEVEL 1
#elif _LIBCPP_DEBUG == 1
# define _LIBCPP_DEBUG_LEVEL 2
#else
# error Supported values for _LIBCPP_DEBUG are 0 and 1
#endif
// - 1: Basic assertions + iterator validity checks + unspecified behavior randomization.
# if !defined(_LIBCPP_DEBUG)
# define _LIBCPP_DEBUG_LEVEL 0
# elif _LIBCPP_DEBUG == 0
# define _LIBCPP_DEBUG_LEVEL 1
# elif _LIBCPP_DEBUG == 1
# define _LIBCPP_DEBUG_LEVEL 2
# else
# error Supported values for _LIBCPP_DEBUG are 0 and 1
# endif
# if _LIBCPP_DEBUG_LEVEL >= 2 && !defined(_LIBCPP_CXX03_LANG)
# define _LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY
# endif
# if defined(_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY)
# if defined(_LIBCPP_CXX03_LANG)
# error Support for unspecified stability is only for C++11 and higher
# endif
# define _LIBCPP_DEBUG_RANDOMIZE_RANGE(__first, __last) \
do { \
if (!__builtin_is_constant_evaluated()) \
_VSTD::shuffle(__first, __last, __libcpp_debug_randomizer()); \
} while (false)
# else
# define _LIBCPP_DEBUG_RANDOMIZE_RANGE(__first, __last) \
do { \
} while (false)
# endif
// Libc++ allows disabling extern template instantiation declarations by
// means of users defining _LIBCPP_DISABLE_EXTERN_TEMPLATE.

View file

@ -0,0 +1,103 @@
//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// <algorithm>
// Test std::nth_element stability randomization
// UNSUPPORTED: libcxx-no-debug-mode
// UNSUPPORTED: c++03
// ADDITIONAL_COMPILE_FLAGS: -D_LIBCPP_DEBUG=1
#include <algorithm>
#include <array>
#include <cassert>
#include <functional>
#include <iterator>
#include <vector>
#include "test_macros.h"
struct MyType {
int value = 0;
constexpr bool operator<(const MyType& other) const { return value < other.value; }
};
std::vector<MyType> deterministic() {
static constexpr int kSize = 100;
std::vector<MyType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = (i % 2 ? i : kSize / 2 + i);
}
std::__nth_element(v.begin(), v.begin() + kSize / 2, v.end(), std::less<MyType>());
return v;
}
void test_randomization() {
static constexpr int kSize = 100;
std::vector<MyType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = (i % 2 ? i : kSize / 2 + i);
}
auto deterministic_v = deterministic();
std::nth_element(v.begin(), v.begin() + kSize / 2, v.end());
bool all_equal = true;
for (int i = 0; i < kSize; ++i) {
if (v[i].value != deterministic_v[i].value) {
all_equal = false;
}
}
assert(!all_equal);
}
void test_same() {
static constexpr int kSize = 100;
std::vector<MyType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = (i % 2 ? i : kSize / 2 + i);
}
auto snapshot_v = v;
auto snapshot_custom_v = v;
std::nth_element(v.begin(), v.begin() + kSize / 2, v.end());
std::nth_element(snapshot_v.begin(), snapshot_v.begin() + kSize / 2, snapshot_v.end());
std::nth_element(snapshot_custom_v.begin(), snapshot_custom_v.begin() + kSize / 2, snapshot_custom_v.end(), std::less<MyType>());
bool all_equal = true;
for (int i = 0; i < kSize; ++i) {
if (v[i].value != snapshot_v[i].value || v[i].value != snapshot_custom_v[i].value) {
all_equal = false;
}
if (i < kSize / 2) {
assert(v[i].value <= v[kSize / 2].value);
}
}
assert(all_equal);
}
#if TEST_STD_VER > 17
constexpr bool test_constexpr() {
std::array<MyType, 10> v;
for (int i = 9; i >= 0; --i) {
v[9 - i].value = i;
}
std::nth_element(v.begin(), v.begin() + 5, v.end());
return std::is_partitioned(v.begin(), v.end(), [&](const MyType& m) { return m.value <= v[5].value; });
}
#endif
int main(int, char**) {
test_randomization();
test_same();
#if TEST_STD_VER > 17
static_assert(test_constexpr(), "");
#endif
return 0;
}

View file

@ -0,0 +1,103 @@
//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// <algorithm>
// Test std::partial_sort stability randomization
// UNSUPPORTED: libcxx-no-debug-mode
// UNSUPPORTED: c++03
// ADDITIONAL_COMPILE_FLAGS: -D_LIBCPP_DEBUG=1
#include <algorithm>
#include <array>
#include <cassert>
#include <functional>
#include <iterator>
#include <vector>
#include "test_macros.h"
struct MyType {
int value = 0;
constexpr bool operator<(const MyType& other) const { return value < other.value; }
};
std::vector<MyType> deterministic() {
static constexpr int kSize = 100;
std::vector<MyType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = (i % 2 ? 1 : kSize / 2 + i);
}
std::__partial_sort(v.begin(), v.begin() + kSize / 2, v.end(), std::less<MyType>());
return v;
}
void test_randomization() {
static constexpr int kSize = 100;
std::vector<MyType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = (i % 2 ? 1 : kSize / 2 + i);
}
auto deterministic_v = deterministic();
std::partial_sort(v.begin(), v.begin() + kSize / 2, v.end());
bool all_equal = true;
for (int i = 0; i < kSize; ++i) {
if (v[i].value != deterministic_v[i].value) {
all_equal = false;
}
}
assert(!all_equal);
}
void test_same() {
static constexpr int kSize = 100;
std::vector<MyType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = (i % 2 ? 1 : kSize / 2 + i);
}
auto snapshot_v = v;
auto snapshot_custom_v = v;
std::partial_sort(v.begin(), v.begin() + kSize / 2, v.end());
std::partial_sort(snapshot_v.begin(), snapshot_v.begin() + kSize / 2, snapshot_v.end());
std::partial_sort(snapshot_custom_v.begin(), snapshot_custom_v.begin() + kSize / 2, snapshot_custom_v.end(), std::less<MyType>());
bool all_equal = true;
for (int i = 0; i < kSize; ++i) {
if (v[i].value != snapshot_v[i].value || v[i].value != snapshot_custom_v[i].value) {
all_equal = false;
}
if (i < kSize / 2) {
assert(v[i].value == 1);
}
}
assert(all_equal);
}
#if TEST_STD_VER > 17
constexpr bool test_constexpr() {
std::array<MyType, 10> v;
for (int i = 9; i >= 0; --i) {
v[9 - i].value = i;
}
std::partial_sort(v.begin(), v.begin() + 5, v.end());
return std::is_sorted(v.begin(), v.begin() + 5);
}
#endif
int main(int, char**) {
test_randomization();
test_same();
#if TEST_STD_VER > 17
static_assert(test_constexpr(), "");
#endif
return 0;
}

View file

@ -0,0 +1,100 @@
//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// <algorithm>
// Test std::sort stability randomization
// UNSUPPORTED: libcxx-no-debug-mode
// UNSUPPORTED: c++03
// ADDITIONAL_COMPILE_FLAGS: -D_LIBCPP_DEBUG=1
#include <algorithm>
#include <array>
#include <cassert>
#include <iterator>
#include <vector>
#include "test_macros.h"
struct EqualType {
int value = 0;
constexpr bool operator<(const EqualType&) const { return false; }
};
std::vector<EqualType> deterministic() {
static constexpr int kSize = 100;
std::vector<EqualType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = kSize / 2 - i * (i % 2 ? -1 : 1);
}
std::__sort(v.begin(), v.end(), std::less<EqualType>());
return v;
}
void test_randomization() {
static constexpr int kSize = 100;
std::vector<EqualType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = kSize / 2 - i * (i % 2 ? -1 : 1);
}
auto deterministic_v = deterministic();
std::sort(v.begin(), v.end());
bool all_equal = true;
for (int i = 0; i < kSize; ++i) {
if (v[i].value != deterministic_v[i].value) {
all_equal = false;
}
}
assert(!all_equal);
}
void test_same() {
static constexpr int kSize = 100;
std::vector<EqualType> v;
v.resize(kSize);
for (int i = 0; i < kSize; ++i) {
v[i].value = kSize / 2 - i * (i % 2 ? -1 : 1);
}
auto snapshot_v = v;
auto snapshot_custom_v = v;
std::sort(v.begin(), v.end());
std::sort(snapshot_v.begin(), snapshot_v.end());
std::sort(snapshot_custom_v.begin(), snapshot_custom_v.end(),
[](const EqualType&, const EqualType&) { return false; });
bool all_equal = true;
for (int i = 0; i < kSize; ++i) {
if (v[i].value != snapshot_v[i].value || v[i].value != snapshot_custom_v[i].value) {
all_equal = false;
}
}
assert(all_equal);
}
#if TEST_STD_VER > 17
constexpr bool test_constexpr() {
std::array<EqualType, 10> v;
for (int i = 9; i >= 0; --i) {
v[9 - i].value = i;
}
std::sort(v.begin(), v.end());
return std::is_sorted(v.begin(), v.end());
}
#endif
int main(int, char**) {
test_randomization();
test_same();
#if TEST_STD_VER > 17
static_assert(test_constexpr(), "");
#endif
return 0;
}