LoopSink with the legacy pass manager still uses AST, because we
can't compute MemorySSA conditionally. I think now that the legacy
pass manager will be removed soon(TM) we don't need to care about
compile-time impact here anymore. Additionally, since MemorySSA is
no longer eagerly optimized, the impact is actually not that high
anymore (~0.2% geomean regression on CTMark).
This just makes legacy PM and new PM behavior line up -- as a
followup I'll drop these options entirely and make MemorySSA use
mandatory.
Differential Revision: https://reviews.llvm.org/D123216
Another change of the code design.
Code simplified again, now there is a single place to check
a handler function and less functions for bug report emitting.
More details are added to the bug report messages.
Reviewed By: whisperity
Differential Revision: https://reviews.llvm.org/D118370
Follow-up from 98bc304e9f - while that
commit fixed when you had two PDBs colliding on the same Guid it didn't
fix the case where you had more than two PDBs using the same Guid.
This commit fixes that and also tests much more carefully that all
the types are correct no matter the order.
Reviewed By: aganea, saudi
Differential Revision: https://reviews.llvm.org/D123185
debugserver does not call thread_set_state when changing xmm/ymm/zmm
register values, so the register contents are never updated. Fix
that. Mark the shell tests which xfail'ed these tests on darwin systems
to xfail them when the system debugserver, they will pass when using
the in-tree debugserver. When this makes it into the installed
system debugservers, we'll remove the xfails.
Differential Revision: https://reviews.llvm.org/D123269
rdar://91258333
rdar://31294382
(The upgrade of the ppc64le bot and D121257 have fixed compiler-rt failures. Tested by nemanjai.)
Default the option introduced in D113372 to ON to match all(?) major Linux
distros. This matches GCC and improves consistency with Android and linux-musl
which always default to PIE.
Note: CLANG_DEFAULT_PIE_ON_LINUX may be removed in the future.
Differential Revision: https://reviews.llvm.org/D120305
Use new NotAtomic expansion to turn these into the equivalent
non-atomic operations. Independent lanes cannot access the private
memory of other lanes, so there's no possibility for synchronization.
These don't really appear directly in user code, but
InferAddressSpaces can make these appear after optimizations.
Fixes issues 54693 and 54274.
Currently LowerAtomics exists as a separate pass which blindly
replaces all atomics. Add a new lowering strategy option to eliminate
the atomics which the target can control on a per-instruction level.
Use the same enum as the other atomic instructions for consistency, in
preparation for addition of another strategy.
Introduce a new "Expand" option, since the store expansion does not
use cmpxchg. Alternatively, the existing CmpXChg strategy could be
renamed to Expand.
In a clean build directory, `check-openmp` or `check-libomptarget` will fail because of missing device RTL .bc files. Ensure that the new targets new custom targets `omptarget.devicertl.nvptx` and `omptarget.devicertl.amdgpu` (corresponding to the plugin rtl targets `omptarget.rtl.cuda`, respectively `omptarget.rlt.amdgpu` ) are dependencies of the regression tests.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D123177
This commit refactors the expected form of native constraint and rewrite
functions, and greatly reduces the necessary user complexity required when
defining a native function. Namely, this commit adds in automatic processing
of the necessary PDLValue glue code, and allows for users to define
constraint/rewrite functions using the C++ types that they actually want to
use.
As an example, lets see a simple example rewrite defined today:
```
static void rewriteFn(PatternRewriter &rewriter, PDLResultList &results,
ArrayRef<PDLValue> args) {
ValueRange operandValues = args[0].cast<ValueRange>();
TypeRange typeValues = args[1].cast<TypeRange>();
...
// Create an operation at some point and pass it back to PDL.
Operation *op = rewriter.create<SomeOp>(...);
results.push_back(op);
}
```
After this commit, that same rewrite could be defined as:
```
static Operation *rewriteFn(PatternRewriter &rewriter ValueRange operandValues,
TypeRange typeValues) {
...
// Create an operation at some point and pass it back to PDL.
return rewriter.create<SomeOp>(...);
}
```
Differential Revision: https://reviews.llvm.org/D122086
Rationale:
Allocating the temporary buffers for access pattern expansion on the stack
(using alloca) is a bit too agressive, since it easily runs out of stack space
for large enveloping tensor dimensions. This revision changes the dynamic
allocation of these buffers with explicit alloc/dealloc pairs.
Reviewed By: bixia, wrengr
Differential Revision: https://reviews.llvm.org/D123253
LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference
being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for
MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP
instruction between a load and any instruction immediately following that reads the load's destination register. I also
included a simple regression test. Since no existing tests target MIPS-I, those all still pass.
Issue ref: https://github.com/simias/psx-sdk-rs/issues/1
I also tested by building a simple demo app with Clang and running it in an emulator.
Patch by: @impiaaa
Differential Revision: https://reviews.llvm.org/D122427
The underlying TLS destruction order bug has been fixed in the OS. This
would technically still fail when running on top of macOS < 12, however
we don't have a good way of encoding that using Lit features. Indeed,
the existing target=<FOO> Lit feature encodes the deployment target,
not the actual runtime system that the tests are being run on.
If this test starts failing on your machine after this patch, upgrading
to macOS 12 should solve the problem.
The previous check for interceptors used `pthread_create()` which is not
available on DriverKit. We need an intercepted symbol that satisfies
the following constraints:
- Symbol is available in DriverKit
- Symbol is provided by simulator runtime dylibs (`dlsym()` fails to
look up host-provided symbols)
`puts()` satisfies all of the above constraints.
rdar://87895539
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D123245
static_cast is a little safer here since the compiler will
ensure we're casting to a class derived from
yaml::MachineFunctionInfo.
I believe this first appeared on AMDGPU and was copied to the
other two targets.
Spotted when it was being copied to RISCV in D123178.
Differential Revision: https://reviews.llvm.org/D123260
Our support for building for baremetal was conditional on a default
off arg and would have failed to build if you had somehow arranged
to pass the correct --target flag; presumably nobody noticed because
nobody was turning it on. A better approach is to model baremetal
as a separate "OS" called "baremetal" and build it in the same way
as we cross-compile for other targets. That's what this patch does.
I only hooked up the arm64 target but others can be added.
Relanding after fixing Mac build breakage in D123244.
Differential Revision: https://reviews.llvm.org/D122862
When cross-compiling from Mac to non-Mac, we need to use the just-built
llvm-ar instead of libtool. We're currently doing the right thing
when determining which archiver command to use, but the path to ar
and the toolchain dependencies were being set based on the host OS
(current_os evaluated in host OS toolchain), instead of the target
OS. Fix the problem by looking up current_os inside toolchain_args.
Differential Revision: https://reviews.llvm.org/D123244
Adds `mlirBlockDetach` to the CAPI to remove a block from its parent
region. Use it in the Python bindings to implement
`Block.append_to(region)`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D123165
The demangler has a utility class 'SwapAndRestore'. That name is
confusing. It's not swapping anything, and the restore part happens at
the object's destruction. What it's actually doing is allowing a
override of some value that is dynamically accessible within the
lifetime of a lexical scope. Thus rename it to ScopedOverride, and
tweak it's member variable names.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D122606
There are no such markings left - all of them have been fixed or
analyzed.
This closes llvm.org/PR32730 (github issue #32077).
Differential Revision: https://reviews.llvm.org/D123145
In COFF, the immediates in IMAGE_REL_ARM64_PAGEBASE_REL21 relocations
are limited to 21 bit signed, i.e. the offset has to be less than
(1 << 20). The previous limit did intend to cover for this case, but
had missed that the 21 bit field was signed.
This fixes issue https://github.com/llvm/llvm-project/issues/54753.
Differential Revision: https://reviews.llvm.org/D123160
This matches how another similar warning is silenced in
Host/posix/ProcessLauncherPosixFork.cpp.
Differential Revision: https://reviews.llvm.org/D123205
This silences warnings like this:
lldb/source/Core/DebuggerEvents.cpp: In member function ‘llvm::StringRef lldb_private::DiagnosticEventData::GetPrefix() const’:
lldb/source/Core/DebuggerEvents.cpp:55:1: warning: control reaches end of non-void function [-Wreturn-type]
55 | }
Differential Revision: https://reviews.llvm.org/D123203