This patch adds codegen for the following BFloat
operations to the ARM backend:
* concatenation of bf16 vectors
* bf16 vector element extraction
* bf16 vector element insertion
* duplication of a bf16 value into each lane of a vector
* duplication of a bf16 vector lane into each lane
Differential Revision: https://reviews.llvm.org/D81411
The outputs between the direct ast-dump test and the ast-dump test after
deserialization should match modulo a few differences.
For hand-written tests, strip the "<undeserialized declarations>"s and
the "imported"s with sed.
For tests generated with "make-ast-dump-check.sh", regenerate the
output.
Part 1/n.
On AIX, we use __atexit to register dtor functions rather than __cxa_atexit.
So a driver change is needed to default AIX to using -fno-use-cxa-atexit.
Windows platform does not uses __cxa_atexit either. Following its precedent,
we remove the assertion for when -fuse-cxa-atexit is specified by the user,
do not produce a message and silently default to -fno-use-cxa-atexit behavior.
Differential Revision: https://reviews.llvm.org/D82136
1. Provides no piroirity supoort && disables three priority related
attributes: init_priority, ctor attr, dtor attr;
2. '-qunique' in XL compiler equivalent behavior of emitting sinit
and sterm functions name using getUniqueModuleId() util function
in LLVM (currently no support for InternalLinkage and WeakODRLinkage
symbols);
3. Add testcases to emit IR sample with __sinit80000000, __dtor, and
__sterm80000000;
4. Temporarily side-steps the need to implement the functionality of
llvm.global_ctors and llvm.global_dtors arrays. The uses of that
functionality in this patch (with respect to the name of the functions
involved) are not representative of how the functionality will be used
once implemented.
Differential Revision: https://reviews.llvm.org/D74166
Summary:
We can't resolve this (if it's a symlink) without further refactoring, but the
current behaviour is just incorrect.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82011
Similar to D81937, we might crash when printing a histogram for a GNU hash table
with a 'symndx' index that is larger than the number of dynamic symbols.
This patch adopts and reuses the `getGnuHashTableChains()` helper which performs
a validation of the table. As a side effect the warning reported for
the --gnu-hash-table was improved.
Also with this change we start to report a warning when the histogram is requested for
the GNU hash table, but the dynamic symbols table is empty (size == 0).
Differential revision: https://reviews.llvm.org/D82010
All cases in the test are supported now, it only still failed because an
over-eager regex match not accounting for `, align ` being added to each
load/store now.
Extend the `InheritParentConfig` support introduced in D75184 for the command line option `--config`.
The current behaviour of `--config` is to when set, disable looking for `.clang-tidy` configuration files.
This new behaviour lets you set `InheritParentConfig` to true in the command line to then look for `.clang-tidy` configuration files to be merged with what's been specified on the command line.
Reviewed By: DmitryPolukhin
Differential Revision: https://reviews.llvm.org/D81949
Without SSE41 we don't have the PCMPEQQ instruction, making cmp-with-zero reductions more complicated than necessary. We can compare as vXi32 (PCMPEQD) and tweak the MOVMSK comparison to test upper/lower DWORD comparisons.
This pre-fixes something that occurs with null tests for vectors of (64-bit) pointers such as in PR35129.
Summary: It was used inside buildCompilerInvocation to speed up stats. But
preambleStatCache doesn't contain stat information performed while
building compiler invocation. So it was an unnecessary optimization.
Furthermore, buildCompilerInvocation in scanPreamble doesn't need to
find gcc installation, include paths and such, as it is only trying to
lex directives. Hence we are passing an empty FS to get rid of any
redundant IO.
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81719
Summary:
Clangd uses FSProvider to get threadsafe views into file systems. This
patch changes naming to make that more explicit.
Depends on D81920
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81998
Summary:
We've faced a couple of problems when the returned FS didn't have the
proper working directory. New signature makes the API safer against such
problems.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81920
At the moment we use Global ISel by default at -O0, however it is
currently not capable of dealing with scalable vectors for two
reasons:
1. The register banks know nothing about SVE registers.
2. The LLT (Low Level Type) class knows nothing about scalable
vectors.
For now, the easiest way to avoid users hitting issues when using
the SVE ACLE is to fall back on normal DAG ISel when encountering
instructions that operate on scalable vector types.
I've added a couple of RUN lines to existing SVE tests to ensure
we can compile at -O0. I've also added some new tests to
CodeGen/AArch64/GlobalISel/arm64-fallback.ll
that demonstrate we correctly fallback to DAG ISel at -O0 when
lowering formal arguments or translating instructions that involve
scalable vector types.
Differential Revision: https://reviews.llvm.org/D81557
The struct store intrinsics in LLVM IR take the individual parts
as arguments, so this patch uses the intrinsics used for `svget`
to break the tuples into individual parts.
Reviewers: c-rhodes, efriedma, ctetreau, david-arm
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81466
Code does not track terminators and do not expose them through interface.
State there is just a state of the last instruction or entry.
So this information is just redundant and doesn't need to be tested.
This patch updates SCCP/IPSCCP to use the computed range info to turn
sexts into zexts, if the value is known to be non-negative. We already
to a similar transform in CorrelatedValuePropagation, but it seems like
we can catch a lot of additional cases by doing it in SCCP/IPSCCP as
well.
The transform is limited to ranges that are known to not include undef.
Currently constant ranges from conditions are treated as potentially
containing undef, due to PR46144. Once we flip this, the transform will
be more effective in practice.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D81756
Without this fix, handleMoveUp can create an invalid live range like
this:
[98904e,98908r:0)[98908e,227504r:1)
where the two segments overlap, but only because we have lost the "e"
(early-clobber) on the end point of the first segment.
Differential Revision: https://reviews.llvm.org/D82110
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650
All class derived from `edsc::NestedBuilder` in core MLIR have been replaced
with alternatives based on OpBuilder+callbacks. The *Builder EDSC
infrastructure has been deprecated. Remove edsc::NestedBuilder.
This completes the "structured builders" refactoring.
Differential Revision: https://reviews.llvm.org/D82128
Callback-based constructions of blocks where the body is populated in the same
function as the block creation is a natural extension of callback-based loop
construction. They provide more concise and simple APIs than EDSC BlockBuilder
at less than 20% infrastructural code cost, and are compatible with
ScopedContext. BlockBuilder, Blockhandle and related functionality has been
deprecated, remove them.
Differential Revision: https://reviews.llvm.org/D82015
Callback-based loop construction, with loop bodies being constructed during the
construction of the parent op using a function, is now fully supported by the
core infrastructure. This provides almost the same level of brevity as EDSC
LoopBuilder at less than 30% infrastructural code cost. Functional equivalents
compatible with EDSC ScopedContext are implemented on top of the main builders.
LoopBuilder and related functionality has been deprecated, remove it.
Differential Revision: https://reviews.llvm.org/D81874
For now I have changed SimplifyDemandedBits and it's various callers
to assume we know nothing for scalable vectors and to ignore the
demanded bits completely. I have also done something similar for
SimplifyDemandedVectorElts. These changes fix up lots of warnings
due to calls to EVT::getVectorNumElements() for types with scalable
vectors. These functions are all used for optimisations, rather than
functional requirements. In future we can revisit this code if
there is a need to improve code quality for SVE.
Differential Revision: https://reviews.llvm.org/D80537
When trying to calculate the number of sign bits for scalable vectors
we should just bail out for now and pretend we know nothing.
Differential Revision: https://reviews.llvm.org/D81093
This patch adds the `default_triple` feature to MLIR test suite.
This feature was added to LLVM in d178f4fc8 in order to be able to
run the LLVM tests without having the host targets configured in.
With this change, `ninja check-mlir` passes without the host
target, i.e. this config:
cmake ../llvm -DLLVM_TARGETS_TO_BUILD="" -DLLVM_DEFAULT_TARGET_TRIPLE="" -DLLVM_ENABLE_PROJECTS=mlir -GNinja
Differential Revision: https://reviews.llvm.org/D82142
These tests involve a JIT, and like other tests should have the
REQUIRE: default_triple present.
This allow to run `ninja check` without the host target configured
in.
The accepted options to -mharden-sls= are:
* all: enable all mitigations against Straight Line Speculation that are
implemented.
* none: disable all mitigations against Straight Line Speculation.
* retbr: enable the mitigation against Straight Line Speculation for RET
and BR instructions.
* blr: enable the mitigation against Straight Line Speculation for BLR
instructions.
Differential Revision: https://reviews.llvm.org/D81404
Reviewers: dylanmckay
Reviewed By: dylanmckay
Subscribers: Jim, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77334
This was originally committed in
03b0831144 but I missed the commit
attribution.
Patch by Dennis van der Schagt.
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BL change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).
Therefore, this patch simply changes the function stubs for the SLSBLR
mitigation from
__llvm_slsblr_thunk_x<N>:
br x<N>
SpeculationBarrier
to
__llvm_slsblr_thunk_x<N>:
mov x16, x<N>
br x16
SpeculationBarrier
Differential Revision: https://reviews.llvm.org/D81405