Commit graph

416700 commits

Author SHA1 Message Date
Jakub Chlanda a895182302 [NVPTX] Add more FMA intriniscs/builtins
This patch adds builtins/intrinsics for the following variants of FMA:

NOTE: follow-up commit with the missing clang-side changes.

- f16, f16x2
  - rn
  - rn_ftz
  - rn_sat
  - rn_ftz_sat
  - rn_relu
  - rn_ftz_relu
- bf16, bf16x2
  - rn
  - rn_relu

ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly.

Differential Revision: https://reviews.llvm.org/D118977
2022-03-01 11:07:11 -08:00
Jakub Chlanda 7a6d692b3b [NVPTX] Expose float tys min, max, abs, neg as builtins
Adds support for the following builtins:

abs, neg:
- .bf16,
- .bf16x2
min, max
- {.ftz}{.NaN}{.xorsign.abs}.f16
- {.ftz}{.NaN}{.xorsign.abs}.f16x2
- {.NaN}{.xorsign.abs}.bf16
- {.NaN}{.xorsign.abs}.bf16x2
- {.ftz}{.NaN}{.xorsign.abs}.f32

Differential Revision: https://reviews.llvm.org/D117887
2022-03-01 11:07:11 -08:00
Zequan Wu e527986a9c [llvm-pdbutil] Fix crashes when TypeIndex is simple or doesn't exist in type stream
- Print simple TypeIndex
- Print error message when type doesn't exist.

Differential Revision: https://reviews.llvm.org/D120692
2022-03-01 11:04:56 -08:00
Bixia Zheng 20eaa88fff [mlir][sparse] Extend convertToMLIRSparseTensor to support permutation and more general sparsity values.
Previously, convertToMLIRSparseTensor assumes identity storage ordering and all
compressed dimensions. This change extends the function with two parameters for
users to specify the storage ordering and the sparsity of each dimension.

Modify PyTACO to reflect this change.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120643
2022-03-01 10:51:39 -08:00
Michał Górny ba4f1e44e4 [libcxx] Add an explicit option to build against system-libcxxabi
Add an explicit LIBCXX_CXX_ABI=system-libcxxabi option for linking to
system-installed libc++abi. This fixes the ability to link against one
when building libcxx via the runtimes build, as otherwise the build
system insists on linking into in-tree targets.

Differential Revision: https://reviews.llvm.org/D119539
2022-03-01 13:44:56 -05:00
Jorge Gorbe Moya 3de4e6b400 [bazel] add missing dependency 2022-03-01 10:43:27 -08:00
William S. Moses 78fb4f9d5d [SCF][MemRef] Enable SCF.Parallel Lowering to use Scope Op
As discussed in https://reviews.llvm.org/D119743 scf.parallel would continuously stack allocate since the alloca op was placd in the wsloop rather than the omp.parallel. This PR is the second stage of the fix for that problem. Specifically, we now introduce an alloca scope around the inlined body of the scf.parallel and enable a canonicalization to hoist the allocations to the surrounding allocation scope (e.g. omp.parallel).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D120423
2022-03-01 13:25:09 -05:00
Craig Topper b9d6e8c441 [RISCV] Lower VECTOR_SPLICE to RVV instructions.
This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup.
Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch
for fixed vectors is D119039.

I've used a tail agnostic slidedown and limited the VL to only the
elements that will not be overwritten by the slideup. The slideup
uses VLMax for its VL. It unfortunately uses tail undisturbed policy
but it isn't required as there is no tail. We just need the merge
operand to carry the bits for the lower portion of the result.

Care was taken to ensure that either the slideup or slidedown will
be able to use a .vi instruction when the immediate is small. Which
one uses the immediate depends on the sign of the immediate.

Reviewed By: frasercrmck, ABataev

Differential Revision: https://reviews.llvm.org/D119303
2022-03-01 10:10:13 -08:00
Craig Topper 7bc6667845 [Analysis] Simplify the interface to llvm::getICmpCode. NFC
Instead of passing an InstCmpInt * and a bool just pass the predicate
from the caller.

I'm considering moving the similar FCmp functions from InstCombine
over here and this makes the interface consistent with what is used
for FCmp.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120609
2022-03-01 09:53:27 -08:00
Tong Zhang 17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
Malhar Jajoo 6d658f37a4 [Openmp]: Missing import statement in openmp interface for Fortran
Essentially removed the "use omp_lib_kinds" statement and replaced it
with import to maintain consistency (and avoid compilation error
in case the omp_lib_kinds.mod file is not accessible) in header file.

The import is required to access entities in host scoping unit.

Differential Revision: https://reviews.llvm.org/D120707
2022-03-01 17:33:06 +00:00
Louis Dionne 97e013dd6b [libc++] Re generate header tests
This must have been missed in 368faacac7.
2022-03-01 12:19:21 -05:00
Siva Chandra Reddy 75747c7394 [libc] Remove the remaining uses of stdatomic.h.
New methods to the Atomic class have been added as required. Futex
related types have been consolidated at a common place.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D120705
2022-03-01 17:12:39 +00:00
Lei Zhang c809c9bd3b [mlir][spirv] Convert gpu.barrier to spv.ControlBarrier
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D120722
2022-03-01 12:04:00 -05:00
serge-sans-paille 71c3a5519d Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor:
before: 1065940348
after:  1065307662

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120659
2022-03-01 18:01:54 +01:00
Jay Foad 289339140e [AMDGPU] Handle legacy multiply-accumulate opcodes in convertToThreeAddress
Handle V_MAC_LEGACY_F32 and V_FMAC_LEGACY_F32 in
convertToThreeAddress, to avoid the need for an extra mov
instruction in some cases.

Differential Revision: https://reviews.llvm.org/D120704
2022-03-01 16:58:00 +00:00
Jay Foad 9ac3a85047 [AMDGPU] Disentangle MFMA handling in convertToThreeAddress. NFC.
Move MFMA handling to the top of convertToThreeAddress and pull
IsF16 calculation out of the switch. I think this makes it clearer
exactly which mac/fmac opcodes are handled, since they are now
listed in the switch with minimal extra clutter.

Differential Revision: https://reviews.llvm.org/D120703
2022-03-01 16:56:56 +00:00
Martin Storsjö 9ffeaaa0ea [LLD] [COFF] Use StringTableBuilder to optimize the string table
This does tail merging (and deduplication) of the strings.

On a statically linked clang.exe, this shrinks the ~17 MB string
table by around 0.5 MB. This adds ~160 ms to the linking time
which originally was around 950 ms.

For cases where `-debug:symtab` or `-debug:dwarf` isn't set, the
string table is only used for long section names, where this
shouldn't make any difference at all.

Differential Revision: https://reviews.llvm.org/D120677
2022-03-01 18:44:03 +02:00
Jay Foad f9c545e1e2 [AMDGPU] Fix test_fmaak_otherimm_src0_f64 test
Judging by the name, and comparing with the f32 version, this was
supposed to be testing that FMAC with a non-inlinable constant
operand did not get converted to FMA.
2022-03-01 16:35:19 +00:00
Erich Keane c601377b23 [NFC]Promote addInstantiatedParametersToScope to a private Sema function
This is used a few places in SemaTeplateInstantiateDecl, but is going
to be useful in SemaConcept.cpp as well. This patch switches it to be
a private function in Sema.

Differential Revision: https://reviews.llvm.org/D120729
2022-03-01 08:31:51 -08:00
Kristóf Umann 32ac21d049 [NFC][analyzer] Allow CallDescriptions to be matched with CallExprs
Since CallDescriptions can only be matched against CallEvents that are created
during symbolic execution, it was not possible to use it in syntactic-only
contexts. For example, even though InnerPointerChecker can check with its set of
CallDescriptions whether a function call is interested during analysis, its
unable to check without hassle whether a non-analyzer piece of code also calls
such a function.

The patch adds the ability to use CallDescriptions in syntactic contexts as
well. While we already have that in Signature, we still want to leverage the
ability to use dynamic information when we have it (function pointers, for
example). This could be done with Signature as well (StdLibraryFunctionsChecker
does it), but it makes it even less of a drop-in replacement.

Differential Revision: https://reviews.llvm.org/D119004
2022-03-01 17:13:04 +01:00
Joe Nash fa55ac6c27 [UpdateTestChecks][AMDGPU] Run test update script
NFC. Run the mir test auto-update script. These tests haven't been updated
since the script changed from inserting CHECK to CHECK-NEXT.
2022-03-01 10:45:03 -05:00
Tue Ly 4816bfa838 [libc] Add LLVM_LIBC_CLANG_TIDY option and allow LLVM_LIBC_ENABLE_LINTING without full build.
Add LLVM_LIBC_CLANG_TIDY option and allow LLVM_LIBC_ENABLE_LINTING without full build.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D119180
2022-03-01 11:04:29 -05:00
Jun Zhang ac616fbb05
[Clang-tidy] Check the existence of ElaboratedType's qualifiers
The ElaboratedType can have no qualifiers, so we should check it before
use.

Fix #issue53874(https://github.com/llvm/llvm-project/issues/53874)

Differential Revision: https://reviews.llvm.org/D119949
2022-03-01 23:52:44 +08:00
Simon Pilgrim 70ab0a9b62 [X86] Add vector shift by scalar test with bitcasted scalar shift amount
As noted on D120553, we didn't have any tests that explicitly showed the bitcast - we were relying on i64 -> i32 legalization on 32-bit targets
2022-03-01 15:40:40 +00:00
Craig Topper bf8054644d [DAGCombiner] Don't expand (neg (abs x)) if the abs has an additional user.
If the types aren't legal, the expansions may get type legalized in a
different way preventing code sharing. If the type is legal, we will
share some instructions between the two expansions, but we will need an
extra register.

Since we don't appear to fold (neg (sub A, B)) if the sub has an
additional user, I think it makes sense not to expand NABS.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120513
2022-03-01 07:32:07 -08:00
Craig Topper c752eb4ae1 [RISCV] Add test cases miscompile of (rotl (grevi X, 24), 16) on RV64. NFC
This pattern was moved from isel to DAG combine in D119527, but
it lost the RV32 qualification in the process.
2022-03-01 07:32:07 -08:00
Nikita Popov a1f442b278 [InstCombine] Support phi to cond fold with more than two preds
This transform can still be applied if there are more than two
phi inputs, as long as phi inputs with the same value are dominated
by the same idom edge.
2022-03-01 16:31:49 +01:00
Stefan Pintilie a84a8c937b [PowerPC] Remove redundant MMA patterns.
There are two MMA patterns that have been added twice. This patch just removes
one set of petterns. Should not change the way MMA behaves.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120680
2022-03-01 09:13:21 -06:00
Nikita Popov 0bb698a2fb [InstCombine] Add additional test for phi to condition fold (NFC)
This one does not have an intermediate block for the true branch,
and demonstrates the importance of using edge dominance.
2022-03-01 16:08:47 +01:00
Adam Czachorowski 8f4ea36bfe [clang] Improve laziness of resolving module map headers.
clang has support for lazy headers in module maps - if size and/or
modtime and provided in the cppmap file, headers are only resolved when
an include directive for a file with that size/modtime is encoutered.

Before this change, the lazy resolution was all-or-nothing per module.
That means as soon as even one file in that module potentially matched
an include, all lazy files in that module were resolved. With this
change, only files with matching size/modtime will be resolved.

The goal is to avoid unnecessary stat() calls on non-included files,
which is especially valuable on networked file systems, with higher
latency.

Differential Revision: https://reviews.llvm.org/D120569
2022-03-01 15:56:23 +01:00
Jay Foad 68895098d1 [AMDGPU] Preserve src2_modifiers in convertToThreeAddress
Found by code inspection. I don't think it makes a difference with
current codegen, because if any source modifiers were present we
would have selected mad/fma instead of mac/fmac in the first place.

Differential Revision: https://reviews.llvm.org/D120709
2022-03-01 14:48:25 +00:00
Nikita Popov a968bee093 [InstCombine] Add more tests for phi to cond fold (NFC)
These have more than two predecessors.
2022-03-01 15:47:55 +01:00
Sebastian Neubauer c74f54f2f4 [UpdateTestChecks] Add requires asserts to tests
The tests use debug-only, which is only available in debug mode.
This should fix the failing tests.
2022-03-01 15:28:42 +01:00
Florian Hahn 470b5c7f0d
[LV] Add test with multiple use of a FOR chained together.
Additional test coverage for D118642.
2022-03-01 14:18:23 +00:00
Nikita Popov 26748bb15a [InstCombine] Slightly relax one-use check in abs canonicalization
Treat the icmp and sub symmetrically, and require that one of them
has one use, not the icmp in particular. This could be further
relaxed in the abs (but not nabs) case to not check one-use at
all.
2022-03-01 15:06:41 +01:00
Florian Hahn d2c8aa0bf4
[AArch64] Pass Reg instead of MI to tryToFindRenameRegister (NFC).
FirstMI is only used to get the load/store operand and the machine
function. Pass the MF and register explicitly, so the helper can be used
to find rename registers for other instructions in the future.
2022-03-01 14:02:02 +00:00
Nikita Popov 7c080e4649 [LoopVectorize] Regenerate test checks (NFC) 2022-03-01 15:01:14 +01:00
Sanjay Patel 84812b9b07 [InstCombine] drop FMF in select->copysign transform
It is not correct to propagate flags from the select
to the new instructions:
https://alive2.llvm.org/ce/z/tNATrd
https://alive2.llvm.org/ce/z/VwcVzn

Fixes #54077
2022-03-01 08:51:41 -05:00
Sanjay Patel 53dbedcd18 [InstCombine] add test for copysign with FMF propagation; NFC
This is a miscompile as noted in #54077.
2022-03-01 08:51:40 -05:00
Nikita Popov c2428a4fad [InstCombine] Remove SPF min/max check from select demanded bits (NFCI)
This should no longer be necessary now that we canonicalize to
intrinsics. This may not be entirely NFC in practice if worklist
order gets inverted and we perform demanded bits simplification
of a select user before the select is canonicalized.
2022-03-01 14:50:37 +01:00
Louis Dionne 3ee0cec88e [runtimes] Remove FOO_TARGET_TRIPLE, FOO_SYSROOT and FOO_GCC_TOOLCHAIN
Instead, folks can use the equivalent variables provided by CMake
to set those. This removal aims to reduce complexity and potential
for confusion when setting the target triple for building the runtimes,
and make it correct when `CMAKE_OSX_ARCHITECTURES` is used (right now
both `-arch` and `--target=` will end up being passed, which is downright
incorrect).

Differential Revision: https://reviews.llvm.org/D112155
2022-03-01 08:39:42 -05:00
Louis Dionne 368faacac7 [libc++] Revert "Protect users from relying on detail headers" & related changes
This commit reverts 5aaefa51 (and also partly 7f285f48e7 and b6d75682f9,
which were related to the original commit). As landed, 5aaefa51 had
unintended consequences on some downstream bots and didn't have proper
coverage upstream due to a few subtle things. Implementing this is
something we should do in libc++, however we'll first need to address
a few issues listed in https://reviews.llvm.org/D106124#3349710.

Differential Revision: https://reviews.llvm.org/D120683
2022-03-01 08:20:24 -05:00
Alex Zinenko 5c73db24df [mlir] disallow side-effecting ops in llvm.mlir.global
The llvm.mlir.global operation accepts a region as initializer. This region
corresponds to an LLVM IR constant expression and therefore should not accept
operations with side effects. Add a corresponding verifier.

Reviewed By: wsmoses, bondhugula

Differential Revision: https://reviews.llvm.org/D120632
2022-03-01 14:16:09 +01:00
gbtozers b3f1480204 [Dexter] Optimize breakpoint deletion in Visual Studio
Breakpoint deletion in visual studio is currently implemented by
iterating over the breakpoints we want to delete, for each of which we
iterate over the complete set of breakpoints in the debugger instance
until we find the one we wish to delete. Ideally we would resolve this
by directly deleting each breakpoint by some ID rather than searching
through the full breakpoint list for them, but in the absence of such a
feature in VS we can instead invert the loop to improve performance.

This patch changes breakpoint deletion to iterate over the complete list
of breakpoints, deleting breakpoints that match the breakpoints we
expect to delete by checking set membership. This represents a
worst-case improvement from O(nm) to O(n), for 'm' breakpoints being
deleted out of 'n' total. In practise this is almost exactly 'm'-times
faster, as when we delete multiple breakpoints they are typically
adjacent in the full breakpoint list.

Differential Revision: https://reviews.llvm.org/D120658
2022-03-01 13:13:38 +00:00
Sockke ba54ebeb5e [clang-tidy] Fix readability-const-return-type for pure virtual function.
It cannot match a `pure virtual function`. This patch fixes this behavior.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D116439
2022-03-01 20:55:28 +08:00
Jeremy Morse ab49dce01f [DebugInfo][InstrRef][NFC] Use unique_ptr instead of raw pointers
InstrRefBasedLDV allocates some big tables of ValueIDNum, to store live-in
and live-out block values in, that then get passed around as pointers
everywhere. This patch wraps the allocation in a std::unique_ptr, names
some types based on unique_ptr, and passes references to those around
instead. There's no functional change, but it makes it clearer to the
reader that references to these tables are borrowed rather than owned, and
we get some extra validity assertions too.

Differential Revision: https://reviews.llvm.org/D118774
2022-03-01 12:49:50 +00:00
Nathan Sidwell 75db1795e4 [demangler] Add co_await demangling
The demangler doesn't understand 'aw' as an operator name. This adds
the necessary smarts -- you may use this as an operator functionname,
but not as an expression operator.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D120143
2022-03-01 04:49:19 -08:00
Florian Hahn 45c969defa
[AArch64] Remove unused argument from tryToFindRegisterToRename (NFC).
The MI argument is not used by the function. Remove it.
2022-03-01 12:47:37 +00:00
Nathan Sidwell 7f89fa32e8 [demangler][NFC] Tabularize operator name parsing
We need to parse operator names in 3 places -- expressions, names &
fold expressions.  Currently we have 3 separate pieces to do this, and a FIXME.

The operator name and expression parsing are implemented as
handwritten two-character nested switches, the fold expression is a
sequence of string comparisons.

This adds a new OperatorInfo class to encode the operator info
(encoding, kind, name), and has a table that it can binary search.
From that each of the above 3 uses are altered to use the new scheme.

Existing tests cover parsing operator encodings.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D119467
2022-03-01 04:44:56 -08:00