Commit graph

421432 commits

Author SHA1 Message Date
Nikita Popov dbe6d85b8b [PPCGCodeGeneration] Look for function instead of function pointer type
What this code is actually interested in are references to functions.
Use of a function pointer type is being used as an imprecise proxy
for that.
2022-04-19 17:59:34 +02:00
Nikita Popov 880014b593 [PPCGCodeGeneration] Avoid another pointer element type access
Use an API that returns both the address and the element type,
and use that for the load type.
2022-04-19 17:26:33 +02:00
David Green cc03414125 [PerfectShuffle] Remove unused variables from D123386. NFC 2022-04-19 16:22:04 +01:00
Florian Hahn 4026b718b8
[VPlan] Remove unused SCEV forward declaration (NFC). 2022-04-19 17:16:17 +02:00
Nikita Popov ee6bd28f23 [PPCGCodeGeneration] Avoid pointer element type access
Pass through the ArrayTy instead.
2022-04-19 17:09:34 +02:00
Kirill Stoimenov 64c929ec09 [ASan] Fixed a reporting bug in (load|store)N functions which would print unknown-crash instead of the proper error message when a the data access is unaligned.
Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D123643
2022-04-19 15:07:17 +00:00
Jonas Paulsson 4aa5dc15f0 [SystemZ] Handle SystemZ specific inline assembly address operands.
Handle ZQ, ZR, ZS and ZT inline assembly operand constraints.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D110267
2022-04-19 16:55:45 +02:00
Tom Ritter 82f3ed9904 [analyzer] Expose Taint.h to plugins
Reviewed By: NoQ, xazax.hun, steakhal

Differential Revision: https://reviews.llvm.org/D123155
2022-04-19 16:55:01 +02:00
gbreynoo 42865819b2 [llvm-ar][test] Rename two tests and use correct thin command
Two tests used the term "full archive" rather than "regular", these have
been updated including the test names. They now also use --thin rather
than the deprecated T. This change was made in preparation of D123142.

Differential Revision: https://reviews.llvm.org/D123778
2022-04-19 15:13:37 +01:00
Qiongsi Wu 2512a875cc [clang] Adding Platform/Architecture Specific Resource Header Installation Targets
The goal of this patch is to improve distribution build's flexibility to include only applicable header files.

Currently, the clang-resource-headers target contains nearly all the files in clang/lib/Headers. Most of these files are platform specific (e.g. immintrin.h is x86 specific). A distribution build will have to either include all the headers for all the platforms, or not include any headers. For example, if a distribution build for powerpc includes the clang-resource-headers target, it will include all the x86 specific headers, even-though the x86 specific headers cannot be used.

This patch breaks up the clang-resource-headers list to a core list and platform specific lists. With the patch, a distribution build can now include the ppc-resource-headers to include the headers applicable to the powerpc platform.

Specifically, one can now have

cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;ppc-resource-headers" ... ../llvm
ninja install-distribution then installs the powerpc headers.

Similarly, one can do

cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;x86-resource-headers" ... ../llvm
to include headers applicable to the x86 platform in a distribution installation.

To implement this behaviour, the patch does two things:
* It breaks up the long files header file list to a core list and platform specific lists.
* It adds numerous platform specific installation targets.

Differential Revision: https://reviews.llvm.org/D123498
2022-04-19 10:10:07 -04:00
David Spickett 218b5c8394 [clang][AArch64] Remove BTI after setjmp from release notes
This is now going into 14.0.2 as
571c7d8f6d so will not be
new in clang-15.
2022-04-19 13:49:55 +00:00
David Green 73dc996428 [AArch64] Add lane moves to PerfectShuffle tables
This teaches the perfect shuffle tables about lane inserts, that can
help reduce the cost of many entries. Many of the shuffle masks are
one-away from being correct, and a simple lane move can be a lot simpler
than trying to use ext/zip/etc. Because they are not exactly like the
other masks handled in the perfect shuffle tables, they require special
casing to generate them, with a special InsOp Operator.

The lane to insert into is encoded as the RHSID, and the move from is
grabbed from the original mask. This helps reduce the maximum perfect
shuffle entry cost to 3, with many more shuffles being generatable in a
single instruction.

Differential Revision: https://reviews.llvm.org/D123386
2022-04-19 14:49:50 +01:00
Alexey Bataev 7adfa31bc6 [SLP][NFC]Add a test for reducing same values, NFC. 2022-04-19 06:48:21 -07:00
Alexey Bataev 883571928c Revert "[SLP]Improve reductions analysis and emission, part 1."
This reverts commit 0e1f4d4d3c to fix
a crash reported in PR54976
2022-04-19 06:17:03 -07:00
Kirill Bobyrev bdf0b757d5
[clangd] IncludeCleaner: Add filtering mechanism
This introduces filtering out inclusions based on the resolved path. This
mechanism will be important for disabling warnings for headers that we can not
diagnose correctly yet.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D123488
2022-04-19 14:56:27 +02:00
Joseph Huber 0f8b8d79af [OpenMP][Docs] Remove old 14.0 release information
Summary:
This patch removes the OpenMP sections in the release notes. These will
be filled once the release is close and implementations are finalized.
2022-04-19 08:45:51 -04:00
Joseph Huber 944b25aee3 [OpenMP] Make Xopenmp-target args compile-only to silence warnings
Summary:
Previously we needed the `Xopenmp-target=` option during the linking
phase so the old offloading driver knew which items to extract and link
for the device. Now that the new driver has become the default this is
no longer necessary and will cause a warning to be emitted for the
unused argument. This should be silenced to avoid noise.
2022-04-19 08:42:43 -04:00
Arnab Dutta 12f55cac69 [MLIR][GPU] Add canonicalizer for gpu.memcpy
Fold away gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D121279
2022-04-19 17:54:00 +05:30
David Green cc9495f679 [AArch64] Only mark cost 1 perfect shuffles as legal
The perfect shuffle tables encode a cost of either 0 (a nop-copy) or 1
(a single instruction) with a cost encoding of 0 in the upper 2 bits.
All perfect shuffles with any cost are then marked as legal shuffles
though (the maximum encoded cost is 3), which can confuse the DAG
combiner into thinking the shuffles are cheaper than the should be.

Limiting legal shuffles to single instructions seems to do better in
most case, producing less instructions for complex shuffles. There are
some cases that now become tbl, which may be better or worse depending
on whether the instruction is in a loop and the tbl load can be hoisted
out.

Differential Revision: https://reviews.llvm.org/D123377
2022-04-19 12:58:55 +01:00
Roy Jacobson 76410040b9 Revert "[Concepts] Fix overload resolution bug with constrained candidates"
This reverts commit 454d1df942.
2022-04-19 07:51:21 -04:00
Florian Hahn a65f2730d2
[VPlan] Expand induction step in VPlan pre-header.
This patch moves SCEV expansion of steps used by
VPWidenIntOrFpInductionRecipes to the pre-header using
VPExpandSCEVRecipe. This ensures that those steps are expanded while the
CFG is in a valid state. Previously, SCEV expansion may happen during
vector body code-generation, during which the CFG may be invalid,
causing issues with SCEV expansion.

Depends on D122095.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D122096
2022-04-19 13:06:39 +02:00
David Green 50af82701c [AArch64] Cost all perfect shuffles entries as cost 1
A brief introduction to perfect shuffles - AArch64 NEON has a number of
shuffle operations - dups, zips, exts, movs etc that can in some way
shuffle around the lanes of a vector. Given a shuffle of size 4 with 2
inputs, some shuffle masks can be easily codegen'd to a single
instruction. A <0,0,1,1> mask for example is a zip LHS, LHS. This is
great, but some masks are not so simple, like a <0,0,1,2>. It turns out
we can generate that from zip LHS, <0,2,0,2>, having generated
<0,2,0,2> from uzp LHS, LHS, producing the result in 2 instructions.

It is not obvious from a given mask how to get there though. So we have
a simple program (PerfectShuffle.cpp in the util folder) that can scan
through all combinations of 4-element vectors and generate the perfect
combination of results needed for each shuffle mask (for some definition
of perfect). This is run offline to generate a table that is queried for
generating shuffle instructions. (Because the table could get quite big,
it is limited to 4 element vectors).

In the perfect shuffle tables zip, unz and trn shuffles were being cost
as 2, which is higher than needed and skews the perfect shuffle tables
to create inefficient combinations. This sets them to 1 and regenerates
the tables. The codegen will usually be better and the costs should be
more precise (but it can get less second-order re-use of values from
multiple shuffles, these cases should be fixed up in subsequent patches.

Differential Revision: https://reviews.llvm.org/D123379
2022-04-19 12:05:05 +01:00
Alban Bridonneau 8daffd1dfb Fix SLP score for out of order contiguous loads
SLP uses the distance between pointers to optimize
the getShallowScore. However the current code misses
the case where we are trying to vectorize for VF=4, and the distance
between pointers is 2. In that case the returned score
reflects the case of contiguous loads, when it's not actually
contiguous.

The attached unit tests have 5 loads, where the program order
is not the same as the offset order in the GEPs. So, the choice
of which 4 loads to bundle together matters. If we pick the
first 4, then we can vectorize with VF=4. If we pick the
last 4, then we can only vectorize with VF=2.

This patch makes a more conservative choice, to consider
all distances>1 to not be a case of contiguous load, and
give those cases a lower score.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D123516
2022-04-19 11:58:01 +01:00
Dmitry Preobrazhensky e01dbabdd1 [AMDGPU][MC] Corrected error message "image data size does not match dmask and tfe"
Differential Revision: https://reviews.llvm.org/D123929
2022-04-19 13:52:58 +03:00
Balazs Benics 7984189826 [analyzer] Remove HasAlphaDocumentation tablegen enum value
D121387 simplified the doc url generation process, so we no longer need
the HasAlphaDocumentation enum entry. This patch removes that.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D121459
2022-04-19 12:14:27 +02:00
Balazs Benics 744e2a3e22 [analyzer] ClangSA should tablegen doc urls refering to the main doc page
AFAIK we should prefer
https://clang.llvm.org/docs/analyzer/checkers.html to
https://clang-analyzer.llvm.org/{available_checks,alpha_checks}.html

This patch will ensure that the doc urls produced by tablegen for the
ClangSA, will use the new url. Nothing else will be changed.

Reviewed By: martong, Szelethus, ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D121387
2022-04-19 12:14:27 +02:00
Balazs Benics 63c4ca9d14 [analyzer] Turn missing tablegen doc entry of a checker into fatal error
It turns out all checkers explicitly mention the `Documentation<>`.
It makes sense to demand this, so emit a fatal tablegen error if such
happens.

Reviewed By: martong, Szelethus

Differential Revision: https://reviews.llvm.org/D122244
2022-04-19 12:14:27 +02:00
Balazs Benics b7c988811d [analyzer][NFC] Introduce the checker package separator character
Reviewed By: martong, ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D122243
2022-04-19 12:14:27 +02:00
David Spickett 68e73eaee6 [lldb] Handle empty search string in "memory find"
Given that you'd never find empty string, just error.

Also add a test that an invalid expr generates an error.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D123793
2022-04-19 09:19:38 +00:00
Sven van Haastregt f3ee0afc67 [OpenCL] opencl-c.h: Add const to get_image_num_samples
Align with the `-fdeclare-opencl-builtins` option and other
get_image_* builtins which have the const attribute.

Differential Revision: https://reviews.llvm.org/D122728
2022-04-19 10:16:44 +01:00
Marius Brehler 2ba865903d [mlir][emitc] Add test for invalid type
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D123503
2022-04-19 11:03:56 +02:00
Roy Jacobson 454d1df942 [Concepts] Fix overload resolution bug with constrained candidates
When doing overload resolution, we have to check that candidates' parameter types are equal before trying to find a better candidate through checking which candidate is more constrained.
This revision adds this missing check and makes us diagnose those cases as ambiguous calls when the types are not equal.

Fixes GitHub issue https://github.com/llvm/llvm-project/issues/53640

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D123182
2022-04-19 04:45:28 -04:00
Jay Foad f707e1255e [AMDGPU] Select d16 stores even when sramecc is enabled
The sramecc feature changes the behaviour of d16 loads so they do not
preserve the unused 16 bits of the result register, but it has no impact
on d16 stores, so we should make use of them even when the feature is
enabled.

Differential Revision: https://reviews.llvm.org/D104912
2022-04-19 09:34:32 +01:00
Timm Bäder 33ec653055 [clang][lexer] Allow u8 character literal prefixes in C2x
Implement N2418 for C2x.

Differential Revision: https://reviews.llvm.org/D119221
2022-04-19 09:57:51 +02:00
Nikita Popov 653de14f17 [Support] Optimize (.*) regex matches
If capturing groups are used, the regex matcher handles something
like `(.*)suffix` by first doing a maximal match of `.*`, trying to
match `suffix` afterward, and then reducing the maximal stop
position one by one until this finally succeeds. This makes the
match quadratic in the length of the line (with large constant factors).

This is particularly problematic because regexes of this form are
ubiquitous in FileCheck (something like `[[VAR:%.*]] = ...` falls
in this category), making FileCheck executions much slower than
they have any right to be.

This implements a very crude optimization that checks if suffix
starts with a fixed character, and steps back to the last occurrence
of that character, instead of stepping back by one character at a
time. This drops FileCheck time on
clang/test/CodeGen/RISCV/rvv-intrinsics/vloxseg_mask.c from
7.3 seconds to 2.7 seconds.

An obvious further improvement would be to check more than one
character (once again, this is particularly relevant for FileCheck,
because the next character is usually a space, which happens to
have many occurrences).

This should help with https://github.com/llvm/llvm-project/issues/54821.
2022-04-19 09:55:21 +02:00
Matthias Springer a3005a406e [mlir][interfaces] Fix infinite loop in insideMutuallyExclusiveRegions
This function was missing a termination condition.
2022-04-19 16:28:52 +09:00
Mehdi Amini 4e01184ad5 Apply clang-tidy fixes for performance-unnecessary-value-param in JitRunner.cpp (NFC) 2022-04-19 07:23:12 +00:00
Mehdi Amini 722a3a58e2 Apply clang-tidy fixes for performance-for-range-copy in MemRefOps.cpp (NFC) 2022-04-19 07:23:12 +00:00
Chuanqi Xu cd149dbf8e [NFC] Remove unused variable 2022-04-19 15:19:40 +08:00
Matthias Springer 0f4ba02db3 [mlir][interfaces] Add helpers for detecting recursive regions
Add helper functions to check if an op may be executed multiple times based on RegionBranchOpInterface.

Differential Revision: https://reviews.llvm.org/D123789
2022-04-19 16:13:32 +09:00
Fraser Cormack c5cac48549 [RISCV] Fix lowering of BUILD_VECTORs as VID sequences
This patch fixes a bug when lowering BUILD_VECTOR via VID sequences.
After adding support for fractional steps in D106533, elements with zero
steps may be skipped if no step has yet been computed. This allowed
certain sequences to slip through the cracks, being identified as VID
sequences when in fact they are not.

The fix for this is to perform a second loop over the BUILD_VECTOR to
validate the entire sequence once the step has been computed. This isn't
the most efficient, but on balance the code is more readable and
maintainable than doing back-validation during the first loop.

Fixes the tests introduced in D123785.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123786
2022-04-19 07:43:38 +01:00
Fraser Cormack 00537946aa [RISCV] Add tests showing incorrect BUILD_VECTOR lowering
These tests both use vector constants misidentified as VID sequences.
Because the initial run of elements has a zero step, the elements are
skipped until such a step can be identified. The bug is that the skipped
elements are never validated, even though the computed step is
incompatible across the entire sequence.

A fix will follow in a subseqeuent patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123785
2022-04-19 07:00:48 +01:00
Austin Kerbow 7f97ac94f7 Revert "[AMDGPU] Omit unnecessary waitcnt before barriers"
This reverts commit 8d0c34fd4f.
2022-04-18 21:24:08 -07:00
Konstantin Varlamov bcdb11e741 [libc++][NFC] Reindent take_view in accordance with the style guide. 2022-04-18 20:54:50 -07:00
Yaxun (Sam) Liu cac4e2fe25 [CUDA][HIP] Fix gpu.used.external
Rename gpu.used.external as __clang_gpu_used_external as ptxas does not
allow . in global variable name.

Fixes: https://github.com/llvm/llvm-project/issues/54934

Reviewed by: Joseph Huber, Artem Belevich

Differential Revision: https://reviews.llvm.org/D123946
2022-04-18 23:10:31 -04:00
Joseph Huber 80787213ea [Libomptarget] Fix test using old unsupported lit string
Summary:
One test had an old "unsupported" string that used the old `newDriver`
string which was removed. This test should be updated to use the
`oldDriver` one instead.
2022-04-18 23:08:12 -04:00
Chuanqi Xu f9bee35689 [Pipelines] Hoist CoroEarly as a module pass
This change could reduce the time we call `declaresCoroEarlyIntrinsics`.
And it is helpful for future changes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D123925
2022-04-19 11:04:24 +08:00
Michael Kruse 2d92ee97f1 Reapply "[OpenMP] Refactor OMPScheduleType enum."
This reverts commit af0285122f.

The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
2022-04-18 21:56:47 -05:00
jacquesguan 25445b94db [RISCV] Add rvv codegen support for vp.fptrunc.
This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123841
2022-04-19 01:56:18 +00:00
Mehdi Amini 1881d6fc80 Apply clang-tidy fixes for performance-unnecessary-copy-initialization in MathOps.cpp (NFC) 2022-04-19 00:47:58 +00:00