Commit graph

397856 commits

Author SHA1 Message Date
Sanjay Patel c2162e4d89 [InstCombine] add tests for min/max intrinsics with not ops; NFC 2021-08-31 20:08:38 -04:00
Alexandre Ganea 7f0664f193 [LLD][COFF] Clean paths in PDB even when /pdbsourcepath is omitted
Differential Revision: https://reviews.llvm.org/D109030
2021-08-31 19:05:10 -04:00
Joseph Huber 29a74a3915 [OpenMP] Add an option to always inline OpenMP device functions.
Performance on GPU targets can be highly variable, sometimes inlining
everything hurts performance and sometimes it greatly improves it. Add
an option to toggle this behaviour to better investigate it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D109014
2021-08-31 18:48:30 -04:00
Alex Langford 862a311301 [lldb] Tighten lock in Language::ForEach
It is easy to accidentally introduce a deadlock by having the callback
passed to Language::ForEach also attempt to acquire the same lock. It
is easy enough to disallow the callback from calling anything in
Language directly, but it may happen through a series of other
function/method calls.

The solution I am proposing is to tighten the lock in Language::ForEach
so that it is only held as we gather the currently loaded language
plugins. We store them in a vector and then iterate through them with
the callback so that the callback can't introduce a deadlock.

Differential Revision: https://reviews.llvm.org/D109013
2021-08-31 15:45:38 -07:00
wren romano b04b757a8e [mlir][sparse] Rename the public SparseTensorStorage::asCOO to toCOO
Trying to reduce confusion by having the name of the public method match that of the private method for handling the recursion.  Also adding some comments to SparseTensorStorage::fromCOO to help clarify what the recursive calls are doing in the dense case.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D108954
2021-08-31 15:44:34 -07:00
Andrew Browne befb384484 [DFSan][NFC] Fix comment formatting. 2021-08-31 15:35:08 -07:00
Vy Nguyen 3afa2151f8 [llvm-ar][nfc] Reword help message to be less ambiguous on what p and t do.
The current help msg isn't super clear on whether t prints the content of the files or just the list of files.
(I'd certainly thought it'd print the list of files, and accidentally had a bunch of "gargabe" printed to my terminal).
Similarly, t sounded like it'd do what p actually did.

Differential Revision: https://reviews.llvm.org/D109018
2021-08-31 17:48:04 -04:00
Kevin Athey 3e2bd82f02 Revert "[OptTable] Improve error message output for grouped short options"
This reverts commit 71d7fed3bc.

Reason: broke sanitizer bots
more info: https://reviews.llvm.org/D108770
2021-08-31 14:06:11 -07:00
Louis Dionne 1c9b7d0ecc [libc++][NFC] Remove redundant friend declaration for operator==
This must have been meant to be friend-declaring operator!=, but it
turns out that it's not even necessary to make it a friend since it
does not access any private state.

rdar://82568613
2021-08-31 17:02:58 -04:00
Stanislav Mekhanoshin d170945bb2 [RegAlloc] Immediately delete dead instructions with live uses
When RA eliminated a dead def it can either immediately delete
the instruction itself or replace it with KILL to defer the
actual removal. If this instruction has a virtual register use
killing the register it will shrink the LI of the use. However,
if the LI covers the instruction and extends beyond it the
shrink will not happen. In fact that is impossible to shrink
such use because of the KILL still using it.

If later the LI of the use will be split at the KILL and the
KILL itself is eliminated after that point the new live segment
ends up at an invalid slot index.

This extremely rare condition was hit after D106408 which has
enabled rematerialization of such instructions. The replacement
with KILL is only done for rematerialized defs which became dead
and such rematerialization did not generally happen before.

The patch deletes an instruction immediately if it is a result
of rematerialization and has such use. An alternative would be
to prohibit a split at a KILL instruction, but it looks like it
is better to split a live range rather then keeping a killed
instruction just in case it can be rematerialized further.

Fixes PR51655.

Differential Revision: https://reviews.llvm.org/D108951
2021-08-31 13:46:00 -07:00
Nikita Popov 48ebe427c9 [SLPVectorizer] Make aliasing check more precise
SLPVectorizer currently uses AA::isNoAlias() to determine whether
two locations alias. This does not work if one of the instructions
is a call. Instead, we should check getModRefInfo(), which
determines whether an arbitrary instruction modifies or references
a given location.

Among other things, this prevents @llvm.experimental.noalias.scope.decl()
and other inaccessiblmemonly intrinsics from interfering with SLP
vectorization.

Differential Revision: https://reviews.llvm.org/D109012
2021-08-31 22:35:30 +02:00
wlei 964053d56f [llvm-profgen] Support LBR only perf script
This change aims at supporting LBR only sample perf script which is used for regular(Non-CS) profile generation.  A LBR perf script includes a batch of LBR sample which starts with a frame pointer and a group of 32 LBR entries is followed. The FROM/TO LBR pair and the range between two consecutive entries (the former entry's TO and the latter entry's FROM) will be used to infer function profile info.

An example of LBR perf script(created by `perf script -F ip,brstack -i perf.data`)
```
           40062f 0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1 ...
           4005d7 0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1 ...
           ...
```

For implementation:
 - Extended a new child class `LBRPerfReader` for the sample parsing, reused all the functionalities in `extractLBRStack` except for an extension to parsing leading instruction pointer.
 - `HybridSample` is reused(just leave the call stack empty) and the parsed samples is still aggregated in `AggregatedSamples`. After that, range samples, branch sample, address samples are computed and recorded.
 - Reused `ContextSampleCounterMap` to store the raw profile, since it's no need to aggregation by context, here it just registered one sample counter with a fake context key.
 - Unified to use `show-raw-profile` instead of `show-unwinder-output` to dump the intermediate raw profile, see the comments of the format of the raw profile. For CS profile, it remains to output the unwinder output.

Profile generation part will come soon.

Differential Revision: https://reviews.llvm.org/D108153
2021-08-31 13:28:17 -07:00
peter klausler 9ab1efc77a [flang] Fold UNPACK and TRANSPOSE
Implement constant folding for the transformational intrinsic
functions UNPACK and TRANSPOSE.

Differential Revision: https://reviews.llvm.org/D109010
2021-08-31 13:25:35 -07:00
Joel E. Denny ec1ebcd302 [OpenMP][OpenACC] Implement ompx_hold map type modifier extension in runtime (2/2)
This patch implements OpenMP runtime support for an original OpenMP
extension we have developed to support OpenACC: the `ompx_hold` map
type modifier.  The previous patch in this series, D106509, implements
Clang support and documents the new functionality in detail.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D106510
2021-08-31 16:13:49 -04:00
Joel E. Denny 83ddfa0d22 [OpenMP][OpenACC] Implement ompx_hold map type modifier extension in Clang (1/2)
This patch implements Clang support for an original OpenMP extension
we have developed to support OpenACC: the `ompx_hold` map type
modifier.  The next patch in this series, D106510, implements OpenMP
runtime support.

Consider the following example:

```
 #pragma omp target data map(ompx_hold, tofrom: x) // holds onto mapping of x
 {
   foo(); // might have map(delete: x)
   #pragma omp target map(present, alloc: x) // x is guaranteed to be present
   printf("%d\n", x);
 }
```

The `ompx_hold` map type modifier above specifies that the `target
data` directive holds onto the mapping for `x` throughout the
associated region regardless of any `target exit data` directives
executed during the call to `foo`.  Thus, the presence assertion for
`x` at the enclosed `target` construct cannot fail.  (As usual, the
standard OpenMP reference count for `x` must also reach zero before
the data is unmapped.)

Justification for inclusion in Clang and LLVM's OpenMP runtime:

* The `ompx_hold` modifier supports OpenACC functionality (structured
  reference count) that cannot be achieved in standard OpenMP, as of
  5.1.
* The runtime implementation for `ompx_hold` (next patch) will thus be
  used by Flang's OpenACC support.
* The Clang implementation for `ompx_hold` (this patch) as well as the
  runtime implementation are required for the Clang OpenACC support
  being developed as part of the ECP Clacc project, which translates
  OpenACC to OpenMP at the directive AST level.  These patches are the
  first step in upstreaming OpenACC functionality from Clacc.
* The Clang implementation for `ompx_hold` is also used by the tests
  in the runtime implementation.  That syntactic support makes the
  tests more readable than low-level runtime calls can.  Moreover,
  upstream Flang and Clang do not yet support OpenACC syntax
  sufficiently for writing the tests.
* More generally, the Clang implementation enables a clean separation
  of concerns between OpenACC and OpenMP development in LLVM.  That
  is, LLVM's OpenMP developers can discuss, modify, and debug LLVM's
  extended OpenMP implementation and test suite without directly
  considering OpenACC's language and execution model, which can be
  handled by LLVM's OpenACC developers.
* OpenMP users might find the `ompx_hold` modifier useful, as in the
  above example.

See new documentation introduced by this patch in `openmp/docs` for
more detail on the functionality of this extension and its
relationship with OpenACC.  For example, it explains how the runtime
must support two reference counts, as specified by OpenACC.

Clang recognizes `ompx_hold` unless `-fno-openmp-extensions`, a new
command-line option introduced by this patch, is specified.

Reviewed By: ABataev, jdoerfert, protze.joachim, grokos

Differential Revision: https://reviews.llvm.org/D106509
2021-08-31 16:13:49 -04:00
Nikita Popov dc37f5374c [LoadStoreVectorizer] Add test for inaccessiblememonly call (NFC) 2021-08-31 22:12:45 +02:00
Fangrui Song f9277caffc [ELF][test] Fix R_AARCH64_ADR_PREL_PG_HI21 typo
Found by redfast00
2021-08-31 13:09:55 -07:00
Louis Dionne 928cad59c7 [libc++][NFC] Rename _LIBCPP_NODISCARD_ATTRIBUTE to _LIBCPP_NODISCARD
Differential Revision: https://reviews.llvm.org/D108940
2021-08-31 16:06:31 -04:00
Louis Dionne e781e03e40 [libc++] Remove workaround for broken __is_trivially_copyable on old GCC
All supported versions of GCC now do the right thing.

Differential Revision: https://reviews.llvm.org/D108997
2021-08-31 16:05:29 -04:00
Sylvestre Ledru c28473fe4a Fix some typos in the llvm docs 2021-08-31 21:31:20 +02:00
Michał Górny 84f99ef2b1 [lldb] [test] Mark fork-follow-parent-softbp.test as darwin-unsupported 2021-08-31 21:19:27 +02:00
Mehdi Amini c7515a49b1 Fix MLIR python binding test after changes in ASM printer 2021-08-31 18:50:22 +00:00
Chih-Ping Chen 1d36988394 Moved the test to X86 as it's x86 specific. 2021-08-31 14:48:29 -04:00
Arthur O'Dwyer c5e7981aec [libc++] Add missing space in (__map_value_compare&__y) etc. NFCI. 2021-08-31 14:30:20 -04:00
peter klausler b4c86525fd [flang] Downgrade inappropriate error message to a warning
It may not be great practice to pass a procedure (or procedure pointer)
with an implicit interface as an actual argument to correspond with
a dummy procedure (pointer), but it's not an error.  Change to a
warning, and modify tests accordingly.

Differential Revision: https://reviews.llvm.org/D108932
2021-08-31 11:28:42 -07:00
Nick Desaulniers e9b3f25730 [RISCVISelLowering] avoid emitting libcalls to __mulodi4() and __multi3()
Similar to D108842, D108844, D108926, D108928, and D108936.

__has_builtin(builtin_mul_overflow) returns true for 32b RISCV targets,
but Clang is deferring to compiler RT when encountering long long types.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108939
2021-08-31 11:23:56 -07:00
Nikita Popov bf8b69bb3a [SLPVectorizer] Add test for inaccessiblememonly call (NFC) 2021-08-31 20:23:26 +02:00
Fangrui Song a26b09cb98 [CMake] Remove unneeded -Wdelete-non-virtual-dtor availability check
Available and good in Clang 3.5/GCC 5.
2021-08-31 11:19:04 -07:00
MaheshRavishankar b686fdbf92 [mlir][Linalg] Drop output tensor from linalg.pad_tensor op.
The output tensor was added for tiling purposes. With use of
`TilingInterface` for tiling pad operations, there is no need for an
explicit operand for the shape of result of `linalg.pad_tensor`
op. The interface allows the tiling pattern to query the value that
can be used for the "init" needed for tiling dynamically.

Differential Revision: https://reviews.llvm.org/D108613
2021-08-31 11:12:24 -07:00
Nick Desaulniers d8b6ae072d [PPCISelLowering] avoid emitting libcalls to __mulodi4()
Similar to D108842, D108844, and D108926.

__has_builtin(builtin_mul_overflow) returns true for 32b PPC targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks ppc44x_defconfig + CONFIG_BLK_DEV_NBD=y builds of the Linux
kernel that are using builtin_mul_overflow with these types for these
targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support these builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D108936
2021-08-31 11:09:58 -07:00
Philip Reames c49503a76d [SCEV] Add a testcase for zero max btc with non-constant exact btc
Reduced from the ArchiveCommandLine.ll case seen in D108848.
2021-08-31 11:00:41 -07:00
Fangrui Song 4bb5f44c70 [CMake] Remove unneeded -Wnon-virtual-dtor availability check
For Clang, 3.5 is the minimum requirement which has fixed the bug.
GCC 5 is good as well.
2021-08-31 11:00:16 -07:00
Joe Loser 167b2dbde4
[libcxx][docs] Mark LWG3153 as complete
Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D108967
2021-08-31 13:54:10 -04:00
Mehdi Amini 387f95541b Add a new interface allowing to set a default dialect to be used for printing/parsing regions
Currently the builtin dialect is the default namespace used for parsing
and printing. As such module and func don't need to be prefixed.
In the case of some dialects that defines new regions for their own
purpose (like SpirV modules for example), it can be beneficial to
change the default dialect in order to improve readability.

Differential Revision: https://reviews.llvm.org/D107236
2021-08-31 17:52:40 +00:00
Mehdi Amini c41b16c26b Change ASM Op printer to print the operation name in the framework instead of leaving it up to each individual operation
This aligns the printer with the parser contract: the operation isn't part of the user-controllable part of the syntax.

Differential Revision: https://reviews.llvm.org/D108804
2021-08-31 17:52:40 +00:00
Mehdi Amini fd87963eee Change dialect printOperation() hook to getOperationPrinter()
This makes the hook return a printer if available, instead of using LogicalResult  to
indicate if a printer was available (and invoked). This allows the caller to detect that
the dialect has a printer for a given operation without actually invoking the printer.
It'll be leveraged in a future revision to move printing the op name itself under control
of the ASMPrinter.

Differential Revision: https://reviews.llvm.org/D108803
2021-08-31 17:52:39 +00:00
peter klausler 6726a3d858 [flang] Fold PACK()
Implement compile-time constant folding for the transformational
intrinsic function PACK.

Differential Revision: https://reviews.llvm.org/D108956
2021-08-31 10:51:56 -07:00
Vedant Kumar 6c439a3817 [profile] Specify "-V" to otool to get expected test output
Newer Xcode toolchains ship a new otool implementation that prints out
section contents in a slightly different way than otool-classic. Specify
"-V" to otool to get the expected test output.

Differential Revision: https://reviews.llvm.org/D108929
2021-08-31 10:49:51 -07:00
Joe Nash c96839265a [AMDGPU] Enable ds_min/ds_max on more subtargets
Adds patterns for f64 ds_min/ds_max. Shrinks HasLDSFPAtomics
scope to enable f32.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D108994

Change-Id: Id890b677841ee588b20d42b1bb3f4cdbf6e9ba1a
2021-08-31 13:22:31 -04:00
Jessica Paquette 94d3ff09cf [GlobalISel] Don't use G_FPTOSI in G_ISNAN legalization
As noted in the comments in D108227, using G_FPTOSI produces wrong results for
G_ISNAN. Drop the G_FPTOSI and perform the operation on integer types.

Elsewhere in LLVM, a bitcast would be the appropriate choice (as it is in SDAG).
GlobalISel does not distinguish between integer and FP types, so a bitcast would
be meaningless here.
2021-08-31 10:26:42 -07:00
David Green 22c384129e [ARM] Add missing validForTailPredication for VMINNM/VMAXNM
Apparently this was missing, preventing the generation of tail
predication loops containing VMINNM, VMAXNM, VMINNMA and VMAXNMA.
2021-08-31 18:19:03 +01:00
David Green 198259becb [ARM] Test for VMINNM/VMAXNM in tail predicated loops. 2021-08-31 18:19:03 +01:00
Raphael Isemann 4f7fb13f87 [lldb] Don't save empty expressions in the multiline editor history
Right now running `expr` to start the multiline expression editor and then
pressing enter causes an empty history empty to be created for the multiline
editor. That doesn't seem very useful for users as pressing the 'up' key will
now also bring up these empty expressions.

I don't think there is ever a use case for recalling a completely empty
expression from the history, so instead don't save those entries to the history
file and make sure we never recall them when navigating over the expression
history.

Note: This is actually a Swift downstream patch that got shipped with Apple's
LLDB for many years. However, this recently started conflicting with upstream
LLDB as D100048 added a test that made sure that empty expression entries don't
crash LLDB. Apple's LLDB was never affected by this crash as it never saved
empty expressions in the first place.

Reviewed By: augusto2112

Differential Revision: https://reviews.llvm.org/D108983
2021-08-31 18:51:18 +02:00
LLVM GN Syncbot 9c37eda6e4 [gn build] Port e983a659e5 2021-08-31 16:45:24 +00:00
Mark de Wever e983a659e5 [libc++][NFC] split <charconv>.
This move the helper types `chars_format`, `to_chars_result` and
`from_chars_result` to a separate header. The first two are needed for
D70631 the third for consistency.

The header `__charconv/ryu.h` uses these types and it can't depend on the
types in `<charconv>` in a modular build. Moving them to the ryu header
would be an odd place and doesn't work since the header is included in the
middle of `<charconv>`.

Reviewed By: #libc, ldionne, Quuxplusone

Differential Revision: https://reviews.llvm.org/D108927
2021-08-31 18:45:19 +02:00
Philip Reames b604fcb7bc [runtime] Move prolog/epilog block to a post-simplify strategy
The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one iteration. The old implementation did this by avoiding loop construction entirely. This patches instead constructs the trivial loop and then explicitly breaks the backedge and simplifies. This does result in some additional code churn when triggered, but a) results in better quality code and b) removes a codepath which didn't work properly for multiple exit epilogs.

One oddity that I want to draw to reviewer attention is that this somehow changes revisit order. The new order looks equivalent to me, but I don't understand how creating and erasing an extra loop here creates this effect.

Differential Revision: https://reviews.llvm.org/D108521
2021-08-31 09:29:36 -07:00
Philip Reames 9b45fd909f [AlignFromAssume] Bailout w/non-constant alignments (pr51680)
This is a bailout for pr51680.  This pass appears to assume that the alignment operand to an align tag on an assume bundle is constant.  This doesn't appear to be required anywhere, and clang happily generates non-constant alignments for cases such as this case taken from the bug report:

// clang -cc1 -triple powerpc64-- -S -O1 opal_pci-min.c
extern int a[];
long *b;
long c;
void *d(long, int *, int, long, long, long) __attribute__((__alloc_align__(6)));
void e() {
  b = d(c, a, 0, 0, 5, c);
  b[0] = 0;
}

This was exposed by a SCEV change which allowed a non-constant alignment to reach further into the pass' code.  We could generalize the pass, but for now, let's fix the crash.
2021-08-31 09:20:52 -07:00
Shilei Tian 8442967fe3 [OpenMP] Fix task wait doesn't work as expected in serialized team
As discussed in D107121, task wait doesn't work when a regular task T depends on
a detached task or a hidden helper task T' in a serialized team. The root cause is,
since the team is serialized, the last task will not be tracked by
`td_incomplete_child_tasks`. When T' is finished, it first releases its
dependences, and then decrements its parent counter. So far so good. For the thread
that is running task wait, if at the moment it is still spinning and trying to
execute tasks, it is fine because it can detect the new task and execute it.
However, if it happends to finish the function `flag.execute_tasks(...)`, it will
be broken because `td_incomplete_child_tasks` is 0 now.

In this patch, we update the rule to track children tasks a little bit. If the
task team encounters a proxy task or a hidden helper task, all following tasks
will be tracked.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D107496
2021-08-31 12:15:46 -04:00
Sanjay Patel 6c0181c00f [InstCombine] fix typos in comments; NFC 2021-08-31 12:08:36 -04:00
Yaron Keren 10d78a06ba [llvm-lit] unbreak clang-only builds by not assuming llvm-lit in build dir
Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D109000
2021-08-31 18:57:47 +03:00