Commit graph

398144 commits

Author SHA1 Message Date
Michael Kruse 707ce34b06 [OpenMP][OpenMPIRBuilder] Implement loop unrolling.
Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:

 * `unrollLoopFull`
 * `unrollLoopPartial`
 * `unrollLoopHeuristic`

`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.

With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.

Reviewed By: jdoerfert, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107764
2021-09-02 02:37:25 -05:00
Florian Hahn 3e60d216a4
[LoopDistribute] Add tests inspired by PR50296, PR50288. 2021-09-02 09:17:32 +02:00
Wenlei He c000b8bd5c [CSSPGO] Use preinliner decision by default when available
For CSSPGO, turn on `sample-profile-use-preinliner` by default. This simplifies the use of llvm-profgen preinliner as it's now simply driven by ContextShouldBeInlined flag for each context profile without needing extra compiler switch.

Note that llvm-profgen's preinliner is still off by default, under switch `csspgo-preinliner`.

Differential Revision: https://reviews.llvm.org/D109111
2021-09-01 23:45:38 -07:00
Arthur Eubanks 2413d6063b [docs] Mention that the legacy PM is deprecated and will be removed after 14
Per https://lists.llvm.org/pipermail/llvm-dev/2021-August/152305.html.

Reviewed By: MaskRay, fhahn

Differential Revision: https://reviews.llvm.org/D109080
2021-09-01 23:31:48 -07:00
Markus Lavin 304f2bd21d [NPM] Added opt option -print-pipeline-passes.
Added opt option -print-pipeline-passes to print a -passes compatible
string describing the built pass pipeline.

As an example:
$ opt -enable-new-pm=1 -adce -licm -simplifycfg -o /dev/null /dev/null -print-pipeline-passes
verify,function(adce),function(loop-mssa(licm)),function(simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>),verify,BitcodeWriterPass

At the moment this is best-effort only and there are some known
limitations:
- Not all passes accepting parameters will print their parameters
  (currently only implemented for simplifycfg).
- Some ClassName to pass-name mappings are not unique.
- Some ClassName to pass-name mappings are missing (e.g.
  BitcodeWriterPass).

Differential Revision: https://reviews.llvm.org/D108298
2021-09-02 08:23:33 +02:00
Markus Lavin 645af79e8e Revert "[NPM] Added opt option -print-pipeline-passes."
This reverts commit c71869ed4c.
2021-09-02 08:22:17 +02:00
Markus Lavin c71869ed4c [NPM] Added opt option -print-pipeline-passes.
Added opt option -print-pipeline-passes to print a -passes compatible
string describing the built pass pipeline.

As an example:
$ opt -enable-new-pm=1 -adce -licm -simplifycfg -o /dev/null /dev/null -print-pipeline-passes
verify,function(adce),function(loop-mssa(licm)),function(simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>),verify,BitcodeWriterPass

At the moment this is best-effort only and there are some known
limitations:
- Not all passes accepting parameters will print their parameters
  (currently only implemented for simplifycfg).
- Some ClassName to pass-name mappings are not unique.
- Some ClassName to pass-name mappings are missing (e.g.
  BitcodeWriterPass).
2021-09-02 08:16:51 +02:00
Mark de Wever 67794e784e [libc++][nfc] Fixes ppc64le-sanitizer build issue.
After landing D103357 the worker ppc64le-sanitizer fails on `__bool`.
This replaces all occurrences with `__boolean`.
2021-09-02 08:13:59 +02:00
Abinav Puthan Purayil 0baace5379 [DAGCombine] Add node level checks for fp-contract and fp-ninf in visitFMULForFMADistributiveCombine().
Differential Revision: https://reviews.llvm.org/D107551
2021-09-02 11:33:14 +05:30
Arthur Eubanks bc0d16969a Fix missing argument introduced by D108788
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D109132
2021-09-01 23:00:58 -07:00
Arthur Eubanks 1c503e923a [test] Precommit/fix up existing test for MemorySSA/invariant.group 2021-09-01 22:58:17 -07:00
Ye Luo 289a1089cd [libomptarget] Move HostDataToTargetTy states into StatesTy
Use unique_ptr to achieve the effect of mutable.

Remove mutable keyword of DynRefCount and HoldRefCount
Remove std::shared_ptr from UpdateMtx

Reviewed By: tianshilei1992, grokos

Differential Revision: https://reviews.llvm.org/D109007
2021-09-01 23:36:05 -05:00
Jinsong Ji 8671191d26 [NFC][PowerPC] Small code refactor in LoopInstrFormPrep
Avoid some duplicate code.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D109083
2021-09-02 03:16:01 +00:00
Wenlei He f10004e7dd [CSSPGO] Add stats for pre-inliner
Add some stats to help tuning pre-inliner.

Differential Revision: https://reviews.llvm.org/D109098
2021-09-01 20:03:50 -07:00
Jinsong Ji f5505a2ca6 [InstrProfiling] Add AIX triple to more tests
Following patch of https://reviews.llvm.org/D108490.
Add AIX triples to tests in /Instrumentation/InstrProfiling/ to make
sure we can catch regressions earlier.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D109089
2021-09-02 03:03:00 +00:00
Ben Shi 14500628b6 [AArch64][test] Add new tests for (mul (add x, c0), c1)
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D108870
2021-09-02 10:39:49 +08:00
Arthur Eubanks 7b08d9da55 Reland [MemorySSA] Add pass to print results of MemorySSA walker
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D109028
2021-09-01 18:58:57 -07:00
Arthur Eubanks 0f63496ea4 Revert "[MemorySSA] Add pass to print results of MemorySSA walker"
This reverts commit 8f98477c2d.

Breaks bots
2021-09-01 18:45:19 -07:00
Chen Zheng 2596120199 [PowerPC] small code format refactor ; NFC
address the code review comments in patch https://reviews.llvm.org/D105872
2021-09-02 01:39:32 +00:00
Arthur Eubanks 8f98477c2d [MemorySSA] Add pass to print results of MemorySSA walker
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D109028
2021-09-01 18:29:15 -07:00
Richard Smith 6eda66b0a9 PR50294: Fix a performance regression from 2c9dbcd.
Per the contract of ReadLateParsedTemplates, we should not be returning
the same results multiple times. No functionality change intended, other
than to runtime.

Thanks to Luboš Luňák for identifying the cause of the regression!
2021-09-01 18:00:07 -07:00
Fangrui Song 4d5220faf9 [OpenMP] Fix -Wunused-but-set-parameter in -DLLVM_ENABLE_ASSERTIONS=off builds. NFC 2021-09-01 17:55:13 -07:00
Volodymyr Sapsai 64ebf313a7 [HeaderSearch] Use isImport only for imported headers and not for #pragma once.
There is a separate field `isPragmaOnce` and when `isImport` combines
both, it complicates HeaderFileInfo serialization as `#pragma once` is
the inherent property of the header while `isImport` reflects how other
headers use it. The usage of the header can be different in different
contexts, that's why `isImport` requires tracking separate from `#pragma once`.

Differential Revision: https://reviews.llvm.org/D104351
2021-09-01 17:07:35 -07:00
Philip Reames bb0fa3ea02 Revert "snapshot - do not push"
This reverts commit 91f4655d92.

This wasn't intented to be pushed, sorry.
2021-09-01 16:59:23 -07:00
Philip Reames c3b3aa277a Fix a missing MemorySSA update in breakLoopBackedge
This is a case I'd missed in 6a8237. The odd bit here is that missing the edge removal update seems to produce MemorySSA which verifies, but is still corrupt in a way which bothers following passes. I wasn't able to reduce a single pass test case, which is why the reported test case is taken as is.

Differential Revision: https://reviews.llvm.org/D109068
2021-09-01 16:59:01 -07:00
Philip Reames 91f4655d92 snapshot - do not push 2021-09-01 16:59:01 -07:00
Aart Bik 2754604e54 [mlir][sparse] sparse runtime support library improvements
(1) renamed SparseTensor to SparseTensorCOO, the other one remains SparseTensorStorage to focus on contrast

(2) documents difference between public API exclusively for compiler-generated code and methods that could be used by other runtimes (TBD) that want to interact with MLIR

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D109039
2021-09-01 16:51:14 -07:00
Jon Roelofs 9237eda304 Revert "[AArch64][GlobalISel] Legalize bswap <2 x i16>"
This reverts commit 5cd63e9ec2.

https://bugs.llvm.org/show_bug.cgi?id=51707

The sequence feeding in/out of the rev32/ushr isn't quite right:

 _swap:
         ldr     h0, [x0]
         ldr     h1, [x0, #2]
-        mov     v0.h[1], v1.h[0]
+        mov     v0.s[1], v1.s[0]
         rev32   v0.8b, v0.8b
         ushr    v0.2s, v0.2s, #16
-        mov     h1, v0.h[1]
+        mov     s1, v0.s[1]
         str     h0, [x0]
         str     h1, [x0, #2]
         ret
2021-09-01 16:49:20 -07:00
Jacques Pienaar f7bf8a8658 [mlir][capi] Add NameLoc
Add method to get NameLoc. Treat null child location as unknown to avoid
needing to create UnknownLoc in C API where child loc is not needed.

Differential Revision: https://reviews.llvm.org/D108678
2021-09-01 16:16:35 -07:00
Stanislav Mekhanoshin f3645c792a [AMDGPU] Use S_BITCMP1_* to replace AND in optimizeCompareInstr
Differential Revision: https://reviews.llvm.org/D109082
2021-09-01 15:59:12 -07:00
Stanislav Mekhanoshin bf77b11277 [AMDGPU] Introduce optimizeCompareInstr
The following patterns are currently handled:

s_cmp_eq_u32 (s_and_b32 $src, 1), 1 => s_and_b32 $src, 1
s_cmp_eq_i32 (s_and_b32 $src, 1), 1 => s_and_b32 $src, 1
s_cmp_eq_u64 (s_and_b64 $src, 1), 1 => s_and_b64 $src, 1
s_cmp_ge_u32 (s_and_b32 $src, 1), 1 => s_and_b32 $src, 1
s_cmp_ge_i32 (s_and_b32 $src, 1), 1 => s_and_b32 $src, 1
s_cmp_lg_u32 (s_and_b32 $src, 1), 0 => s_and_b32 $src, 1
s_cmp_lg_i32 (s_and_b32 $src, 1), 0 => s_and_b32 $src, 1
s_cmp_lg_u64 (s_and_b64 $src, 1), 0 => s_and_b64 $src, 1
s_cmp_gt_u32 (s_and_b32 $src, 1), 0 => s_and_b32 $src, 1
s_cmp_gt_i32 (s_and_b32 $src, 1), 0 => s_and_b32 $src, 1

Differential Revision: https://reviews.llvm.org/D109031
2021-09-01 15:57:05 -07:00
Alina Sbirlea a10409fe23 [MemorySSAUpdater] Simplify updates when only deleting edges.
When performing only edge deletion, we don't need to do the DT updates
back and forth. Check for the existance of insert updates to simplify
this.
2021-09-01 15:48:20 -07:00
Joel E. Denny 1f9e437065 [OpenMP][AMDGPU] Remove unneeded XFAILs 2021-09-01 18:00:25 -04:00
Roman Lebedev f5753125f0
[Codegen][TLI][X86] SimplifyMultipleUseDemandedBits(): 0'th vec subreg widening is free, try to perform it earlier
I believe, the profitability reasoning here is correct
"sub"reg is already located within the 0'th subreg of wider reg,
so if we have suvector insertion at index 0 into undef,
then it's always free do to.

After this, D109065 finally avoids the regression in D108382.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D109074
2021-09-02 00:54:05 +03:00
Fangrui Song 68745a557e [InstrProfiling] Use llvm.compiler.used if applicable for Mach-O
Similar to D97585.

D25456 used `S_ATTR_LIVE_SUPPORT` to ensure the data variable will be retained
or discarded as a unit with the counter variable, so llvm.compiler.used is
sufficient. It allows ld to dead strip unneeded profc and profd variables.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D105445
2021-09-01 14:46:51 -07:00
Wenlei He 4ef88031f5 [llvm-profdata] Fix assertion from invalid iterator
Differential Revision: https://reviews.llvm.org/D109096
2021-09-01 14:42:00 -07:00
Alexander Pivovarov 4b04d54206 [RISCV] Fix typo in RISCVSchedSiFive7.td
Fix typo in "microarchitecure".

Differential Revision: https://reviews.llvm.org/D109006
2021-09-01 16:39:48 -05:00
David Green 49476a4d66 [ARM] Add MVE lowering for fptosi.sat
This adds lowering of the llvm.fptosi.sat and llvm.fptoui.sat intinsics,
selecting a VCVT instruction which under MVE will inherently perform the
saturate.

Differential Revision: https://reviews.llvm.org/D107865
2021-09-01 22:38:47 +01:00
Joel E. Denny 786a140650 [OpenMP] Use IsHostPtr where needed in rest of omptarget.cpp
As started in D107925, this patch replaces the remaining occurrences
of `UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin` in
`omptarget.cpp` with `IsHostPtr`.  The former condition is broken in
the rare case that the device and host happen to use the same address
for their mapped allocations.  I don't know how to write a test that's
likely to reveal this case.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D107928
2021-09-01 17:31:42 -04:00
Joel E. Denny d11bab0b73 [OpenMP] Use IsHostPtr where needed for targetDataBegin
As discussed in D105990, without this patch, `targetDataBegin`
determines whether to transfer data (as opposed to assuming it's in
shared memory) using the condition `!UseUSM || HasCloseModifier`.
However, this condition is broken if use of discrete memory was forced
by `omp_target_associate_ptr`.  This patch extends
`unified_shared_memory/associate_ptr.c` to reveal this case, and it
fixes it using `!IsHostPtr` in `DeviceTy::getTargetPointer` to replace
this condition.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D107927
2021-09-01 17:31:42 -04:00
Joel E. Denny fa6c275505 [OpenMP][NFC] Eliminate CopyMember from targetDataEnd
This patch is based on comments in D105990.  It is NFC according to
the following observations:

1. `CopyMember` is computed as `!IsHostPtr && IsLast`.
2. `DelEntry` is true only if `IsLast` is true.

We apply those observations in order:

```
if ((DelEntry || Always || CopyMember) && !IsHostPtr)

if ((DelEntry || Always || IsLast) && !IsHostPtr)

if ((Always || IsLast) && !IsHostPtr)
```

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D107926
2021-09-01 17:31:42 -04:00
Joel E. Denny 8e4836b2a2 [OpenMP] Use IsHostPtr where needed for targetDataEnd
As discussed in D105990, without this patch, `targetDataEnd`
determines whether to transfer data or delete a device mapping (as
opposed to assuming it's in shared memory) using two different
conditions, each of which is broken for some cases:

1. `!(UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin)`: The
   broken case is rare: the device and host might happen to use the
   same address for their mapped allocations.  I don't know how to
   write a test that's likely to reveal this case, but this patch does
   fix it, as discussed below.
2. `!UNIFIED_SHARED_MEMORY || HasCloseModifier`: There are at least
   two broken cases:
    1. The `close` modifier might have been specified on an `omp
      target enter data` but not the corresponding `omp target exit
      data`, which thus might falsely assume a mapping is in shared
      memory.  The test `unified_shared_memory/close_enter_exit.c`
      already has a missing deletion as a result, and this patch adds
      a check for that.  This patch also adds the new test
      `close_member.c` to reveal a missing transfer and deletion.
    2. Use of discrete memory might have been forced by
      `omp_target_associate_ptr`, as in the test
      `unified_shared_memory/api.c`.  In the current `targetDataEnd`
      implementation, this condition turns out not be used for this
      case: because the reference count is infinite, a transfer is
      possible only with an `always` modifier, and this condition is
      never used in that case.  To ensure it's never used for that
      case in the future, this patch adds the test
      `unified_shared_memory/associate_ptr.c`.

Fortunately, `DeviceTy::getTgtPtrBegin` already has a solution: it
reports whether the allocation was found in shared memory via the
variable `IsHostPtr`.

After this patch, `HasCloseModifier` is no longer used in
`targetDataEnd`, and I wonder if the `close` modifier is ever useful
on an `omp target data end`.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D107925
2021-09-01 17:31:42 -04:00
Arthur Eubanks 39f780b51d [OpaquePtr] Cleanup some uses of getPointerElementType() in TailRecursionElimination 2021-09-01 14:24:47 -07:00
Fangrui Song 623bf6c34b [InstrProfiling][test] Combine profiling.ll and linkage.ll
The latter mostly covers the former.
2021-09-01 14:21:42 -07:00
Philip Reames 73b951a7f7 [SCEV] Clarify requirements for zero-stride to be UB
There's a silent bug in our reasoning about zero strides. We assume that having a single static exit implies that if that exit is not taken, then the loop must be infinite. This ignores the potential for abnormal exits via exceptions. Consider the following example:

for (uint_8 i = 0; i < 1; i += 0) {
  throw_on_thousandth_call();
}

Our reasoning is such that we'd conclude this loop can't take the backedge as that would lead to a (presumed) infinite loop.

In practice, this is a silent bug because the loopIsFiniteByAssumption returns false strictly more often than the loopHaNoAbnormalExits property. We could reasonable want to change that in the future, so fixing the codeflow now is worthwhile.

Differential Revision: https://reviews.llvm.org/D109029
2021-09-01 14:01:13 -07:00
Jon Chesterfield 06cdf48a0d [openmp] Drop test from D109057, disproportionately difficult to run on windows 2021-09-01 21:51:51 +01:00
alex-t e3cbf1d437 [AMDGPU] enable scalar compare in truncate selection
Currently, the truncate selection dag node is expanded as a bitwise AND plus compare to 1.  This change enables scalar comparison in the pattern if the truncate node is uniform.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D108925
2021-09-01 23:35:11 +03:00
Philip Reames 3af8a11bc6 [LoopDeletion] Separate logic in breakBackedgeIfNotTaken using symboic max trip count [nfc]
As mentioned in D108833, the logic for figuring out if a backedge is dead was somewhat interwoven with the SCEV based logic and the symbolic eval logic. This is my attempt at making the code easier to follow.

Note that this is only NFC after the work done in 29fa37ec.  Thanks to Nikita for catching that case.

Differential Revision: https://reviews.llvm.org/D108848
2021-09-01 13:30:46 -07:00
Jon Chesterfield c7cbf1a03e [openmp] Accept directory for libomptarget-bc-path
The commandline flag to specify a particular openmp devicertl library
currently errors like:
```
fatal error: cannot open file
      './runtimes/runtimes-bins/openmp/libomptarget':
      Is a directory
```
CommonArgs successfully appends the directory to the commandline args then
mlink-builtin-bitcode rejects it.

This patch is a point fix to that. If --libomptarget-amdgcn-bc-path=directory
then append the expected name for the current architecture and go on as before.
This is useful for test runners that don't hardcode the architecture.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D109057
2021-09-01 21:22:35 +01:00
Nikita Popov 7f058ce8c2 [WebAssembly] Support opaque pointers in FixFunctionBitcasts
With opaque pointers, no actual bitcasts will be present. Instead,
there will be a mismatch between the call FunctionType and the
function ValueType. Change the code to collect CallBases
specifically (rather than general Uses) and compare these types.

RAUW is no longer performed, as there would no longer be any
bitcasts that can be RAUWd.

Differential Revision: https://reviews.llvm.org/D108880
2021-09-01 22:17:24 +02:00