Commit graph

421204 commits

Author SHA1 Message Date
Andrew Litteken a919d3d888 [IROutliner] Ensure that incoming blocks of PHINodes are included in the unique numbering gneration for phi nodes for each exit path
Issue: https://github.com/llvm/llvm-project/issues/54431

PHINodes that need to be generated to accommodate a PHINode outside the region due to different output paths need to have their own numbering to determine the number of output schemes required to properly handle all the outlined regions. This numbering was previously only determined by the order and values of the incoming values, as well as the parent block of the PHINode. This adds the incoming blocks to the calculation of a hash value for these PHINodes as well, and the supporting infrastructure to give each block in a region a corresponding canonical numbering.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D122207
2022-04-14 12:13:17 -05:00
Thomas Raoux 4c564940a1 [mlir][nvgpu] Add NVGPU dialect (architectural specific gpu dialect)
This introduce a new dialect for vendro specific ptx operations. This
also adds the first operation ldmatrix as an example. More operations
will be added in follow up patches.
This new dialect is meant to be a bridge between GPU and Vector
dialectis and NVVM dialect.

This is based on the RFC proposed here:
https://discourse.llvm.org/t/rfc-add-nv-gpu-dialect-hw-specific-extension-of-gpu-dialect-for-nvidia-gpus/61466/8

Differential Revision: https://reviews.llvm.org/D123266
2022-04-14 16:33:46 +00:00
Arthur Eubanks f14ebe91c5 [clang-format] Skip preprocessor lines when finding the record lbrace
With D117142, we would now format

```
struct A {
#define A
  void f() { a(); }
#endif
};
```

into

```
struct A {
#ifdef A
  void f() {
    a();
  }
#endif
};
```

because we were looking for the record lbrace without skipping preprocess lines.

Fixes https://github.com/llvm/llvm-project/issues/54901.

Reviewed By: curdeius, owenpan

Differential Revision: https://reviews.llvm.org/D123737
2022-04-14 09:31:15 -07:00
Paul Walker a5a258e208 [SVE] Refactor MGATHER lowering for unsupported passthru values.
Handle unsupported passthru values before lowering the gather to
target specific nodes. This is a simplification that's on the road
to moving more of MGATHER lowering into td based isel.

Differential Revision: https://reviews.llvm.org/D123683
2022-04-14 17:26:43 +01:00
chenglin.bi 00871e2f4f [SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select
When switch with 2^n cases go to one result, check if the 2^n cases can be covered by n bit masks.
If yes we can use "and condition, ~mask" to simplify the switch

case 0 2 4 6 -> and condition, -7
https://alive2.llvm.org/ce/z/jjH_0N

case 0 2 8 10 -> and condition, -11
https://alive2.llvm.org/ce/z/K7E-2V

case 2 4 8 12 -> and (sub condition, 2), -11
https://alive2.llvm.org/ce/z/CrxbYg

Fix one case of https://github.com/llvm/llvm-project/issues/39957

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D122485
2022-04-15 00:10:00 +08:00
Peter Klausler 9b200074b1 [flang] Fix combining cases of USE association & generic interfaces
Fortran admits a few ways to have multiple symbols with the
same name in the same scope.  Two of them involve generic
interfaces (from INTERFACE or GENERIC, the syntax doesn't matter);
these are allowed to inhabit a scope with either a derived type or
a subprogram that is also a specific procedure of the generic.
(But not both a derived type and a subprogram; they could not
cohabit a scope anyway, generic or not.)

In cases of USE association, f18 needs to be capable of combining
use-associated generic interfaces with other use-associated entities.
Two generics get merged (this case was nearly correct); a generic
and a derived type can merge into a GenericDetails with a shadowed
derivedType(); and a generic can replace or ignore a use-associated
procedure of the same name so long as that procedure is already
one of its specifics.

Further, these modifications to the use-associated generic
interface must be made to a local copy of the symbol.  The previous
code was messing directly with the symbol in the module's scope.

The fix is basically a reimplementation of the member function
DoAddUse() in name resolution.

Differential Revision: https://reviews.llvm.org/D123704
2022-04-14 08:59:52 -07:00
John Brawn 27a8735a44 [AArch64] Add mayRaiseFPException to appropriate instructions
This is mostly handled by adding "let mayRaiseFPException = 1" before
the definition of the relevant instruction classes, but there are a
couple of complications:
 * When we have a multiclass where currently some instantiations are
   of instructions that can raise an exception and others aren't we
   need to split that into two multiclasses, one inheriting from the
   other using a multiclass parameter to enable exceptions.
 * In a couple of places in the globalisel instruction selector we
   need to manually set the NoFPExcept flag. There's also another
   place that looks like it should need it, but that code is never hit
   for those opcodes due to them being handled by the generic
   instruction selector, so I've instead just removed them from the
   switch.

Differential Revision: https://reviews.llvm.org/D115352
2022-04-14 16:51:22 +01:00
John Brawn 1b1466c346 [AArch64] Adjust aarch64 constrained intrinsics tests and un-XFAIL
Remove the checking of the generated asm, as that's already tested
elsewhere, and adjust some tests that were expecting the wrong
intrinsic to be generated.

Differential Revision: https://reviews.llvm.org/D118259
2022-04-14 16:51:22 +01:00
John Brawn 12c1022679 [AArch64] Lowering and legalization of strict FP16
For strict FP16 to work correctly needs some changes in lowering and
legalization:
 * SelectionDAGLegalize::PromoteNode was missing handling for some
   strict fp opcodes.
 * Some of the custom lowering of strict fp operations needed to be
   adjusted to work with FP16.
 * Custom lowering needed to be added for round-to-int operations.

With this, and the previous patches for the rest of the strict fp
isel, we can set IsStrictFPEnabled = true.

Differential Revision: https://reviews.llvm.org/D115620
2022-04-14 16:51:22 +01:00
Kevin P. Neal d43d9e1d5c [FPEnv][InstSimplify] Fold fsub -0.0, -X ==> X
Currently the fsub optimizations in InstSimplify don't know how to fold
-0.0 - (-X) to X when the constrained intrinsics are used. This adds partial
support. The rest of the support will come later with work on the IR
matchers.

This review is split out from D107285.

Differential Revision: https://reviews.llvm.org/D123396
2022-04-14 11:48:54 -04:00
LLVM GN Syncbot f80e47884c [gn build] Port 1fdf952dee 2022-04-14 15:22:09 +00:00
Chris Bieneman 1fdf952dee [HLSL] Add Semantic syntax, and SV_GroupIndex
HLSL has a language feature called Semantics which get attached to
declarations like attributes and are used in a variety of ways.

One example of semantic use is here with the `SV_GroupIndex` semantic
which, when applied to an input for a compute shader is pre-populated
by the driver with a flattened thread index.

Differential Revision: https://reviews.llvm.org/D122699

# Conflicts:
#	clang/include/clang/Basic/Attr.td
#	clang/include/clang/Basic/AttrDocs.td
2022-04-14 10:21:58 -05:00
LLVM GN Syncbot 4f1065156b [gn build] Port e471ba3d01 2022-04-14 14:51:17 +00:00
Joseph Huber e471ba3d01 [Object] Add binary format for bundling offloading metadata
We need to embed certain metadata along with a binary image when we wish
to perform a device-linking job on it. Currently this metadata was
embedded in the section name of the data itself. This worked, but made
adding new metadata very difficult and didn't work if the user did any
sort of section linking.

This patch introduces a custom binary format for bundling offloading
metadata with a device object file. This binary format is fundamentally
a simple string map table with some additional data and an embedded
image. I decided to use a custom format rather than using an existing
format (ELF, JSON, etc) because of the specialty use-case of this. We
need a simple binary format that can be concatenated without requiring
other external dependencies.

This extension will make it easier to extend the linker wrapper's
capabilties with whatever data is necessary. Eventually this will allow
us to remove all the external arguments passed to the linker wrapper and
embed it directly in the host's linker so device linking behaves exactly
like host linking.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D122069
2022-04-14 10:50:52 -04:00
Joseph Huber 11f47b791f [OpenMP] Make offloading sections have the SHF_EXCLUDE flag
Offloading sections can be embedded in the host during codegen via a
section. This section was originally marked as metadata to prevent it
from being loaded, but these sections are completely unused at runtime
so the linker should automatically drop them from the final executable
or shard library. This flag adds support for the SHF_EXCLUDE flag in
target lowering and uses it.

Reviewed By: JonChesterfield, MaskRay

Differential Revision: https://reviews.llvm.org/D122987
2022-04-14 10:50:49 -04:00
Peter Klausler ef141aec3c [flang] Improve appearance of message attachments
Error messages can have a list of attachments; these are used to point
to related source locations, supply additional information, and to
encapsulate error messages that were *not* emitted in a given context
to explain why a warning was justified.

This patch adds a message severity ("Because") for that last case,
and extends to AttachTo() API to provide a means for overriding
the severity of an attached message.

Some existing message attachments had their severities adjusted,
now that we're printing them.  And operator==() for Message was
cleaned up while debugging after I noticed that it was recursively
O(N**2) and subject to returning a false positive.

Differential Revision: https://reviews.llvm.org/D123710
2022-04-14 07:34:50 -07:00
Fabian Wolff 4ff70dba38 [libc++] Fix undefined behavior in std::filebuf
Fixes https://github.com/llvm/llvm-project/issues/49267.
Fixes https://github.com/llvm/llvm-project/issues/49282.
Fixes https://github.com/llvm/llvm-project/issues/49789.

Reviewed By: ldionne

Differential Revision: https://reviews.llvm.org/D122257
2022-04-14 16:20:51 +02:00
David Green 1ba8f4f67d [AArch64] Move v4i8 concat load lowering to a combine.
The existing code was not updating the uses of loads that it recreated,
leading to incorrect chains which could break the ordering between
nodes. This moves the code to a combine instead, and makes sure we
update the chain references. This does mean it happens earlier -
potentially before the concats are simplified. This can lead to
inefficiencies in the codegen, which will be fixed in followups.
2022-04-14 15:19:33 +01:00
Andrew Savonichev b6183a57a1 [NVPTX] Fix barrier.ll LIT test
The second parameter should be a multiple of the warp size (32).

PTX ISA spec, s9.7.12.1. Parallel Synchronization and Communication
Instructions: bar, barrier

barrier.sync{.aligned}      a{, b};

Operand b specifies the number of threads participating in the
barrier. If no thread count is specified, all threads in the CTA
participate in the barrier. When specifying a thread count, the value
must be a multiple of the warp size.

Differential Revision: https://reviews.llvm.org/D123470
2022-04-14 17:07:53 +03:00
Andrew Savonichev 32949401a8 [NVPTX] Avoid dots in global names
It seems that ptxas cannot parse them:
ptxas fatal: Parsing error near '.2': syntax error

Differential Revision: https://reviews.llvm.org/D123041
2022-04-14 17:07:52 +03:00
Andrew Savonichev 4cef5c397d [NVPTX] .attribute(.managed) is only supported for sm_30 and PTX 4.0
PTX ISA spec, s5.4.8. Variable Attribute Directive: .attribute

PTX ISA Notes
Introduced in PTX ISA version 4.0.

Target ISA Notes
.managed attribute requires sm_30 or higher.

Differential Revision: https://reviews.llvm.org/D123040
2022-04-14 17:07:52 +03:00
Andrew Savonichev 230f326964 [NVPTX] shfl.sync is introduced in PTX 6.0
PTX ISA spec, s9.7.8.6. Data Movement and Conversion Instructions:
shfl.sync

PTX ISA Notes
Introduced in PTX ISA version 6.0.

Target ISA Notes
Requires sm_30 or higher.

Differential Revision: https://reviews.llvm.org/D123039
2022-04-14 17:07:51 +03:00
Andrew Savonichev 369adba043 [NVPTX] 64-bit atom.{and,or,xor,min,max} require sm_32 or higher
PTX ISA spec, s9.7.12.4. Parallel Synchronization and Communication
Instructions: atom

Target ISA Notes
64-bit atom.{and,or,xor,min,max} require sm_32 or higher.

Differential Revision: https://reviews.llvm.org/D123038
2022-04-14 17:07:51 +03:00
LLVM GN Syncbot 5bf9aa38ab [gn build] Port 58d9ab70ae 2022-04-14 13:37:28 +00:00
Nikolas Klauser 58d9ab70ae [libc++][ranges] Implement ranges::minmax and ranges::minmax_element
Reviewed By: var-const, #libc, ldionne

Spies: sstefan1, ldionne, BRevzin, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D120637
2022-04-14 15:37:22 +02:00
PeixinQiao 0b55a8dc6f [flang] Fix intrinsic interface for DIMAG/DCONJG
The intrinsics DREAL, DIMAG, and DCONJG are from Fortran 77 extensions.
For DREAL, the type of argument is extended to any complex. For DIMAG
and DCONJG, the type of argument for them should be complex(8). For DIMAG,
the result type should be real(8). For DCONJG, the result type should be
complex(8). Fix the intrinsic interface for them and add test cases for
the semantic checks and the lowering.

Reviewed By: Jean Perier

Differential Revision: https://reviews.llvm.org/D123459
2022-04-14 21:34:00 +08:00
PeixinQiao 0ff322246b [flang] Fix float-number representation bug
The float number is represented as (-1)^s * 1.f * 2^(-127) for 32-bit,
where s is the signed flag, f is the mantissa. When the exponent bits
are all zeros, the float number is represented as (-1)^s * 0.f *2^(-126)
for 32-bit, in which case, the intPart is '0'.

Reviewed By: Jean Perier

https://reviews.llvm.org/D123673
2022-04-14 21:28:30 +08:00
Alex Zinenko e5a5e00825 [mlir] fix compiler warnings
-Wsign-compare and -Wunsued-value in the recently introduced code.
2022-04-14 15:26:50 +02:00
LLVM GN Syncbot 65fcd31ef1 [gn build] Port dd47ab750b 2022-04-14 13:21:19 +00:00
Nico Weber dd47ab750b Revert "[clang-tidy] Add portability-std-allocator-const check"
This reverts commit 73da7eed8f.
Breaks check-clang-tools on Windows, see comment on
https://reviews.llvm.org/D123655
2022-04-14 09:20:51 -04:00
LLVM GN Syncbot 33ab2e3c84 [gn build] Port 73da7eed8f 2022-04-14 13:06:13 +00:00
Nico Weber a29d9ba1f5 Revert "[gn build] Port 73da7eed8fac"
This reverts commit bd4463bebf.
Breaks check-clang-tools on Windows, see comment on
https://reviews.llvm.org/D123655
2022-04-14 09:05:11 -04:00
Lei Zhang e54236dfb5 [mlir][vector] Cast away leading one dims for insert ops
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123621
2022-04-14 08:57:32 -04:00
Lei Zhang bc408afbfe [mlir][vector] Fold splat constant transpose
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123595
2022-04-14 08:51:25 -04:00
Paul Robinson 52d346e715 [PS4] NFC refactor of PS4 toolchain class, prep for PS5 2022-04-14 05:37:39 -07:00
Simon Pilgrim a305d8f44e [X86] Adjust fsetcc/fmin/fmax costs to match SoG (Issue #54889)
znver1/2 models were incorrectly modelling these as 3 cycle latency instructions on the wrong pipe and znver1 ymm variants also require double pumping.

Now matches AMD SoG, Agner and instlatx64 numbers.

Thanks to @fabian-r for the report
2022-04-14 13:27:33 +01:00
David Truby 66c44b20b4 [Clang][AArch64][SVE] Add shift operators for SVE vector types
This patch enables shift operators on SVE vector types, as well as
supporting vector-scalar shift operations.
Shifts by a scalar that is wider than the contained type in the
vector are permitted but as in the C standard if the value is larger
than the width of the type the behavior is undefined.

Differential Revision: https://reviews.llvm.org/D123303
2022-04-14 13:20:50 +01:00
David Truby 53fd8db791 [Clang][AArch64][SVE] Allow subscript operator for SVE types
Undefined behaviour is just passed on to extract_element when the
index is out of bounds. Subscript on svbool_t is not allowed as
this doesn't really have meaningful semantics.

Differential Revision: https://reviews.llvm.org/D122732
2022-04-14 13:20:50 +01:00
Alex Zinenko d064c4801c [mlir] Introduce Transform dialect
This dialect provides operations that can be used to control transformation of
the IR using a different portion of the IR. It refers to the IR being
transformed as payload IR, and to the IR guiding the transformation as
transform IR.

The main use case for this dialect is orchestrating fine-grain transformations
on individual operations or sets thereof. For example, it may involve finding
loop-like operations with specific properties (e.g., large size) in the payload
IR, applying loop tiling to those and only those operations, and then applying
loop unrolling to the inner loops produced by the previous transformations. As
such, it is not intended as a replacement for the pass infrastructure, nor for
the pattern rewriting infrastructure. In the most common case, the transform IR
will be processed and applied to payload IR by a pass. Transformations
expressed by the transform dialect may be implemented using the pattern
infrastructure or any other relevant MLIR component.

This dialect is designed to be extensible, that is, clients of this dialect are
allowed to inject additional operations into this dialect using the newly
introduced in this patch `TransformDialectExtension` mechanism. This allows the
dialect to avoid a dependency on the implementation of the transformation as
well as to avoid introducing dialect-specific transform dialects.

See https://discourse.llvm.org/t/rfc-interfaces-and-dialects-for-precise-ir-transformation-control/60927.

Reviewed By: nicolasvasilache, Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D123135
2022-04-14 13:48:45 +02:00
Alex Zinenko 09141f1adf [mlir] Split intrinsics out of LLVMOps.td
Move the operations that correspond to LLVM IR intrinsics in a separate .td
file. This makes it easier to maintain the intrinsics and decreases the compile
time of LLVMDialect.cpp by ~25%.

Depends On D123310

Reviewed By: wsmoses, jacquesguan

Differential Revision: https://reviews.llvm.org/D123315
2022-04-14 13:23:31 +02:00
Alex Zinenko 2366a43b3c [mlir] initial support for opaque pointers in the LLVM dialect
LLVM IR has introduced and is moving forward with the concept of opaque
pointers, i.e. pointer types that are not carrying around the pointee type.
Instead, memory-related operations indicate the type of the data being accessed
through the opaque pointer. Introduce the initial support for opaque pointers
in the LLVM dialect:

  - `LLVMPointerType` to support omitting the element type;
  - alloca/load/store/gep to support opaque pointers in their operands and
    results; this requires alloca and gep to store the element type as an
    attribute;
  - memory-related intrinsics to support opaque pointers in their operands;
  - translation to LLVM IR for the ops above is no longer using methods
    deprecated in LLVM API due to the introduction of opaque pointers.

Unlike LLVM IR, MLIR can afford to support both opaque and non-opaque pointers
at the same time and simplify the transition. Translation to LLVM IR of MLIR
that involves opaque pointers requires the LLVMContext to be configured to
always use opaque pointers.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D123310
2022-04-14 13:23:29 +02:00
Paul Walker 0c44115e51 [SVE] Add support for non-element-type sized scaling when lowering MGATHER/MSCATTER.
The lowering code did not use the scale operand of MGATHER/MSCATTER
nodes, but instead assumed scaled indices were always scaled based
on the element type of the memory type. This patch adds the missing
support by rewritting the nodes as unscaled variants.

Differential Revision: https://reviews.llvm.org/D123670
2022-04-14 11:54:46 +01:00
Florian Hahn 2c14cdf831
[VPlan] Turn external defs in Value -> VPValue mapping.
This addresses an existing TODO by keeping a mapping of external IR
Value * definitions wrapped in VPValues for use in a VPlan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D123700
2022-04-14 12:03:09 +02:00
Mike Kashkarov 68efe63565 [flang] Fix ICE for sqrt(0.0) evaluation
During real range reduction to [0.5, 4) with

           SQRT(2**(2a) * x) = SQRT(2**(2a)) * SQRT(x) = 2**a * SQRT(x)

    we fall into inf. recursion if IsZero() == true.

    Explicitly handle SQRT(0.0) instead of additional checks during folding. Also
    add helpers for +0.0/-0.0 generation to clean up a bit.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D123131
2022-04-14 12:53:46 +03:00
Mike Kashkarov ba038a3080 [flang] Do not ICE on out-of-range data statement designator
Print error message instead of assert trigger.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D123132
2022-04-14 12:51:55 +03:00
Mike Kashkarov d2bcb0a129 [flang] Allow IMPLICIT NONE(EXTERNAL) with GenericDetails
Restrictions of IMPLICIT NONE(EXTERNAL) prohibits usage of c_associated from
iso_c_binding (with explicit interface) without external definiton - relax
associated check.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D120971
2022-04-14 12:45:21 +03:00
Mehdi Amini 35f48edb91 Apply clang-tidy fixes for llvm-qualified-auto in VectorTransforms.cpp (NFC) 2022-04-14 09:42:37 +00:00
Mehdi Amini 2d06420879 Apply clang-tidy fixes for performance-for-range-copy in SCF.cpp (NFC) 2022-04-14 09:42:36 +00:00
Andrzej Warzynski 6c93e1d329 [flang][driver] Add support for -mmlir
The semantics of `-mmlir` are identical to `-mllvm`. The only notable
difference is that `-mmlir` options should be forwarded to MLIR rather
than LLVM.

Note that MLIR llvm::cl options are lazily constructed on demand (see
the definition of options in PassManagerOptions.cpp). This means that:
  * MLIR global options are only visible when explicitly initialised and
    displayed only when using `-mmlir --help`,
  * Flang and LLVM global options are always visible and displayed when
    using either `-mllvm -help` or `-mmlir --help`.

In other words, `-mmlir --help` is a superset of `-mllvm --help`. This is not
ideal, but we'd need to refactor all option definitions in Flang and
LLVM to improve this. I suggesting leaving this for later.

Differential Revision: https://reviews.llvm.org/D123297
2022-04-14 09:40:31 +00:00
LLVM GN Syncbot 04a3f3f167 [gn build] Port 6ba1b9075d 2022-04-14 09:08:24 +00:00