[Libomptarget] Increase stack size for bug49779 test

The 'bug49779.cpp' test has been failing recently. This is because the
runtime is sufficiently complex when using nested parallelism without
optimizations that the CUDA tools cannot statically determine the stack
size. Because of this the kernel can exceed the thread stack size and
crash. Work around this using the 'LIBOMPTARGET_STACK_SIZE' environment
variable and add an FAQ entry for this situation.

Fixes #53670

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D119357
This commit is contained in:
Joseph Huber 2022-02-09 13:43:56 -05:00
parent c45c1b130b
commit 9582f09690
2 changed files with 18 additions and 5 deletions

View file

@ -313,3 +313,16 @@ Using this module requires at least CMake version 3.13.4. Supported languages
are C and C++ with Fortran support planned in the future. Compiler support is
best for Clang but this module should work for other compiler vendors such as
IBM, GNU.
Q: What does 'Stack size for entry function cannot be statically determined' mean?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is a warning that the Nvidia tools will sometimes emit if the offloading
region is too complex. Normally, the CUDA tools attempt to statically determine
how much stack memory each thread. This way when the kernel is launched each
thread will have as much memory as it needs. If the control flow of the kernel
is too complex, containing recursive calls or nested parallelism, this analysis
can fail. If this warning is triggered it means that the kernel may run out of
stack memory during execution and crash. The environment variable
``LIBOMPTARGET_STACK_SIZE`` can be used to increase the stack size if this
occurs.

View file

@ -1,8 +1,8 @@
// RUN: %libomptarget-compilexx-run-and-check-aarch64-unknown-linux-gnu
// RUN: %libomptarget-compilexx-run-and-check-powerpc64-ibm-linux-gnu
// RUN: %libomptarget-compilexx-run-and-check-powerpc64le-ibm-linux-gnu
// RUN: %libomptarget-compilexx-run-and-check-x86_64-pc-linux-gnu
// RUN: %libomptarget-compilexx-run-and-check-nvptx64-nvidia-cuda
// RUN: %libomptarget-compilexx-generic && \
// RUN: env LIBOMPTARGET_STACK_SIZE=2048 %libomptarget-run-generic
// UNSUPPORTED: amdgcn-amd-amdhsa
// UNSUPPORTED: amdgcn-amd-amdhsa-newDriver
#include <cassert>
#include <iostream>