From 9582f0969041d79a386171dbf95ef83e4566c530 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 9 Feb 2022 13:43:56 -0500 Subject: [PATCH] [Libomptarget] Increase stack size for bug49779 test The 'bug49779.cpp' test has been failing recently. This is because the runtime is sufficiently complex when using nested parallelism without optimizations that the CUDA tools cannot statically determine the stack size. Because of this the kernel can exceed the thread stack size and crash. Work around this using the 'LIBOMPTARGET_STACK_SIZE' environment variable and add an FAQ entry for this situation. Fixes #53670 Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D119357 --- openmp/docs/SupportAndFAQ.rst | 13 +++++++++++++ openmp/libomptarget/test/offloading/bug49779.cpp | 10 +++++----- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/openmp/docs/SupportAndFAQ.rst b/openmp/docs/SupportAndFAQ.rst index a3b776d11c7e..602d08f577d4 100644 --- a/openmp/docs/SupportAndFAQ.rst +++ b/openmp/docs/SupportAndFAQ.rst @@ -313,3 +313,16 @@ Using this module requires at least CMake version 3.13.4. Supported languages are C and C++ with Fortran support planned in the future. Compiler support is best for Clang but this module should work for other compiler vendors such as IBM, GNU. + +Q: What does 'Stack size for entry function cannot be statically determined' mean? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This is a warning that the Nvidia tools will sometimes emit if the offloading +region is too complex. Normally, the CUDA tools attempt to statically determine +how much stack memory each thread. This way when the kernel is launched each +thread will have as much memory as it needs. If the control flow of the kernel +is too complex, containing recursive calls or nested parallelism, this analysis +can fail. If this warning is triggered it means that the kernel may run out of +stack memory during execution and crash. The environment variable +``LIBOMPTARGET_STACK_SIZE`` can be used to increase the stack size if this +occurs. diff --git a/openmp/libomptarget/test/offloading/bug49779.cpp b/openmp/libomptarget/test/offloading/bug49779.cpp index 41fbc06595fb..c1fd68f2c709 100644 --- a/openmp/libomptarget/test/offloading/bug49779.cpp +++ b/openmp/libomptarget/test/offloading/bug49779.cpp @@ -1,8 +1,8 @@ -// RUN: %libomptarget-compilexx-run-and-check-aarch64-unknown-linux-gnu -// RUN: %libomptarget-compilexx-run-and-check-powerpc64-ibm-linux-gnu -// RUN: %libomptarget-compilexx-run-and-check-powerpc64le-ibm-linux-gnu -// RUN: %libomptarget-compilexx-run-and-check-x86_64-pc-linux-gnu -// RUN: %libomptarget-compilexx-run-and-check-nvptx64-nvidia-cuda +// RUN: %libomptarget-compilexx-generic && \ +// RUN: env LIBOMPTARGET_STACK_SIZE=2048 %libomptarget-run-generic + +// UNSUPPORTED: amdgcn-amd-amdhsa +// UNSUPPORTED: amdgcn-amd-amdhsa-newDriver #include #include