Index: vendor/clang/dist/docs/ReleaseNotes.rst =================================================================== --- vendor/clang/dist/docs/ReleaseNotes.rst (revision 314259) +++ vendor/clang/dist/docs/ReleaseNotes.rst (revision 314260) @@ -1,288 +1,193 @@ -======================================= -Clang 4.0.0 (In-Progress) Release Notes -======================================= +========================= +Clang 4.0.0 Release Notes +========================= .. contents:: :local: :depth: 2 Written by the `LLVM Team `_ -.. warning:: - - These are in-progress notes for the upcoming Clang 4.0.0 release. You may - prefer the `Clang 3.9 Release Notes - `_. - Introduction ============ This document contains the release notes for the Clang C/C++/Objective-C/OpenCL frontend, part of the LLVM Compiler Infrastructure, release 4.0.0. Here we describe the status of Clang in some detail, including major improvements from the previous release and new feature work. For the general LLVM release notes, see `the LLVM documentation `_. All LLVM releases may be downloaded from the `LLVM releases web site `_. For more information about Clang or LLVM, including information about the latest release, please check out the main please see the `Clang Web Site `_ or the `LLVM Web Site `_. What's New in Clang 4.0.0? ========================== Some of the major new features and improvements to Clang are listed here. Generic improvements to Clang as a whole or to its underlying infrastructure are described first, followed by language-specific sections with improvements to Clang's support for those languages. Major New Features ------------------ -- The ``diagnose_if`` attribute has been added to clang. This attribute allows +- The `diagnose_if `_ attribute has been + added to clang. This attribute allows clang to emit a warning or error if a function call meets one or more user-specified conditions. - Enhanced devirtualization with `-fstrict-vtable-pointers `_. Clang devirtualizes across different basic blocks, like loops: .. code-block:: c++ struct A { virtual void foo(); }; void indirect(A &a, int n) { for (int i = 0 ; i < n; i++) a.foo(); } void test(int n) { A a; indirect(a, n); } -- ... - Improvements to ThinLTO (-flto=thin) ------------------------------------ - Integration with profile data (PGO). When available, profile data enables more accurate function importing decisions, as well as cross-module indirect call promotion. - Significant build-time and binary-size improvements when compiling with debug - info (-g). + info (``-g``). -Improvements to Clang's diagnostics -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -- ... - New Compiler Flags ------------------ -The option -Og has been added to optimize the debugging experience. -For now, this option is exactly the same as -O1. However, in the future, -some other optimizations might be enabled or disabled. +- The option ``-Og`` has been added to optimize the debugging experience. + For now, this option is exactly the same as ``-O1``. However, in the future, + some other optimizations might be enabled or disabled. -The option -MJ has been added to simplify adding JSON compilation -database output into existing build systems. +- The option ``-MJ`` has been added to simplify adding JSON compilation + database output into existing build systems. -The option .... -New Pragmas in Clang ------------------------ - -Clang now supports the ... - - -Attribute Changes in Clang --------------------------- - -- ... - -Windows Support ---------------- - -Clang's support for building native Windows programs ... - - -C Language Changes in Clang ---------------------------- - -- ... - -... - -C11 Feature Support -^^^^^^^^^^^^^^^^^^^ - -... - -C++ Language Changes in Clang ------------------------------ - -... - -C++1z Feature Support -^^^^^^^^^^^^^^^^^^^^^ - -... - -Objective-C Language Changes in Clang -------------------------------------- - -... - OpenCL C Language Changes in Clang ---------------------------------- **The following bugs in the OpenCL header have been fixed:** * Added missing ``overloadable`` and ``convergent`` attributes. * Removed some erroneous extra ``native_*`` functions. **The following bugs in the generation of metadata have been fixed:** * Corrected the SPIR version depending on the OpenCL version. * Source level address spaces are taken from the SPIR specification. * Image types now contain no access qualifier. **The following bugs in the AMD target have been fixed:** * Corrected the bitwidth of ``size_t`` and NULL pointer value with respect to address spaces. * Added ``cl_khr_subgroups``, ``cl_amd_media_ops`` and ``cl_amd_media_ops2`` extensions. * Added ``cl-denorms-are-zero`` support. * Changed address spaces for image objects to be ``constant``. * Added little-endian. **The following bugs in OpenCL 2.0 have been fixed:** * Fixed pipe builtin function return type, added extra argument to generated IR intrinsics to propagate size and alignment information of the pipe packed type. * Improved pipe type to accommodate access qualifiers. * Added correct address space to the ObjC block generation and ``enqueue_kernel`` prototype. * Improved handling of integer parameters of ``enqueue_kernel`` prototype. We now allow ``size_t`` instead of ``int`` for specifying block parameter sizes. * Allow using NULL (aka ``CLK_NULL_QUEUE``) with ``queue_t``. **Improved the following diagnostics:** * Disallow address spaces other than ``global`` for kernel pointer parameters. * Correct the use of half type argument and pointer assignment with dereferencing. * Disallow variadic arguments in functions and blocks. * Allow partial initializer for array and struct. **Some changes to OpenCL extensions have been made:** * Added ``cl_khr_mipmap_image``. * Added ``-cl-ext`` flag to allow overwriting supported extensions otherwise set by the target compiled for (Example: ``-cl-ext=-all,+cl_khr_fp16``). * New types and functions can now be flexibly added to extensions using the following pragmas instead of modifying the Clang source code: .. code-block:: c #pragma OPENCL EXTENSION the_new_extension_name : begin // declare types and functions associated with the extension here #pragma OPENCL EXTENSION the_new_extension_name : end **Miscellaneous changes:** * Fix ``__builtin_astype`` to cast between different address space objects. * Allow using ``opencl_unroll_hint`` with earlier OpenCL versions than 2.0. * Improved handling of floating point literal to default to single precision if fp64 extension is not enabled. * Refactor ``sampler_t`` implementation to simplify initializer representation which is now handled as a compiler builtin function with an integer value passed into it. * Change fake address space map to use the SPIR convention. -* Added `the OpenCL manual - `_ to Clang +* Added `the OpenCL manual `_ to Clang documentation. -OpenMP Support in Clang ----------------------------------- -... - -Internal API Changes --------------------- - -These are major API changes that have happened since the 3.9 release of -Clang. If upgrading an external codebase that uses Clang as a library, -this section should help get you past the largest hurdles of upgrading. - -- ... - -AST Matchers ------------- - -... - -libclang --------- - -... - -With the option --show-description, scan-build's list of defects will also -show the description of the defects. - - Static Analyzer --------------- -With the option --show-description, scan-build's list of defects will also +With the option ``--show-description``, scan-build's list of defects will also show the description of the defects. The analyzer now provides better support of code that uses gtest. Several new checks were added: - The analyzer warns when virtual calls are made from constructors or destructors. This check is off by default but can be enabled by passing the - following command to scan-build: -enable-checker optin.cplusplus.VirtualCall. + following command to scan-build: ``-enable-checker optin.cplusplus.VirtualCall``. - The analyzer checks for synthesized copy properties of mutable types in - Objective C, such as NSMutableArray. Calling the setter for these properties + Objective C, such as ``NSMutableArray``. Calling the setter for these properties will store an immutable copy of the value. -- The analyzer checks for calls to dispatch_once() that use an Objective-C +- The analyzer checks for calls to ``dispatch_once()`` that use an Objective-C instance variable as the predicate. Using an instance variable as a predicate may result in the passed-in block being executed multiple times or not at all. These calls should be rewritten either to use a lock or to store the predicate in a global or static variable. -- The analyzer checks for unintended comparisons of NSNumber, CFNumberRef, and +- The analyzer checks for unintended comparisons of ``NSNumber``, ``CFNumberRef``, and other Cocoa number objects to scalar values. - -Python Binding Changes ----------------------- - -The following methods have been added: - -- ... - -Significant Known Problems -========================== Additional Information ====================== A wide variety of additional information is available on the `Clang web page `_. The web page contains versions of the API documentation which are up-to-date with the Subversion version of the source code. You can access versions of these documents specific to this release by going into the "``clang/docs/``" directory in the Clang tree. If you have any questions or comments about Clang, please feel free to contact us via the `mailing list `_. Index: vendor/clang/dist/lib/CodeGen/CGOpenMPRuntime.cpp =================================================================== --- vendor/clang/dist/lib/CodeGen/CGOpenMPRuntime.cpp (revision 314259) +++ vendor/clang/dist/lib/CodeGen/CGOpenMPRuntime.cpp (revision 314260) @@ -1,6795 +1,6797 @@ //===----- CGOpenMPRuntime.cpp - Interface to OpenMP Runtimes -------------===// // // The LLVM Compiler Infrastructure // // This file is distributed under the University of Illinois Open Source // License. See LICENSE.TXT for details. // //===----------------------------------------------------------------------===// // // This provides a class for OpenMP runtime code generation. // //===----------------------------------------------------------------------===// #include "CGCXXABI.h" #include "CGCleanup.h" #include "CGOpenMPRuntime.h" #include "CodeGenFunction.h" #include "ConstantBuilder.h" #include "clang/AST/Decl.h" #include "clang/AST/StmtOpenMP.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/Bitcode/BitcodeReader.h" #include "llvm/IR/CallSite.h" #include "llvm/IR/DerivedTypes.h" #include "llvm/IR/GlobalValue.h" #include "llvm/IR/Value.h" #include "llvm/Support/Format.h" #include "llvm/Support/raw_ostream.h" #include using namespace clang; using namespace CodeGen; namespace { /// \brief Base class for handling code generation inside OpenMP regions. class CGOpenMPRegionInfo : public CodeGenFunction::CGCapturedStmtInfo { public: /// \brief Kinds of OpenMP regions used in codegen. enum CGOpenMPRegionKind { /// \brief Region with outlined function for standalone 'parallel' /// directive. ParallelOutlinedRegion, /// \brief Region with outlined function for standalone 'task' directive. TaskOutlinedRegion, /// \brief Region for constructs that do not require function outlining, /// like 'for', 'sections', 'atomic' etc. directives. InlinedRegion, /// \brief Region with outlined function for standalone 'target' directive. TargetRegion, }; CGOpenMPRegionInfo(const CapturedStmt &CS, const CGOpenMPRegionKind RegionKind, const RegionCodeGenTy &CodeGen, OpenMPDirectiveKind Kind, bool HasCancel) : CGCapturedStmtInfo(CS, CR_OpenMP), RegionKind(RegionKind), CodeGen(CodeGen), Kind(Kind), HasCancel(HasCancel) {} CGOpenMPRegionInfo(const CGOpenMPRegionKind RegionKind, const RegionCodeGenTy &CodeGen, OpenMPDirectiveKind Kind, bool HasCancel) : CGCapturedStmtInfo(CR_OpenMP), RegionKind(RegionKind), CodeGen(CodeGen), Kind(Kind), HasCancel(HasCancel) {} /// \brief Get a variable or parameter for storing global thread id /// inside OpenMP construct. virtual const VarDecl *getThreadIDVariable() const = 0; /// \brief Emit the captured statement body. void EmitBody(CodeGenFunction &CGF, const Stmt *S) override; /// \brief Get an LValue for the current ThreadID variable. /// \return LValue for thread id variable. This LValue always has type int32*. virtual LValue getThreadIDVariableLValue(CodeGenFunction &CGF); virtual void emitUntiedSwitch(CodeGenFunction & /*CGF*/) {} CGOpenMPRegionKind getRegionKind() const { return RegionKind; } OpenMPDirectiveKind getDirectiveKind() const { return Kind; } bool hasCancel() const { return HasCancel; } static bool classof(const CGCapturedStmtInfo *Info) { return Info->getKind() == CR_OpenMP; } ~CGOpenMPRegionInfo() override = default; protected: CGOpenMPRegionKind RegionKind; RegionCodeGenTy CodeGen; OpenMPDirectiveKind Kind; bool HasCancel; }; /// \brief API for captured statement code generation in OpenMP constructs. class CGOpenMPOutlinedRegionInfo final : public CGOpenMPRegionInfo { public: CGOpenMPOutlinedRegionInfo(const CapturedStmt &CS, const VarDecl *ThreadIDVar, const RegionCodeGenTy &CodeGen, OpenMPDirectiveKind Kind, bool HasCancel, StringRef HelperName) : CGOpenMPRegionInfo(CS, ParallelOutlinedRegion, CodeGen, Kind, HasCancel), ThreadIDVar(ThreadIDVar), HelperName(HelperName) { assert(ThreadIDVar != nullptr && "No ThreadID in OpenMP region."); } /// \brief Get a variable or parameter for storing global thread id /// inside OpenMP construct. const VarDecl *getThreadIDVariable() const override { return ThreadIDVar; } /// \brief Get the name of the capture helper. StringRef getHelperName() const override { return HelperName; } static bool classof(const CGCapturedStmtInfo *Info) { return CGOpenMPRegionInfo::classof(Info) && cast(Info)->getRegionKind() == ParallelOutlinedRegion; } private: /// \brief A variable or parameter storing global thread id for OpenMP /// constructs. const VarDecl *ThreadIDVar; StringRef HelperName; }; /// \brief API for captured statement code generation in OpenMP constructs. class CGOpenMPTaskOutlinedRegionInfo final : public CGOpenMPRegionInfo { public: class UntiedTaskActionTy final : public PrePostActionTy { bool Untied; const VarDecl *PartIDVar; const RegionCodeGenTy UntiedCodeGen; llvm::SwitchInst *UntiedSwitch = nullptr; public: UntiedTaskActionTy(bool Tied, const VarDecl *PartIDVar, const RegionCodeGenTy &UntiedCodeGen) : Untied(!Tied), PartIDVar(PartIDVar), UntiedCodeGen(UntiedCodeGen) {} void Enter(CodeGenFunction &CGF) override { if (Untied) { // Emit task switching point. auto PartIdLVal = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(PartIDVar), PartIDVar->getType()->castAs()); auto *Res = CGF.EmitLoadOfScalar(PartIdLVal, SourceLocation()); auto *DoneBB = CGF.createBasicBlock(".untied.done."); UntiedSwitch = CGF.Builder.CreateSwitch(Res, DoneBB); CGF.EmitBlock(DoneBB); CGF.EmitBranchThroughCleanup(CGF.ReturnBlock); CGF.EmitBlock(CGF.createBasicBlock(".untied.jmp.")); UntiedSwitch->addCase(CGF.Builder.getInt32(0), CGF.Builder.GetInsertBlock()); emitUntiedSwitch(CGF); } } void emitUntiedSwitch(CodeGenFunction &CGF) const { if (Untied) { auto PartIdLVal = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(PartIDVar), PartIDVar->getType()->castAs()); CGF.EmitStoreOfScalar(CGF.Builder.getInt32(UntiedSwitch->getNumCases()), PartIdLVal); UntiedCodeGen(CGF); CodeGenFunction::JumpDest CurPoint = CGF.getJumpDestInCurrentScope(".untied.next."); CGF.EmitBranchThroughCleanup(CGF.ReturnBlock); CGF.EmitBlock(CGF.createBasicBlock(".untied.jmp.")); UntiedSwitch->addCase(CGF.Builder.getInt32(UntiedSwitch->getNumCases()), CGF.Builder.GetInsertBlock()); CGF.EmitBranchThroughCleanup(CurPoint); CGF.EmitBlock(CurPoint.getBlock()); } } unsigned getNumberOfParts() const { return UntiedSwitch->getNumCases(); } }; CGOpenMPTaskOutlinedRegionInfo(const CapturedStmt &CS, const VarDecl *ThreadIDVar, const RegionCodeGenTy &CodeGen, OpenMPDirectiveKind Kind, bool HasCancel, const UntiedTaskActionTy &Action) : CGOpenMPRegionInfo(CS, TaskOutlinedRegion, CodeGen, Kind, HasCancel), ThreadIDVar(ThreadIDVar), Action(Action) { assert(ThreadIDVar != nullptr && "No ThreadID in OpenMP region."); } /// \brief Get a variable or parameter for storing global thread id /// inside OpenMP construct. const VarDecl *getThreadIDVariable() const override { return ThreadIDVar; } /// \brief Get an LValue for the current ThreadID variable. LValue getThreadIDVariableLValue(CodeGenFunction &CGF) override; /// \brief Get the name of the capture helper. StringRef getHelperName() const override { return ".omp_outlined."; } void emitUntiedSwitch(CodeGenFunction &CGF) override { Action.emitUntiedSwitch(CGF); } static bool classof(const CGCapturedStmtInfo *Info) { return CGOpenMPRegionInfo::classof(Info) && cast(Info)->getRegionKind() == TaskOutlinedRegion; } private: /// \brief A variable or parameter storing global thread id for OpenMP /// constructs. const VarDecl *ThreadIDVar; /// Action for emitting code for untied tasks. const UntiedTaskActionTy &Action; }; /// \brief API for inlined captured statement code generation in OpenMP /// constructs. class CGOpenMPInlinedRegionInfo : public CGOpenMPRegionInfo { public: CGOpenMPInlinedRegionInfo(CodeGenFunction::CGCapturedStmtInfo *OldCSI, const RegionCodeGenTy &CodeGen, OpenMPDirectiveKind Kind, bool HasCancel) : CGOpenMPRegionInfo(InlinedRegion, CodeGen, Kind, HasCancel), OldCSI(OldCSI), OuterRegionInfo(dyn_cast_or_null(OldCSI)) {} // \brief Retrieve the value of the context parameter. llvm::Value *getContextValue() const override { if (OuterRegionInfo) return OuterRegionInfo->getContextValue(); llvm_unreachable("No context value for inlined OpenMP region"); } void setContextValue(llvm::Value *V) override { if (OuterRegionInfo) { OuterRegionInfo->setContextValue(V); return; } llvm_unreachable("No context value for inlined OpenMP region"); } /// \brief Lookup the captured field decl for a variable. const FieldDecl *lookup(const VarDecl *VD) const override { if (OuterRegionInfo) return OuterRegionInfo->lookup(VD); // If there is no outer outlined region,no need to lookup in a list of // captured variables, we can use the original one. return nullptr; } FieldDecl *getThisFieldDecl() const override { if (OuterRegionInfo) return OuterRegionInfo->getThisFieldDecl(); return nullptr; } /// \brief Get a variable or parameter for storing global thread id /// inside OpenMP construct. const VarDecl *getThreadIDVariable() const override { if (OuterRegionInfo) return OuterRegionInfo->getThreadIDVariable(); return nullptr; } /// \brief Get the name of the capture helper. StringRef getHelperName() const override { if (auto *OuterRegionInfo = getOldCSI()) return OuterRegionInfo->getHelperName(); llvm_unreachable("No helper name for inlined OpenMP construct"); } void emitUntiedSwitch(CodeGenFunction &CGF) override { if (OuterRegionInfo) OuterRegionInfo->emitUntiedSwitch(CGF); } CodeGenFunction::CGCapturedStmtInfo *getOldCSI() const { return OldCSI; } static bool classof(const CGCapturedStmtInfo *Info) { return CGOpenMPRegionInfo::classof(Info) && cast(Info)->getRegionKind() == InlinedRegion; } ~CGOpenMPInlinedRegionInfo() override = default; private: /// \brief CodeGen info about outer OpenMP region. CodeGenFunction::CGCapturedStmtInfo *OldCSI; CGOpenMPRegionInfo *OuterRegionInfo; }; /// \brief API for captured statement code generation in OpenMP target /// constructs. For this captures, implicit parameters are used instead of the /// captured fields. The name of the target region has to be unique in a given /// application so it is provided by the client, because only the client has /// the information to generate that. class CGOpenMPTargetRegionInfo final : public CGOpenMPRegionInfo { public: CGOpenMPTargetRegionInfo(const CapturedStmt &CS, const RegionCodeGenTy &CodeGen, StringRef HelperName) : CGOpenMPRegionInfo(CS, TargetRegion, CodeGen, OMPD_target, /*HasCancel=*/false), HelperName(HelperName) {} /// \brief This is unused for target regions because each starts executing /// with a single thread. const VarDecl *getThreadIDVariable() const override { return nullptr; } /// \brief Get the name of the capture helper. StringRef getHelperName() const override { return HelperName; } static bool classof(const CGCapturedStmtInfo *Info) { return CGOpenMPRegionInfo::classof(Info) && cast(Info)->getRegionKind() == TargetRegion; } private: StringRef HelperName; }; static void EmptyCodeGen(CodeGenFunction &, PrePostActionTy &) { llvm_unreachable("No codegen for expressions"); } /// \brief API for generation of expressions captured in a innermost OpenMP /// region. class CGOpenMPInnerExprInfo final : public CGOpenMPInlinedRegionInfo { public: CGOpenMPInnerExprInfo(CodeGenFunction &CGF, const CapturedStmt &CS) : CGOpenMPInlinedRegionInfo(CGF.CapturedStmtInfo, EmptyCodeGen, OMPD_unknown, /*HasCancel=*/false), PrivScope(CGF) { // Make sure the globals captured in the provided statement are local by // using the privatization logic. We assume the same variable is not // captured more than once. for (auto &C : CS.captures()) { if (!C.capturesVariable() && !C.capturesVariableByCopy()) continue; const VarDecl *VD = C.getCapturedVar(); if (VD->isLocalVarDeclOrParm()) continue; DeclRefExpr DRE(const_cast(VD), /*RefersToEnclosingVariableOrCapture=*/false, VD->getType().getNonReferenceType(), VK_LValue, SourceLocation()); PrivScope.addPrivate(VD, [&CGF, &DRE]() -> Address { return CGF.EmitLValue(&DRE).getAddress(); }); } (void)PrivScope.Privatize(); } /// \brief Lookup the captured field decl for a variable. const FieldDecl *lookup(const VarDecl *VD) const override { if (auto *FD = CGOpenMPInlinedRegionInfo::lookup(VD)) return FD; return nullptr; } /// \brief Emit the captured statement body. void EmitBody(CodeGenFunction &CGF, const Stmt *S) override { llvm_unreachable("No body for expressions"); } /// \brief Get a variable or parameter for storing global thread id /// inside OpenMP construct. const VarDecl *getThreadIDVariable() const override { llvm_unreachable("No thread id for expressions"); } /// \brief Get the name of the capture helper. StringRef getHelperName() const override { llvm_unreachable("No helper name for expressions"); } static bool classof(const CGCapturedStmtInfo *Info) { return false; } private: /// Private scope to capture global variables. CodeGenFunction::OMPPrivateScope PrivScope; }; /// \brief RAII for emitting code of OpenMP constructs. class InlinedOpenMPRegionRAII { CodeGenFunction &CGF; llvm::DenseMap LambdaCaptureFields; FieldDecl *LambdaThisCaptureField = nullptr; public: /// \brief Constructs region for combined constructs. /// \param CodeGen Code generation sequence for combined directives. Includes /// a list of functions used for code generation of implicitly inlined /// regions. InlinedOpenMPRegionRAII(CodeGenFunction &CGF, const RegionCodeGenTy &CodeGen, OpenMPDirectiveKind Kind, bool HasCancel) : CGF(CGF) { // Start emission for the construct. CGF.CapturedStmtInfo = new CGOpenMPInlinedRegionInfo( CGF.CapturedStmtInfo, CodeGen, Kind, HasCancel); std::swap(CGF.LambdaCaptureFields, LambdaCaptureFields); LambdaThisCaptureField = CGF.LambdaThisCaptureField; CGF.LambdaThisCaptureField = nullptr; } ~InlinedOpenMPRegionRAII() { // Restore original CapturedStmtInfo only if we're done with code emission. auto *OldCSI = cast(CGF.CapturedStmtInfo)->getOldCSI(); delete CGF.CapturedStmtInfo; CGF.CapturedStmtInfo = OldCSI; std::swap(CGF.LambdaCaptureFields, LambdaCaptureFields); CGF.LambdaThisCaptureField = LambdaThisCaptureField; } }; /// \brief Values for bit flags used in the ident_t to describe the fields. /// All enumeric elements are named and described in accordance with the code /// from http://llvm.org/svn/llvm-project/openmp/trunk/runtime/src/kmp.h enum OpenMPLocationFlags { /// \brief Use trampoline for internal microtask. OMP_IDENT_IMD = 0x01, /// \brief Use c-style ident structure. OMP_IDENT_KMPC = 0x02, /// \brief Atomic reduction option for kmpc_reduce. OMP_ATOMIC_REDUCE = 0x10, /// \brief Explicit 'barrier' directive. OMP_IDENT_BARRIER_EXPL = 0x20, /// \brief Implicit barrier in code. OMP_IDENT_BARRIER_IMPL = 0x40, /// \brief Implicit barrier in 'for' directive. OMP_IDENT_BARRIER_IMPL_FOR = 0x40, /// \brief Implicit barrier in 'sections' directive. OMP_IDENT_BARRIER_IMPL_SECTIONS = 0xC0, /// \brief Implicit barrier in 'single' directive. OMP_IDENT_BARRIER_IMPL_SINGLE = 0x140 }; /// \brief Describes ident structure that describes a source location. /// All descriptions are taken from /// http://llvm.org/svn/llvm-project/openmp/trunk/runtime/src/kmp.h /// Original structure: /// typedef struct ident { /// kmp_int32 reserved_1; /**< might be used in Fortran; /// see above */ /// kmp_int32 flags; /**< also f.flags; KMP_IDENT_xxx flags; /// KMP_IDENT_KMPC identifies this union /// member */ /// kmp_int32 reserved_2; /**< not really used in Fortran any more; /// see above */ ///#if USE_ITT_BUILD /// /* but currently used for storing /// region-specific ITT */ /// /* contextual information. */ ///#endif /* USE_ITT_BUILD */ /// kmp_int32 reserved_3; /**< source[4] in Fortran, do not use for /// C++ */ /// char const *psource; /**< String describing the source location. /// The string is composed of semi-colon separated // fields which describe the source file, /// the function and a pair of line numbers that /// delimit the construct. /// */ /// } ident_t; enum IdentFieldIndex { /// \brief might be used in Fortran IdentField_Reserved_1, /// \brief OMP_IDENT_xxx flags; OMP_IDENT_KMPC identifies this union member. IdentField_Flags, /// \brief Not really used in Fortran any more IdentField_Reserved_2, /// \brief Source[4] in Fortran, do not use for C++ IdentField_Reserved_3, /// \brief String describing the source location. The string is composed of /// semi-colon separated fields which describe the source file, the function /// and a pair of line numbers that delimit the construct. IdentField_PSource }; /// \brief Schedule types for 'omp for' loops (these enumerators are taken from /// the enum sched_type in kmp.h). enum OpenMPSchedType { /// \brief Lower bound for default (unordered) versions. OMP_sch_lower = 32, OMP_sch_static_chunked = 33, OMP_sch_static = 34, OMP_sch_dynamic_chunked = 35, OMP_sch_guided_chunked = 36, OMP_sch_runtime = 37, OMP_sch_auto = 38, /// static with chunk adjustment (e.g., simd) OMP_sch_static_balanced_chunked = 45, /// \brief Lower bound for 'ordered' versions. OMP_ord_lower = 64, OMP_ord_static_chunked = 65, OMP_ord_static = 66, OMP_ord_dynamic_chunked = 67, OMP_ord_guided_chunked = 68, OMP_ord_runtime = 69, OMP_ord_auto = 70, OMP_sch_default = OMP_sch_static, /// \brief dist_schedule types OMP_dist_sch_static_chunked = 91, OMP_dist_sch_static = 92, /// Support for OpenMP 4.5 monotonic and nonmonotonic schedule modifiers. /// Set if the monotonic schedule modifier was present. OMP_sch_modifier_monotonic = (1 << 29), /// Set if the nonmonotonic schedule modifier was present. OMP_sch_modifier_nonmonotonic = (1 << 30), }; enum OpenMPRTLFunction { /// \brief Call to void __kmpc_fork_call(ident_t *loc, kmp_int32 argc, /// kmpc_micro microtask, ...); OMPRTL__kmpc_fork_call, /// \brief Call to void *__kmpc_threadprivate_cached(ident_t *loc, /// kmp_int32 global_tid, void *data, size_t size, void ***cache); OMPRTL__kmpc_threadprivate_cached, /// \brief Call to void __kmpc_threadprivate_register( ident_t *, /// void *data, kmpc_ctor ctor, kmpc_cctor cctor, kmpc_dtor dtor); OMPRTL__kmpc_threadprivate_register, // Call to __kmpc_int32 kmpc_global_thread_num(ident_t *loc); OMPRTL__kmpc_global_thread_num, // Call to void __kmpc_critical(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *crit); OMPRTL__kmpc_critical, // Call to void __kmpc_critical_with_hint(ident_t *loc, kmp_int32 // global_tid, kmp_critical_name *crit, uintptr_t hint); OMPRTL__kmpc_critical_with_hint, // Call to void __kmpc_end_critical(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *crit); OMPRTL__kmpc_end_critical, // Call to kmp_int32 __kmpc_cancel_barrier(ident_t *loc, kmp_int32 // global_tid); OMPRTL__kmpc_cancel_barrier, // Call to void __kmpc_barrier(ident_t *loc, kmp_int32 global_tid); OMPRTL__kmpc_barrier, // Call to void __kmpc_for_static_fini(ident_t *loc, kmp_int32 global_tid); OMPRTL__kmpc_for_static_fini, // Call to void __kmpc_serialized_parallel(ident_t *loc, kmp_int32 // global_tid); OMPRTL__kmpc_serialized_parallel, // Call to void __kmpc_end_serialized_parallel(ident_t *loc, kmp_int32 // global_tid); OMPRTL__kmpc_end_serialized_parallel, // Call to void __kmpc_push_num_threads(ident_t *loc, kmp_int32 global_tid, // kmp_int32 num_threads); OMPRTL__kmpc_push_num_threads, // Call to void __kmpc_flush(ident_t *loc); OMPRTL__kmpc_flush, // Call to kmp_int32 __kmpc_master(ident_t *, kmp_int32 global_tid); OMPRTL__kmpc_master, // Call to void __kmpc_end_master(ident_t *, kmp_int32 global_tid); OMPRTL__kmpc_end_master, // Call to kmp_int32 __kmpc_omp_taskyield(ident_t *, kmp_int32 global_tid, // int end_part); OMPRTL__kmpc_omp_taskyield, // Call to kmp_int32 __kmpc_single(ident_t *, kmp_int32 global_tid); OMPRTL__kmpc_single, // Call to void __kmpc_end_single(ident_t *, kmp_int32 global_tid); OMPRTL__kmpc_end_single, // Call to kmp_task_t * __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, // kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, // kmp_routine_entry_t *task_entry); OMPRTL__kmpc_omp_task_alloc, // Call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t * // new_task); OMPRTL__kmpc_omp_task, // Call to void __kmpc_copyprivate(ident_t *loc, kmp_int32 global_tid, // size_t cpy_size, void *cpy_data, void(*cpy_func)(void *, void *), // kmp_int32 didit); OMPRTL__kmpc_copyprivate, // Call to kmp_int32 __kmpc_reduce(ident_t *loc, kmp_int32 global_tid, // kmp_int32 num_vars, size_t reduce_size, void *reduce_data, void // (*reduce_func)(void *lhs_data, void *rhs_data), kmp_critical_name *lck); OMPRTL__kmpc_reduce, // Call to kmp_int32 __kmpc_reduce_nowait(ident_t *loc, kmp_int32 // global_tid, kmp_int32 num_vars, size_t reduce_size, void *reduce_data, // void (*reduce_func)(void *lhs_data, void *rhs_data), kmp_critical_name // *lck); OMPRTL__kmpc_reduce_nowait, // Call to void __kmpc_end_reduce(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *lck); OMPRTL__kmpc_end_reduce, // Call to void __kmpc_end_reduce_nowait(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *lck); OMPRTL__kmpc_end_reduce_nowait, // Call to void __kmpc_omp_task_begin_if0(ident_t *, kmp_int32 gtid, // kmp_task_t * new_task); OMPRTL__kmpc_omp_task_begin_if0, // Call to void __kmpc_omp_task_complete_if0(ident_t *, kmp_int32 gtid, // kmp_task_t * new_task); OMPRTL__kmpc_omp_task_complete_if0, // Call to void __kmpc_ordered(ident_t *loc, kmp_int32 global_tid); OMPRTL__kmpc_ordered, // Call to void __kmpc_end_ordered(ident_t *loc, kmp_int32 global_tid); OMPRTL__kmpc_end_ordered, // Call to kmp_int32 __kmpc_omp_taskwait(ident_t *loc, kmp_int32 // global_tid); OMPRTL__kmpc_omp_taskwait, // Call to void __kmpc_taskgroup(ident_t *loc, kmp_int32 global_tid); OMPRTL__kmpc_taskgroup, // Call to void __kmpc_end_taskgroup(ident_t *loc, kmp_int32 global_tid); OMPRTL__kmpc_end_taskgroup, // Call to void __kmpc_push_proc_bind(ident_t *loc, kmp_int32 global_tid, // int proc_bind); OMPRTL__kmpc_push_proc_bind, // Call to kmp_int32 __kmpc_omp_task_with_deps(ident_t *loc_ref, kmp_int32 // gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t // *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list); OMPRTL__kmpc_omp_task_with_deps, // Call to void __kmpc_omp_wait_deps(ident_t *loc_ref, kmp_int32 // gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 // ndeps_noalias, kmp_depend_info_t *noalias_dep_list); OMPRTL__kmpc_omp_wait_deps, // Call to kmp_int32 __kmpc_cancellationpoint(ident_t *loc, kmp_int32 // global_tid, kmp_int32 cncl_kind); OMPRTL__kmpc_cancellationpoint, // Call to kmp_int32 __kmpc_cancel(ident_t *loc, kmp_int32 global_tid, // kmp_int32 cncl_kind); OMPRTL__kmpc_cancel, // Call to void __kmpc_push_num_teams(ident_t *loc, kmp_int32 global_tid, // kmp_int32 num_teams, kmp_int32 thread_limit); OMPRTL__kmpc_push_num_teams, // Call to void __kmpc_fork_teams(ident_t *loc, kmp_int32 argc, kmpc_micro // microtask, ...); OMPRTL__kmpc_fork_teams, // Call to void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int // if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int // sched, kmp_uint64 grainsize, void *task_dup); OMPRTL__kmpc_taskloop, // Call to void __kmpc_doacross_init(ident_t *loc, kmp_int32 gtid, kmp_int32 // num_dims, struct kmp_dim *dims); OMPRTL__kmpc_doacross_init, // Call to void __kmpc_doacross_fini(ident_t *loc, kmp_int32 gtid); OMPRTL__kmpc_doacross_fini, // Call to void __kmpc_doacross_post(ident_t *loc, kmp_int32 gtid, kmp_int64 // *vec); OMPRTL__kmpc_doacross_post, // Call to void __kmpc_doacross_wait(ident_t *loc, kmp_int32 gtid, kmp_int64 // *vec); OMPRTL__kmpc_doacross_wait, // // Offloading related calls // // Call to int32_t __tgt_target(int32_t device_id, void *host_ptr, int32_t // arg_num, void** args_base, void **args, size_t *arg_sizes, int32_t // *arg_types); OMPRTL__tgt_target, // Call to int32_t __tgt_target_teams(int32_t device_id, void *host_ptr, // int32_t arg_num, void** args_base, void **args, size_t *arg_sizes, // int32_t *arg_types, int32_t num_teams, int32_t thread_limit); OMPRTL__tgt_target_teams, // Call to void __tgt_register_lib(__tgt_bin_desc *desc); OMPRTL__tgt_register_lib, // Call to void __tgt_unregister_lib(__tgt_bin_desc *desc); OMPRTL__tgt_unregister_lib, // Call to void __tgt_target_data_begin(int32_t device_id, int32_t arg_num, // void** args_base, void **args, size_t *arg_sizes, int32_t *arg_types); OMPRTL__tgt_target_data_begin, // Call to void __tgt_target_data_end(int32_t device_id, int32_t arg_num, // void** args_base, void **args, size_t *arg_sizes, int32_t *arg_types); OMPRTL__tgt_target_data_end, // Call to void __tgt_target_data_update(int32_t device_id, int32_t arg_num, // void** args_base, void **args, size_t *arg_sizes, int32_t *arg_types); OMPRTL__tgt_target_data_update, }; /// A basic class for pre|post-action for advanced codegen sequence for OpenMP /// region. class CleanupTy final : public EHScopeStack::Cleanup { PrePostActionTy *Action; public: explicit CleanupTy(PrePostActionTy *Action) : Action(Action) {} void Emit(CodeGenFunction &CGF, Flags /*flags*/) override { if (!CGF.HaveInsertPoint()) return; Action->Exit(CGF); } }; } // anonymous namespace void RegionCodeGenTy::operator()(CodeGenFunction &CGF) const { CodeGenFunction::RunCleanupsScope Scope(CGF); if (PrePostAction) { CGF.EHStack.pushCleanup(NormalAndEHCleanup, PrePostAction); Callback(CodeGen, CGF, *PrePostAction); } else { PrePostActionTy Action; Callback(CodeGen, CGF, Action); } } LValue CGOpenMPRegionInfo::getThreadIDVariableLValue(CodeGenFunction &CGF) { return CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(getThreadIDVariable()), getThreadIDVariable()->getType()->castAs()); } void CGOpenMPRegionInfo::EmitBody(CodeGenFunction &CGF, const Stmt * /*S*/) { if (!CGF.HaveInsertPoint()) return; // 1.2.2 OpenMP Language Terminology // Structured block - An executable statement with a single entry at the // top and a single exit at the bottom. // The point of exit cannot be a branch out of the structured block. // longjmp() and throw() must not violate the entry/exit criteria. CGF.EHStack.pushTerminate(); CodeGen(CGF); CGF.EHStack.popTerminate(); } LValue CGOpenMPTaskOutlinedRegionInfo::getThreadIDVariableLValue( CodeGenFunction &CGF) { return CGF.MakeAddrLValue(CGF.GetAddrOfLocalVar(getThreadIDVariable()), getThreadIDVariable()->getType(), AlignmentSource::Decl); } CGOpenMPRuntime::CGOpenMPRuntime(CodeGenModule &CGM) : CGM(CGM), OffloadEntriesInfoManager(CGM) { IdentTy = llvm::StructType::create( "ident_t", CGM.Int32Ty /* reserved_1 */, CGM.Int32Ty /* flags */, CGM.Int32Ty /* reserved_2 */, CGM.Int32Ty /* reserved_3 */, CGM.Int8PtrTy /* psource */, nullptr); KmpCriticalNameTy = llvm::ArrayType::get(CGM.Int32Ty, /*NumElements*/ 8); loadOffloadInfoMetadata(); } void CGOpenMPRuntime::clear() { InternalVars.clear(); } static llvm::Function * emitCombinerOrInitializer(CodeGenModule &CGM, QualType Ty, const Expr *CombinerInitializer, const VarDecl *In, const VarDecl *Out, bool IsCombiner) { // void .omp_combiner.(Ty *in, Ty *out); auto &C = CGM.getContext(); QualType PtrTy = C.getPointerType(Ty).withRestrict(); FunctionArgList Args; ImplicitParamDecl OmpOutParm(C, /*DC=*/nullptr, Out->getLocation(), /*Id=*/nullptr, PtrTy); ImplicitParamDecl OmpInParm(C, /*DC=*/nullptr, In->getLocation(), /*Id=*/nullptr, PtrTy); Args.push_back(&OmpOutParm); Args.push_back(&OmpInParm); auto &FnInfo = CGM.getTypes().arrangeBuiltinFunctionDeclaration(C.VoidTy, Args); auto *FnTy = CGM.getTypes().GetFunctionType(FnInfo); auto *Fn = llvm::Function::Create( FnTy, llvm::GlobalValue::InternalLinkage, IsCombiner ? ".omp_combiner." : ".omp_initializer.", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, Fn, FnInfo); Fn->removeFnAttr(llvm::Attribute::NoInline); Fn->addFnAttr(llvm::Attribute::AlwaysInline); CodeGenFunction CGF(CGM); // Map "T omp_in;" variable to "*omp_in_parm" value in all expressions. // Map "T omp_out;" variable to "*omp_out_parm" value in all expressions. CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FnInfo, Args); CodeGenFunction::OMPPrivateScope Scope(CGF); Address AddrIn = CGF.GetAddrOfLocalVar(&OmpInParm); Scope.addPrivate(In, [&CGF, AddrIn, PtrTy]() -> Address { return CGF.EmitLoadOfPointerLValue(AddrIn, PtrTy->castAs()) .getAddress(); }); Address AddrOut = CGF.GetAddrOfLocalVar(&OmpOutParm); Scope.addPrivate(Out, [&CGF, AddrOut, PtrTy]() -> Address { return CGF.EmitLoadOfPointerLValue(AddrOut, PtrTy->castAs()) .getAddress(); }); (void)Scope.Privatize(); CGF.EmitIgnoredExpr(CombinerInitializer); Scope.ForceCleanup(); CGF.FinishFunction(); return Fn; } void CGOpenMPRuntime::emitUserDefinedReduction( CodeGenFunction *CGF, const OMPDeclareReductionDecl *D) { if (UDRMap.count(D) > 0) return; auto &C = CGM.getContext(); if (!In || !Out) { In = &C.Idents.get("omp_in"); Out = &C.Idents.get("omp_out"); } llvm::Function *Combiner = emitCombinerOrInitializer( CGM, D->getType(), D->getCombiner(), cast(D->lookup(In).front()), cast(D->lookup(Out).front()), /*IsCombiner=*/true); llvm::Function *Initializer = nullptr; if (auto *Init = D->getInitializer()) { if (!Priv || !Orig) { Priv = &C.Idents.get("omp_priv"); Orig = &C.Idents.get("omp_orig"); } Initializer = emitCombinerOrInitializer( CGM, D->getType(), Init, cast(D->lookup(Orig).front()), cast(D->lookup(Priv).front()), /*IsCombiner=*/false); } UDRMap.insert(std::make_pair(D, std::make_pair(Combiner, Initializer))); if (CGF) { auto &Decls = FunctionUDRMap.FindAndConstruct(CGF->CurFn); Decls.second.push_back(D); } } std::pair CGOpenMPRuntime::getUserDefinedReduction(const OMPDeclareReductionDecl *D) { auto I = UDRMap.find(D); if (I != UDRMap.end()) return I->second; emitUserDefinedReduction(/*CGF=*/nullptr, D); return UDRMap.lookup(D); } // Layout information for ident_t. static CharUnits getIdentAlign(CodeGenModule &CGM) { return CGM.getPointerAlign(); } static CharUnits getIdentSize(CodeGenModule &CGM) { assert((4 * CGM.getPointerSize()).isMultipleOf(CGM.getPointerAlign())); return CharUnits::fromQuantity(16) + CGM.getPointerSize(); } static CharUnits getOffsetOfIdentField(IdentFieldIndex Field) { // All the fields except the last are i32, so this works beautifully. return unsigned(Field) * CharUnits::fromQuantity(4); } static Address createIdentFieldGEP(CodeGenFunction &CGF, Address Addr, IdentFieldIndex Field, const llvm::Twine &Name = "") { auto Offset = getOffsetOfIdentField(Field); return CGF.Builder.CreateStructGEP(Addr, Field, Offset, Name); } llvm::Value *CGOpenMPRuntime::emitParallelOrTeamsOutlinedFunction( const OMPExecutableDirective &D, const VarDecl *ThreadIDVar, OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy &CodeGen) { assert(ThreadIDVar->getType()->isPointerType() && "thread id variable must be of type kmp_int32 *"); const CapturedStmt *CS = cast(D.getAssociatedStmt()); CodeGenFunction CGF(CGM, true); bool HasCancel = false; if (auto *OPD = dyn_cast(&D)) HasCancel = OPD->hasCancel(); else if (auto *OPSD = dyn_cast(&D)) HasCancel = OPSD->hasCancel(); else if (auto *OPFD = dyn_cast(&D)) HasCancel = OPFD->hasCancel(); CGOpenMPOutlinedRegionInfo CGInfo(*CS, ThreadIDVar, CodeGen, InnermostKind, HasCancel, getOutlinedHelperName()); CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(CGF, &CGInfo); return CGF.GenerateOpenMPCapturedStmtFunction(*CS); } llvm::Value *CGOpenMPRuntime::emitTaskOutlinedFunction( const OMPExecutableDirective &D, const VarDecl *ThreadIDVar, const VarDecl *PartIDVar, const VarDecl *TaskTVar, OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy &CodeGen, bool Tied, unsigned &NumberOfParts) { auto &&UntiedCodeGen = [this, &D, TaskTVar](CodeGenFunction &CGF, PrePostActionTy &) { auto *ThreadID = getThreadID(CGF, D.getLocStart()); auto *UpLoc = emitUpdateLocation(CGF, D.getLocStart()); llvm::Value *TaskArgs[] = { UpLoc, ThreadID, CGF.EmitLoadOfPointerLValue(CGF.GetAddrOfLocalVar(TaskTVar), TaskTVar->getType()->castAs()) .getPointer()}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_omp_task), TaskArgs); }; CGOpenMPTaskOutlinedRegionInfo::UntiedTaskActionTy Action(Tied, PartIDVar, UntiedCodeGen); CodeGen.setAction(Action); assert(!ThreadIDVar->getType()->isPointerType() && "thread id variable must be of type kmp_int32 for tasks"); auto *CS = cast(D.getAssociatedStmt()); auto *TD = dyn_cast(&D); CodeGenFunction CGF(CGM, true); CGOpenMPTaskOutlinedRegionInfo CGInfo(*CS, ThreadIDVar, CodeGen, InnermostKind, TD ? TD->hasCancel() : false, Action); CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(CGF, &CGInfo); auto *Res = CGF.GenerateCapturedStmtFunction(*CS); if (!Tied) NumberOfParts = Action.getNumberOfParts(); return Res; } Address CGOpenMPRuntime::getOrCreateDefaultLocation(unsigned Flags) { CharUnits Align = getIdentAlign(CGM); llvm::Value *Entry = OpenMPDefaultLocMap.lookup(Flags); if (!Entry) { if (!DefaultOpenMPPSource) { // Initialize default location for psource field of ident_t structure of // all ident_t objects. Format is ";file;function;line;column;;". // Taken from // http://llvm.org/svn/llvm-project/openmp/trunk/runtime/src/kmp_str.c DefaultOpenMPPSource = CGM.GetAddrOfConstantCString(";unknown;unknown;0;0;;").getPointer(); DefaultOpenMPPSource = llvm::ConstantExpr::getBitCast(DefaultOpenMPPSource, CGM.Int8PtrTy); } ConstantInitBuilder builder(CGM); auto fields = builder.beginStruct(IdentTy); fields.addInt(CGM.Int32Ty, 0); fields.addInt(CGM.Int32Ty, Flags); fields.addInt(CGM.Int32Ty, 0); fields.addInt(CGM.Int32Ty, 0); fields.add(DefaultOpenMPPSource); auto DefaultOpenMPLocation = fields.finishAndCreateGlobal("", Align, /*isConstant*/ true, llvm::GlobalValue::PrivateLinkage); DefaultOpenMPLocation->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::Global); OpenMPDefaultLocMap[Flags] = Entry = DefaultOpenMPLocation; } return Address(Entry, Align); } llvm::Value *CGOpenMPRuntime::emitUpdateLocation(CodeGenFunction &CGF, SourceLocation Loc, unsigned Flags) { Flags |= OMP_IDENT_KMPC; // If no debug info is generated - return global default location. if (CGM.getCodeGenOpts().getDebugInfo() == codegenoptions::NoDebugInfo || Loc.isInvalid()) return getOrCreateDefaultLocation(Flags).getPointer(); assert(CGF.CurFn && "No function in current CodeGenFunction."); Address LocValue = Address::invalid(); auto I = OpenMPLocThreadIDMap.find(CGF.CurFn); if (I != OpenMPLocThreadIDMap.end()) LocValue = Address(I->second.DebugLoc, getIdentAlign(CGF.CGM)); // OpenMPLocThreadIDMap may have null DebugLoc and non-null ThreadID, if // GetOpenMPThreadID was called before this routine. if (!LocValue.isValid()) { // Generate "ident_t .kmpc_loc.addr;" Address AI = CGF.CreateTempAlloca(IdentTy, getIdentAlign(CGF.CGM), ".kmpc_loc.addr"); auto &Elem = OpenMPLocThreadIDMap.FindAndConstruct(CGF.CurFn); Elem.second.DebugLoc = AI.getPointer(); LocValue = AI; CGBuilderTy::InsertPointGuard IPG(CGF.Builder); CGF.Builder.SetInsertPoint(CGF.AllocaInsertPt); CGF.Builder.CreateMemCpy(LocValue, getOrCreateDefaultLocation(Flags), CGM.getSize(getIdentSize(CGF.CGM))); } // char **psource = &.kmpc_loc_.addr.psource; Address PSource = createIdentFieldGEP(CGF, LocValue, IdentField_PSource); auto OMPDebugLoc = OpenMPDebugLocMap.lookup(Loc.getRawEncoding()); if (OMPDebugLoc == nullptr) { SmallString<128> Buffer2; llvm::raw_svector_ostream OS2(Buffer2); // Build debug location PresumedLoc PLoc = CGF.getContext().getSourceManager().getPresumedLoc(Loc); OS2 << ";" << PLoc.getFilename() << ";"; if (const FunctionDecl *FD = dyn_cast_or_null(CGF.CurFuncDecl)) { OS2 << FD->getQualifiedNameAsString(); } OS2 << ";" << PLoc.getLine() << ";" << PLoc.getColumn() << ";;"; OMPDebugLoc = CGF.Builder.CreateGlobalStringPtr(OS2.str()); OpenMPDebugLocMap[Loc.getRawEncoding()] = OMPDebugLoc; } // *psource = ";;;;;;"; CGF.Builder.CreateStore(OMPDebugLoc, PSource); // Our callers always pass this to a runtime function, so for // convenience, go ahead and return a naked pointer. return LocValue.getPointer(); } llvm::Value *CGOpenMPRuntime::getThreadID(CodeGenFunction &CGF, SourceLocation Loc) { assert(CGF.CurFn && "No function in current CodeGenFunction."); llvm::Value *ThreadID = nullptr; // Check whether we've already cached a load of the thread id in this // function. auto I = OpenMPLocThreadIDMap.find(CGF.CurFn); if (I != OpenMPLocThreadIDMap.end()) { ThreadID = I->second.ThreadID; if (ThreadID != nullptr) return ThreadID; } if (auto *OMPRegionInfo = dyn_cast_or_null(CGF.CapturedStmtInfo)) { if (OMPRegionInfo->getThreadIDVariable()) { // Check if this an outlined function with thread id passed as argument. auto LVal = OMPRegionInfo->getThreadIDVariableLValue(CGF); ThreadID = CGF.EmitLoadOfLValue(LVal, Loc).getScalarVal(); // If value loaded in entry block, cache it and use it everywhere in // function. if (CGF.Builder.GetInsertBlock() == CGF.AllocaInsertPt->getParent()) { auto &Elem = OpenMPLocThreadIDMap.FindAndConstruct(CGF.CurFn); Elem.second.ThreadID = ThreadID; } return ThreadID; } } // This is not an outlined function region - need to call __kmpc_int32 // kmpc_global_thread_num(ident_t *loc). // Generate thread id value and cache this value for use across the // function. CGBuilderTy::InsertPointGuard IPG(CGF.Builder); CGF.Builder.SetInsertPoint(CGF.AllocaInsertPt); ThreadID = CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_global_thread_num), emitUpdateLocation(CGF, Loc)); auto &Elem = OpenMPLocThreadIDMap.FindAndConstruct(CGF.CurFn); Elem.second.ThreadID = ThreadID; return ThreadID; } void CGOpenMPRuntime::functionFinished(CodeGenFunction &CGF) { assert(CGF.CurFn && "No function in current CodeGenFunction."); if (OpenMPLocThreadIDMap.count(CGF.CurFn)) OpenMPLocThreadIDMap.erase(CGF.CurFn); if (FunctionUDRMap.count(CGF.CurFn) > 0) { for(auto *D : FunctionUDRMap[CGF.CurFn]) { UDRMap.erase(D); } FunctionUDRMap.erase(CGF.CurFn); } } llvm::Type *CGOpenMPRuntime::getIdentTyPointerTy() { if (!IdentTy) { } return llvm::PointerType::getUnqual(IdentTy); } llvm::Type *CGOpenMPRuntime::getKmpc_MicroPointerTy() { if (!Kmpc_MicroTy) { // Build void (*kmpc_micro)(kmp_int32 *global_tid, kmp_int32 *bound_tid,...) llvm::Type *MicroParams[] = {llvm::PointerType::getUnqual(CGM.Int32Ty), llvm::PointerType::getUnqual(CGM.Int32Ty)}; Kmpc_MicroTy = llvm::FunctionType::get(CGM.VoidTy, MicroParams, true); } return llvm::PointerType::getUnqual(Kmpc_MicroTy); } llvm::Constant * CGOpenMPRuntime::createRuntimeFunction(unsigned Function) { llvm::Constant *RTLFn = nullptr; switch (static_cast(Function)) { case OMPRTL__kmpc_fork_call: { // Build void __kmpc_fork_call(ident_t *loc, kmp_int32 argc, kmpc_micro // microtask, ...); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, getKmpc_MicroPointerTy()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ true); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_fork_call"); break; } case OMPRTL__kmpc_global_thread_num: { // Build kmp_int32 __kmpc_global_thread_num(ident_t *loc); llvm::Type *TypeParams[] = {getIdentTyPointerTy()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_global_thread_num"); break; } case OMPRTL__kmpc_threadprivate_cached: { // Build void *__kmpc_threadprivate_cached(ident_t *loc, // kmp_int32 global_tid, void *data, size_t size, void ***cache); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.VoidPtrTy, CGM.SizeTy, CGM.VoidPtrTy->getPointerTo()->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidPtrTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_threadprivate_cached"); break; } case OMPRTL__kmpc_critical: { // Build void __kmpc_critical(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *crit); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, llvm::PointerType::getUnqual(KmpCriticalNameTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_critical"); break; } case OMPRTL__kmpc_critical_with_hint: { // Build void __kmpc_critical_with_hint(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *crit, uintptr_t hint); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, llvm::PointerType::getUnqual(KmpCriticalNameTy), CGM.IntPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_critical_with_hint"); break; } case OMPRTL__kmpc_threadprivate_register: { // Build void __kmpc_threadprivate_register(ident_t *, void *data, // kmpc_ctor ctor, kmpc_cctor cctor, kmpc_dtor dtor); // typedef void *(*kmpc_ctor)(void *); auto KmpcCtorTy = llvm::FunctionType::get(CGM.VoidPtrTy, CGM.VoidPtrTy, /*isVarArg*/ false)->getPointerTo(); // typedef void *(*kmpc_cctor)(void *, void *); llvm::Type *KmpcCopyCtorTyArgs[] = {CGM.VoidPtrTy, CGM.VoidPtrTy}; auto KmpcCopyCtorTy = llvm::FunctionType::get(CGM.VoidPtrTy, KmpcCopyCtorTyArgs, /*isVarArg*/ false)->getPointerTo(); // typedef void (*kmpc_dtor)(void *); auto KmpcDtorTy = llvm::FunctionType::get(CGM.VoidTy, CGM.VoidPtrTy, /*isVarArg*/ false) ->getPointerTo(); llvm::Type *FnTyArgs[] = {getIdentTyPointerTy(), CGM.VoidPtrTy, KmpcCtorTy, KmpcCopyCtorTy, KmpcDtorTy}; auto FnTy = llvm::FunctionType::get(CGM.VoidTy, FnTyArgs, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_threadprivate_register"); break; } case OMPRTL__kmpc_end_critical: { // Build void __kmpc_end_critical(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *crit); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, llvm::PointerType::getUnqual(KmpCriticalNameTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_end_critical"); break; } case OMPRTL__kmpc_cancel_barrier: { // Build kmp_int32 __kmpc_cancel_barrier(ident_t *loc, kmp_int32 // global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name*/ "__kmpc_cancel_barrier"); break; } case OMPRTL__kmpc_barrier: { // Build void __kmpc_barrier(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name*/ "__kmpc_barrier"); break; } case OMPRTL__kmpc_for_static_fini: { // Build void __kmpc_for_static_fini(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_for_static_fini"); break; } case OMPRTL__kmpc_push_num_threads: { // Build void __kmpc_push_num_threads(ident_t *loc, kmp_int32 global_tid, // kmp_int32 num_threads) llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_push_num_threads"); break; } case OMPRTL__kmpc_serialized_parallel: { // Build void __kmpc_serialized_parallel(ident_t *loc, kmp_int32 // global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_serialized_parallel"); break; } case OMPRTL__kmpc_end_serialized_parallel: { // Build void __kmpc_end_serialized_parallel(ident_t *loc, kmp_int32 // global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_end_serialized_parallel"); break; } case OMPRTL__kmpc_flush: { // Build void __kmpc_flush(ident_t *loc); llvm::Type *TypeParams[] = {getIdentTyPointerTy()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_flush"); break; } case OMPRTL__kmpc_master: { // Build kmp_int32 __kmpc_master(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_master"); break; } case OMPRTL__kmpc_end_master: { // Build void __kmpc_end_master(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_end_master"); break; } case OMPRTL__kmpc_omp_taskyield: { // Build kmp_int32 __kmpc_omp_taskyield(ident_t *, kmp_int32 global_tid, // int end_part); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.IntTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_taskyield"); break; } case OMPRTL__kmpc_single: { // Build kmp_int32 __kmpc_single(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_single"); break; } case OMPRTL__kmpc_end_single: { // Build void __kmpc_end_single(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_end_single"); break; } case OMPRTL__kmpc_omp_task_alloc: { // Build kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, // kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, // kmp_routine_entry_t *task_entry); assert(KmpRoutineEntryPtrTy != nullptr && "Type kmp_routine_entry_t must be created."); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, CGM.SizeTy, CGM.SizeTy, KmpRoutineEntryPtrTy}; // Return void * and then cast to particular kmp_task_t type. llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidPtrTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task_alloc"); break; } case OMPRTL__kmpc_omp_task: { // Build kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t // *new_task); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task"); break; } case OMPRTL__kmpc_copyprivate: { // Build void __kmpc_copyprivate(ident_t *loc, kmp_int32 global_tid, // size_t cpy_size, void *cpy_data, void(*cpy_func)(void *, void *), // kmp_int32 didit); llvm::Type *CpyTypeParams[] = {CGM.VoidPtrTy, CGM.VoidPtrTy}; auto *CpyFnTy = llvm::FunctionType::get(CGM.VoidTy, CpyTypeParams, /*isVarArg=*/false); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.SizeTy, CGM.VoidPtrTy, CpyFnTy->getPointerTo(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_copyprivate"); break; } case OMPRTL__kmpc_reduce: { // Build kmp_int32 __kmpc_reduce(ident_t *loc, kmp_int32 global_tid, // kmp_int32 num_vars, size_t reduce_size, void *reduce_data, void // (*reduce_func)(void *lhs_data, void *rhs_data), kmp_critical_name *lck); llvm::Type *ReduceTypeParams[] = {CGM.VoidPtrTy, CGM.VoidPtrTy}; auto *ReduceFnTy = llvm::FunctionType::get(CGM.VoidTy, ReduceTypeParams, /*isVarArg=*/false); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, CGM.SizeTy, CGM.VoidPtrTy, ReduceFnTy->getPointerTo(), llvm::PointerType::getUnqual(KmpCriticalNameTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_reduce"); break; } case OMPRTL__kmpc_reduce_nowait: { // Build kmp_int32 __kmpc_reduce_nowait(ident_t *loc, kmp_int32 // global_tid, kmp_int32 num_vars, size_t reduce_size, void *reduce_data, // void (*reduce_func)(void *lhs_data, void *rhs_data), kmp_critical_name // *lck); llvm::Type *ReduceTypeParams[] = {CGM.VoidPtrTy, CGM.VoidPtrTy}; auto *ReduceFnTy = llvm::FunctionType::get(CGM.VoidTy, ReduceTypeParams, /*isVarArg=*/false); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, CGM.SizeTy, CGM.VoidPtrTy, ReduceFnTy->getPointerTo(), llvm::PointerType::getUnqual(KmpCriticalNameTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_reduce_nowait"); break; } case OMPRTL__kmpc_end_reduce: { // Build void __kmpc_end_reduce(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *lck); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, llvm::PointerType::getUnqual(KmpCriticalNameTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_end_reduce"); break; } case OMPRTL__kmpc_end_reduce_nowait: { // Build __kmpc_end_reduce_nowait(ident_t *loc, kmp_int32 global_tid, // kmp_critical_name *lck); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, llvm::PointerType::getUnqual(KmpCriticalNameTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_end_reduce_nowait"); break; } case OMPRTL__kmpc_omp_task_begin_if0: { // Build void __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t // *new_task); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task_begin_if0"); break; } case OMPRTL__kmpc_omp_task_complete_if0: { // Build void __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t // *new_task); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task_complete_if0"); break; } case OMPRTL__kmpc_ordered: { // Build void __kmpc_ordered(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_ordered"); break; } case OMPRTL__kmpc_end_ordered: { // Build void __kmpc_end_ordered(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_end_ordered"); break; } case OMPRTL__kmpc_omp_taskwait: { // Build kmp_int32 __kmpc_omp_taskwait(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_omp_taskwait"); break; } case OMPRTL__kmpc_taskgroup: { // Build void __kmpc_taskgroup(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_taskgroup"); break; } case OMPRTL__kmpc_end_taskgroup: { // Build void __kmpc_end_taskgroup(ident_t *loc, kmp_int32 global_tid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_end_taskgroup"); break; } case OMPRTL__kmpc_push_proc_bind: { // Build void __kmpc_push_proc_bind(ident_t *loc, kmp_int32 global_tid, // int proc_bind) llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.IntTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_push_proc_bind"); break; } case OMPRTL__kmpc_omp_task_with_deps: { // Build kmp_int32 __kmpc_omp_task_with_deps(ident_t *, kmp_int32 gtid, // kmp_task_t *new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list, // kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), CGM.Int32Ty, CGM.VoidPtrTy, CGM.Int32Ty, CGM.VoidPtrTy, CGM.Int32Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task_with_deps"); break; } case OMPRTL__kmpc_omp_wait_deps: { // Build void __kmpc_omp_wait_deps(ident_t *, kmp_int32 gtid, // kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, // kmp_depend_info_t *noalias_dep_list); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, CGM.VoidPtrTy, CGM.Int32Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_wait_deps"); break; } case OMPRTL__kmpc_cancellationpoint: { // Build kmp_int32 __kmpc_cancellationpoint(ident_t *loc, kmp_int32 // global_tid, kmp_int32 cncl_kind) llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.IntTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_cancellationpoint"); break; } case OMPRTL__kmpc_cancel: { // Build kmp_int32 __kmpc_cancel(ident_t *loc, kmp_int32 global_tid, // kmp_int32 cncl_kind) llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.IntTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_cancel"); break; } case OMPRTL__kmpc_push_num_teams: { // Build void kmpc_push_num_teams (ident_t loc, kmp_int32 global_tid, // kmp_int32 num_teams, kmp_int32 num_threads) llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_push_num_teams"); break; } case OMPRTL__kmpc_fork_teams: { // Build void __kmpc_fork_teams(ident_t *loc, kmp_int32 argc, kmpc_micro // microtask, ...); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, getKmpc_MicroPointerTy()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ true); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_fork_teams"); break; } case OMPRTL__kmpc_taskloop: { // Build void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int // if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int // sched, kmp_uint64 grainsize, void *task_dup); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.IntTy, CGM.VoidPtrTy, CGM.IntTy, CGM.Int64Ty->getPointerTo(), CGM.Int64Ty->getPointerTo(), CGM.Int64Ty, CGM.IntTy, CGM.IntTy, CGM.Int64Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_taskloop"); break; } case OMPRTL__kmpc_doacross_init: { // Build void __kmpc_doacross_init(ident_t *loc, kmp_int32 gtid, kmp_int32 // num_dims, struct kmp_dim *dims); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, CGM.VoidPtrTy}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_doacross_init"); break; } case OMPRTL__kmpc_doacross_fini: { // Build void __kmpc_doacross_fini(ident_t *loc, kmp_int32 gtid); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_doacross_fini"); break; } case OMPRTL__kmpc_doacross_post: { // Build void __kmpc_doacross_post(ident_t *loc, kmp_int32 gtid, kmp_int64 // *vec); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int64Ty->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_doacross_post"); break; } case OMPRTL__kmpc_doacross_wait: { // Build void __kmpc_doacross_wait(ident_t *loc, kmp_int32 gtid, kmp_int64 // *vec); llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int64Ty->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_doacross_wait"); break; } case OMPRTL__tgt_target: { // Build int32_t __tgt_target(int32_t device_id, void *host_ptr, int32_t // arg_num, void** args_base, void **args, size_t *arg_sizes, int32_t // *arg_types); llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.VoidPtrTy, CGM.Int32Ty, CGM.VoidPtrPtrTy, CGM.VoidPtrPtrTy, CGM.SizeTy->getPointerTo(), CGM.Int32Ty->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_target"); break; } case OMPRTL__tgt_target_teams: { // Build int32_t __tgt_target_teams(int32_t device_id, void *host_ptr, // int32_t arg_num, void** args_base, void **args, size_t *arg_sizes, // int32_t *arg_types, int32_t num_teams, int32_t thread_limit); llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.VoidPtrTy, CGM.Int32Ty, CGM.VoidPtrPtrTy, CGM.VoidPtrPtrTy, CGM.SizeTy->getPointerTo(), CGM.Int32Ty->getPointerTo(), CGM.Int32Ty, CGM.Int32Ty}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_target_teams"); break; } case OMPRTL__tgt_register_lib: { // Build void __tgt_register_lib(__tgt_bin_desc *desc); QualType ParamTy = CGM.getContext().getPointerType(getTgtBinaryDescriptorQTy()); llvm::Type *TypeParams[] = {CGM.getTypes().ConvertTypeForMem(ParamTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_register_lib"); break; } case OMPRTL__tgt_unregister_lib: { // Build void __tgt_unregister_lib(__tgt_bin_desc *desc); QualType ParamTy = CGM.getContext().getPointerType(getTgtBinaryDescriptorQTy()); llvm::Type *TypeParams[] = {CGM.getTypes().ConvertTypeForMem(ParamTy)}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_unregister_lib"); break; } case OMPRTL__tgt_target_data_begin: { // Build void __tgt_target_data_begin(int32_t device_id, int32_t arg_num, // void** args_base, void **args, size_t *arg_sizes, int32_t *arg_types); llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.Int32Ty, CGM.VoidPtrPtrTy, CGM.VoidPtrPtrTy, CGM.SizeTy->getPointerTo(), CGM.Int32Ty->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_target_data_begin"); break; } case OMPRTL__tgt_target_data_end: { // Build void __tgt_target_data_end(int32_t device_id, int32_t arg_num, // void** args_base, void **args, size_t *arg_sizes, int32_t *arg_types); llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.Int32Ty, CGM.VoidPtrPtrTy, CGM.VoidPtrPtrTy, CGM.SizeTy->getPointerTo(), CGM.Int32Ty->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_target_data_end"); break; } case OMPRTL__tgt_target_data_update: { // Build void __tgt_target_data_update(int32_t device_id, int32_t arg_num, // void** args_base, void **args, size_t *arg_sizes, int32_t *arg_types); llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.Int32Ty, CGM.VoidPtrPtrTy, CGM.VoidPtrPtrTy, CGM.SizeTy->getPointerTo(), CGM.Int32Ty->getPointerTo()}; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); RTLFn = CGM.CreateRuntimeFunction(FnTy, "__tgt_target_data_update"); break; } } assert(RTLFn && "Unable to find OpenMP runtime function"); return RTLFn; } llvm::Constant *CGOpenMPRuntime::createForStaticInitFunction(unsigned IVSize, bool IVSigned) { assert((IVSize == 32 || IVSize == 64) && "IV size is not compatible with the omp runtime"); auto Name = IVSize == 32 ? (IVSigned ? "__kmpc_for_static_init_4" : "__kmpc_for_static_init_4u") : (IVSigned ? "__kmpc_for_static_init_8" : "__kmpc_for_static_init_8u"); auto ITy = IVSize == 32 ? CGM.Int32Ty : CGM.Int64Ty; auto PtrTy = llvm::PointerType::getUnqual(ITy); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), // loc CGM.Int32Ty, // tid CGM.Int32Ty, // schedtype llvm::PointerType::getUnqual(CGM.Int32Ty), // p_lastiter PtrTy, // p_lower PtrTy, // p_upper PtrTy, // p_stride ITy, // incr ITy // chunk }; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); return CGM.CreateRuntimeFunction(FnTy, Name); } llvm::Constant *CGOpenMPRuntime::createDispatchInitFunction(unsigned IVSize, bool IVSigned) { assert((IVSize == 32 || IVSize == 64) && "IV size is not compatible with the omp runtime"); auto Name = IVSize == 32 ? (IVSigned ? "__kmpc_dispatch_init_4" : "__kmpc_dispatch_init_4u") : (IVSigned ? "__kmpc_dispatch_init_8" : "__kmpc_dispatch_init_8u"); auto ITy = IVSize == 32 ? CGM.Int32Ty : CGM.Int64Ty; llvm::Type *TypeParams[] = { getIdentTyPointerTy(), // loc CGM.Int32Ty, // tid CGM.Int32Ty, // schedtype ITy, // lower ITy, // upper ITy, // stride ITy // chunk }; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); return CGM.CreateRuntimeFunction(FnTy, Name); } llvm::Constant *CGOpenMPRuntime::createDispatchFiniFunction(unsigned IVSize, bool IVSigned) { assert((IVSize == 32 || IVSize == 64) && "IV size is not compatible with the omp runtime"); auto Name = IVSize == 32 ? (IVSigned ? "__kmpc_dispatch_fini_4" : "__kmpc_dispatch_fini_4u") : (IVSigned ? "__kmpc_dispatch_fini_8" : "__kmpc_dispatch_fini_8u"); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), // loc CGM.Int32Ty, // tid }; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg=*/false); return CGM.CreateRuntimeFunction(FnTy, Name); } llvm::Constant *CGOpenMPRuntime::createDispatchNextFunction(unsigned IVSize, bool IVSigned) { assert((IVSize == 32 || IVSize == 64) && "IV size is not compatible with the omp runtime"); auto Name = IVSize == 32 ? (IVSigned ? "__kmpc_dispatch_next_4" : "__kmpc_dispatch_next_4u") : (IVSigned ? "__kmpc_dispatch_next_8" : "__kmpc_dispatch_next_8u"); auto ITy = IVSize == 32 ? CGM.Int32Ty : CGM.Int64Ty; auto PtrTy = llvm::PointerType::getUnqual(ITy); llvm::Type *TypeParams[] = { getIdentTyPointerTy(), // loc CGM.Int32Ty, // tid llvm::PointerType::getUnqual(CGM.Int32Ty), // p_lastiter PtrTy, // p_lower PtrTy, // p_upper PtrTy // p_stride }; llvm::FunctionType *FnTy = llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); return CGM.CreateRuntimeFunction(FnTy, Name); } llvm::Constant * CGOpenMPRuntime::getOrCreateThreadPrivateCache(const VarDecl *VD) { assert(!CGM.getLangOpts().OpenMPUseTLS || !CGM.getContext().getTargetInfo().isTLSSupported()); // Lookup the entry, lazily creating it if necessary. return getOrCreateInternalVariable(CGM.Int8PtrPtrTy, Twine(CGM.getMangledName(VD)) + ".cache."); } Address CGOpenMPRuntime::getAddrOfThreadPrivate(CodeGenFunction &CGF, const VarDecl *VD, Address VDAddr, SourceLocation Loc) { if (CGM.getLangOpts().OpenMPUseTLS && CGM.getContext().getTargetInfo().isTLSSupported()) return VDAddr; auto VarTy = VDAddr.getElementType(); llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), CGF.Builder.CreatePointerCast(VDAddr.getPointer(), CGM.Int8PtrTy), CGM.getSize(CGM.GetTargetTypeStoreSize(VarTy)), getOrCreateThreadPrivateCache(VD)}; return Address(CGF.EmitRuntimeCall( createRuntimeFunction(OMPRTL__kmpc_threadprivate_cached), Args), VDAddr.getAlignment()); } void CGOpenMPRuntime::emitThreadPrivateVarInit( CodeGenFunction &CGF, Address VDAddr, llvm::Value *Ctor, llvm::Value *CopyCtor, llvm::Value *Dtor, SourceLocation Loc) { // Call kmp_int32 __kmpc_global_thread_num(&loc) to init OpenMP runtime // library. auto OMPLoc = emitUpdateLocation(CGF, Loc); CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_global_thread_num), OMPLoc); // Call __kmpc_threadprivate_register(&loc, &var, ctor, cctor/*NULL*/, dtor) // to register constructor/destructor for variable. llvm::Value *Args[] = {OMPLoc, CGF.Builder.CreatePointerCast(VDAddr.getPointer(), CGM.VoidPtrTy), Ctor, CopyCtor, Dtor}; CGF.EmitRuntimeCall( createRuntimeFunction(OMPRTL__kmpc_threadprivate_register), Args); } llvm::Function *CGOpenMPRuntime::emitThreadPrivateVarDefinition( const VarDecl *VD, Address VDAddr, SourceLocation Loc, bool PerformInit, CodeGenFunction *CGF) { if (CGM.getLangOpts().OpenMPUseTLS && CGM.getContext().getTargetInfo().isTLSSupported()) return nullptr; VD = VD->getDefinition(CGM.getContext()); if (VD && ThreadPrivateWithDefinition.count(VD) == 0) { ThreadPrivateWithDefinition.insert(VD); QualType ASTTy = VD->getType(); llvm::Value *Ctor = nullptr, *CopyCtor = nullptr, *Dtor = nullptr; auto Init = VD->getAnyInitializer(); if (CGM.getLangOpts().CPlusPlus && PerformInit) { // Generate function that re-emits the declaration's initializer into the // threadprivate copy of the variable VD CodeGenFunction CtorCGF(CGM); FunctionArgList Args; ImplicitParamDecl Dst(CGM.getContext(), /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, CGM.getContext().VoidPtrTy); Args.push_back(&Dst); auto &FI = CGM.getTypes().arrangeBuiltinFunctionDeclaration( CGM.getContext().VoidPtrTy, Args); auto FTy = CGM.getTypes().GetFunctionType(FI); auto Fn = CGM.CreateGlobalInitOrDestructFunction( FTy, ".__kmpc_global_ctor_.", FI, Loc); CtorCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidPtrTy, Fn, FI, Args, SourceLocation()); auto ArgVal = CtorCGF.EmitLoadOfScalar( CtorCGF.GetAddrOfLocalVar(&Dst), /*Volatile=*/false, CGM.getContext().VoidPtrTy, Dst.getLocation()); Address Arg = Address(ArgVal, VDAddr.getAlignment()); Arg = CtorCGF.Builder.CreateElementBitCast(Arg, CtorCGF.ConvertTypeForMem(ASTTy)); CtorCGF.EmitAnyExprToMem(Init, Arg, Init->getType().getQualifiers(), /*IsInitializer=*/true); ArgVal = CtorCGF.EmitLoadOfScalar( CtorCGF.GetAddrOfLocalVar(&Dst), /*Volatile=*/false, CGM.getContext().VoidPtrTy, Dst.getLocation()); CtorCGF.Builder.CreateStore(ArgVal, CtorCGF.ReturnValue); CtorCGF.FinishFunction(); Ctor = Fn; } if (VD->getType().isDestructedType() != QualType::DK_none) { // Generate function that emits destructor call for the threadprivate copy // of the variable VD CodeGenFunction DtorCGF(CGM); FunctionArgList Args; ImplicitParamDecl Dst(CGM.getContext(), /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, CGM.getContext().VoidPtrTy); Args.push_back(&Dst); auto &FI = CGM.getTypes().arrangeBuiltinFunctionDeclaration( CGM.getContext().VoidTy, Args); auto FTy = CGM.getTypes().GetFunctionType(FI); auto Fn = CGM.CreateGlobalInitOrDestructFunction( FTy, ".__kmpc_global_dtor_.", FI, Loc); auto NL = ApplyDebugLocation::CreateEmpty(DtorCGF); DtorCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidTy, Fn, FI, Args, SourceLocation()); // Create a scope with an artificial location for the body of this function. auto AL = ApplyDebugLocation::CreateArtificial(DtorCGF); auto ArgVal = DtorCGF.EmitLoadOfScalar( DtorCGF.GetAddrOfLocalVar(&Dst), /*Volatile=*/false, CGM.getContext().VoidPtrTy, Dst.getLocation()); DtorCGF.emitDestroy(Address(ArgVal, VDAddr.getAlignment()), ASTTy, DtorCGF.getDestroyer(ASTTy.isDestructedType()), DtorCGF.needsEHCleanup(ASTTy.isDestructedType())); DtorCGF.FinishFunction(); Dtor = Fn; } // Do not emit init function if it is not required. if (!Ctor && !Dtor) return nullptr; llvm::Type *CopyCtorTyArgs[] = {CGM.VoidPtrTy, CGM.VoidPtrTy}; auto CopyCtorTy = llvm::FunctionType::get(CGM.VoidPtrTy, CopyCtorTyArgs, /*isVarArg=*/false)->getPointerTo(); // Copying constructor for the threadprivate variable. // Must be NULL - reserved by runtime, but currently it requires that this // parameter is always NULL. Otherwise it fires assertion. CopyCtor = llvm::Constant::getNullValue(CopyCtorTy); if (Ctor == nullptr) { auto CtorTy = llvm::FunctionType::get(CGM.VoidPtrTy, CGM.VoidPtrTy, /*isVarArg=*/false)->getPointerTo(); Ctor = llvm::Constant::getNullValue(CtorTy); } if (Dtor == nullptr) { auto DtorTy = llvm::FunctionType::get(CGM.VoidTy, CGM.VoidPtrTy, /*isVarArg=*/false)->getPointerTo(); Dtor = llvm::Constant::getNullValue(DtorTy); } if (!CGF) { auto InitFunctionTy = llvm::FunctionType::get(CGM.VoidTy, /*isVarArg*/ false); auto InitFunction = CGM.CreateGlobalInitOrDestructFunction( InitFunctionTy, ".__omp_threadprivate_init_.", CGM.getTypes().arrangeNullaryFunction()); CodeGenFunction InitCGF(CGM); FunctionArgList ArgList; InitCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidTy, InitFunction, CGM.getTypes().arrangeNullaryFunction(), ArgList, Loc); emitThreadPrivateVarInit(InitCGF, VDAddr, Ctor, CopyCtor, Dtor, Loc); InitCGF.FinishFunction(); return InitFunction; } emitThreadPrivateVarInit(*CGF, VDAddr, Ctor, CopyCtor, Dtor, Loc); } return nullptr; } /// \brief Emits code for OpenMP 'if' clause using specified \a CodeGen /// function. Here is the logic: /// if (Cond) { /// ThenGen(); /// } else { /// ElseGen(); /// } void CGOpenMPRuntime::emitOMPIfClause(CodeGenFunction &CGF, const Expr *Cond, const RegionCodeGenTy &ThenGen, const RegionCodeGenTy &ElseGen) { CodeGenFunction::LexicalScope ConditionScope(CGF, Cond->getSourceRange()); // If the condition constant folds and can be elided, try to avoid emitting // the condition and the dead arm of the if/else. bool CondConstant; if (CGF.ConstantFoldsToSimpleInteger(Cond, CondConstant)) { if (CondConstant) ThenGen(CGF); else ElseGen(CGF); return; } // Otherwise, the condition did not fold, or we couldn't elide it. Just // emit the conditional branch. auto ThenBlock = CGF.createBasicBlock("omp_if.then"); auto ElseBlock = CGF.createBasicBlock("omp_if.else"); auto ContBlock = CGF.createBasicBlock("omp_if.end"); CGF.EmitBranchOnBoolExpr(Cond, ThenBlock, ElseBlock, /*TrueCount=*/0); // Emit the 'then' code. CGF.EmitBlock(ThenBlock); ThenGen(CGF); CGF.EmitBranch(ContBlock); // Emit the 'else' code if present. // There is no need to emit line number for unconditional branch. (void)ApplyDebugLocation::CreateEmpty(CGF); CGF.EmitBlock(ElseBlock); ElseGen(CGF); // There is no need to emit line number for unconditional branch. (void)ApplyDebugLocation::CreateEmpty(CGF); CGF.EmitBranch(ContBlock); // Emit the continuation block for code after the if. CGF.EmitBlock(ContBlock, /*IsFinished=*/true); } void CGOpenMPRuntime::emitParallelCall(CodeGenFunction &CGF, SourceLocation Loc, llvm::Value *OutlinedFn, ArrayRef CapturedVars, const Expr *IfCond) { if (!CGF.HaveInsertPoint()) return; auto *RTLoc = emitUpdateLocation(CGF, Loc); auto &&ThenGen = [OutlinedFn, CapturedVars, RTLoc](CodeGenFunction &CGF, PrePostActionTy &) { // Build call __kmpc_fork_call(loc, n, microtask, var1, .., varn); auto &RT = CGF.CGM.getOpenMPRuntime(); llvm::Value *Args[] = { RTLoc, CGF.Builder.getInt32(CapturedVars.size()), // Number of captured vars CGF.Builder.CreateBitCast(OutlinedFn, RT.getKmpc_MicroPointerTy())}; llvm::SmallVector RealArgs; RealArgs.append(std::begin(Args), std::end(Args)); RealArgs.append(CapturedVars.begin(), CapturedVars.end()); auto RTLFn = RT.createRuntimeFunction(OMPRTL__kmpc_fork_call); CGF.EmitRuntimeCall(RTLFn, RealArgs); }; auto &&ElseGen = [OutlinedFn, CapturedVars, RTLoc, Loc](CodeGenFunction &CGF, PrePostActionTy &) { auto &RT = CGF.CGM.getOpenMPRuntime(); auto ThreadID = RT.getThreadID(CGF, Loc); // Build calls: // __kmpc_serialized_parallel(&Loc, GTid); llvm::Value *Args[] = {RTLoc, ThreadID}; CGF.EmitRuntimeCall( RT.createRuntimeFunction(OMPRTL__kmpc_serialized_parallel), Args); // OutlinedFn(>id, &zero, CapturedStruct); auto ThreadIDAddr = RT.emitThreadIDAddress(CGF, Loc); Address ZeroAddr = CGF.CreateTempAlloca(CGF.Int32Ty, CharUnits::fromQuantity(4), /*Name*/ ".zero.addr"); CGF.InitTempAlloca(ZeroAddr, CGF.Builder.getInt32(/*C*/ 0)); llvm::SmallVector OutlinedFnArgs; OutlinedFnArgs.push_back(ThreadIDAddr.getPointer()); OutlinedFnArgs.push_back(ZeroAddr.getPointer()); OutlinedFnArgs.append(CapturedVars.begin(), CapturedVars.end()); CGF.EmitCallOrInvoke(OutlinedFn, OutlinedFnArgs); // __kmpc_end_serialized_parallel(&Loc, GTid); llvm::Value *EndArgs[] = {RT.emitUpdateLocation(CGF, Loc), ThreadID}; CGF.EmitRuntimeCall( RT.createRuntimeFunction(OMPRTL__kmpc_end_serialized_parallel), EndArgs); }; if (IfCond) emitOMPIfClause(CGF, IfCond, ThenGen, ElseGen); else { RegionCodeGenTy ThenRCG(ThenGen); ThenRCG(CGF); } } // If we're inside an (outlined) parallel region, use the region info's // thread-ID variable (it is passed in a first argument of the outlined function // as "kmp_int32 *gtid"). Otherwise, if we're not inside parallel region, but in // regular serial code region, get thread ID by calling kmp_int32 // kmpc_global_thread_num(ident_t *loc), stash this thread ID in a temporary and // return the address of that temp. Address CGOpenMPRuntime::emitThreadIDAddress(CodeGenFunction &CGF, SourceLocation Loc) { if (auto *OMPRegionInfo = dyn_cast_or_null(CGF.CapturedStmtInfo)) if (OMPRegionInfo->getThreadIDVariable()) return OMPRegionInfo->getThreadIDVariableLValue(CGF).getAddress(); auto ThreadID = getThreadID(CGF, Loc); auto Int32Ty = CGF.getContext().getIntTypeForBitwidth(/*DestWidth*/ 32, /*Signed*/ true); auto ThreadIDTemp = CGF.CreateMemTemp(Int32Ty, /*Name*/ ".threadid_temp."); CGF.EmitStoreOfScalar(ThreadID, CGF.MakeAddrLValue(ThreadIDTemp, Int32Ty)); return ThreadIDTemp; } llvm::Constant * CGOpenMPRuntime::getOrCreateInternalVariable(llvm::Type *Ty, const llvm::Twine &Name) { SmallString<256> Buffer; llvm::raw_svector_ostream Out(Buffer); Out << Name; auto RuntimeName = Out.str(); auto &Elem = *InternalVars.insert(std::make_pair(RuntimeName, nullptr)).first; if (Elem.second) { assert(Elem.second->getType()->getPointerElementType() == Ty && "OMP internal variable has different type than requested"); return &*Elem.second; } return Elem.second = new llvm::GlobalVariable( CGM.getModule(), Ty, /*IsConstant*/ false, llvm::GlobalValue::CommonLinkage, llvm::Constant::getNullValue(Ty), Elem.first()); } llvm::Value *CGOpenMPRuntime::getCriticalRegionLock(StringRef CriticalName) { llvm::Twine Name(".gomp_critical_user_", CriticalName); return getOrCreateInternalVariable(KmpCriticalNameTy, Name.concat(".var")); } namespace { /// Common pre(post)-action for different OpenMP constructs. class CommonActionTy final : public PrePostActionTy { llvm::Value *EnterCallee; ArrayRef EnterArgs; llvm::Value *ExitCallee; ArrayRef ExitArgs; bool Conditional; llvm::BasicBlock *ContBlock = nullptr; public: CommonActionTy(llvm::Value *EnterCallee, ArrayRef EnterArgs, llvm::Value *ExitCallee, ArrayRef ExitArgs, bool Conditional = false) : EnterCallee(EnterCallee), EnterArgs(EnterArgs), ExitCallee(ExitCallee), ExitArgs(ExitArgs), Conditional(Conditional) {} void Enter(CodeGenFunction &CGF) override { llvm::Value *EnterRes = CGF.EmitRuntimeCall(EnterCallee, EnterArgs); if (Conditional) { llvm::Value *CallBool = CGF.Builder.CreateIsNotNull(EnterRes); auto *ThenBlock = CGF.createBasicBlock("omp_if.then"); ContBlock = CGF.createBasicBlock("omp_if.end"); // Generate the branch (If-stmt) CGF.Builder.CreateCondBr(CallBool, ThenBlock, ContBlock); CGF.EmitBlock(ThenBlock); } } void Done(CodeGenFunction &CGF) { // Emit the rest of blocks/branches CGF.EmitBranch(ContBlock); CGF.EmitBlock(ContBlock, true); } void Exit(CodeGenFunction &CGF) override { CGF.EmitRuntimeCall(ExitCallee, ExitArgs); } }; } // anonymous namespace void CGOpenMPRuntime::emitCriticalRegion(CodeGenFunction &CGF, StringRef CriticalName, const RegionCodeGenTy &CriticalOpGen, SourceLocation Loc, const Expr *Hint) { // __kmpc_critical[_with_hint](ident_t *, gtid, Lock[, hint]); // CriticalOpGen(); // __kmpc_end_critical(ident_t *, gtid, Lock); // Prepare arguments and build a call to __kmpc_critical if (!CGF.HaveInsertPoint()) return; llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), getCriticalRegionLock(CriticalName)}; llvm::SmallVector EnterArgs(std::begin(Args), std::end(Args)); if (Hint) { EnterArgs.push_back(CGF.Builder.CreateIntCast( CGF.EmitScalarExpr(Hint), CGM.IntPtrTy, /*isSigned=*/false)); } CommonActionTy Action( createRuntimeFunction(Hint ? OMPRTL__kmpc_critical_with_hint : OMPRTL__kmpc_critical), EnterArgs, createRuntimeFunction(OMPRTL__kmpc_end_critical), Args); CriticalOpGen.setAction(Action); emitInlinedDirective(CGF, OMPD_critical, CriticalOpGen); } void CGOpenMPRuntime::emitMasterRegion(CodeGenFunction &CGF, const RegionCodeGenTy &MasterOpGen, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // if(__kmpc_master(ident_t *, gtid)) { // MasterOpGen(); // __kmpc_end_master(ident_t *, gtid); // } // Prepare arguments and build a call to __kmpc_master llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; CommonActionTy Action(createRuntimeFunction(OMPRTL__kmpc_master), Args, createRuntimeFunction(OMPRTL__kmpc_end_master), Args, /*Conditional=*/true); MasterOpGen.setAction(Action); emitInlinedDirective(CGF, OMPD_master, MasterOpGen); Action.Done(CGF); } void CGOpenMPRuntime::emitTaskyieldCall(CodeGenFunction &CGF, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // Build call __kmpc_omp_taskyield(loc, thread_id, 0); llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), llvm::ConstantInt::get(CGM.IntTy, /*V=*/0, /*isSigned=*/true)}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_omp_taskyield), Args); if (auto *Region = dyn_cast_or_null(CGF.CapturedStmtInfo)) Region->emitUntiedSwitch(CGF); } void CGOpenMPRuntime::emitTaskgroupRegion(CodeGenFunction &CGF, const RegionCodeGenTy &TaskgroupOpGen, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // __kmpc_taskgroup(ident_t *, gtid); // TaskgroupOpGen(); // __kmpc_end_taskgroup(ident_t *, gtid); // Prepare arguments and build a call to __kmpc_taskgroup llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; CommonActionTy Action(createRuntimeFunction(OMPRTL__kmpc_taskgroup), Args, createRuntimeFunction(OMPRTL__kmpc_end_taskgroup), Args); TaskgroupOpGen.setAction(Action); emitInlinedDirective(CGF, OMPD_taskgroup, TaskgroupOpGen); } /// Given an array of pointers to variables, project the address of a /// given variable. static Address emitAddrOfVarFromArray(CodeGenFunction &CGF, Address Array, unsigned Index, const VarDecl *Var) { // Pull out the pointer to the variable. Address PtrAddr = CGF.Builder.CreateConstArrayGEP(Array, Index, CGF.getPointerSize()); llvm::Value *Ptr = CGF.Builder.CreateLoad(PtrAddr); Address Addr = Address(Ptr, CGF.getContext().getDeclAlign(Var)); Addr = CGF.Builder.CreateElementBitCast( Addr, CGF.ConvertTypeForMem(Var->getType())); return Addr; } static llvm::Value *emitCopyprivateCopyFunction( CodeGenModule &CGM, llvm::Type *ArgsType, ArrayRef CopyprivateVars, ArrayRef DestExprs, ArrayRef SrcExprs, ArrayRef AssignmentOps) { auto &C = CGM.getContext(); // void copy_func(void *LHSArg, void *RHSArg); FunctionArgList Args; ImplicitParamDecl LHSArg(C, /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, C.VoidPtrTy); ImplicitParamDecl RHSArg(C, /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, C.VoidPtrTy); Args.push_back(&LHSArg); Args.push_back(&RHSArg); auto &CGFI = CGM.getTypes().arrangeBuiltinFunctionDeclaration(C.VoidTy, Args); auto *Fn = llvm::Function::Create( CGM.getTypes().GetFunctionType(CGFI), llvm::GlobalValue::InternalLinkage, ".omp.copyprivate.copy_func", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, Fn, CGFI); CodeGenFunction CGF(CGM); CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, CGFI, Args); // Dest = (void*[n])(LHSArg); // Src = (void*[n])(RHSArg); Address LHS(CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.Builder.CreateLoad(CGF.GetAddrOfLocalVar(&LHSArg)), ArgsType), CGF.getPointerAlign()); Address RHS(CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.Builder.CreateLoad(CGF.GetAddrOfLocalVar(&RHSArg)), ArgsType), CGF.getPointerAlign()); // *(Type0*)Dst[0] = *(Type0*)Src[0]; // *(Type1*)Dst[1] = *(Type1*)Src[1]; // ... // *(Typen*)Dst[n] = *(Typen*)Src[n]; for (unsigned I = 0, E = AssignmentOps.size(); I < E; ++I) { auto DestVar = cast(cast(DestExprs[I])->getDecl()); Address DestAddr = emitAddrOfVarFromArray(CGF, LHS, I, DestVar); auto SrcVar = cast(cast(SrcExprs[I])->getDecl()); Address SrcAddr = emitAddrOfVarFromArray(CGF, RHS, I, SrcVar); auto *VD = cast(CopyprivateVars[I])->getDecl(); QualType Type = VD->getType(); CGF.EmitOMPCopy(Type, DestAddr, SrcAddr, DestVar, SrcVar, AssignmentOps[I]); } CGF.FinishFunction(); return Fn; } void CGOpenMPRuntime::emitSingleRegion(CodeGenFunction &CGF, const RegionCodeGenTy &SingleOpGen, SourceLocation Loc, ArrayRef CopyprivateVars, ArrayRef SrcExprs, ArrayRef DstExprs, ArrayRef AssignmentOps) { if (!CGF.HaveInsertPoint()) return; assert(CopyprivateVars.size() == SrcExprs.size() && CopyprivateVars.size() == DstExprs.size() && CopyprivateVars.size() == AssignmentOps.size()); auto &C = CGM.getContext(); // int32 did_it = 0; // if(__kmpc_single(ident_t *, gtid)) { // SingleOpGen(); // __kmpc_end_single(ident_t *, gtid); // did_it = 1; // } // call __kmpc_copyprivate(ident_t *, gtid, , , // , did_it); Address DidIt = Address::invalid(); if (!CopyprivateVars.empty()) { // int32 did_it = 0; auto KmpInt32Ty = C.getIntTypeForBitwidth(/*DestWidth=*/32, /*Signed=*/1); DidIt = CGF.CreateMemTemp(KmpInt32Ty, ".omp.copyprivate.did_it"); CGF.Builder.CreateStore(CGF.Builder.getInt32(0), DidIt); } // Prepare arguments and build a call to __kmpc_single llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; CommonActionTy Action(createRuntimeFunction(OMPRTL__kmpc_single), Args, createRuntimeFunction(OMPRTL__kmpc_end_single), Args, /*Conditional=*/true); SingleOpGen.setAction(Action); emitInlinedDirective(CGF, OMPD_single, SingleOpGen); if (DidIt.isValid()) { // did_it = 1; CGF.Builder.CreateStore(CGF.Builder.getInt32(1), DidIt); } Action.Done(CGF); // call __kmpc_copyprivate(ident_t *, gtid, , , // , did_it); if (DidIt.isValid()) { llvm::APInt ArraySize(/*unsigned int numBits=*/32, CopyprivateVars.size()); auto CopyprivateArrayTy = C.getConstantArrayType(C.VoidPtrTy, ArraySize, ArrayType::Normal, /*IndexTypeQuals=*/0); // Create a list of all private variables for copyprivate. Address CopyprivateList = CGF.CreateMemTemp(CopyprivateArrayTy, ".omp.copyprivate.cpr_list"); for (unsigned I = 0, E = CopyprivateVars.size(); I < E; ++I) { Address Elem = CGF.Builder.CreateConstArrayGEP( CopyprivateList, I, CGF.getPointerSize()); CGF.Builder.CreateStore( CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.EmitLValue(CopyprivateVars[I]).getPointer(), CGF.VoidPtrTy), Elem); } // Build function that copies private values from single region to all other // threads in the corresponding parallel region. auto *CpyFn = emitCopyprivateCopyFunction( CGM, CGF.ConvertTypeForMem(CopyprivateArrayTy)->getPointerTo(), CopyprivateVars, SrcExprs, DstExprs, AssignmentOps); auto *BufSize = CGF.getTypeSize(CopyprivateArrayTy); Address CL = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(CopyprivateList, CGF.VoidPtrTy); auto *DidItVal = CGF.Builder.CreateLoad(DidIt); llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), // ident_t * getThreadID(CGF, Loc), // i32 BufSize, // size_t CL.getPointer(), // void * CpyFn, // void (*) (void *, void *) DidItVal // i32 did_it }; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_copyprivate), Args); } } void CGOpenMPRuntime::emitOrderedRegion(CodeGenFunction &CGF, const RegionCodeGenTy &OrderedOpGen, SourceLocation Loc, bool IsThreads) { if (!CGF.HaveInsertPoint()) return; // __kmpc_ordered(ident_t *, gtid); // OrderedOpGen(); // __kmpc_end_ordered(ident_t *, gtid); // Prepare arguments and build a call to __kmpc_ordered if (IsThreads) { llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; CommonActionTy Action(createRuntimeFunction(OMPRTL__kmpc_ordered), Args, createRuntimeFunction(OMPRTL__kmpc_end_ordered), Args); OrderedOpGen.setAction(Action); emitInlinedDirective(CGF, OMPD_ordered, OrderedOpGen); return; } emitInlinedDirective(CGF, OMPD_ordered, OrderedOpGen); } void CGOpenMPRuntime::emitBarrierCall(CodeGenFunction &CGF, SourceLocation Loc, OpenMPDirectiveKind Kind, bool EmitChecks, bool ForceSimpleCall) { if (!CGF.HaveInsertPoint()) return; // Build call __kmpc_cancel_barrier(loc, thread_id); // Build call __kmpc_barrier(loc, thread_id); unsigned Flags; if (Kind == OMPD_for) Flags = OMP_IDENT_BARRIER_IMPL_FOR; else if (Kind == OMPD_sections) Flags = OMP_IDENT_BARRIER_IMPL_SECTIONS; else if (Kind == OMPD_single) Flags = OMP_IDENT_BARRIER_IMPL_SINGLE; else if (Kind == OMPD_barrier) Flags = OMP_IDENT_BARRIER_EXPL; else Flags = OMP_IDENT_BARRIER_IMPL; // Build call __kmpc_cancel_barrier(loc, thread_id) or __kmpc_barrier(loc, // thread_id); llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc, Flags), getThreadID(CGF, Loc)}; if (auto *OMPRegionInfo = dyn_cast_or_null(CGF.CapturedStmtInfo)) { if (!ForceSimpleCall && OMPRegionInfo->hasCancel()) { auto *Result = CGF.EmitRuntimeCall( createRuntimeFunction(OMPRTL__kmpc_cancel_barrier), Args); if (EmitChecks) { // if (__kmpc_cancel_barrier()) { // exit from construct; // } auto *ExitBB = CGF.createBasicBlock(".cancel.exit"); auto *ContBB = CGF.createBasicBlock(".cancel.continue"); auto *Cmp = CGF.Builder.CreateIsNotNull(Result); CGF.Builder.CreateCondBr(Cmp, ExitBB, ContBB); CGF.EmitBlock(ExitBB); // exit from construct; auto CancelDestination = CGF.getOMPCancelDestination(OMPRegionInfo->getDirectiveKind()); CGF.EmitBranchThroughCleanup(CancelDestination); CGF.EmitBlock(ContBB, /*IsFinished=*/true); } return; } } CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_barrier), Args); } /// \brief Map the OpenMP loop schedule to the runtime enumeration. static OpenMPSchedType getRuntimeSchedule(OpenMPScheduleClauseKind ScheduleKind, bool Chunked, bool Ordered) { switch (ScheduleKind) { case OMPC_SCHEDULE_static: return Chunked ? (Ordered ? OMP_ord_static_chunked : OMP_sch_static_chunked) : (Ordered ? OMP_ord_static : OMP_sch_static); case OMPC_SCHEDULE_dynamic: return Ordered ? OMP_ord_dynamic_chunked : OMP_sch_dynamic_chunked; case OMPC_SCHEDULE_guided: return Ordered ? OMP_ord_guided_chunked : OMP_sch_guided_chunked; case OMPC_SCHEDULE_runtime: return Ordered ? OMP_ord_runtime : OMP_sch_runtime; case OMPC_SCHEDULE_auto: return Ordered ? OMP_ord_auto : OMP_sch_auto; case OMPC_SCHEDULE_unknown: assert(!Chunked && "chunk was specified but schedule kind not known"); return Ordered ? OMP_ord_static : OMP_sch_static; } llvm_unreachable("Unexpected runtime schedule"); } /// \brief Map the OpenMP distribute schedule to the runtime enumeration. static OpenMPSchedType getRuntimeSchedule(OpenMPDistScheduleClauseKind ScheduleKind, bool Chunked) { // only static is allowed for dist_schedule return Chunked ? OMP_dist_sch_static_chunked : OMP_dist_sch_static; } bool CGOpenMPRuntime::isStaticNonchunked(OpenMPScheduleClauseKind ScheduleKind, bool Chunked) const { auto Schedule = getRuntimeSchedule(ScheduleKind, Chunked, /*Ordered=*/false); return Schedule == OMP_sch_static; } bool CGOpenMPRuntime::isStaticNonchunked( OpenMPDistScheduleClauseKind ScheduleKind, bool Chunked) const { auto Schedule = getRuntimeSchedule(ScheduleKind, Chunked); return Schedule == OMP_dist_sch_static; } bool CGOpenMPRuntime::isDynamic(OpenMPScheduleClauseKind ScheduleKind) const { auto Schedule = getRuntimeSchedule(ScheduleKind, /*Chunked=*/false, /*Ordered=*/false); assert(Schedule != OMP_sch_static_chunked && "cannot be chunked here"); return Schedule != OMP_sch_static; } static int addMonoNonMonoModifier(OpenMPSchedType Schedule, OpenMPScheduleClauseModifier M1, OpenMPScheduleClauseModifier M2) { int Modifier = 0; switch (M1) { case OMPC_SCHEDULE_MODIFIER_monotonic: Modifier = OMP_sch_modifier_monotonic; break; case OMPC_SCHEDULE_MODIFIER_nonmonotonic: Modifier = OMP_sch_modifier_nonmonotonic; break; case OMPC_SCHEDULE_MODIFIER_simd: if (Schedule == OMP_sch_static_chunked) Schedule = OMP_sch_static_balanced_chunked; break; case OMPC_SCHEDULE_MODIFIER_last: case OMPC_SCHEDULE_MODIFIER_unknown: break; } switch (M2) { case OMPC_SCHEDULE_MODIFIER_monotonic: Modifier = OMP_sch_modifier_monotonic; break; case OMPC_SCHEDULE_MODIFIER_nonmonotonic: Modifier = OMP_sch_modifier_nonmonotonic; break; case OMPC_SCHEDULE_MODIFIER_simd: if (Schedule == OMP_sch_static_chunked) Schedule = OMP_sch_static_balanced_chunked; break; case OMPC_SCHEDULE_MODIFIER_last: case OMPC_SCHEDULE_MODIFIER_unknown: break; } return Schedule | Modifier; } void CGOpenMPRuntime::emitForDispatchInit(CodeGenFunction &CGF, SourceLocation Loc, const OpenMPScheduleTy &ScheduleKind, unsigned IVSize, bool IVSigned, bool Ordered, llvm::Value *UB, llvm::Value *Chunk) { if (!CGF.HaveInsertPoint()) return; OpenMPSchedType Schedule = getRuntimeSchedule(ScheduleKind.Schedule, Chunk != nullptr, Ordered); assert(Ordered || (Schedule != OMP_sch_static && Schedule != OMP_sch_static_chunked && Schedule != OMP_ord_static && Schedule != OMP_ord_static_chunked && Schedule != OMP_sch_static_balanced_chunked)); // Call __kmpc_dispatch_init( // ident_t *loc, kmp_int32 tid, kmp_int32 schedule, // kmp_int[32|64] lower, kmp_int[32|64] upper, // kmp_int[32|64] stride, kmp_int[32|64] chunk); // If the Chunk was not specified in the clause - use default value 1. if (Chunk == nullptr) Chunk = CGF.Builder.getIntN(IVSize, 1); llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), CGF.Builder.getInt32(addMonoNonMonoModifier( Schedule, ScheduleKind.M1, ScheduleKind.M2)), // Schedule type CGF.Builder.getIntN(IVSize, 0), // Lower UB, // Upper CGF.Builder.getIntN(IVSize, 1), // Stride Chunk // Chunk }; CGF.EmitRuntimeCall(createDispatchInitFunction(IVSize, IVSigned), Args); } static void emitForStaticInitCall( CodeGenFunction &CGF, llvm::Value *UpdateLocation, llvm::Value *ThreadId, llvm::Constant *ForStaticInitFunction, OpenMPSchedType Schedule, OpenMPScheduleClauseModifier M1, OpenMPScheduleClauseModifier M2, unsigned IVSize, bool Ordered, Address IL, Address LB, Address UB, Address ST, llvm::Value *Chunk) { if (!CGF.HaveInsertPoint()) return; assert(!Ordered); assert(Schedule == OMP_sch_static || Schedule == OMP_sch_static_chunked || Schedule == OMP_sch_static_balanced_chunked || Schedule == OMP_ord_static || Schedule == OMP_ord_static_chunked || Schedule == OMP_dist_sch_static || Schedule == OMP_dist_sch_static_chunked); // Call __kmpc_for_static_init( // ident_t *loc, kmp_int32 tid, kmp_int32 schedtype, // kmp_int32 *p_lastiter, kmp_int[32|64] *p_lower, // kmp_int[32|64] *p_upper, kmp_int[32|64] *p_stride, // kmp_int[32|64] incr, kmp_int[32|64] chunk); if (Chunk == nullptr) { assert((Schedule == OMP_sch_static || Schedule == OMP_ord_static || Schedule == OMP_dist_sch_static) && "expected static non-chunked schedule"); // If the Chunk was not specified in the clause - use default value 1. Chunk = CGF.Builder.getIntN(IVSize, 1); } else { assert((Schedule == OMP_sch_static_chunked || Schedule == OMP_sch_static_balanced_chunked || Schedule == OMP_ord_static_chunked || Schedule == OMP_dist_sch_static_chunked) && "expected static chunked schedule"); } llvm::Value *Args[] = { UpdateLocation, ThreadId, CGF.Builder.getInt32(addMonoNonMonoModifier( Schedule, M1, M2)), // Schedule type IL.getPointer(), // &isLastIter LB.getPointer(), // &LB UB.getPointer(), // &UB ST.getPointer(), // &Stride CGF.Builder.getIntN(IVSize, 1), // Incr Chunk // Chunk }; CGF.EmitRuntimeCall(ForStaticInitFunction, Args); } void CGOpenMPRuntime::emitForStaticInit(CodeGenFunction &CGF, SourceLocation Loc, const OpenMPScheduleTy &ScheduleKind, unsigned IVSize, bool IVSigned, bool Ordered, Address IL, Address LB, Address UB, Address ST, llvm::Value *Chunk) { OpenMPSchedType ScheduleNum = getRuntimeSchedule(ScheduleKind.Schedule, Chunk != nullptr, Ordered); auto *UpdatedLocation = emitUpdateLocation(CGF, Loc); auto *ThreadId = getThreadID(CGF, Loc); auto *StaticInitFunction = createForStaticInitFunction(IVSize, IVSigned); emitForStaticInitCall(CGF, UpdatedLocation, ThreadId, StaticInitFunction, ScheduleNum, ScheduleKind.M1, ScheduleKind.M2, IVSize, Ordered, IL, LB, UB, ST, Chunk); } void CGOpenMPRuntime::emitDistributeStaticInit( CodeGenFunction &CGF, SourceLocation Loc, OpenMPDistScheduleClauseKind SchedKind, unsigned IVSize, bool IVSigned, bool Ordered, Address IL, Address LB, Address UB, Address ST, llvm::Value *Chunk) { OpenMPSchedType ScheduleNum = getRuntimeSchedule(SchedKind, Chunk != nullptr); auto *UpdatedLocation = emitUpdateLocation(CGF, Loc); auto *ThreadId = getThreadID(CGF, Loc); auto *StaticInitFunction = createForStaticInitFunction(IVSize, IVSigned); emitForStaticInitCall(CGF, UpdatedLocation, ThreadId, StaticInitFunction, ScheduleNum, OMPC_SCHEDULE_MODIFIER_unknown, OMPC_SCHEDULE_MODIFIER_unknown, IVSize, Ordered, IL, LB, UB, ST, Chunk); } void CGOpenMPRuntime::emitForStaticFinish(CodeGenFunction &CGF, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // Call __kmpc_for_static_fini(ident_t *loc, kmp_int32 tid); llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_for_static_fini), Args); } void CGOpenMPRuntime::emitForOrderedIterationEnd(CodeGenFunction &CGF, SourceLocation Loc, unsigned IVSize, bool IVSigned) { if (!CGF.HaveInsertPoint()) return; // Call __kmpc_for_dynamic_fini_(4|8)[u](ident_t *loc, kmp_int32 tid); llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; CGF.EmitRuntimeCall(createDispatchFiniFunction(IVSize, IVSigned), Args); } llvm::Value *CGOpenMPRuntime::emitForNext(CodeGenFunction &CGF, SourceLocation Loc, unsigned IVSize, bool IVSigned, Address IL, Address LB, Address UB, Address ST) { // Call __kmpc_dispatch_next( // ident_t *loc, kmp_int32 tid, kmp_int32 *p_lastiter, // kmp_int[32|64] *p_lower, kmp_int[32|64] *p_upper, // kmp_int[32|64] *p_stride); llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), IL.getPointer(), // &isLastIter LB.getPointer(), // &Lower UB.getPointer(), // &Upper ST.getPointer() // &Stride }; llvm::Value *Call = CGF.EmitRuntimeCall(createDispatchNextFunction(IVSize, IVSigned), Args); return CGF.EmitScalarConversion( Call, CGF.getContext().getIntTypeForBitwidth(32, /* Signed */ true), CGF.getContext().BoolTy, Loc); } void CGOpenMPRuntime::emitNumThreadsClause(CodeGenFunction &CGF, llvm::Value *NumThreads, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // Build call __kmpc_push_num_threads(&loc, global_tid, num_threads) llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), CGF.Builder.CreateIntCast(NumThreads, CGF.Int32Ty, /*isSigned*/ true)}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_push_num_threads), Args); } void CGOpenMPRuntime::emitProcBindClause(CodeGenFunction &CGF, OpenMPProcBindClauseKind ProcBind, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // Constants for proc bind value accepted by the runtime. enum ProcBindTy { ProcBindFalse = 0, ProcBindTrue, ProcBindMaster, ProcBindClose, ProcBindSpread, ProcBindIntel, ProcBindDefault } RuntimeProcBind; switch (ProcBind) { case OMPC_PROC_BIND_master: RuntimeProcBind = ProcBindMaster; break; case OMPC_PROC_BIND_close: RuntimeProcBind = ProcBindClose; break; case OMPC_PROC_BIND_spread: RuntimeProcBind = ProcBindSpread; break; case OMPC_PROC_BIND_unknown: llvm_unreachable("Unsupported proc_bind value."); } // Build call __kmpc_push_proc_bind(&loc, global_tid, proc_bind) llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), llvm::ConstantInt::get(CGM.IntTy, RuntimeProcBind, /*isSigned=*/true)}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_push_proc_bind), Args); } void CGOpenMPRuntime::emitFlush(CodeGenFunction &CGF, ArrayRef, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // Build call void __kmpc_flush(ident_t *loc) CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_flush), emitUpdateLocation(CGF, Loc)); } namespace { /// \brief Indexes of fields for type kmp_task_t. enum KmpTaskTFields { /// \brief List of shared variables. KmpTaskTShareds, /// \brief Task routine. KmpTaskTRoutine, /// \brief Partition id for the untied tasks. KmpTaskTPartId, /// Function with call of destructors for private variables. Data1, /// Task priority. Data2, /// (Taskloops only) Lower bound. KmpTaskTLowerBound, /// (Taskloops only) Upper bound. KmpTaskTUpperBound, /// (Taskloops only) Stride. KmpTaskTStride, /// (Taskloops only) Is last iteration flag. KmpTaskTLastIter, }; } // anonymous namespace bool CGOpenMPRuntime::OffloadEntriesInfoManagerTy::empty() const { // FIXME: Add other entries type when they become supported. return OffloadEntriesTargetRegion.empty(); } /// \brief Initialize target region entry. void CGOpenMPRuntime::OffloadEntriesInfoManagerTy:: initializeTargetRegionEntryInfo(unsigned DeviceID, unsigned FileID, StringRef ParentName, unsigned LineNum, unsigned Order) { assert(CGM.getLangOpts().OpenMPIsDevice && "Initialization of entries is " "only required for the device " "code generation."); OffloadEntriesTargetRegion[DeviceID][FileID][ParentName][LineNum] = OffloadEntryInfoTargetRegion(Order, /*Addr=*/nullptr, /*ID=*/nullptr, /*Flags=*/0); ++OffloadingEntriesNum; } void CGOpenMPRuntime::OffloadEntriesInfoManagerTy:: registerTargetRegionEntryInfo(unsigned DeviceID, unsigned FileID, StringRef ParentName, unsigned LineNum, llvm::Constant *Addr, llvm::Constant *ID, int32_t Flags) { // If we are emitting code for a target, the entry is already initialized, // only has to be registered. if (CGM.getLangOpts().OpenMPIsDevice) { assert(hasTargetRegionEntryInfo(DeviceID, FileID, ParentName, LineNum) && "Entry must exist."); auto &Entry = OffloadEntriesTargetRegion[DeviceID][FileID][ParentName][LineNum]; assert(Entry.isValid() && "Entry not initialized!"); Entry.setAddress(Addr); Entry.setID(ID); Entry.setFlags(Flags); return; } else { OffloadEntryInfoTargetRegion Entry(OffloadingEntriesNum++, Addr, ID, Flags); OffloadEntriesTargetRegion[DeviceID][FileID][ParentName][LineNum] = Entry; } } bool CGOpenMPRuntime::OffloadEntriesInfoManagerTy::hasTargetRegionEntryInfo( unsigned DeviceID, unsigned FileID, StringRef ParentName, unsigned LineNum) const { auto PerDevice = OffloadEntriesTargetRegion.find(DeviceID); if (PerDevice == OffloadEntriesTargetRegion.end()) return false; auto PerFile = PerDevice->second.find(FileID); if (PerFile == PerDevice->second.end()) return false; auto PerParentName = PerFile->second.find(ParentName); if (PerParentName == PerFile->second.end()) return false; auto PerLine = PerParentName->second.find(LineNum); if (PerLine == PerParentName->second.end()) return false; // Fail if this entry is already registered. if (PerLine->second.getAddress() || PerLine->second.getID()) return false; return true; } void CGOpenMPRuntime::OffloadEntriesInfoManagerTy::actOnTargetRegionEntriesInfo( const OffloadTargetRegionEntryInfoActTy &Action) { // Scan all target region entries and perform the provided action. for (auto &D : OffloadEntriesTargetRegion) for (auto &F : D.second) for (auto &P : F.second) for (auto &L : P.second) Action(D.first, F.first, P.first(), L.first, L.second); } /// \brief Create a Ctor/Dtor-like function whose body is emitted through /// \a Codegen. This is used to emit the two functions that register and /// unregister the descriptor of the current compilation unit. static llvm::Function * createOffloadingBinaryDescriptorFunction(CodeGenModule &CGM, StringRef Name, const RegionCodeGenTy &Codegen) { auto &C = CGM.getContext(); FunctionArgList Args; ImplicitParamDecl DummyPtr(C, /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, C.VoidPtrTy); Args.push_back(&DummyPtr); CodeGenFunction CGF(CGM); auto &FI = CGM.getTypes().arrangeBuiltinFunctionDeclaration(C.VoidTy, Args); auto FTy = CGM.getTypes().GetFunctionType(FI); auto *Fn = CGM.CreateGlobalInitOrDestructFunction(FTy, Name, FI, SourceLocation()); CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FI, Args, SourceLocation()); Codegen(CGF); CGF.FinishFunction(); return Fn; } llvm::Function * CGOpenMPRuntime::createOffloadingBinaryDescriptorRegistration() { // If we don't have entries or if we are emitting code for the device, we // don't need to do anything. if (CGM.getLangOpts().OpenMPIsDevice || OffloadEntriesInfoManager.empty()) return nullptr; auto &M = CGM.getModule(); auto &C = CGM.getContext(); // Get list of devices we care about auto &Devices = CGM.getLangOpts().OMPTargetTriples; // We should be creating an offloading descriptor only if there are devices // specified. assert(!Devices.empty() && "No OpenMP offloading devices??"); // Create the external variables that will point to the begin and end of the // host entries section. These will be defined by the linker. auto *OffloadEntryTy = CGM.getTypes().ConvertTypeForMem(getTgtOffloadEntryQTy()); llvm::GlobalVariable *HostEntriesBegin = new llvm::GlobalVariable( M, OffloadEntryTy, /*isConstant=*/true, llvm::GlobalValue::ExternalLinkage, /*Initializer=*/nullptr, ".omp_offloading.entries_begin"); llvm::GlobalVariable *HostEntriesEnd = new llvm::GlobalVariable( M, OffloadEntryTy, /*isConstant=*/true, llvm::GlobalValue::ExternalLinkage, /*Initializer=*/nullptr, ".omp_offloading.entries_end"); // Create all device images auto *DeviceImageTy = cast( CGM.getTypes().ConvertTypeForMem(getTgtDeviceImageQTy())); ConstantInitBuilder DeviceImagesBuilder(CGM); auto DeviceImagesEntries = DeviceImagesBuilder.beginArray(DeviceImageTy); for (unsigned i = 0; i < Devices.size(); ++i) { StringRef T = Devices[i].getTriple(); auto *ImgBegin = new llvm::GlobalVariable( M, CGM.Int8Ty, /*isConstant=*/true, llvm::GlobalValue::ExternalLinkage, /*Initializer=*/nullptr, Twine(".omp_offloading.img_start.") + Twine(T)); auto *ImgEnd = new llvm::GlobalVariable( M, CGM.Int8Ty, /*isConstant=*/true, llvm::GlobalValue::ExternalLinkage, /*Initializer=*/nullptr, Twine(".omp_offloading.img_end.") + Twine(T)); auto Dev = DeviceImagesEntries.beginStruct(DeviceImageTy); Dev.add(ImgBegin); Dev.add(ImgEnd); Dev.add(HostEntriesBegin); Dev.add(HostEntriesEnd); Dev.finishAndAddTo(DeviceImagesEntries); } // Create device images global array. llvm::GlobalVariable *DeviceImages = DeviceImagesEntries.finishAndCreateGlobal(".omp_offloading.device_images", CGM.getPointerAlign(), /*isConstant=*/true); DeviceImages->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::Global); // This is a Zero array to be used in the creation of the constant expressions llvm::Constant *Index[] = {llvm::Constant::getNullValue(CGM.Int32Ty), llvm::Constant::getNullValue(CGM.Int32Ty)}; // Create the target region descriptor. auto *BinaryDescriptorTy = cast( CGM.getTypes().ConvertTypeForMem(getTgtBinaryDescriptorQTy())); ConstantInitBuilder DescBuilder(CGM); auto DescInit = DescBuilder.beginStruct(BinaryDescriptorTy); DescInit.addInt(CGM.Int32Ty, Devices.size()); DescInit.add(llvm::ConstantExpr::getGetElementPtr(DeviceImages->getValueType(), DeviceImages, Index)); DescInit.add(HostEntriesBegin); DescInit.add(HostEntriesEnd); auto *Desc = DescInit.finishAndCreateGlobal(".omp_offloading.descriptor", CGM.getPointerAlign(), /*isConstant=*/true); // Emit code to register or unregister the descriptor at execution // startup or closing, respectively. // Create a variable to drive the registration and unregistration of the // descriptor, so we can reuse the logic that emits Ctors and Dtors. auto *IdentInfo = &C.Idents.get(".omp_offloading.reg_unreg_var"); ImplicitParamDecl RegUnregVar(C, C.getTranslationUnitDecl(), SourceLocation(), IdentInfo, C.CharTy); auto *UnRegFn = createOffloadingBinaryDescriptorFunction( CGM, ".omp_offloading.descriptor_unreg", [&](CodeGenFunction &CGF, PrePostActionTy &) { CGF.EmitCallOrInvoke(createRuntimeFunction(OMPRTL__tgt_unregister_lib), Desc); }); auto *RegFn = createOffloadingBinaryDescriptorFunction( CGM, ".omp_offloading.descriptor_reg", [&](CodeGenFunction &CGF, PrePostActionTy &) { CGF.EmitCallOrInvoke(createRuntimeFunction(OMPRTL__tgt_register_lib), Desc); CGM.getCXXABI().registerGlobalDtor(CGF, RegUnregVar, UnRegFn, Desc); }); return RegFn; } void CGOpenMPRuntime::createOffloadEntry(llvm::Constant *ID, llvm::Constant *Addr, uint64_t Size, int32_t Flags) { StringRef Name = Addr->getName(); auto *TgtOffloadEntryType = cast( CGM.getTypes().ConvertTypeForMem(getTgtOffloadEntryQTy())); llvm::LLVMContext &C = CGM.getModule().getContext(); llvm::Module &M = CGM.getModule(); // Make sure the address has the right type. llvm::Constant *AddrPtr = llvm::ConstantExpr::getBitCast(ID, CGM.VoidPtrTy); // Create constant string with the name. llvm::Constant *StrPtrInit = llvm::ConstantDataArray::getString(C, Name); llvm::GlobalVariable *Str = new llvm::GlobalVariable(M, StrPtrInit->getType(), /*isConstant=*/true, llvm::GlobalValue::InternalLinkage, StrPtrInit, ".omp_offloading.entry_name"); Str->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::Global); llvm::Constant *StrPtr = llvm::ConstantExpr::getBitCast(Str, CGM.Int8PtrTy); // We can't have any padding between symbols, so we need to have 1-byte // alignment. auto Align = CharUnits::fromQuantity(1); // Create the entry struct. ConstantInitBuilder EntryBuilder(CGM); auto EntryInit = EntryBuilder.beginStruct(TgtOffloadEntryType); EntryInit.add(AddrPtr); EntryInit.add(StrPtr); EntryInit.addInt(CGM.SizeTy, Size); EntryInit.addInt(CGM.Int32Ty, Flags); EntryInit.addInt(CGM.Int32Ty, 0); llvm::GlobalVariable *Entry = EntryInit.finishAndCreateGlobal(".omp_offloading.entry", Align, /*constant*/ true, llvm::GlobalValue::ExternalLinkage); // The entry has to be created in the section the linker expects it to be. Entry->setSection(".omp_offloading.entries"); } void CGOpenMPRuntime::createOffloadEntriesAndInfoMetadata() { // Emit the offloading entries and metadata so that the device codegen side // can easily figure out what to emit. The produced metadata looks like // this: // // !omp_offload.info = !{!1, ...} // // Right now we only generate metadata for function that contain target // regions. // If we do not have entries, we dont need to do anything. if (OffloadEntriesInfoManager.empty()) return; llvm::Module &M = CGM.getModule(); llvm::LLVMContext &C = M.getContext(); SmallVector OrderedEntries(OffloadEntriesInfoManager.size()); // Create the offloading info metadata node. llvm::NamedMDNode *MD = M.getOrInsertNamedMetadata("omp_offload.info"); // Auxiliar methods to create metadata values and strings. auto getMDInt = [&](unsigned v) { return llvm::ConstantAsMetadata::get( llvm::ConstantInt::get(llvm::Type::getInt32Ty(C), v)); }; auto getMDString = [&](StringRef v) { return llvm::MDString::get(C, v); }; // Create function that emits metadata for each target region entry; auto &&TargetRegionMetadataEmitter = [&]( unsigned DeviceID, unsigned FileID, StringRef ParentName, unsigned Line, OffloadEntriesInfoManagerTy::OffloadEntryInfoTargetRegion &E) { llvm::SmallVector Ops; // Generate metadata for target regions. Each entry of this metadata // contains: // - Entry 0 -> Kind of this type of metadata (0). // - Entry 1 -> Device ID of the file where the entry was identified. // - Entry 2 -> File ID of the file where the entry was identified. // - Entry 3 -> Mangled name of the function where the entry was identified. // - Entry 4 -> Line in the file where the entry was identified. // - Entry 5 -> Order the entry was created. // The first element of the metadata node is the kind. Ops.push_back(getMDInt(E.getKind())); Ops.push_back(getMDInt(DeviceID)); Ops.push_back(getMDInt(FileID)); Ops.push_back(getMDString(ParentName)); Ops.push_back(getMDInt(Line)); Ops.push_back(getMDInt(E.getOrder())); // Save this entry in the right position of the ordered entries array. OrderedEntries[E.getOrder()] = &E; // Add metadata to the named metadata node. MD->addOperand(llvm::MDNode::get(C, Ops)); }; OffloadEntriesInfoManager.actOnTargetRegionEntriesInfo( TargetRegionMetadataEmitter); for (auto *E : OrderedEntries) { assert(E && "All ordered entries must exist!"); if (auto *CE = dyn_cast( E)) { assert(CE->getID() && CE->getAddress() && "Entry ID and Addr are invalid!"); createOffloadEntry(CE->getID(), CE->getAddress(), /*Size=*/0); } else llvm_unreachable("Unsupported entry kind."); } } /// \brief Loads all the offload entries information from the host IR /// metadata. void CGOpenMPRuntime::loadOffloadInfoMetadata() { // If we are in target mode, load the metadata from the host IR. This code has // to match the metadaata creation in createOffloadEntriesAndInfoMetadata(). if (!CGM.getLangOpts().OpenMPIsDevice) return; if (CGM.getLangOpts().OMPHostIRFile.empty()) return; auto Buf = llvm::MemoryBuffer::getFile(CGM.getLangOpts().OMPHostIRFile); if (Buf.getError()) return; llvm::LLVMContext C; auto ME = expectedToErrorOrAndEmitErrors( C, llvm::parseBitcodeFile(Buf.get()->getMemBufferRef(), C)); if (ME.getError()) return; llvm::NamedMDNode *MD = ME.get()->getNamedMetadata("omp_offload.info"); if (!MD) return; for (auto I : MD->operands()) { llvm::MDNode *MN = cast(I); auto getMDInt = [&](unsigned Idx) { llvm::ConstantAsMetadata *V = cast(MN->getOperand(Idx)); return cast(V->getValue())->getZExtValue(); }; auto getMDString = [&](unsigned Idx) { llvm::MDString *V = cast(MN->getOperand(Idx)); return V->getString(); }; switch (getMDInt(0)) { default: llvm_unreachable("Unexpected metadata!"); break; case OffloadEntriesInfoManagerTy::OffloadEntryInfo:: OFFLOAD_ENTRY_INFO_TARGET_REGION: OffloadEntriesInfoManager.initializeTargetRegionEntryInfo( /*DeviceID=*/getMDInt(1), /*FileID=*/getMDInt(2), /*ParentName=*/getMDString(3), /*Line=*/getMDInt(4), /*Order=*/getMDInt(5)); break; } } } void CGOpenMPRuntime::emitKmpRoutineEntryT(QualType KmpInt32Ty) { if (!KmpRoutineEntryPtrTy) { // Build typedef kmp_int32 (* kmp_routine_entry_t)(kmp_int32, void *); type. auto &C = CGM.getContext(); QualType KmpRoutineEntryTyArgs[] = {KmpInt32Ty, C.VoidPtrTy}; FunctionProtoType::ExtProtoInfo EPI; KmpRoutineEntryPtrQTy = C.getPointerType( C.getFunctionType(KmpInt32Ty, KmpRoutineEntryTyArgs, EPI)); KmpRoutineEntryPtrTy = CGM.getTypes().ConvertType(KmpRoutineEntryPtrQTy); } } static FieldDecl *addFieldToRecordDecl(ASTContext &C, DeclContext *DC, QualType FieldTy) { auto *Field = FieldDecl::Create( C, DC, SourceLocation(), SourceLocation(), /*Id=*/nullptr, FieldTy, C.getTrivialTypeSourceInfo(FieldTy, SourceLocation()), /*BW=*/nullptr, /*Mutable=*/false, /*InitStyle=*/ICIS_NoInit); Field->setAccess(AS_public); DC->addDecl(Field); return Field; } QualType CGOpenMPRuntime::getTgtOffloadEntryQTy() { // Make sure the type of the entry is already created. This is the type we // have to create: // struct __tgt_offload_entry{ // void *addr; // Pointer to the offload entry info. // // (function or global) // char *name; // Name of the function or global. // size_t size; // Size of the entry info (0 if it a function). // int32_t flags; // Flags associated with the entry, e.g. 'link'. // int32_t reserved; // Reserved, to use by the runtime library. // }; if (TgtOffloadEntryQTy.isNull()) { ASTContext &C = CGM.getContext(); auto *RD = C.buildImplicitRecord("__tgt_offload_entry"); RD->startDefinition(); addFieldToRecordDecl(C, RD, C.VoidPtrTy); addFieldToRecordDecl(C, RD, C.getPointerType(C.CharTy)); addFieldToRecordDecl(C, RD, C.getSizeType()); addFieldToRecordDecl( C, RD, C.getIntTypeForBitwidth(/*DestWidth=*/32, /*Signed=*/true)); addFieldToRecordDecl( C, RD, C.getIntTypeForBitwidth(/*DestWidth=*/32, /*Signed=*/true)); RD->completeDefinition(); TgtOffloadEntryQTy = C.getRecordType(RD); } return TgtOffloadEntryQTy; } QualType CGOpenMPRuntime::getTgtDeviceImageQTy() { // These are the types we need to build: // struct __tgt_device_image{ // void *ImageStart; // Pointer to the target code start. // void *ImageEnd; // Pointer to the target code end. // // We also add the host entries to the device image, as it may be useful // // for the target runtime to have access to that information. // __tgt_offload_entry *EntriesBegin; // Begin of the table with all // // the entries. // __tgt_offload_entry *EntriesEnd; // End of the table with all the // // entries (non inclusive). // }; if (TgtDeviceImageQTy.isNull()) { ASTContext &C = CGM.getContext(); auto *RD = C.buildImplicitRecord("__tgt_device_image"); RD->startDefinition(); addFieldToRecordDecl(C, RD, C.VoidPtrTy); addFieldToRecordDecl(C, RD, C.VoidPtrTy); addFieldToRecordDecl(C, RD, C.getPointerType(getTgtOffloadEntryQTy())); addFieldToRecordDecl(C, RD, C.getPointerType(getTgtOffloadEntryQTy())); RD->completeDefinition(); TgtDeviceImageQTy = C.getRecordType(RD); } return TgtDeviceImageQTy; } QualType CGOpenMPRuntime::getTgtBinaryDescriptorQTy() { // struct __tgt_bin_desc{ // int32_t NumDevices; // Number of devices supported. // __tgt_device_image *DeviceImages; // Arrays of device images // // (one per device). // __tgt_offload_entry *EntriesBegin; // Begin of the table with all the // // entries. // __tgt_offload_entry *EntriesEnd; // End of the table with all the // // entries (non inclusive). // }; if (TgtBinaryDescriptorQTy.isNull()) { ASTContext &C = CGM.getContext(); auto *RD = C.buildImplicitRecord("__tgt_bin_desc"); RD->startDefinition(); addFieldToRecordDecl( C, RD, C.getIntTypeForBitwidth(/*DestWidth=*/32, /*Signed=*/true)); addFieldToRecordDecl(C, RD, C.getPointerType(getTgtDeviceImageQTy())); addFieldToRecordDecl(C, RD, C.getPointerType(getTgtOffloadEntryQTy())); addFieldToRecordDecl(C, RD, C.getPointerType(getTgtOffloadEntryQTy())); RD->completeDefinition(); TgtBinaryDescriptorQTy = C.getRecordType(RD); } return TgtBinaryDescriptorQTy; } namespace { struct PrivateHelpersTy { PrivateHelpersTy(const VarDecl *Original, const VarDecl *PrivateCopy, const VarDecl *PrivateElemInit) : Original(Original), PrivateCopy(PrivateCopy), PrivateElemInit(PrivateElemInit) {} const VarDecl *Original; const VarDecl *PrivateCopy; const VarDecl *PrivateElemInit; }; typedef std::pair PrivateDataTy; } // anonymous namespace static RecordDecl * createPrivatesRecordDecl(CodeGenModule &CGM, ArrayRef Privates) { if (!Privates.empty()) { auto &C = CGM.getContext(); // Build struct .kmp_privates_t. { // /* private vars */ // }; auto *RD = C.buildImplicitRecord(".kmp_privates.t"); RD->startDefinition(); for (auto &&Pair : Privates) { auto *VD = Pair.second.Original; auto Type = VD->getType(); Type = Type.getNonReferenceType(); auto *FD = addFieldToRecordDecl(C, RD, Type); if (VD->hasAttrs()) { for (specific_attr_iterator I(VD->getAttrs().begin()), E(VD->getAttrs().end()); I != E; ++I) FD->addAttr(*I); } } RD->completeDefinition(); return RD; } return nullptr; } static RecordDecl * createKmpTaskTRecordDecl(CodeGenModule &CGM, OpenMPDirectiveKind Kind, QualType KmpInt32Ty, QualType KmpRoutineEntryPointerQTy) { auto &C = CGM.getContext(); // Build struct kmp_task_t { // void * shareds; // kmp_routine_entry_t routine; // kmp_int32 part_id; // kmp_cmplrdata_t data1; // kmp_cmplrdata_t data2; // For taskloops additional fields: // kmp_uint64 lb; // kmp_uint64 ub; // kmp_int64 st; // kmp_int32 liter; // }; auto *UD = C.buildImplicitRecord("kmp_cmplrdata_t", TTK_Union); UD->startDefinition(); addFieldToRecordDecl(C, UD, KmpInt32Ty); addFieldToRecordDecl(C, UD, KmpRoutineEntryPointerQTy); UD->completeDefinition(); QualType KmpCmplrdataTy = C.getRecordType(UD); auto *RD = C.buildImplicitRecord("kmp_task_t"); RD->startDefinition(); addFieldToRecordDecl(C, RD, C.VoidPtrTy); addFieldToRecordDecl(C, RD, KmpRoutineEntryPointerQTy); addFieldToRecordDecl(C, RD, KmpInt32Ty); addFieldToRecordDecl(C, RD, KmpCmplrdataTy); addFieldToRecordDecl(C, RD, KmpCmplrdataTy); if (isOpenMPTaskLoopDirective(Kind)) { QualType KmpUInt64Ty = CGM.getContext().getIntTypeForBitwidth(/*DestWidth=*/64, /*Signed=*/0); QualType KmpInt64Ty = CGM.getContext().getIntTypeForBitwidth(/*DestWidth=*/64, /*Signed=*/1); addFieldToRecordDecl(C, RD, KmpUInt64Ty); addFieldToRecordDecl(C, RD, KmpUInt64Ty); addFieldToRecordDecl(C, RD, KmpInt64Ty); addFieldToRecordDecl(C, RD, KmpInt32Ty); } RD->completeDefinition(); return RD; } static RecordDecl * createKmpTaskTWithPrivatesRecordDecl(CodeGenModule &CGM, QualType KmpTaskTQTy, ArrayRef Privates) { auto &C = CGM.getContext(); // Build struct kmp_task_t_with_privates { // kmp_task_t task_data; // .kmp_privates_t. privates; // }; auto *RD = C.buildImplicitRecord("kmp_task_t_with_privates"); RD->startDefinition(); addFieldToRecordDecl(C, RD, KmpTaskTQTy); if (auto *PrivateRD = createPrivatesRecordDecl(CGM, Privates)) { addFieldToRecordDecl(C, RD, C.getRecordType(PrivateRD)); } RD->completeDefinition(); return RD; } /// \brief Emit a proxy function which accepts kmp_task_t as the second /// argument. /// \code /// kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) { /// TaskFunction(gtid, tt->part_id, &tt->privates, task_privates_map, tt, /// For taskloops: /// tt->task_data.lb, tt->task_data.ub, tt->task_data.st, tt->task_data.liter, /// tt->shareds); /// return 0; /// } /// \endcode static llvm::Value * emitProxyTaskFunction(CodeGenModule &CGM, SourceLocation Loc, OpenMPDirectiveKind Kind, QualType KmpInt32Ty, QualType KmpTaskTWithPrivatesPtrQTy, QualType KmpTaskTWithPrivatesQTy, QualType KmpTaskTQTy, QualType SharedsPtrTy, llvm::Value *TaskFunction, llvm::Value *TaskPrivatesMap) { auto &C = CGM.getContext(); FunctionArgList Args; ImplicitParamDecl GtidArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, KmpInt32Ty); ImplicitParamDecl TaskTypeArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, KmpTaskTWithPrivatesPtrQTy.withRestrict()); Args.push_back(&GtidArg); Args.push_back(&TaskTypeArg); auto &TaskEntryFnInfo = CGM.getTypes().arrangeBuiltinFunctionDeclaration(KmpInt32Ty, Args); auto *TaskEntryTy = CGM.getTypes().GetFunctionType(TaskEntryFnInfo); auto *TaskEntry = llvm::Function::Create(TaskEntryTy, llvm::GlobalValue::InternalLinkage, ".omp_task_entry.", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, TaskEntry, TaskEntryFnInfo); CodeGenFunction CGF(CGM); CGF.disableDebugInfo(); CGF.StartFunction(GlobalDecl(), KmpInt32Ty, TaskEntry, TaskEntryFnInfo, Args); // TaskFunction(gtid, tt->task_data.part_id, &tt->privates, task_privates_map, // tt, // For taskloops: // tt->task_data.lb, tt->task_data.ub, tt->task_data.st, tt->task_data.liter, // tt->task_data.shareds); auto *GtidParam = CGF.EmitLoadOfScalar( CGF.GetAddrOfLocalVar(&GtidArg), /*Volatile=*/false, KmpInt32Ty, Loc); LValue TDBase = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(&TaskTypeArg), KmpTaskTWithPrivatesPtrQTy->castAs()); auto *KmpTaskTWithPrivatesQTyRD = cast(KmpTaskTWithPrivatesQTy->getAsTagDecl()); LValue Base = CGF.EmitLValueForField(TDBase, *KmpTaskTWithPrivatesQTyRD->field_begin()); auto *KmpTaskTQTyRD = cast(KmpTaskTQTy->getAsTagDecl()); auto PartIdFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTPartId); auto PartIdLVal = CGF.EmitLValueForField(Base, *PartIdFI); auto *PartidParam = PartIdLVal.getPointer(); auto SharedsFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTShareds); auto SharedsLVal = CGF.EmitLValueForField(Base, *SharedsFI); auto *SharedsParam = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.EmitLoadOfLValue(SharedsLVal, Loc).getScalarVal(), CGF.ConvertTypeForMem(SharedsPtrTy)); auto PrivatesFI = std::next(KmpTaskTWithPrivatesQTyRD->field_begin(), 1); llvm::Value *PrivatesParam; if (PrivatesFI != KmpTaskTWithPrivatesQTyRD->field_end()) { auto PrivatesLVal = CGF.EmitLValueForField(TDBase, *PrivatesFI); PrivatesParam = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( PrivatesLVal.getPointer(), CGF.VoidPtrTy); } else PrivatesParam = llvm::ConstantPointerNull::get(CGF.VoidPtrTy); llvm::Value *CommonArgs[] = {GtidParam, PartidParam, PrivatesParam, TaskPrivatesMap, CGF.Builder .CreatePointerBitCastOrAddrSpaceCast( TDBase.getAddress(), CGF.VoidPtrTy) .getPointer()}; SmallVector CallArgs(std::begin(CommonArgs), std::end(CommonArgs)); if (isOpenMPTaskLoopDirective(Kind)) { auto LBFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTLowerBound); auto LBLVal = CGF.EmitLValueForField(Base, *LBFI); auto *LBParam = CGF.EmitLoadOfLValue(LBLVal, Loc).getScalarVal(); auto UBFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTUpperBound); auto UBLVal = CGF.EmitLValueForField(Base, *UBFI); auto *UBParam = CGF.EmitLoadOfLValue(UBLVal, Loc).getScalarVal(); auto StFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTStride); auto StLVal = CGF.EmitLValueForField(Base, *StFI); auto *StParam = CGF.EmitLoadOfLValue(StLVal, Loc).getScalarVal(); auto LIFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTLastIter); auto LILVal = CGF.EmitLValueForField(Base, *LIFI); auto *LIParam = CGF.EmitLoadOfLValue(LILVal, Loc).getScalarVal(); CallArgs.push_back(LBParam); CallArgs.push_back(UBParam); CallArgs.push_back(StParam); CallArgs.push_back(LIParam); } CallArgs.push_back(SharedsParam); CGF.EmitCallOrInvoke(TaskFunction, CallArgs); CGF.EmitStoreThroughLValue( RValue::get(CGF.Builder.getInt32(/*C=*/0)), CGF.MakeAddrLValue(CGF.ReturnValue, KmpInt32Ty)); CGF.FinishFunction(); return TaskEntry; } static llvm::Value *emitDestructorsFunction(CodeGenModule &CGM, SourceLocation Loc, QualType KmpInt32Ty, QualType KmpTaskTWithPrivatesPtrQTy, QualType KmpTaskTWithPrivatesQTy) { auto &C = CGM.getContext(); FunctionArgList Args; ImplicitParamDecl GtidArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, KmpInt32Ty); ImplicitParamDecl TaskTypeArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, KmpTaskTWithPrivatesPtrQTy.withRestrict()); Args.push_back(&GtidArg); Args.push_back(&TaskTypeArg); FunctionType::ExtInfo Info; auto &DestructorFnInfo = CGM.getTypes().arrangeBuiltinFunctionDeclaration(KmpInt32Ty, Args); auto *DestructorFnTy = CGM.getTypes().GetFunctionType(DestructorFnInfo); auto *DestructorFn = llvm::Function::Create(DestructorFnTy, llvm::GlobalValue::InternalLinkage, ".omp_task_destructor.", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, DestructorFn, DestructorFnInfo); CodeGenFunction CGF(CGM); CGF.disableDebugInfo(); CGF.StartFunction(GlobalDecl(), KmpInt32Ty, DestructorFn, DestructorFnInfo, Args); LValue Base = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(&TaskTypeArg), KmpTaskTWithPrivatesPtrQTy->castAs()); auto *KmpTaskTWithPrivatesQTyRD = cast(KmpTaskTWithPrivatesQTy->getAsTagDecl()); auto FI = std::next(KmpTaskTWithPrivatesQTyRD->field_begin()); Base = CGF.EmitLValueForField(Base, *FI); for (auto *Field : cast(FI->getType()->getAsTagDecl())->fields()) { if (auto DtorKind = Field->getType().isDestructedType()) { auto FieldLValue = CGF.EmitLValueForField(Base, Field); CGF.pushDestroy(DtorKind, FieldLValue.getAddress(), Field->getType()); } } CGF.FinishFunction(); return DestructorFn; } /// \brief Emit a privates mapping function for correct handling of private and /// firstprivate variables. /// \code /// void .omp_task_privates_map.(const .privates. *noalias privs, /// **noalias priv1,..., **noalias privn) { /// *priv1 = &.privates.priv1; /// ...; /// *privn = &.privates.privn; /// } /// \endcode static llvm::Value * emitTaskPrivateMappingFunction(CodeGenModule &CGM, SourceLocation Loc, ArrayRef PrivateVars, ArrayRef FirstprivateVars, ArrayRef LastprivateVars, QualType PrivatesQTy, ArrayRef Privates) { auto &C = CGM.getContext(); FunctionArgList Args; ImplicitParamDecl TaskPrivatesArg( C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, C.getPointerType(PrivatesQTy).withConst().withRestrict()); Args.push_back(&TaskPrivatesArg); llvm::DenseMap PrivateVarsPos; unsigned Counter = 1; for (auto *E: PrivateVars) { Args.push_back(ImplicitParamDecl::Create( C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, C.getPointerType(C.getPointerType(E->getType())) .withConst() .withRestrict())); auto *VD = cast(cast(E)->getDecl()); PrivateVarsPos[VD] = Counter; ++Counter; } for (auto *E : FirstprivateVars) { Args.push_back(ImplicitParamDecl::Create( C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, C.getPointerType(C.getPointerType(E->getType())) .withConst() .withRestrict())); auto *VD = cast(cast(E)->getDecl()); PrivateVarsPos[VD] = Counter; ++Counter; } for (auto *E: LastprivateVars) { Args.push_back(ImplicitParamDecl::Create( C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, C.getPointerType(C.getPointerType(E->getType())) .withConst() .withRestrict())); auto *VD = cast(cast(E)->getDecl()); PrivateVarsPos[VD] = Counter; ++Counter; } auto &TaskPrivatesMapFnInfo = CGM.getTypes().arrangeBuiltinFunctionDeclaration(C.VoidTy, Args); auto *TaskPrivatesMapTy = CGM.getTypes().GetFunctionType(TaskPrivatesMapFnInfo); auto *TaskPrivatesMap = llvm::Function::Create( TaskPrivatesMapTy, llvm::GlobalValue::InternalLinkage, ".omp_task_privates_map.", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, TaskPrivatesMap, TaskPrivatesMapFnInfo); TaskPrivatesMap->removeFnAttr(llvm::Attribute::NoInline); TaskPrivatesMap->addFnAttr(llvm::Attribute::AlwaysInline); CodeGenFunction CGF(CGM); CGF.disableDebugInfo(); CGF.StartFunction(GlobalDecl(), C.VoidTy, TaskPrivatesMap, TaskPrivatesMapFnInfo, Args); // *privi = &.privates.privi; LValue Base = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(&TaskPrivatesArg), TaskPrivatesArg.getType()->castAs()); auto *PrivatesQTyRD = cast(PrivatesQTy->getAsTagDecl()); Counter = 0; for (auto *Field : PrivatesQTyRD->fields()) { auto FieldLVal = CGF.EmitLValueForField(Base, Field); auto *VD = Args[PrivateVarsPos[Privates[Counter].second.Original]]; auto RefLVal = CGF.MakeAddrLValue(CGF.GetAddrOfLocalVar(VD), VD->getType()); auto RefLoadLVal = CGF.EmitLoadOfPointerLValue( RefLVal.getAddress(), RefLVal.getType()->castAs()); CGF.EmitStoreOfScalar(FieldLVal.getPointer(), RefLoadLVal); ++Counter; } CGF.FinishFunction(); return TaskPrivatesMap; } static int array_pod_sort_comparator(const PrivateDataTy *P1, const PrivateDataTy *P2) { return P1->first < P2->first ? 1 : (P2->first < P1->first ? -1 : 0); } /// Emit initialization for private variables in task-based directives. static void emitPrivatesInit(CodeGenFunction &CGF, const OMPExecutableDirective &D, Address KmpTaskSharedsPtr, LValue TDBase, const RecordDecl *KmpTaskTWithPrivatesQTyRD, QualType SharedsTy, QualType SharedsPtrTy, const OMPTaskDataTy &Data, ArrayRef Privates, bool ForDup) { auto &C = CGF.getContext(); auto FI = std::next(KmpTaskTWithPrivatesQTyRD->field_begin()); LValue PrivatesBase = CGF.EmitLValueForField(TDBase, *FI); LValue SrcBase; if (!Data.FirstprivateVars.empty()) { SrcBase = CGF.MakeAddrLValue( CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( KmpTaskSharedsPtr, CGF.ConvertTypeForMem(SharedsPtrTy)), SharedsTy); } CodeGenFunction::CGCapturedStmtInfo CapturesInfo( cast(*D.getAssociatedStmt())); FI = cast(FI->getType()->getAsTagDecl())->field_begin(); for (auto &&Pair : Privates) { auto *VD = Pair.second.PrivateCopy; auto *Init = VD->getAnyInitializer(); if (Init && (!ForDup || (isa(Init) && !CGF.isTrivialInitializer(Init)))) { LValue PrivateLValue = CGF.EmitLValueForField(PrivatesBase, *FI); if (auto *Elem = Pair.second.PrivateElemInit) { auto *OriginalVD = Pair.second.Original; auto *SharedField = CapturesInfo.lookup(OriginalVD); auto SharedRefLValue = CGF.EmitLValueForField(SrcBase, SharedField); SharedRefLValue = CGF.MakeAddrLValue( Address(SharedRefLValue.getPointer(), C.getDeclAlign(OriginalVD)), SharedRefLValue.getType(), AlignmentSource::Decl); QualType Type = OriginalVD->getType(); if (Type->isArrayType()) { // Initialize firstprivate array. if (!isa(Init) || CGF.isTrivialInitializer(Init)) { // Perform simple memcpy. CGF.EmitAggregateAssign(PrivateLValue.getAddress(), SharedRefLValue.getAddress(), Type); } else { // Initialize firstprivate array using element-by-element // intialization. CGF.EmitOMPAggregateAssign( PrivateLValue.getAddress(), SharedRefLValue.getAddress(), Type, [&CGF, Elem, Init, &CapturesInfo](Address DestElement, Address SrcElement) { // Clean up any temporaries needed by the initialization. CodeGenFunction::OMPPrivateScope InitScope(CGF); InitScope.addPrivate( Elem, [SrcElement]() -> Address { return SrcElement; }); (void)InitScope.Privatize(); // Emit initialization for single element. CodeGenFunction::CGCapturedStmtRAII CapInfoRAII( CGF, &CapturesInfo); CGF.EmitAnyExprToMem(Init, DestElement, Init->getType().getQualifiers(), /*IsInitializer=*/false); }); } } else { CodeGenFunction::OMPPrivateScope InitScope(CGF); InitScope.addPrivate(Elem, [SharedRefLValue]() -> Address { return SharedRefLValue.getAddress(); }); (void)InitScope.Privatize(); CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(CGF, &CapturesInfo); CGF.EmitExprAsInit(Init, VD, PrivateLValue, /*capturedByInit=*/false); } } else CGF.EmitExprAsInit(Init, VD, PrivateLValue, /*capturedByInit=*/false); } ++FI; } } /// Check if duplication function is required for taskloops. static bool checkInitIsRequired(CodeGenFunction &CGF, ArrayRef Privates) { bool InitRequired = false; for (auto &&Pair : Privates) { auto *VD = Pair.second.PrivateCopy; auto *Init = VD->getAnyInitializer(); InitRequired = InitRequired || (Init && isa(Init) && !CGF.isTrivialInitializer(Init)); } return InitRequired; } /// Emit task_dup function (for initialization of /// private/firstprivate/lastprivate vars and last_iter flag) /// \code /// void __task_dup_entry(kmp_task_t *task_dst, const kmp_task_t *task_src, int /// lastpriv) { /// // setup lastprivate flag /// task_dst->last = lastpriv; /// // could be constructor calls here... /// } /// \endcode static llvm::Value * emitTaskDupFunction(CodeGenModule &CGM, SourceLocation Loc, const OMPExecutableDirective &D, QualType KmpTaskTWithPrivatesPtrQTy, const RecordDecl *KmpTaskTWithPrivatesQTyRD, const RecordDecl *KmpTaskTQTyRD, QualType SharedsTy, QualType SharedsPtrTy, const OMPTaskDataTy &Data, ArrayRef Privates, bool WithLastIter) { auto &C = CGM.getContext(); FunctionArgList Args; ImplicitParamDecl DstArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, KmpTaskTWithPrivatesPtrQTy); ImplicitParamDecl SrcArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, KmpTaskTWithPrivatesPtrQTy); ImplicitParamDecl LastprivArg(C, /*DC=*/nullptr, Loc, /*Id=*/nullptr, C.IntTy); Args.push_back(&DstArg); Args.push_back(&SrcArg); Args.push_back(&LastprivArg); auto &TaskDupFnInfo = CGM.getTypes().arrangeBuiltinFunctionDeclaration(C.VoidTy, Args); auto *TaskDupTy = CGM.getTypes().GetFunctionType(TaskDupFnInfo); auto *TaskDup = llvm::Function::Create(TaskDupTy, llvm::GlobalValue::InternalLinkage, ".omp_task_dup.", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, TaskDup, TaskDupFnInfo); CodeGenFunction CGF(CGM); CGF.disableDebugInfo(); CGF.StartFunction(GlobalDecl(), C.VoidTy, TaskDup, TaskDupFnInfo, Args); LValue TDBase = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(&DstArg), KmpTaskTWithPrivatesPtrQTy->castAs()); // task_dst->liter = lastpriv; if (WithLastIter) { auto LIFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTLastIter); LValue Base = CGF.EmitLValueForField( TDBase, *KmpTaskTWithPrivatesQTyRD->field_begin()); LValue LILVal = CGF.EmitLValueForField(Base, *LIFI); llvm::Value *Lastpriv = CGF.EmitLoadOfScalar( CGF.GetAddrOfLocalVar(&LastprivArg), /*Volatile=*/false, C.IntTy, Loc); CGF.EmitStoreOfScalar(Lastpriv, LILVal); } // Emit initial values for private copies (if any). assert(!Privates.empty()); Address KmpTaskSharedsPtr = Address::invalid(); if (!Data.FirstprivateVars.empty()) { LValue TDBase = CGF.EmitLoadOfPointerLValue( CGF.GetAddrOfLocalVar(&SrcArg), KmpTaskTWithPrivatesPtrQTy->castAs()); LValue Base = CGF.EmitLValueForField( TDBase, *KmpTaskTWithPrivatesQTyRD->field_begin()); KmpTaskSharedsPtr = Address( CGF.EmitLoadOfScalar(CGF.EmitLValueForField( Base, *std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTShareds)), Loc), CGF.getNaturalTypeAlignment(SharedsTy)); } emitPrivatesInit(CGF, D, KmpTaskSharedsPtr, TDBase, KmpTaskTWithPrivatesQTyRD, SharedsTy, SharedsPtrTy, Data, Privates, /*ForDup=*/true); CGF.FinishFunction(); return TaskDup; } /// Checks if destructor function is required to be generated. /// \return true if cleanups are required, false otherwise. static bool checkDestructorsRequired(const RecordDecl *KmpTaskTWithPrivatesQTyRD) { bool NeedsCleanup = false; auto FI = std::next(KmpTaskTWithPrivatesQTyRD->field_begin()); auto *PrivateRD = cast(FI->getType()->getAsTagDecl()); for (auto *FD : PrivateRD->fields()) { NeedsCleanup = NeedsCleanup || FD->getType().isDestructedType(); if (NeedsCleanup) break; } return NeedsCleanup; } CGOpenMPRuntime::TaskResultTy CGOpenMPRuntime::emitTaskInit(CodeGenFunction &CGF, SourceLocation Loc, const OMPExecutableDirective &D, llvm::Value *TaskFunction, QualType SharedsTy, Address Shareds, const OMPTaskDataTy &Data) { auto &C = CGM.getContext(); llvm::SmallVector Privates; // Aggregate privates and sort them by the alignment. auto I = Data.PrivateCopies.begin(); for (auto *E : Data.PrivateVars) { auto *VD = cast(cast(E)->getDecl()); Privates.push_back(std::make_pair( C.getDeclAlign(VD), PrivateHelpersTy(VD, cast(cast(*I)->getDecl()), /*PrivateElemInit=*/nullptr))); ++I; } I = Data.FirstprivateCopies.begin(); auto IElemInitRef = Data.FirstprivateInits.begin(); for (auto *E : Data.FirstprivateVars) { auto *VD = cast(cast(E)->getDecl()); Privates.push_back(std::make_pair( C.getDeclAlign(VD), PrivateHelpersTy( VD, cast(cast(*I)->getDecl()), cast(cast(*IElemInitRef)->getDecl())))); ++I; ++IElemInitRef; } I = Data.LastprivateCopies.begin(); for (auto *E : Data.LastprivateVars) { auto *VD = cast(cast(E)->getDecl()); Privates.push_back(std::make_pair( C.getDeclAlign(VD), PrivateHelpersTy(VD, cast(cast(*I)->getDecl()), /*PrivateElemInit=*/nullptr))); ++I; } llvm::array_pod_sort(Privates.begin(), Privates.end(), array_pod_sort_comparator); auto KmpInt32Ty = C.getIntTypeForBitwidth(/*DestWidth=*/32, /*Signed=*/1); // Build type kmp_routine_entry_t (if not built yet). emitKmpRoutineEntryT(KmpInt32Ty); // Build type kmp_task_t (if not built yet). if (KmpTaskTQTy.isNull()) { KmpTaskTQTy = C.getRecordType(createKmpTaskTRecordDecl( CGM, D.getDirectiveKind(), KmpInt32Ty, KmpRoutineEntryPtrQTy)); } auto *KmpTaskTQTyRD = cast(KmpTaskTQTy->getAsTagDecl()); // Build particular struct kmp_task_t for the given task. auto *KmpTaskTWithPrivatesQTyRD = createKmpTaskTWithPrivatesRecordDecl(CGM, KmpTaskTQTy, Privates); auto KmpTaskTWithPrivatesQTy = C.getRecordType(KmpTaskTWithPrivatesQTyRD); QualType KmpTaskTWithPrivatesPtrQTy = C.getPointerType(KmpTaskTWithPrivatesQTy); auto *KmpTaskTWithPrivatesTy = CGF.ConvertType(KmpTaskTWithPrivatesQTy); auto *KmpTaskTWithPrivatesPtrTy = KmpTaskTWithPrivatesTy->getPointerTo(); auto *KmpTaskTWithPrivatesTySize = CGF.getTypeSize(KmpTaskTWithPrivatesQTy); QualType SharedsPtrTy = C.getPointerType(SharedsTy); // Emit initial values for private copies (if any). llvm::Value *TaskPrivatesMap = nullptr; auto *TaskPrivatesMapTy = std::next(cast(TaskFunction)->getArgumentList().begin(), 3) ->getType(); if (!Privates.empty()) { auto FI = std::next(KmpTaskTWithPrivatesQTyRD->field_begin()); TaskPrivatesMap = emitTaskPrivateMappingFunction( CGM, Loc, Data.PrivateVars, Data.FirstprivateVars, Data.LastprivateVars, FI->getType(), Privates); TaskPrivatesMap = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( TaskPrivatesMap, TaskPrivatesMapTy); } else { TaskPrivatesMap = llvm::ConstantPointerNull::get( cast(TaskPrivatesMapTy)); } // Build a proxy function kmp_int32 .omp_task_entry.(kmp_int32 gtid, // kmp_task_t *tt); auto *TaskEntry = emitProxyTaskFunction( CGM, Loc, D.getDirectiveKind(), KmpInt32Ty, KmpTaskTWithPrivatesPtrQTy, KmpTaskTWithPrivatesQTy, KmpTaskTQTy, SharedsPtrTy, TaskFunction, TaskPrivatesMap); // Build call kmp_task_t * __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, // kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, // kmp_routine_entry_t *task_entry); // Task flags. Format is taken from // http://llvm.org/svn/llvm-project/openmp/trunk/runtime/src/kmp.h, // description of kmp_tasking_flags struct. enum { TiedFlag = 0x1, FinalFlag = 0x2, DestructorsFlag = 0x8, PriorityFlag = 0x20 }; unsigned Flags = Data.Tied ? TiedFlag : 0; bool NeedsCleanup = false; if (!Privates.empty()) { NeedsCleanup = checkDestructorsRequired(KmpTaskTWithPrivatesQTyRD); if (NeedsCleanup) Flags = Flags | DestructorsFlag; } if (Data.Priority.getInt()) Flags = Flags | PriorityFlag; auto *TaskFlags = Data.Final.getPointer() ? CGF.Builder.CreateSelect(Data.Final.getPointer(), CGF.Builder.getInt32(FinalFlag), CGF.Builder.getInt32(/*C=*/0)) : CGF.Builder.getInt32(Data.Final.getInt() ? FinalFlag : 0); TaskFlags = CGF.Builder.CreateOr(TaskFlags, CGF.Builder.getInt32(Flags)); auto *SharedsSize = CGM.getSize(C.getTypeSizeInChars(SharedsTy)); llvm::Value *AllocArgs[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), TaskFlags, KmpTaskTWithPrivatesTySize, SharedsSize, CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( TaskEntry, KmpRoutineEntryPtrTy)}; auto *NewTask = CGF.EmitRuntimeCall( createRuntimeFunction(OMPRTL__kmpc_omp_task_alloc), AllocArgs); auto *NewTaskNewTaskTTy = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( NewTask, KmpTaskTWithPrivatesPtrTy); LValue Base = CGF.MakeNaturalAlignAddrLValue(NewTaskNewTaskTTy, KmpTaskTWithPrivatesQTy); LValue TDBase = CGF.EmitLValueForField(Base, *KmpTaskTWithPrivatesQTyRD->field_begin()); // Fill the data in the resulting kmp_task_t record. // Copy shareds if there are any. Address KmpTaskSharedsPtr = Address::invalid(); if (!SharedsTy->getAsStructureType()->getDecl()->field_empty()) { KmpTaskSharedsPtr = Address(CGF.EmitLoadOfScalar( CGF.EmitLValueForField( TDBase, *std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTShareds)), Loc), CGF.getNaturalTypeAlignment(SharedsTy)); CGF.EmitAggregateCopy(KmpTaskSharedsPtr, Shareds, SharedsTy); } // Emit initial values for private copies (if any). TaskResultTy Result; if (!Privates.empty()) { emitPrivatesInit(CGF, D, KmpTaskSharedsPtr, Base, KmpTaskTWithPrivatesQTyRD, SharedsTy, SharedsPtrTy, Data, Privates, /*ForDup=*/false); if (isOpenMPTaskLoopDirective(D.getDirectiveKind()) && (!Data.LastprivateVars.empty() || checkInitIsRequired(CGF, Privates))) { Result.TaskDupFn = emitTaskDupFunction( CGM, Loc, D, KmpTaskTWithPrivatesPtrQTy, KmpTaskTWithPrivatesQTyRD, KmpTaskTQTyRD, SharedsTy, SharedsPtrTy, Data, Privates, /*WithLastIter=*/!Data.LastprivateVars.empty()); } } // Fields of union "kmp_cmplrdata_t" for destructors and priority. enum { Priority = 0, Destructors = 1 }; // Provide pointer to function with destructors for privates. auto FI = std::next(KmpTaskTQTyRD->field_begin(), Data1); auto *KmpCmplrdataUD = (*FI)->getType()->getAsUnionType()->getDecl(); if (NeedsCleanup) { llvm::Value *DestructorFn = emitDestructorsFunction( CGM, Loc, KmpInt32Ty, KmpTaskTWithPrivatesPtrQTy, KmpTaskTWithPrivatesQTy); LValue Data1LV = CGF.EmitLValueForField(TDBase, *FI); LValue DestructorsLV = CGF.EmitLValueForField( Data1LV, *std::next(KmpCmplrdataUD->field_begin(), Destructors)); CGF.EmitStoreOfScalar(CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( DestructorFn, KmpRoutineEntryPtrTy), DestructorsLV); } // Set priority. if (Data.Priority.getInt()) { LValue Data2LV = CGF.EmitLValueForField( TDBase, *std::next(KmpTaskTQTyRD->field_begin(), Data2)); LValue PriorityLV = CGF.EmitLValueForField( Data2LV, *std::next(KmpCmplrdataUD->field_begin(), Priority)); CGF.EmitStoreOfScalar(Data.Priority.getPointer(), PriorityLV); } Result.NewTask = NewTask; Result.TaskEntry = TaskEntry; Result.NewTaskNewTaskTTy = NewTaskNewTaskTTy; Result.TDBase = TDBase; Result.KmpTaskTQTyRD = KmpTaskTQTyRD; return Result; } void CGOpenMPRuntime::emitTaskCall(CodeGenFunction &CGF, SourceLocation Loc, const OMPExecutableDirective &D, llvm::Value *TaskFunction, QualType SharedsTy, Address Shareds, const Expr *IfCond, const OMPTaskDataTy &Data) { if (!CGF.HaveInsertPoint()) return; TaskResultTy Result = emitTaskInit(CGF, Loc, D, TaskFunction, SharedsTy, Shareds, Data); llvm::Value *NewTask = Result.NewTask; llvm::Value *TaskEntry = Result.TaskEntry; llvm::Value *NewTaskNewTaskTTy = Result.NewTaskNewTaskTTy; LValue TDBase = Result.TDBase; RecordDecl *KmpTaskTQTyRD = Result.KmpTaskTQTyRD; auto &C = CGM.getContext(); // Process list of dependences. Address DependenciesArray = Address::invalid(); unsigned NumDependencies = Data.Dependences.size(); if (NumDependencies) { // Dependence kind for RTL. enum RTLDependenceKindTy { DepIn = 0x01, DepInOut = 0x3 }; enum RTLDependInfoFieldsTy { BaseAddr, Len, Flags }; RecordDecl *KmpDependInfoRD; QualType FlagsTy = C.getIntTypeForBitwidth(C.getTypeSize(C.BoolTy), /*Signed=*/false); llvm::Type *LLVMFlagsTy = CGF.ConvertTypeForMem(FlagsTy); if (KmpDependInfoTy.isNull()) { KmpDependInfoRD = C.buildImplicitRecord("kmp_depend_info"); KmpDependInfoRD->startDefinition(); addFieldToRecordDecl(C, KmpDependInfoRD, C.getIntPtrType()); addFieldToRecordDecl(C, KmpDependInfoRD, C.getSizeType()); addFieldToRecordDecl(C, KmpDependInfoRD, FlagsTy); KmpDependInfoRD->completeDefinition(); KmpDependInfoTy = C.getRecordType(KmpDependInfoRD); } else KmpDependInfoRD = cast(KmpDependInfoTy->getAsTagDecl()); CharUnits DependencySize = C.getTypeSizeInChars(KmpDependInfoTy); // Define type kmp_depend_info[]; QualType KmpDependInfoArrayTy = C.getConstantArrayType( KmpDependInfoTy, llvm::APInt(/*numBits=*/64, NumDependencies), ArrayType::Normal, /*IndexTypeQuals=*/0); // kmp_depend_info[] deps; DependenciesArray = CGF.CreateMemTemp(KmpDependInfoArrayTy, ".dep.arr.addr"); for (unsigned i = 0; i < NumDependencies; ++i) { const Expr *E = Data.Dependences[i].second; auto Addr = CGF.EmitLValue(E); llvm::Value *Size; QualType Ty = E->getType(); if (auto *ASE = dyn_cast(E->IgnoreParenImpCasts())) { LValue UpAddrLVal = CGF.EmitOMPArraySectionExpr(ASE, /*LowerBound=*/false); llvm::Value *UpAddr = CGF.Builder.CreateConstGEP1_32(UpAddrLVal.getPointer(), /*Idx0=*/1); llvm::Value *LowIntPtr = CGF.Builder.CreatePtrToInt(Addr.getPointer(), CGM.SizeTy); llvm::Value *UpIntPtr = CGF.Builder.CreatePtrToInt(UpAddr, CGM.SizeTy); Size = CGF.Builder.CreateNUWSub(UpIntPtr, LowIntPtr); } else Size = CGF.getTypeSize(Ty); auto Base = CGF.MakeAddrLValue( CGF.Builder.CreateConstArrayGEP(DependenciesArray, i, DependencySize), KmpDependInfoTy); // deps[i].base_addr = &; auto BaseAddrLVal = CGF.EmitLValueForField( Base, *std::next(KmpDependInfoRD->field_begin(), BaseAddr)); CGF.EmitStoreOfScalar( CGF.Builder.CreatePtrToInt(Addr.getPointer(), CGF.IntPtrTy), BaseAddrLVal); // deps[i].len = sizeof(); auto LenLVal = CGF.EmitLValueForField( Base, *std::next(KmpDependInfoRD->field_begin(), Len)); CGF.EmitStoreOfScalar(Size, LenLVal); // deps[i].flags = ; RTLDependenceKindTy DepKind; switch (Data.Dependences[i].first) { case OMPC_DEPEND_in: DepKind = DepIn; break; // Out and InOut dependencies must use the same code. case OMPC_DEPEND_out: case OMPC_DEPEND_inout: DepKind = DepInOut; break; case OMPC_DEPEND_source: case OMPC_DEPEND_sink: case OMPC_DEPEND_unknown: llvm_unreachable("Unknown task dependence type"); } auto FlagsLVal = CGF.EmitLValueForField( Base, *std::next(KmpDependInfoRD->field_begin(), Flags)); CGF.EmitStoreOfScalar(llvm::ConstantInt::get(LLVMFlagsTy, DepKind), FlagsLVal); } DependenciesArray = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.Builder.CreateStructGEP(DependenciesArray, 0, CharUnits::Zero()), CGF.VoidPtrTy); } // NOTE: routine and part_id fields are intialized by __kmpc_omp_task_alloc() // libcall. // Build kmp_int32 __kmpc_omp_task_with_deps(ident_t *, kmp_int32 gtid, // kmp_task_t *new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list, // kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) if dependence // list is not empty auto *ThreadID = getThreadID(CGF, Loc); auto *UpLoc = emitUpdateLocation(CGF, Loc); llvm::Value *TaskArgs[] = { UpLoc, ThreadID, NewTask }; llvm::Value *DepTaskArgs[7]; if (NumDependencies) { DepTaskArgs[0] = UpLoc; DepTaskArgs[1] = ThreadID; DepTaskArgs[2] = NewTask; DepTaskArgs[3] = CGF.Builder.getInt32(NumDependencies); DepTaskArgs[4] = DependenciesArray.getPointer(); DepTaskArgs[5] = CGF.Builder.getInt32(0); DepTaskArgs[6] = llvm::ConstantPointerNull::get(CGF.VoidPtrTy); } auto &&ThenCodeGen = [this, Loc, &Data, TDBase, KmpTaskTQTyRD, NumDependencies, &TaskArgs, &DepTaskArgs](CodeGenFunction &CGF, PrePostActionTy &) { if (!Data.Tied) { auto PartIdFI = std::next(KmpTaskTQTyRD->field_begin(), KmpTaskTPartId); auto PartIdLVal = CGF.EmitLValueForField(TDBase, *PartIdFI); CGF.EmitStoreOfScalar(CGF.Builder.getInt32(0), PartIdLVal); } if (NumDependencies) { CGF.EmitRuntimeCall( createRuntimeFunction(OMPRTL__kmpc_omp_task_with_deps), DepTaskArgs); } else { CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_omp_task), TaskArgs); } // Check if parent region is untied and build return for untied task; if (auto *Region = dyn_cast_or_null(CGF.CapturedStmtInfo)) Region->emitUntiedSwitch(CGF); }; llvm::Value *DepWaitTaskArgs[6]; if (NumDependencies) { DepWaitTaskArgs[0] = UpLoc; DepWaitTaskArgs[1] = ThreadID; DepWaitTaskArgs[2] = CGF.Builder.getInt32(NumDependencies); DepWaitTaskArgs[3] = DependenciesArray.getPointer(); DepWaitTaskArgs[4] = CGF.Builder.getInt32(0); DepWaitTaskArgs[5] = llvm::ConstantPointerNull::get(CGF.VoidPtrTy); } auto &&ElseCodeGen = [&TaskArgs, ThreadID, NewTaskNewTaskTTy, TaskEntry, NumDependencies, &DepWaitTaskArgs](CodeGenFunction &CGF, PrePostActionTy &) { auto &RT = CGF.CGM.getOpenMPRuntime(); CodeGenFunction::RunCleanupsScope LocalScope(CGF); // Build void __kmpc_omp_wait_deps(ident_t *, kmp_int32 gtid, // kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 // ndeps_noalias, kmp_depend_info_t *noalias_dep_list); if dependence info // is specified. if (NumDependencies) CGF.EmitRuntimeCall(RT.createRuntimeFunction(OMPRTL__kmpc_omp_wait_deps), DepWaitTaskArgs); // Call proxy_task_entry(gtid, new_task); auto &&CodeGen = [TaskEntry, ThreadID, NewTaskNewTaskTTy]( CodeGenFunction &CGF, PrePostActionTy &Action) { Action.Enter(CGF); llvm::Value *OutlinedFnArgs[] = {ThreadID, NewTaskNewTaskTTy}; CGF.EmitCallOrInvoke(TaskEntry, OutlinedFnArgs); }; // Build void __kmpc_omp_task_begin_if0(ident_t *, kmp_int32 gtid, // kmp_task_t *new_task); // Build void __kmpc_omp_task_complete_if0(ident_t *, kmp_int32 gtid, // kmp_task_t *new_task); RegionCodeGenTy RCG(CodeGen); CommonActionTy Action( RT.createRuntimeFunction(OMPRTL__kmpc_omp_task_begin_if0), TaskArgs, RT.createRuntimeFunction(OMPRTL__kmpc_omp_task_complete_if0), TaskArgs); RCG.setAction(Action); RCG(CGF); }; if (IfCond) emitOMPIfClause(CGF, IfCond, ThenCodeGen, ElseCodeGen); else { RegionCodeGenTy ThenRCG(ThenCodeGen); ThenRCG(CGF); } } void CGOpenMPRuntime::emitTaskLoopCall(CodeGenFunction &CGF, SourceLocation Loc, const OMPLoopDirective &D, llvm::Value *TaskFunction, QualType SharedsTy, Address Shareds, const Expr *IfCond, const OMPTaskDataTy &Data) { if (!CGF.HaveInsertPoint()) return; TaskResultTy Result = emitTaskInit(CGF, Loc, D, TaskFunction, SharedsTy, Shareds, Data); // NOTE: routine and part_id fields are intialized by __kmpc_omp_task_alloc() // libcall. // Call to void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int // if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int // sched, kmp_uint64 grainsize, void *task_dup); llvm::Value *ThreadID = getThreadID(CGF, Loc); llvm::Value *UpLoc = emitUpdateLocation(CGF, Loc); llvm::Value *IfVal; if (IfCond) { IfVal = CGF.Builder.CreateIntCast(CGF.EvaluateExprAsBool(IfCond), CGF.IntTy, /*isSigned=*/true); } else IfVal = llvm::ConstantInt::getSigned(CGF.IntTy, /*V=*/1); LValue LBLVal = CGF.EmitLValueForField( Result.TDBase, *std::next(Result.KmpTaskTQTyRD->field_begin(), KmpTaskTLowerBound)); auto *LBVar = cast(cast(D.getLowerBoundVariable())->getDecl()); CGF.EmitAnyExprToMem(LBVar->getInit(), LBLVal.getAddress(), LBLVal.getQuals(), /*IsInitializer=*/true); LValue UBLVal = CGF.EmitLValueForField( Result.TDBase, *std::next(Result.KmpTaskTQTyRD->field_begin(), KmpTaskTUpperBound)); auto *UBVar = cast(cast(D.getUpperBoundVariable())->getDecl()); CGF.EmitAnyExprToMem(UBVar->getInit(), UBLVal.getAddress(), UBLVal.getQuals(), /*IsInitializer=*/true); LValue StLVal = CGF.EmitLValueForField( Result.TDBase, *std::next(Result.KmpTaskTQTyRD->field_begin(), KmpTaskTStride)); auto *StVar = cast(cast(D.getStrideVariable())->getDecl()); CGF.EmitAnyExprToMem(StVar->getInit(), StLVal.getAddress(), StLVal.getQuals(), /*IsInitializer=*/true); enum { NoSchedule = 0, Grainsize = 1, NumTasks = 2 }; llvm::Value *TaskArgs[] = { UpLoc, ThreadID, Result.NewTask, IfVal, LBLVal.getPointer(), UBLVal.getPointer(), CGF.EmitLoadOfScalar(StLVal, SourceLocation()), llvm::ConstantInt::getSigned(CGF.IntTy, Data.Nogroup ? 1 : 0), llvm::ConstantInt::getSigned( CGF.IntTy, Data.Schedule.getPointer() ? Data.Schedule.getInt() ? NumTasks : Grainsize : NoSchedule), Data.Schedule.getPointer() ? CGF.Builder.CreateIntCast(Data.Schedule.getPointer(), CGF.Int64Ty, /*isSigned=*/false) : llvm::ConstantInt::get(CGF.Int64Ty, /*V=*/0), Result.TaskDupFn ? CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(Result.TaskDupFn, CGF.VoidPtrTy) : llvm::ConstantPointerNull::get(CGF.VoidPtrTy)}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_taskloop), TaskArgs); } /// \brief Emit reduction operation for each element of array (required for /// array sections) LHS op = RHS. /// \param Type Type of array. /// \param LHSVar Variable on the left side of the reduction operation /// (references element of array in original variable). /// \param RHSVar Variable on the right side of the reduction operation /// (references element of array in original variable). /// \param RedOpGen Generator of reduction operation with use of LHSVar and /// RHSVar. static void EmitOMPAggregateReduction( CodeGenFunction &CGF, QualType Type, const VarDecl *LHSVar, const VarDecl *RHSVar, const llvm::function_ref &RedOpGen, const Expr *XExpr = nullptr, const Expr *EExpr = nullptr, const Expr *UpExpr = nullptr) { // Perform element-by-element initialization. QualType ElementTy; Address LHSAddr = CGF.GetAddrOfLocalVar(LHSVar); Address RHSAddr = CGF.GetAddrOfLocalVar(RHSVar); // Drill down to the base element type on both arrays. auto ArrayTy = Type->getAsArrayTypeUnsafe(); auto NumElements = CGF.emitArrayLength(ArrayTy, ElementTy, LHSAddr); auto RHSBegin = RHSAddr.getPointer(); auto LHSBegin = LHSAddr.getPointer(); // Cast from pointer to array type to pointer to single element. auto LHSEnd = CGF.Builder.CreateGEP(LHSBegin, NumElements); // The basic structure here is a while-do loop. auto BodyBB = CGF.createBasicBlock("omp.arraycpy.body"); auto DoneBB = CGF.createBasicBlock("omp.arraycpy.done"); auto IsEmpty = CGF.Builder.CreateICmpEQ(LHSBegin, LHSEnd, "omp.arraycpy.isempty"); CGF.Builder.CreateCondBr(IsEmpty, DoneBB, BodyBB); // Enter the loop body, making that address the current address. auto EntryBB = CGF.Builder.GetInsertBlock(); CGF.EmitBlock(BodyBB); CharUnits ElementSize = CGF.getContext().getTypeSizeInChars(ElementTy); llvm::PHINode *RHSElementPHI = CGF.Builder.CreatePHI( RHSBegin->getType(), 2, "omp.arraycpy.srcElementPast"); RHSElementPHI->addIncoming(RHSBegin, EntryBB); Address RHSElementCurrent = Address(RHSElementPHI, RHSAddr.getAlignment().alignmentOfArrayElement(ElementSize)); llvm::PHINode *LHSElementPHI = CGF.Builder.CreatePHI( LHSBegin->getType(), 2, "omp.arraycpy.destElementPast"); LHSElementPHI->addIncoming(LHSBegin, EntryBB); Address LHSElementCurrent = Address(LHSElementPHI, LHSAddr.getAlignment().alignmentOfArrayElement(ElementSize)); // Emit copy. CodeGenFunction::OMPPrivateScope Scope(CGF); Scope.addPrivate(LHSVar, [=]() -> Address { return LHSElementCurrent; }); Scope.addPrivate(RHSVar, [=]() -> Address { return RHSElementCurrent; }); Scope.Privatize(); RedOpGen(CGF, XExpr, EExpr, UpExpr); Scope.ForceCleanup(); // Shift the address forward by one element. auto LHSElementNext = CGF.Builder.CreateConstGEP1_32( LHSElementPHI, /*Idx0=*/1, "omp.arraycpy.dest.element"); auto RHSElementNext = CGF.Builder.CreateConstGEP1_32( RHSElementPHI, /*Idx0=*/1, "omp.arraycpy.src.element"); // Check whether we've reached the end. auto Done = CGF.Builder.CreateICmpEQ(LHSElementNext, LHSEnd, "omp.arraycpy.done"); CGF.Builder.CreateCondBr(Done, DoneBB, BodyBB); LHSElementPHI->addIncoming(LHSElementNext, CGF.Builder.GetInsertBlock()); RHSElementPHI->addIncoming(RHSElementNext, CGF.Builder.GetInsertBlock()); // Done. CGF.EmitBlock(DoneBB, /*IsFinished=*/true); } /// Emit reduction combiner. If the combiner is a simple expression emit it as /// is, otherwise consider it as combiner of UDR decl and emit it as a call of /// UDR combiner function. static void emitReductionCombiner(CodeGenFunction &CGF, const Expr *ReductionOp) { if (auto *CE = dyn_cast(ReductionOp)) if (auto *OVE = dyn_cast(CE->getCallee())) if (auto *DRE = dyn_cast(OVE->getSourceExpr()->IgnoreImpCasts())) if (auto *DRD = dyn_cast(DRE->getDecl())) { std::pair Reduction = CGF.CGM.getOpenMPRuntime().getUserDefinedReduction(DRD); RValue Func = RValue::get(Reduction.first); CodeGenFunction::OpaqueValueMapping Map(CGF, OVE, Func); CGF.EmitIgnoredExpr(ReductionOp); return; } CGF.EmitIgnoredExpr(ReductionOp); } static llvm::Value *emitReductionFunction(CodeGenModule &CGM, llvm::Type *ArgsType, ArrayRef Privates, ArrayRef LHSExprs, ArrayRef RHSExprs, ArrayRef ReductionOps) { auto &C = CGM.getContext(); // void reduction_func(void *LHSArg, void *RHSArg); FunctionArgList Args; ImplicitParamDecl LHSArg(C, /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, C.VoidPtrTy); ImplicitParamDecl RHSArg(C, /*DC=*/nullptr, SourceLocation(), /*Id=*/nullptr, C.VoidPtrTy); Args.push_back(&LHSArg); Args.push_back(&RHSArg); auto &CGFI = CGM.getTypes().arrangeBuiltinFunctionDeclaration(C.VoidTy, Args); auto *Fn = llvm::Function::Create( CGM.getTypes().GetFunctionType(CGFI), llvm::GlobalValue::InternalLinkage, ".omp.reduction.reduction_func", &CGM.getModule()); CGM.SetInternalFunctionAttributes(/*D=*/nullptr, Fn, CGFI); CodeGenFunction CGF(CGM); CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, CGFI, Args); // Dst = (void*[n])(LHSArg); // Src = (void*[n])(RHSArg); Address LHS(CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.Builder.CreateLoad(CGF.GetAddrOfLocalVar(&LHSArg)), ArgsType), CGF.getPointerAlign()); Address RHS(CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.Builder.CreateLoad(CGF.GetAddrOfLocalVar(&RHSArg)), ArgsType), CGF.getPointerAlign()); // ... // *(Type*)lhs[i] = RedOp(*(Type*)lhs[i], *(Type*)rhs[i]); // ... CodeGenFunction::OMPPrivateScope Scope(CGF); auto IPriv = Privates.begin(); unsigned Idx = 0; for (unsigned I = 0, E = ReductionOps.size(); I < E; ++I, ++IPriv, ++Idx) { auto RHSVar = cast(cast(RHSExprs[I])->getDecl()); Scope.addPrivate(RHSVar, [&]() -> Address { return emitAddrOfVarFromArray(CGF, RHS, Idx, RHSVar); }); auto LHSVar = cast(cast(LHSExprs[I])->getDecl()); Scope.addPrivate(LHSVar, [&]() -> Address { return emitAddrOfVarFromArray(CGF, LHS, Idx, LHSVar); }); QualType PrivTy = (*IPriv)->getType(); if (PrivTy->isVariablyModifiedType()) { // Get array size and emit VLA type. ++Idx; Address Elem = CGF.Builder.CreateConstArrayGEP(LHS, Idx, CGF.getPointerSize()); llvm::Value *Ptr = CGF.Builder.CreateLoad(Elem); auto *VLA = CGF.getContext().getAsVariableArrayType(PrivTy); auto *OVE = cast(VLA->getSizeExpr()); CodeGenFunction::OpaqueValueMapping OpaqueMap( CGF, OVE, RValue::get(CGF.Builder.CreatePtrToInt(Ptr, CGF.SizeTy))); CGF.EmitVariablyModifiedType(PrivTy); } } Scope.Privatize(); IPriv = Privates.begin(); auto ILHS = LHSExprs.begin(); auto IRHS = RHSExprs.begin(); for (auto *E : ReductionOps) { if ((*IPriv)->getType()->isArrayType()) { // Emit reduction for array section. auto *LHSVar = cast(cast(*ILHS)->getDecl()); auto *RHSVar = cast(cast(*IRHS)->getDecl()); EmitOMPAggregateReduction( CGF, (*IPriv)->getType(), LHSVar, RHSVar, [=](CodeGenFunction &CGF, const Expr *, const Expr *, const Expr *) { emitReductionCombiner(CGF, E); }); } else // Emit reduction for array subscript or single variable. emitReductionCombiner(CGF, E); ++IPriv; ++ILHS; ++IRHS; } Scope.ForceCleanup(); CGF.FinishFunction(); return Fn; } static void emitSingleReductionCombiner(CodeGenFunction &CGF, const Expr *ReductionOp, const Expr *PrivateRef, const DeclRefExpr *LHS, const DeclRefExpr *RHS) { if (PrivateRef->getType()->isArrayType()) { // Emit reduction for array section. auto *LHSVar = cast(LHS->getDecl()); auto *RHSVar = cast(RHS->getDecl()); EmitOMPAggregateReduction( CGF, PrivateRef->getType(), LHSVar, RHSVar, [=](CodeGenFunction &CGF, const Expr *, const Expr *, const Expr *) { emitReductionCombiner(CGF, ReductionOp); }); } else // Emit reduction for array subscript or single variable. emitReductionCombiner(CGF, ReductionOp); } void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc, ArrayRef Privates, ArrayRef LHSExprs, ArrayRef RHSExprs, ArrayRef ReductionOps, bool WithNowait, bool SimpleReduction) { if (!CGF.HaveInsertPoint()) return; // Next code should be emitted for reduction: // // static kmp_critical_name lock = { 0 }; // // void reduce_func(void *lhs[], void *rhs[]) { // *(Type0*)lhs[0] = ReductionOperation0(*(Type0*)lhs[0], *(Type0*)rhs[0]); // ... // *(Type-1*)lhs[-1] = ReductionOperation-1(*(Type-1*)lhs[-1], // *(Type-1*)rhs[-1]); // } // // ... // void *RedList[] = {&[0], ..., &[-1]}; // switch (__kmpc_reduce{_nowait}(, , , sizeof(RedList), // RedList, reduce_func, &)) { // case 1: // ... // [i] = RedOp(*[i], *[i]); // ... // __kmpc_end_reduce{_nowait}(, , &); // break; // case 2: // ... // Atomic([i] = RedOp(*[i], *[i])); // ... // [__kmpc_end_reduce(, , &);] // break; // default:; // } // // if SimpleReduction is true, only the next code is generated: // ... // [i] = RedOp(*[i], *[i]); // ... auto &C = CGM.getContext(); if (SimpleReduction) { CodeGenFunction::RunCleanupsScope Scope(CGF); auto IPriv = Privates.begin(); auto ILHS = LHSExprs.begin(); auto IRHS = RHSExprs.begin(); for (auto *E : ReductionOps) { emitSingleReductionCombiner(CGF, E, *IPriv, cast(*ILHS), cast(*IRHS)); ++IPriv; ++ILHS; ++IRHS; } return; } // 1. Build a list of reduction variables. // void *RedList[] = {[0], ..., [-1]}; auto Size = RHSExprs.size(); for (auto *E : Privates) { if (E->getType()->isVariablyModifiedType()) // Reserve place for array size. ++Size; } llvm::APInt ArraySize(/*unsigned int numBits=*/32, Size); QualType ReductionArrayTy = C.getConstantArrayType(C.VoidPtrTy, ArraySize, ArrayType::Normal, /*IndexTypeQuals=*/0); Address ReductionList = CGF.CreateMemTemp(ReductionArrayTy, ".omp.reduction.red_list"); auto IPriv = Privates.begin(); unsigned Idx = 0; for (unsigned I = 0, E = RHSExprs.size(); I < E; ++I, ++IPriv, ++Idx) { Address Elem = CGF.Builder.CreateConstArrayGEP(ReductionList, Idx, CGF.getPointerSize()); CGF.Builder.CreateStore( CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( CGF.EmitLValue(RHSExprs[I]).getPointer(), CGF.VoidPtrTy), Elem); if ((*IPriv)->getType()->isVariablyModifiedType()) { // Store array size. ++Idx; Elem = CGF.Builder.CreateConstArrayGEP(ReductionList, Idx, CGF.getPointerSize()); llvm::Value *Size = CGF.Builder.CreateIntCast( CGF.getVLASize( CGF.getContext().getAsVariableArrayType((*IPriv)->getType())) .first, CGF.SizeTy, /*isSigned=*/false); CGF.Builder.CreateStore(CGF.Builder.CreateIntToPtr(Size, CGF.VoidPtrTy), Elem); } } // 2. Emit reduce_func(). auto *ReductionFn = emitReductionFunction( CGM, CGF.ConvertTypeForMem(ReductionArrayTy)->getPointerTo(), Privates, LHSExprs, RHSExprs, ReductionOps); // 3. Create static kmp_critical_name lock = { 0 }; auto *Lock = getCriticalRegionLock(".reduction"); // 4. Build res = __kmpc_reduce{_nowait}(, , , sizeof(RedList), // RedList, reduce_func, &); auto *IdentTLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); auto *ThreadId = getThreadID(CGF, Loc); auto *ReductionArrayTySize = CGF.getTypeSize(ReductionArrayTy); auto *RL = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( ReductionList.getPointer(), CGF.VoidPtrTy); llvm::Value *Args[] = { IdentTLoc, // ident_t * ThreadId, // i32 CGF.Builder.getInt32(RHSExprs.size()), // i32 ReductionArrayTySize, // size_type sizeof(RedList) RL, // void *RedList ReductionFn, // void (*) (void *, void *) Lock // kmp_critical_name *& }; auto Res = CGF.EmitRuntimeCall( createRuntimeFunction(WithNowait ? OMPRTL__kmpc_reduce_nowait : OMPRTL__kmpc_reduce), Args); // 5. Build switch(res) auto *DefaultBB = CGF.createBasicBlock(".omp.reduction.default"); auto *SwInst = CGF.Builder.CreateSwitch(Res, DefaultBB, /*NumCases=*/2); // 6. Build case 1: // ... // [i] = RedOp(*[i], *[i]); // ... // __kmpc_end_reduce{_nowait}(, , &); // break; auto *Case1BB = CGF.createBasicBlock(".omp.reduction.case1"); SwInst->addCase(CGF.Builder.getInt32(1), Case1BB); CGF.EmitBlock(Case1BB); // Add emission of __kmpc_end_reduce{_nowait}(, , &); llvm::Value *EndArgs[] = { IdentTLoc, // ident_t * ThreadId, // i32 Lock // kmp_critical_name *& }; auto &&CodeGen = [&Privates, &LHSExprs, &RHSExprs, &ReductionOps]( CodeGenFunction &CGF, PrePostActionTy &Action) { auto IPriv = Privates.begin(); auto ILHS = LHSExprs.begin(); auto IRHS = RHSExprs.begin(); for (auto *E : ReductionOps) { emitSingleReductionCombiner(CGF, E, *IPriv, cast(*ILHS), cast(*IRHS)); ++IPriv; ++ILHS; ++IRHS; } }; RegionCodeGenTy RCG(CodeGen); CommonActionTy Action( nullptr, llvm::None, createRuntimeFunction(WithNowait ? OMPRTL__kmpc_end_reduce_nowait : OMPRTL__kmpc_end_reduce), EndArgs); RCG.setAction(Action); RCG(CGF); CGF.EmitBranch(DefaultBB); // 7. Build case 2: // ... // Atomic([i] = RedOp(*[i], *[i])); // ... // break; auto *Case2BB = CGF.createBasicBlock(".omp.reduction.case2"); SwInst->addCase(CGF.Builder.getInt32(2), Case2BB); CGF.EmitBlock(Case2BB); auto &&AtomicCodeGen = [Loc, &Privates, &LHSExprs, &RHSExprs, &ReductionOps]( CodeGenFunction &CGF, PrePostActionTy &Action) { auto ILHS = LHSExprs.begin(); auto IRHS = RHSExprs.begin(); auto IPriv = Privates.begin(); for (auto *E : ReductionOps) { const Expr *XExpr = nullptr; const Expr *EExpr = nullptr; const Expr *UpExpr = nullptr; BinaryOperatorKind BO = BO_Comma; if (auto *BO = dyn_cast(E)) { if (BO->getOpcode() == BO_Assign) { XExpr = BO->getLHS(); UpExpr = BO->getRHS(); } } // Try to emit update expression as a simple atomic. auto *RHSExpr = UpExpr; if (RHSExpr) { // Analyze RHS part of the whole expression. if (auto *ACO = dyn_cast( RHSExpr->IgnoreParenImpCasts())) { // If this is a conditional operator, analyze its condition for // min/max reduction operator. RHSExpr = ACO->getCond(); } if (auto *BORHS = dyn_cast(RHSExpr->IgnoreParenImpCasts())) { EExpr = BORHS->getRHS(); BO = BORHS->getOpcode(); } } if (XExpr) { auto *VD = cast(cast(*ILHS)->getDecl()); auto &&AtomicRedGen = [BO, VD, IPriv, Loc](CodeGenFunction &CGF, const Expr *XExpr, const Expr *EExpr, const Expr *UpExpr) { LValue X = CGF.EmitLValue(XExpr); RValue E; if (EExpr) E = CGF.EmitAnyExpr(EExpr); CGF.EmitOMPAtomicSimpleUpdateExpr( X, E, BO, /*IsXLHSInRHSPart=*/true, llvm::AtomicOrdering::Monotonic, Loc, [&CGF, UpExpr, VD, IPriv, Loc](RValue XRValue) { CodeGenFunction::OMPPrivateScope PrivateScope(CGF); PrivateScope.addPrivate( VD, [&CGF, VD, XRValue, Loc]() -> Address { Address LHSTemp = CGF.CreateMemTemp(VD->getType()); CGF.emitOMPSimpleStore( CGF.MakeAddrLValue(LHSTemp, VD->getType()), XRValue, VD->getType().getNonReferenceType(), Loc); return LHSTemp; }); (void)PrivateScope.Privatize(); return CGF.EmitAnyExpr(UpExpr); }); }; if ((*IPriv)->getType()->isArrayType()) { // Emit atomic reduction for array section. auto *RHSVar = cast(cast(*IRHS)->getDecl()); EmitOMPAggregateReduction(CGF, (*IPriv)->getType(), VD, RHSVar, AtomicRedGen, XExpr, EExpr, UpExpr); } else // Emit atomic reduction for array subscript or single variable. AtomicRedGen(CGF, XExpr, EExpr, UpExpr); } else { // Emit as a critical region. auto &&CritRedGen = [E, Loc](CodeGenFunction &CGF, const Expr *, const Expr *, const Expr *) { auto &RT = CGF.CGM.getOpenMPRuntime(); RT.emitCriticalRegion( CGF, ".atomic_reduction", [=](CodeGenFunction &CGF, PrePostActionTy &Action) { Action.Enter(CGF); emitReductionCombiner(CGF, E); }, Loc); }; if ((*IPriv)->getType()->isArrayType()) { auto *LHSVar = cast(cast(*ILHS)->getDecl()); auto *RHSVar = cast(cast(*IRHS)->getDecl()); EmitOMPAggregateReduction(CGF, (*IPriv)->getType(), LHSVar, RHSVar, CritRedGen); } else CritRedGen(CGF, nullptr, nullptr, nullptr); } ++ILHS; ++IRHS; ++IPriv; } }; RegionCodeGenTy AtomicRCG(AtomicCodeGen); if (!WithNowait) { // Add emission of __kmpc_end_reduce(, , &); llvm::Value *EndArgs[] = { IdentTLoc, // ident_t * ThreadId, // i32 Lock // kmp_critical_name *& }; CommonActionTy Action(nullptr, llvm::None, createRuntimeFunction(OMPRTL__kmpc_end_reduce), EndArgs); AtomicRCG.setAction(Action); AtomicRCG(CGF); } else AtomicRCG(CGF); CGF.EmitBranch(DefaultBB); CGF.EmitBlock(DefaultBB, /*IsFinished=*/true); } void CGOpenMPRuntime::emitTaskwaitCall(CodeGenFunction &CGF, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; // Build call kmp_int32 __kmpc_omp_taskwait(ident_t *loc, kmp_int32 // global_tid); llvm::Value *Args[] = {emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc)}; // Ignore return result until untied tasks are supported. CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_omp_taskwait), Args); if (auto *Region = dyn_cast_or_null(CGF.CapturedStmtInfo)) Region->emitUntiedSwitch(CGF); } void CGOpenMPRuntime::emitInlinedDirective(CodeGenFunction &CGF, OpenMPDirectiveKind InnerKind, const RegionCodeGenTy &CodeGen, bool HasCancel) { if (!CGF.HaveInsertPoint()) return; InlinedOpenMPRegionRAII Region(CGF, CodeGen, InnerKind, HasCancel); CGF.CapturedStmtInfo->EmitBody(CGF, /*S=*/nullptr); } namespace { enum RTCancelKind { CancelNoreq = 0, CancelParallel = 1, CancelLoop = 2, CancelSections = 3, CancelTaskgroup = 4 }; } // anonymous namespace static RTCancelKind getCancellationKind(OpenMPDirectiveKind CancelRegion) { RTCancelKind CancelKind = CancelNoreq; if (CancelRegion == OMPD_parallel) CancelKind = CancelParallel; else if (CancelRegion == OMPD_for) CancelKind = CancelLoop; else if (CancelRegion == OMPD_sections) CancelKind = CancelSections; else { assert(CancelRegion == OMPD_taskgroup); CancelKind = CancelTaskgroup; } return CancelKind; } void CGOpenMPRuntime::emitCancellationPointCall( CodeGenFunction &CGF, SourceLocation Loc, OpenMPDirectiveKind CancelRegion) { if (!CGF.HaveInsertPoint()) return; // Build call kmp_int32 __kmpc_cancellationpoint(ident_t *loc, kmp_int32 // global_tid, kmp_int32 cncl_kind); if (auto *OMPRegionInfo = dyn_cast_or_null(CGF.CapturedStmtInfo)) { - if (OMPRegionInfo->hasCancel()) { + // For 'cancellation point taskgroup', the task region info may not have a + // cancel. This may instead happen in another adjacent task. + if (CancelRegion == OMPD_taskgroup || OMPRegionInfo->hasCancel()) { llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc), getThreadID(CGF, Loc), CGF.Builder.getInt32(getCancellationKind(CancelRegion))}; // Ignore return result until untied tasks are supported. auto *Result = CGF.EmitRuntimeCall( createRuntimeFunction(OMPRTL__kmpc_cancellationpoint), Args); // if (__kmpc_cancellationpoint()) { // exit from construct; // } auto *ExitBB = CGF.createBasicBlock(".cancel.exit"); auto *ContBB = CGF.createBasicBlock(".cancel.continue"); auto *Cmp = CGF.Builder.CreateIsNotNull(Result); CGF.Builder.CreateCondBr(Cmp, ExitBB, ContBB); CGF.EmitBlock(ExitBB); // exit from construct; auto CancelDest = CGF.getOMPCancelDestination(OMPRegionInfo->getDirectiveKind()); CGF.EmitBranchThroughCleanup(CancelDest); CGF.EmitBlock(ContBB, /*IsFinished=*/true); } } } void CGOpenMPRuntime::emitCancelCall(CodeGenFunction &CGF, SourceLocation Loc, const Expr *IfCond, OpenMPDirectiveKind CancelRegion) { if (!CGF.HaveInsertPoint()) return; // Build call kmp_int32 __kmpc_cancel(ident_t *loc, kmp_int32 global_tid, // kmp_int32 cncl_kind); if (auto *OMPRegionInfo = dyn_cast_or_null(CGF.CapturedStmtInfo)) { auto &&ThenGen = [Loc, CancelRegion, OMPRegionInfo](CodeGenFunction &CGF, PrePostActionTy &) { auto &RT = CGF.CGM.getOpenMPRuntime(); llvm::Value *Args[] = { RT.emitUpdateLocation(CGF, Loc), RT.getThreadID(CGF, Loc), CGF.Builder.getInt32(getCancellationKind(CancelRegion))}; // Ignore return result until untied tasks are supported. auto *Result = CGF.EmitRuntimeCall( RT.createRuntimeFunction(OMPRTL__kmpc_cancel), Args); // if (__kmpc_cancel()) { // exit from construct; // } auto *ExitBB = CGF.createBasicBlock(".cancel.exit"); auto *ContBB = CGF.createBasicBlock(".cancel.continue"); auto *Cmp = CGF.Builder.CreateIsNotNull(Result); CGF.Builder.CreateCondBr(Cmp, ExitBB, ContBB); CGF.EmitBlock(ExitBB); // exit from construct; auto CancelDest = CGF.getOMPCancelDestination(OMPRegionInfo->getDirectiveKind()); CGF.EmitBranchThroughCleanup(CancelDest); CGF.EmitBlock(ContBB, /*IsFinished=*/true); }; if (IfCond) emitOMPIfClause(CGF, IfCond, ThenGen, [](CodeGenFunction &, PrePostActionTy &) {}); else { RegionCodeGenTy ThenRCG(ThenGen); ThenRCG(CGF); } } } /// \brief Obtain information that uniquely identifies a target entry. This /// consists of the file and device IDs as well as line number associated with /// the relevant entry source location. static void getTargetEntryUniqueInfo(ASTContext &C, SourceLocation Loc, unsigned &DeviceID, unsigned &FileID, unsigned &LineNum) { auto &SM = C.getSourceManager(); // The loc should be always valid and have a file ID (the user cannot use // #pragma directives in macros) assert(Loc.isValid() && "Source location is expected to be always valid."); assert(Loc.isFileID() && "Source location is expected to refer to a file."); PresumedLoc PLoc = SM.getPresumedLoc(Loc); assert(PLoc.isValid() && "Source location is expected to be always valid."); llvm::sys::fs::UniqueID ID; if (llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID)) llvm_unreachable("Source file with target region no longer exists!"); DeviceID = ID.getDevice(); FileID = ID.getFile(); LineNum = PLoc.getLine(); } void CGOpenMPRuntime::emitTargetOutlinedFunction( const OMPExecutableDirective &D, StringRef ParentName, llvm::Function *&OutlinedFn, llvm::Constant *&OutlinedFnID, bool IsOffloadEntry, const RegionCodeGenTy &CodeGen) { assert(!ParentName.empty() && "Invalid target region parent name!"); emitTargetOutlinedFunctionHelper(D, ParentName, OutlinedFn, OutlinedFnID, IsOffloadEntry, CodeGen); } void CGOpenMPRuntime::emitTargetOutlinedFunctionHelper( const OMPExecutableDirective &D, StringRef ParentName, llvm::Function *&OutlinedFn, llvm::Constant *&OutlinedFnID, bool IsOffloadEntry, const RegionCodeGenTy &CodeGen) { // Create a unique name for the entry function using the source location // information of the current target region. The name will be something like: // // __omp_offloading_DD_FFFF_PP_lBB // // where DD_FFFF is an ID unique to the file (device and file IDs), PP is the // mangled name of the function that encloses the target region and BB is the // line number of the target region. unsigned DeviceID; unsigned FileID; unsigned Line; getTargetEntryUniqueInfo(CGM.getContext(), D.getLocStart(), DeviceID, FileID, Line); SmallString<64> EntryFnName; { llvm::raw_svector_ostream OS(EntryFnName); OS << "__omp_offloading" << llvm::format("_%x", DeviceID) << llvm::format("_%x_", FileID) << ParentName << "_l" << Line; } const CapturedStmt &CS = *cast(D.getAssociatedStmt()); CodeGenFunction CGF(CGM, true); CGOpenMPTargetRegionInfo CGInfo(CS, CodeGen, EntryFnName); CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(CGF, &CGInfo); OutlinedFn = CGF.GenerateOpenMPCapturedStmtFunction(CS); // If this target outline function is not an offload entry, we don't need to // register it. if (!IsOffloadEntry) return; // The target region ID is used by the runtime library to identify the current // target region, so it only has to be unique and not necessarily point to // anything. It could be the pointer to the outlined function that implements // the target region, but we aren't using that so that the compiler doesn't // need to keep that, and could therefore inline the host function if proven // worthwhile during optimization. In the other hand, if emitting code for the // device, the ID has to be the function address so that it can retrieved from // the offloading entry and launched by the runtime library. We also mark the // outlined function to have external linkage in case we are emitting code for // the device, because these functions will be entry points to the device. if (CGM.getLangOpts().OpenMPIsDevice) { OutlinedFnID = llvm::ConstantExpr::getBitCast(OutlinedFn, CGM.Int8PtrTy); OutlinedFn->setLinkage(llvm::GlobalValue::ExternalLinkage); } else OutlinedFnID = new llvm::GlobalVariable( CGM.getModule(), CGM.Int8Ty, /*isConstant=*/true, llvm::GlobalValue::PrivateLinkage, llvm::Constant::getNullValue(CGM.Int8Ty), ".omp_offload.region_id"); // Register the information for the entry associated with this target region. OffloadEntriesInfoManager.registerTargetRegionEntryInfo( DeviceID, FileID, ParentName, Line, OutlinedFn, OutlinedFnID, /*Flags=*/0); } /// discard all CompoundStmts intervening between two constructs static const Stmt *ignoreCompoundStmts(const Stmt *Body) { while (auto *CS = dyn_cast_or_null(Body)) Body = CS->body_front(); return Body; } /// \brief Emit the num_teams clause of an enclosed teams directive at the /// target region scope. If there is no teams directive associated with the /// target directive, or if there is no num_teams clause associated with the /// enclosed teams directive, return nullptr. static llvm::Value * emitNumTeamsClauseForTargetDirective(CGOpenMPRuntime &OMPRuntime, CodeGenFunction &CGF, const OMPExecutableDirective &D) { assert(!CGF.getLangOpts().OpenMPIsDevice && "Clauses associated with the " "teams directive expected to be " "emitted only for the host!"); // FIXME: For the moment we do not support combined directives with target and // teams, so we do not expect to get any num_teams clause in the provided // directive. Once we support that, this assertion can be replaced by the // actual emission of the clause expression. assert(D.getSingleClause() == nullptr && "Not expecting clause in directive."); // If the current target region has a teams region enclosed, we need to get // the number of teams to pass to the runtime function call. This is done // by generating the expression in a inlined region. This is required because // the expression is captured in the enclosing target environment when the // teams directive is not combined with target. const CapturedStmt &CS = *cast(D.getAssociatedStmt()); // FIXME: Accommodate other combined directives with teams when they become // available. if (auto *TeamsDir = dyn_cast_or_null( ignoreCompoundStmts(CS.getCapturedStmt()))) { if (auto *NTE = TeamsDir->getSingleClause()) { CGOpenMPInnerExprInfo CGInfo(CGF, CS); CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(CGF, &CGInfo); llvm::Value *NumTeams = CGF.EmitScalarExpr(NTE->getNumTeams()); return CGF.Builder.CreateIntCast(NumTeams, CGF.Int32Ty, /*IsSigned=*/true); } // If we have an enclosed teams directive but no num_teams clause we use // the default value 0. return CGF.Builder.getInt32(0); } // No teams associated with the directive. return nullptr; } /// \brief Emit the thread_limit clause of an enclosed teams directive at the /// target region scope. If there is no teams directive associated with the /// target directive, or if there is no thread_limit clause associated with the /// enclosed teams directive, return nullptr. static llvm::Value * emitThreadLimitClauseForTargetDirective(CGOpenMPRuntime &OMPRuntime, CodeGenFunction &CGF, const OMPExecutableDirective &D) { assert(!CGF.getLangOpts().OpenMPIsDevice && "Clauses associated with the " "teams directive expected to be " "emitted only for the host!"); // FIXME: For the moment we do not support combined directives with target and // teams, so we do not expect to get any thread_limit clause in the provided // directive. Once we support that, this assertion can be replaced by the // actual emission of the clause expression. assert(D.getSingleClause() == nullptr && "Not expecting clause in directive."); // If the current target region has a teams region enclosed, we need to get // the thread limit to pass to the runtime function call. This is done // by generating the expression in a inlined region. This is required because // the expression is captured in the enclosing target environment when the // teams directive is not combined with target. const CapturedStmt &CS = *cast(D.getAssociatedStmt()); // FIXME: Accommodate other combined directives with teams when they become // available. if (auto *TeamsDir = dyn_cast_or_null( ignoreCompoundStmts(CS.getCapturedStmt()))) { if (auto *TLE = TeamsDir->getSingleClause()) { CGOpenMPInnerExprInfo CGInfo(CGF, CS); CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(CGF, &CGInfo); llvm::Value *ThreadLimit = CGF.EmitScalarExpr(TLE->getThreadLimit()); return CGF.Builder.CreateIntCast(ThreadLimit, CGF.Int32Ty, /*IsSigned=*/true); } // If we have an enclosed teams directive but no thread_limit clause we use // the default value 0. return CGF.Builder.getInt32(0); } // No teams associated with the directive. return nullptr; } namespace { // \brief Utility to handle information from clauses associated with a given // construct that use mappable expressions (e.g. 'map' clause, 'to' clause). // It provides a convenient interface to obtain the information and generate // code for that information. class MappableExprsHandler { public: /// \brief Values for bit flags used to specify the mapping type for /// offloading. enum OpenMPOffloadMappingFlags { /// \brief Allocate memory on the device and move data from host to device. OMP_MAP_TO = 0x01, /// \brief Allocate memory on the device and move data from device to host. OMP_MAP_FROM = 0x02, /// \brief Always perform the requested mapping action on the element, even /// if it was already mapped before. OMP_MAP_ALWAYS = 0x04, /// \brief Delete the element from the device environment, ignoring the /// current reference count associated with the element. OMP_MAP_DELETE = 0x08, /// \brief The element being mapped is a pointer, therefore the pointee /// should be mapped as well. OMP_MAP_IS_PTR = 0x10, /// \brief This flags signals that an argument is the first one relating to /// a map/private clause expression. For some cases a single /// map/privatization results in multiple arguments passed to the runtime /// library. OMP_MAP_FIRST_REF = 0x20, /// \brief Signal that the runtime library has to return the device pointer /// in the current position for the data being mapped. OMP_MAP_RETURN_PTR = 0x40, /// \brief This flag signals that the reference being passed is a pointer to /// private data. OMP_MAP_PRIVATE_PTR = 0x80, /// \brief Pass the element to the device by value. OMP_MAP_PRIVATE_VAL = 0x100, }; /// Class that associates information with a base pointer to be passed to the /// runtime library. class BasePointerInfo { /// The base pointer. llvm::Value *Ptr = nullptr; /// The base declaration that refers to this device pointer, or null if /// there is none. const ValueDecl *DevPtrDecl = nullptr; public: BasePointerInfo(llvm::Value *Ptr, const ValueDecl *DevPtrDecl = nullptr) : Ptr(Ptr), DevPtrDecl(DevPtrDecl) {} llvm::Value *operator*() const { return Ptr; } const ValueDecl *getDevicePtrDecl() const { return DevPtrDecl; } void setDevicePtrDecl(const ValueDecl *D) { DevPtrDecl = D; } }; typedef SmallVector MapBaseValuesArrayTy; typedef SmallVector MapValuesArrayTy; typedef SmallVector MapFlagsArrayTy; private: /// \brief Directive from where the map clauses were extracted. const OMPExecutableDirective &CurDir; /// \brief Function the directive is being generated for. CodeGenFunction &CGF; /// \brief Set of all first private variables in the current directive. llvm::SmallPtrSet FirstPrivateDecls; /// Map between device pointer declarations and their expression components. /// The key value for declarations in 'this' is null. llvm::DenseMap< const ValueDecl *, SmallVector> DevPointersMap; llvm::Value *getExprTypeSize(const Expr *E) const { auto ExprTy = E->getType().getCanonicalType(); // Reference types are ignored for mapping purposes. if (auto *RefTy = ExprTy->getAs()) ExprTy = RefTy->getPointeeType().getCanonicalType(); // Given that an array section is considered a built-in type, we need to // do the calculation based on the length of the section instead of relying // on CGF.getTypeSize(E->getType()). if (const auto *OAE = dyn_cast(E)) { QualType BaseTy = OMPArraySectionExpr::getBaseOriginalType( OAE->getBase()->IgnoreParenImpCasts()) .getCanonicalType(); // If there is no length associated with the expression, that means we // are using the whole length of the base. if (!OAE->getLength() && OAE->getColonLoc().isValid()) return CGF.getTypeSize(BaseTy); llvm::Value *ElemSize; if (auto *PTy = BaseTy->getAs()) ElemSize = CGF.getTypeSize(PTy->getPointeeType().getCanonicalType()); else { auto *ATy = cast(BaseTy.getTypePtr()); assert(ATy && "Expecting array type if not a pointer type."); ElemSize = CGF.getTypeSize(ATy->getElementType().getCanonicalType()); } // If we don't have a length at this point, that is because we have an // array section with a single element. if (!OAE->getLength()) return ElemSize; auto *LengthVal = CGF.EmitScalarExpr(OAE->getLength()); LengthVal = CGF.Builder.CreateIntCast(LengthVal, CGF.SizeTy, /*isSigned=*/false); return CGF.Builder.CreateNUWMul(LengthVal, ElemSize); } return CGF.getTypeSize(ExprTy); } /// \brief Return the corresponding bits for a given map clause modifier. Add /// a flag marking the map as a pointer if requested. Add a flag marking the /// map as the first one of a series of maps that relate to the same map /// expression. unsigned getMapTypeBits(OpenMPMapClauseKind MapType, OpenMPMapClauseKind MapTypeModifier, bool AddPtrFlag, bool AddIsFirstFlag) const { unsigned Bits = 0u; switch (MapType) { case OMPC_MAP_alloc: case OMPC_MAP_release: // alloc and release is the default behavior in the runtime library, i.e. // if we don't pass any bits alloc/release that is what the runtime is // going to do. Therefore, we don't need to signal anything for these two // type modifiers. break; case OMPC_MAP_to: Bits = OMP_MAP_TO; break; case OMPC_MAP_from: Bits = OMP_MAP_FROM; break; case OMPC_MAP_tofrom: Bits = OMP_MAP_TO | OMP_MAP_FROM; break; case OMPC_MAP_delete: Bits = OMP_MAP_DELETE; break; default: llvm_unreachable("Unexpected map type!"); break; } if (AddPtrFlag) Bits |= OMP_MAP_IS_PTR; if (AddIsFirstFlag) Bits |= OMP_MAP_FIRST_REF; if (MapTypeModifier == OMPC_MAP_always) Bits |= OMP_MAP_ALWAYS; return Bits; } /// \brief Return true if the provided expression is a final array section. A /// final array section, is one whose length can't be proved to be one. bool isFinalArraySectionExpression(const Expr *E) const { auto *OASE = dyn_cast(E); // It is not an array section and therefore not a unity-size one. if (!OASE) return false; // An array section with no colon always refer to a single element. if (OASE->getColonLoc().isInvalid()) return false; auto *Length = OASE->getLength(); // If we don't have a length we have to check if the array has size 1 // for this dimension. Also, we should always expect a length if the // base type is pointer. if (!Length) { auto BaseQTy = OMPArraySectionExpr::getBaseOriginalType( OASE->getBase()->IgnoreParenImpCasts()) .getCanonicalType(); if (auto *ATy = dyn_cast(BaseQTy.getTypePtr())) return ATy->getSize().getSExtValue() != 1; // If we don't have a constant dimension length, we have to consider // the current section as having any size, so it is not necessarily // unitary. If it happen to be unity size, that's user fault. return true; } // Check if the length evaluates to 1. llvm::APSInt ConstLength; if (!Length->EvaluateAsInt(ConstLength, CGF.getContext())) return true; // Can have more that size 1. return ConstLength.getSExtValue() != 1; } /// \brief Generate the base pointers, section pointers, sizes and map type /// bits for the provided map type, map modifier, and expression components. /// \a IsFirstComponent should be set to true if the provided set of /// components is the first associated with a capture. void generateInfoForComponentList( OpenMPMapClauseKind MapType, OpenMPMapClauseKind MapTypeModifier, OMPClauseMappableExprCommon::MappableExprComponentListRef Components, MapBaseValuesArrayTy &BasePointers, MapValuesArrayTy &Pointers, MapValuesArrayTy &Sizes, MapFlagsArrayTy &Types, bool IsFirstComponentList) const { // The following summarizes what has to be generated for each map and the // types bellow. The generated information is expressed in this order: // base pointer, section pointer, size, flags // (to add to the ones that come from the map type and modifier). // // double d; // int i[100]; // float *p; // // struct S1 { // int i; // float f[50]; // } // struct S2 { // int i; // float f[50]; // S1 s; // double *p; // struct S2 *ps; // } // S2 s; // S2 *ps; // // map(d) // &d, &d, sizeof(double), noflags // // map(i) // &i, &i, 100*sizeof(int), noflags // // map(i[1:23]) // &i(=&i[0]), &i[1], 23*sizeof(int), noflags // // map(p) // &p, &p, sizeof(float*), noflags // // map(p[1:24]) // p, &p[1], 24*sizeof(float), noflags // // map(s) // &s, &s, sizeof(S2), noflags // // map(s.i) // &s, &(s.i), sizeof(int), noflags // // map(s.s.f) // &s, &(s.i.f), 50*sizeof(int), noflags // // map(s.p) // &s, &(s.p), sizeof(double*), noflags // // map(s.p[:22], s.a s.b) // &s, &(s.p), sizeof(double*), noflags // &(s.p), &(s.p[0]), 22*sizeof(double), ptr_flag + extra_flag // // map(s.ps) // &s, &(s.ps), sizeof(S2*), noflags // // map(s.ps->s.i) // &s, &(s.ps), sizeof(S2*), noflags // &(s.ps), &(s.ps->s.i), sizeof(int), ptr_flag + extra_flag // // map(s.ps->ps) // &s, &(s.ps), sizeof(S2*), noflags // &(s.ps), &(s.ps->ps), sizeof(S2*), ptr_flag + extra_flag // // map(s.ps->ps->ps) // &s, &(s.ps), sizeof(S2*), noflags // &(s.ps), &(s.ps->ps), sizeof(S2*), ptr_flag + extra_flag // &(s.ps->ps), &(s.ps->ps->ps), sizeof(S2*), ptr_flag + extra_flag // // map(s.ps->ps->s.f[:22]) // &s, &(s.ps), sizeof(S2*), noflags // &(s.ps), &(s.ps->ps), sizeof(S2*), ptr_flag + extra_flag // &(s.ps->ps), &(s.ps->ps->s.f[0]), 22*sizeof(float), ptr_flag + extra_flag // // map(ps) // &ps, &ps, sizeof(S2*), noflags // // map(ps->i) // ps, &(ps->i), sizeof(int), noflags // // map(ps->s.f) // ps, &(ps->s.f[0]), 50*sizeof(float), noflags // // map(ps->p) // ps, &(ps->p), sizeof(double*), noflags // // map(ps->p[:22]) // ps, &(ps->p), sizeof(double*), noflags // &(ps->p), &(ps->p[0]), 22*sizeof(double), ptr_flag + extra_flag // // map(ps->ps) // ps, &(ps->ps), sizeof(S2*), noflags // // map(ps->ps->s.i) // ps, &(ps->ps), sizeof(S2*), noflags // &(ps->ps), &(ps->ps->s.i), sizeof(int), ptr_flag + extra_flag // // map(ps->ps->ps) // ps, &(ps->ps), sizeof(S2*), noflags // &(ps->ps), &(ps->ps->ps), sizeof(S2*), ptr_flag + extra_flag // // map(ps->ps->ps->ps) // ps, &(ps->ps), sizeof(S2*), noflags // &(ps->ps), &(ps->ps->ps), sizeof(S2*), ptr_flag + extra_flag // &(ps->ps->ps), &(ps->ps->ps->ps), sizeof(S2*), ptr_flag + extra_flag // // map(ps->ps->ps->s.f[:22]) // ps, &(ps->ps), sizeof(S2*), noflags // &(ps->ps), &(ps->ps->ps), sizeof(S2*), ptr_flag + extra_flag // &(ps->ps->ps), &(ps->ps->ps->s.f[0]), 22*sizeof(float), ptr_flag + // extra_flag // Track if the map information being generated is the first for a capture. bool IsCaptureFirstInfo = IsFirstComponentList; // Scan the components from the base to the complete expression. auto CI = Components.rbegin(); auto CE = Components.rend(); auto I = CI; // Track if the map information being generated is the first for a list of // components. bool IsExpressionFirstInfo = true; llvm::Value *BP = nullptr; if (auto *ME = dyn_cast(I->getAssociatedExpression())) { // The base is the 'this' pointer. The content of the pointer is going // to be the base of the field being mapped. BP = CGF.EmitScalarExpr(ME->getBase()); } else { // The base is the reference to the variable. // BP = &Var. BP = CGF.EmitLValue(cast(I->getAssociatedExpression())) .getPointer(); // If the variable is a pointer and is being dereferenced (i.e. is not // the last component), the base has to be the pointer itself, not its // reference. References are ignored for mapping purposes. QualType Ty = I->getAssociatedDeclaration()->getType().getNonReferenceType(); if (Ty->isAnyPointerType() && std::next(I) != CE) { auto PtrAddr = CGF.MakeNaturalAlignAddrLValue(BP, Ty); BP = CGF.EmitLoadOfPointerLValue(PtrAddr.getAddress(), Ty->castAs()) .getPointer(); // We do not need to generate individual map information for the // pointer, it can be associated with the combined storage. ++I; } } for (; I != CE; ++I) { auto Next = std::next(I); // We need to generate the addresses and sizes if this is the last // component, if the component is a pointer or if it is an array section // whose length can't be proved to be one. If this is a pointer, it // becomes the base address for the following components. // A final array section, is one whose length can't be proved to be one. bool IsFinalArraySection = isFinalArraySectionExpression(I->getAssociatedExpression()); // Get information on whether the element is a pointer. Have to do a // special treatment for array sections given that they are built-in // types. const auto *OASE = dyn_cast(I->getAssociatedExpression()); bool IsPointer = (OASE && OMPArraySectionExpr::getBaseOriginalType(OASE) .getCanonicalType() ->isAnyPointerType()) || I->getAssociatedExpression()->getType()->isAnyPointerType(); if (Next == CE || IsPointer || IsFinalArraySection) { // If this is not the last component, we expect the pointer to be // associated with an array expression or member expression. assert((Next == CE || isa(Next->getAssociatedExpression()) || isa(Next->getAssociatedExpression()) || isa(Next->getAssociatedExpression())) && "Unexpected expression"); auto *LB = CGF.EmitLValue(I->getAssociatedExpression()).getPointer(); auto *Size = getExprTypeSize(I->getAssociatedExpression()); // If we have a member expression and the current component is a // reference, we have to map the reference too. Whenever we have a // reference, the section that reference refers to is going to be a // load instruction from the storage assigned to the reference. if (isa(I->getAssociatedExpression()) && I->getAssociatedDeclaration()->getType()->isReferenceType()) { auto *LI = cast(LB); auto *RefAddr = LI->getPointerOperand(); BasePointers.push_back(BP); Pointers.push_back(RefAddr); Sizes.push_back(CGF.getTypeSize(CGF.getContext().VoidPtrTy)); Types.push_back(getMapTypeBits( /*MapType*/ OMPC_MAP_alloc, /*MapTypeModifier=*/OMPC_MAP_unknown, !IsExpressionFirstInfo, IsCaptureFirstInfo)); IsExpressionFirstInfo = false; IsCaptureFirstInfo = false; // The reference will be the next base address. BP = RefAddr; } BasePointers.push_back(BP); Pointers.push_back(LB); Sizes.push_back(Size); // We need to add a pointer flag for each map that comes from the // same expression except for the first one. We also need to signal // this map is the first one that relates with the current capture // (there is a set of entries for each capture). Types.push_back(getMapTypeBits(MapType, MapTypeModifier, !IsExpressionFirstInfo, IsCaptureFirstInfo)); // If we have a final array section, we are done with this expression. if (IsFinalArraySection) break; // The pointer becomes the base for the next element. if (Next != CE) BP = LB; IsExpressionFirstInfo = false; IsCaptureFirstInfo = false; continue; } } } /// \brief Return the adjusted map modifiers if the declaration a capture /// refers to appears in a first-private clause. This is expected to be used /// only with directives that start with 'target'. unsigned adjustMapModifiersForPrivateClauses(const CapturedStmt::Capture &Cap, unsigned CurrentModifiers) { assert(Cap.capturesVariable() && "Expected capture by reference only!"); // A first private variable captured by reference will use only the // 'private ptr' and 'map to' flag. Return the right flags if the captured // declaration is known as first-private in this handler. if (FirstPrivateDecls.count(Cap.getCapturedVar())) return MappableExprsHandler::OMP_MAP_PRIVATE_PTR | MappableExprsHandler::OMP_MAP_TO; // We didn't modify anything. return CurrentModifiers; } public: MappableExprsHandler(const OMPExecutableDirective &Dir, CodeGenFunction &CGF) : CurDir(Dir), CGF(CGF) { // Extract firstprivate clause information. for (const auto *C : Dir.getClausesOfKind()) for (const auto *D : C->varlists()) FirstPrivateDecls.insert( cast(cast(D)->getDecl())->getCanonicalDecl()); // Extract device pointer clause information. for (const auto *C : Dir.getClausesOfKind()) for (auto L : C->component_lists()) DevPointersMap[L.first].push_back(L.second); } /// \brief Generate all the base pointers, section pointers, sizes and map /// types for the extracted mappable expressions. Also, for each item that /// relates with a device pointer, a pair of the relevant declaration and /// index where it occurs is appended to the device pointers info array. void generateAllInfo(MapBaseValuesArrayTy &BasePointers, MapValuesArrayTy &Pointers, MapValuesArrayTy &Sizes, MapFlagsArrayTy &Types) const { BasePointers.clear(); Pointers.clear(); Sizes.clear(); Types.clear(); struct MapInfo { /// Kind that defines how a device pointer has to be returned. enum ReturnPointerKind { // Don't have to return any pointer. RPK_None, // Pointer is the base of the declaration. RPK_Base, // Pointer is a member of the base declaration - 'this' RPK_Member, // Pointer is a reference and a member of the base declaration - 'this' RPK_MemberReference, }; OMPClauseMappableExprCommon::MappableExprComponentListRef Components; OpenMPMapClauseKind MapType; OpenMPMapClauseKind MapTypeModifier; ReturnPointerKind ReturnDevicePointer; MapInfo() : MapType(OMPC_MAP_unknown), MapTypeModifier(OMPC_MAP_unknown), ReturnDevicePointer(RPK_None) {} MapInfo( OMPClauseMappableExprCommon::MappableExprComponentListRef Components, OpenMPMapClauseKind MapType, OpenMPMapClauseKind MapTypeModifier, ReturnPointerKind ReturnDevicePointer) : Components(Components), MapType(MapType), MapTypeModifier(MapTypeModifier), ReturnDevicePointer(ReturnDevicePointer) {} }; // We have to process the component lists that relate with the same // declaration in a single chunk so that we can generate the map flags // correctly. Therefore, we organize all lists in a map. llvm::DenseMap> Info; // Helper function to fill the information map for the different supported // clauses. auto &&InfoGen = [&Info]( const ValueDecl *D, OMPClauseMappableExprCommon::MappableExprComponentListRef L, OpenMPMapClauseKind MapType, OpenMPMapClauseKind MapModifier, MapInfo::ReturnPointerKind ReturnDevicePointer) { const ValueDecl *VD = D ? cast(D->getCanonicalDecl()) : nullptr; Info[VD].push_back({L, MapType, MapModifier, ReturnDevicePointer}); }; // FIXME: MSVC 2013 seems to require this-> to find member CurDir. for (auto *C : this->CurDir.getClausesOfKind()) for (auto L : C->component_lists()) InfoGen(L.first, L.second, C->getMapType(), C->getMapTypeModifier(), MapInfo::RPK_None); for (auto *C : this->CurDir.getClausesOfKind()) for (auto L : C->component_lists()) InfoGen(L.first, L.second, OMPC_MAP_to, OMPC_MAP_unknown, MapInfo::RPK_None); for (auto *C : this->CurDir.getClausesOfKind()) for (auto L : C->component_lists()) InfoGen(L.first, L.second, OMPC_MAP_from, OMPC_MAP_unknown, MapInfo::RPK_None); // Look at the use_device_ptr clause information and mark the existing map // entries as such. If there is no map information for an entry in the // use_device_ptr list, we create one with map type 'alloc' and zero size // section. It is the user fault if that was not mapped before. // FIXME: MSVC 2013 seems to require this-> to find member CurDir. for (auto *C : this->CurDir.getClausesOfKind()) for (auto L : C->component_lists()) { assert(!L.second.empty() && "Not expecting empty list of components!"); const ValueDecl *VD = L.second.back().getAssociatedDeclaration(); VD = cast(VD->getCanonicalDecl()); auto *IE = L.second.back().getAssociatedExpression(); // If the first component is a member expression, we have to look into // 'this', which maps to null in the map of map information. Otherwise // look directly for the information. auto It = Info.find(isa(IE) ? nullptr : VD); // We potentially have map information for this declaration already. // Look for the first set of components that refer to it. if (It != Info.end()) { auto CI = std::find_if( It->second.begin(), It->second.end(), [VD](const MapInfo &MI) { return MI.Components.back().getAssociatedDeclaration() == VD; }); // If we found a map entry, signal that the pointer has to be returned // and move on to the next declaration. if (CI != It->second.end()) { CI->ReturnDevicePointer = isa(IE) ? (VD->getType()->isReferenceType() ? MapInfo::RPK_MemberReference : MapInfo::RPK_Member) : MapInfo::RPK_Base; continue; } } // We didn't find any match in our map information - generate a zero // size array section. // FIXME: MSVC 2013 seems to require this-> to find member CGF. llvm::Value *Ptr = this->CGF .EmitLoadOfLValue(this->CGF.EmitLValue(IE), SourceLocation()) .getScalarVal(); BasePointers.push_back({Ptr, VD}); Pointers.push_back(Ptr); Sizes.push_back(llvm::Constant::getNullValue(this->CGF.SizeTy)); Types.push_back(OMP_MAP_RETURN_PTR | OMP_MAP_FIRST_REF); } for (auto &M : Info) { // We need to know when we generate information for the first component // associated with a capture, because the mapping flags depend on it. bool IsFirstComponentList = true; for (MapInfo &L : M.second) { assert(!L.Components.empty() && "Not expecting declaration with no component lists."); // Remember the current base pointer index. unsigned CurrentBasePointersIdx = BasePointers.size(); // FIXME: MSVC 2013 seems to require this-> to find the member method. this->generateInfoForComponentList(L.MapType, L.MapTypeModifier, L.Components, BasePointers, Pointers, Sizes, Types, IsFirstComponentList); // If this entry relates with a device pointer, set the relevant // declaration and add the 'return pointer' flag. if (IsFirstComponentList && L.ReturnDevicePointer != MapInfo::RPK_None) { // If the pointer is not the base of the map, we need to skip the // base. If it is a reference in a member field, we also need to skip // the map of the reference. if (L.ReturnDevicePointer != MapInfo::RPK_Base) { ++CurrentBasePointersIdx; if (L.ReturnDevicePointer == MapInfo::RPK_MemberReference) ++CurrentBasePointersIdx; } assert(BasePointers.size() > CurrentBasePointersIdx && "Unexpected number of mapped base pointers."); auto *RelevantVD = L.Components.back().getAssociatedDeclaration(); assert(RelevantVD && "No relevant declaration related with device pointer??"); BasePointers[CurrentBasePointersIdx].setDevicePtrDecl(RelevantVD); Types[CurrentBasePointersIdx] |= OMP_MAP_RETURN_PTR; } IsFirstComponentList = false; } } } /// \brief Generate the base pointers, section pointers, sizes and map types /// associated to a given capture. void generateInfoForCapture(const CapturedStmt::Capture *Cap, llvm::Value *Arg, MapBaseValuesArrayTy &BasePointers, MapValuesArrayTy &Pointers, MapValuesArrayTy &Sizes, MapFlagsArrayTy &Types) const { assert(!Cap->capturesVariableArrayType() && "Not expecting to generate map info for a variable array type!"); BasePointers.clear(); Pointers.clear(); Sizes.clear(); Types.clear(); // We need to know when we generating information for the first component // associated with a capture, because the mapping flags depend on it. bool IsFirstComponentList = true; const ValueDecl *VD = Cap->capturesThis() ? nullptr : cast(Cap->getCapturedVar()->getCanonicalDecl()); // If this declaration appears in a is_device_ptr clause we just have to // pass the pointer by value. If it is a reference to a declaration, we just // pass its value, otherwise, if it is a member expression, we need to map // 'to' the field. if (!VD) { auto It = DevPointersMap.find(VD); if (It != DevPointersMap.end()) { for (auto L : It->second) { generateInfoForComponentList( /*MapType=*/OMPC_MAP_to, /*MapTypeModifier=*/OMPC_MAP_unknown, L, BasePointers, Pointers, Sizes, Types, IsFirstComponentList); IsFirstComponentList = false; } return; } } else if (DevPointersMap.count(VD)) { BasePointers.push_back({Arg, VD}); Pointers.push_back(Arg); Sizes.push_back(CGF.getTypeSize(CGF.getContext().VoidPtrTy)); Types.push_back(OMP_MAP_PRIVATE_VAL | OMP_MAP_FIRST_REF); return; } // FIXME: MSVC 2013 seems to require this-> to find member CurDir. for (auto *C : this->CurDir.getClausesOfKind()) for (auto L : C->decl_component_lists(VD)) { assert(L.first == VD && "We got information for the wrong declaration??"); assert(!L.second.empty() && "Not expecting declaration with no component lists."); generateInfoForComponentList(C->getMapType(), C->getMapTypeModifier(), L.second, BasePointers, Pointers, Sizes, Types, IsFirstComponentList); IsFirstComponentList = false; } return; } /// \brief Generate the default map information for a given capture \a CI, /// record field declaration \a RI and captured value \a CV. void generateDefaultMapInfo(const CapturedStmt::Capture &CI, const FieldDecl &RI, llvm::Value *CV, MapBaseValuesArrayTy &CurBasePointers, MapValuesArrayTy &CurPointers, MapValuesArrayTy &CurSizes, MapFlagsArrayTy &CurMapTypes) { // Do the default mapping. if (CI.capturesThis()) { CurBasePointers.push_back(CV); CurPointers.push_back(CV); const PointerType *PtrTy = cast(RI.getType().getTypePtr()); CurSizes.push_back(CGF.getTypeSize(PtrTy->getPointeeType())); // Default map type. CurMapTypes.push_back(OMP_MAP_TO | OMP_MAP_FROM); } else if (CI.capturesVariableByCopy()) { CurBasePointers.push_back(CV); CurPointers.push_back(CV); if (!RI.getType()->isAnyPointerType()) { // We have to signal to the runtime captures passed by value that are // not pointers. CurMapTypes.push_back(OMP_MAP_PRIVATE_VAL); CurSizes.push_back(CGF.getTypeSize(RI.getType())); } else { // Pointers are implicitly mapped with a zero size and no flags // (other than first map that is added for all implicit maps). CurMapTypes.push_back(0u); CurSizes.push_back(llvm::Constant::getNullValue(CGF.SizeTy)); } } else { assert(CI.capturesVariable() && "Expected captured reference."); CurBasePointers.push_back(CV); CurPointers.push_back(CV); const ReferenceType *PtrTy = cast(RI.getType().getTypePtr()); QualType ElementType = PtrTy->getPointeeType(); CurSizes.push_back(CGF.getTypeSize(ElementType)); // The default map type for a scalar/complex type is 'to' because by // default the value doesn't have to be retrieved. For an aggregate // type, the default is 'tofrom'. CurMapTypes.push_back(ElementType->isAggregateType() ? (OMP_MAP_TO | OMP_MAP_FROM) : OMP_MAP_TO); // If we have a capture by reference we may need to add the private // pointer flag if the base declaration shows in some first-private // clause. CurMapTypes.back() = adjustMapModifiersForPrivateClauses(CI, CurMapTypes.back()); } // Every default map produces a single argument, so, it is always the // first one. CurMapTypes.back() |= OMP_MAP_FIRST_REF; } }; enum OpenMPOffloadingReservedDeviceIDs { /// \brief Device ID if the device was not defined, runtime should get it /// from environment variables in the spec. OMP_DEVICEID_UNDEF = -1, }; } // anonymous namespace /// \brief Emit the arrays used to pass the captures and map information to the /// offloading runtime library. If there is no map or capture information, /// return nullptr by reference. static void emitOffloadingArrays(CodeGenFunction &CGF, MappableExprsHandler::MapBaseValuesArrayTy &BasePointers, MappableExprsHandler::MapValuesArrayTy &Pointers, MappableExprsHandler::MapValuesArrayTy &Sizes, MappableExprsHandler::MapFlagsArrayTy &MapTypes, CGOpenMPRuntime::TargetDataInfo &Info) { auto &CGM = CGF.CGM; auto &Ctx = CGF.getContext(); // Reset the array information. Info.clearArrayInfo(); Info.NumberOfPtrs = BasePointers.size(); if (Info.NumberOfPtrs) { // Detect if we have any capture size requiring runtime evaluation of the // size so that a constant array could be eventually used. bool hasRuntimeEvaluationCaptureSize = false; for (auto *S : Sizes) if (!isa(S)) { hasRuntimeEvaluationCaptureSize = true; break; } llvm::APInt PointerNumAP(32, Info.NumberOfPtrs, /*isSigned=*/true); QualType PointerArrayType = Ctx.getConstantArrayType(Ctx.VoidPtrTy, PointerNumAP, ArrayType::Normal, /*IndexTypeQuals=*/0); Info.BasePointersArray = CGF.CreateMemTemp(PointerArrayType, ".offload_baseptrs").getPointer(); Info.PointersArray = CGF.CreateMemTemp(PointerArrayType, ".offload_ptrs").getPointer(); // If we don't have any VLA types or other types that require runtime // evaluation, we can use a constant array for the map sizes, otherwise we // need to fill up the arrays as we do for the pointers. if (hasRuntimeEvaluationCaptureSize) { QualType SizeArrayType = Ctx.getConstantArrayType( Ctx.getSizeType(), PointerNumAP, ArrayType::Normal, /*IndexTypeQuals=*/0); Info.SizesArray = CGF.CreateMemTemp(SizeArrayType, ".offload_sizes").getPointer(); } else { // We expect all the sizes to be constant, so we collect them to create // a constant array. SmallVector ConstSizes; for (auto S : Sizes) ConstSizes.push_back(cast(S)); auto *SizesArrayInit = llvm::ConstantArray::get( llvm::ArrayType::get(CGM.SizeTy, ConstSizes.size()), ConstSizes); auto *SizesArrayGbl = new llvm::GlobalVariable( CGM.getModule(), SizesArrayInit->getType(), /*isConstant=*/true, llvm::GlobalValue::PrivateLinkage, SizesArrayInit, ".offload_sizes"); SizesArrayGbl->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::Global); Info.SizesArray = SizesArrayGbl; } // The map types are always constant so we don't need to generate code to // fill arrays. Instead, we create an array constant. llvm::Constant *MapTypesArrayInit = llvm::ConstantDataArray::get(CGF.Builder.getContext(), MapTypes); auto *MapTypesArrayGbl = new llvm::GlobalVariable( CGM.getModule(), MapTypesArrayInit->getType(), /*isConstant=*/true, llvm::GlobalValue::PrivateLinkage, MapTypesArrayInit, ".offload_maptypes"); MapTypesArrayGbl->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::Global); Info.MapTypesArray = MapTypesArrayGbl; for (unsigned i = 0; i < Info.NumberOfPtrs; ++i) { llvm::Value *BPVal = *BasePointers[i]; if (BPVal->getType()->isPointerTy()) BPVal = CGF.Builder.CreateBitCast(BPVal, CGM.VoidPtrTy); else { assert(BPVal->getType()->isIntegerTy() && "If not a pointer, the value type must be an integer."); BPVal = CGF.Builder.CreateIntToPtr(BPVal, CGM.VoidPtrTy); } llvm::Value *BP = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.VoidPtrTy, Info.NumberOfPtrs), Info.BasePointersArray, 0, i); Address BPAddr(BP, Ctx.getTypeAlignInChars(Ctx.VoidPtrTy)); CGF.Builder.CreateStore(BPVal, BPAddr); if (Info.requiresDevicePointerInfo()) if (auto *DevVD = BasePointers[i].getDevicePtrDecl()) Info.CaptureDeviceAddrMap.insert(std::make_pair(DevVD, BPAddr)); llvm::Value *PVal = Pointers[i]; if (PVal->getType()->isPointerTy()) PVal = CGF.Builder.CreateBitCast(PVal, CGM.VoidPtrTy); else { assert(PVal->getType()->isIntegerTy() && "If not a pointer, the value type must be an integer."); PVal = CGF.Builder.CreateIntToPtr(PVal, CGM.VoidPtrTy); } llvm::Value *P = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.VoidPtrTy, Info.NumberOfPtrs), Info.PointersArray, 0, i); Address PAddr(P, Ctx.getTypeAlignInChars(Ctx.VoidPtrTy)); CGF.Builder.CreateStore(PVal, PAddr); if (hasRuntimeEvaluationCaptureSize) { llvm::Value *S = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.SizeTy, Info.NumberOfPtrs), Info.SizesArray, /*Idx0=*/0, /*Idx1=*/i); Address SAddr(S, Ctx.getTypeAlignInChars(Ctx.getSizeType())); CGF.Builder.CreateStore( CGF.Builder.CreateIntCast(Sizes[i], CGM.SizeTy, /*isSigned=*/true), SAddr); } } } } /// \brief Emit the arguments to be passed to the runtime library based on the /// arrays of pointers, sizes and map types. static void emitOffloadingArraysArgument( CodeGenFunction &CGF, llvm::Value *&BasePointersArrayArg, llvm::Value *&PointersArrayArg, llvm::Value *&SizesArrayArg, llvm::Value *&MapTypesArrayArg, CGOpenMPRuntime::TargetDataInfo &Info) { auto &CGM = CGF.CGM; if (Info.NumberOfPtrs) { BasePointersArrayArg = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.VoidPtrTy, Info.NumberOfPtrs), Info.BasePointersArray, /*Idx0=*/0, /*Idx1=*/0); PointersArrayArg = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.VoidPtrTy, Info.NumberOfPtrs), Info.PointersArray, /*Idx0=*/0, /*Idx1=*/0); SizesArrayArg = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.SizeTy, Info.NumberOfPtrs), Info.SizesArray, /*Idx0=*/0, /*Idx1=*/0); MapTypesArrayArg = CGF.Builder.CreateConstInBoundsGEP2_32( llvm::ArrayType::get(CGM.Int32Ty, Info.NumberOfPtrs), Info.MapTypesArray, /*Idx0=*/0, /*Idx1=*/0); } else { BasePointersArrayArg = llvm::ConstantPointerNull::get(CGM.VoidPtrPtrTy); PointersArrayArg = llvm::ConstantPointerNull::get(CGM.VoidPtrPtrTy); SizesArrayArg = llvm::ConstantPointerNull::get(CGM.SizeTy->getPointerTo()); MapTypesArrayArg = llvm::ConstantPointerNull::get(CGM.Int32Ty->getPointerTo()); } } void CGOpenMPRuntime::emitTargetCall(CodeGenFunction &CGF, const OMPExecutableDirective &D, llvm::Value *OutlinedFn, llvm::Value *OutlinedFnID, const Expr *IfCond, const Expr *Device, ArrayRef CapturedVars) { if (!CGF.HaveInsertPoint()) return; assert(OutlinedFn && "Invalid outlined function!"); auto &Ctx = CGF.getContext(); // Fill up the arrays with all the captured variables. MappableExprsHandler::MapValuesArrayTy KernelArgs; MappableExprsHandler::MapBaseValuesArrayTy BasePointers; MappableExprsHandler::MapValuesArrayTy Pointers; MappableExprsHandler::MapValuesArrayTy Sizes; MappableExprsHandler::MapFlagsArrayTy MapTypes; MappableExprsHandler::MapBaseValuesArrayTy CurBasePointers; MappableExprsHandler::MapValuesArrayTy CurPointers; MappableExprsHandler::MapValuesArrayTy CurSizes; MappableExprsHandler::MapFlagsArrayTy CurMapTypes; // Get mappable expression information. MappableExprsHandler MEHandler(D, CGF); const CapturedStmt &CS = *cast(D.getAssociatedStmt()); auto RI = CS.getCapturedRecordDecl()->field_begin(); auto CV = CapturedVars.begin(); for (CapturedStmt::const_capture_iterator CI = CS.capture_begin(), CE = CS.capture_end(); CI != CE; ++CI, ++RI, ++CV) { StringRef Name; QualType Ty; CurBasePointers.clear(); CurPointers.clear(); CurSizes.clear(); CurMapTypes.clear(); // VLA sizes are passed to the outlined region by copy and do not have map // information associated. if (CI->capturesVariableArrayType()) { CurBasePointers.push_back(*CV); CurPointers.push_back(*CV); CurSizes.push_back(CGF.getTypeSize(RI->getType())); // Copy to the device as an argument. No need to retrieve it. CurMapTypes.push_back(MappableExprsHandler::OMP_MAP_PRIVATE_VAL | MappableExprsHandler::OMP_MAP_FIRST_REF); } else { // If we have any information in the map clause, we use it, otherwise we // just do a default mapping. MEHandler.generateInfoForCapture(CI, *CV, CurBasePointers, CurPointers, CurSizes, CurMapTypes); if (CurBasePointers.empty()) MEHandler.generateDefaultMapInfo(*CI, **RI, *CV, CurBasePointers, CurPointers, CurSizes, CurMapTypes); } // We expect to have at least an element of information for this capture. assert(!CurBasePointers.empty() && "Non-existing map pointer for capture!"); assert(CurBasePointers.size() == CurPointers.size() && CurBasePointers.size() == CurSizes.size() && CurBasePointers.size() == CurMapTypes.size() && "Inconsistent map information sizes!"); // The kernel args are always the first elements of the base pointers // associated with a capture. KernelArgs.push_back(*CurBasePointers.front()); // We need to append the results of this capture to what we already have. BasePointers.append(CurBasePointers.begin(), CurBasePointers.end()); Pointers.append(CurPointers.begin(), CurPointers.end()); Sizes.append(CurSizes.begin(), CurSizes.end()); MapTypes.append(CurMapTypes.begin(), CurMapTypes.end()); } // Keep track on whether the host function has to be executed. auto OffloadErrorQType = Ctx.getIntTypeForBitwidth(/*DestWidth=*/32, /*Signed=*/true); auto OffloadError = CGF.MakeAddrLValue( CGF.CreateMemTemp(OffloadErrorQType, ".run_host_version"), OffloadErrorQType); CGF.EmitStoreOfScalar(llvm::Constant::getNullValue(CGM.Int32Ty), OffloadError); // Fill up the pointer arrays and transfer execution to the device. auto &&ThenGen = [&Ctx, &BasePointers, &Pointers, &Sizes, &MapTypes, Device, OutlinedFnID, OffloadError, OffloadErrorQType, &D](CodeGenFunction &CGF, PrePostActionTy &) { auto &RT = CGF.CGM.getOpenMPRuntime(); // Emit the offloading arrays. TargetDataInfo Info; emitOffloadingArrays(CGF, BasePointers, Pointers, Sizes, MapTypes, Info); emitOffloadingArraysArgument(CGF, Info.BasePointersArray, Info.PointersArray, Info.SizesArray, Info.MapTypesArray, Info); // On top of the arrays that were filled up, the target offloading call // takes as arguments the device id as well as the host pointer. The host // pointer is used by the runtime library to identify the current target // region, so it only has to be unique and not necessarily point to // anything. It could be the pointer to the outlined function that // implements the target region, but we aren't using that so that the // compiler doesn't need to keep that, and could therefore inline the host // function if proven worthwhile during optimization. // From this point on, we need to have an ID of the target region defined. assert(OutlinedFnID && "Invalid outlined function ID!"); // Emit device ID if any. llvm::Value *DeviceID; if (Device) DeviceID = CGF.Builder.CreateIntCast(CGF.EmitScalarExpr(Device), CGF.Int32Ty, /*isSigned=*/true); else DeviceID = CGF.Builder.getInt32(OMP_DEVICEID_UNDEF); // Emit the number of elements in the offloading arrays. llvm::Value *PointerNum = CGF.Builder.getInt32(BasePointers.size()); // Return value of the runtime offloading call. llvm::Value *Return; auto *NumTeams = emitNumTeamsClauseForTargetDirective(RT, CGF, D); auto *ThreadLimit = emitThreadLimitClauseForTargetDirective(RT, CGF, D); // If we have NumTeams defined this means that we have an enclosed teams // region. Therefore we also expect to have ThreadLimit defined. These two // values should be defined in the presence of a teams directive, regardless // of having any clauses associated. If the user is using teams but no // clauses, these two values will be the default that should be passed to // the runtime library - a 32-bit integer with the value zero. if (NumTeams) { assert(ThreadLimit && "Thread limit expression should be available along " "with number of teams."); llvm::Value *OffloadingArgs[] = { DeviceID, OutlinedFnID, PointerNum, Info.BasePointersArray, Info.PointersArray, Info.SizesArray, Info.MapTypesArray, NumTeams, ThreadLimit}; Return = CGF.EmitRuntimeCall( RT.createRuntimeFunction(OMPRTL__tgt_target_teams), OffloadingArgs); } else { llvm::Value *OffloadingArgs[] = { DeviceID, OutlinedFnID, PointerNum, Info.BasePointersArray, Info.PointersArray, Info.SizesArray, Info.MapTypesArray}; Return = CGF.EmitRuntimeCall(RT.createRuntimeFunction(OMPRTL__tgt_target), OffloadingArgs); } CGF.EmitStoreOfScalar(Return, OffloadError); }; // Notify that the host version must be executed. auto &&ElseGen = [OffloadError](CodeGenFunction &CGF, PrePostActionTy &) { CGF.EmitStoreOfScalar(llvm::ConstantInt::get(CGF.Int32Ty, /*V=*/-1u), OffloadError); }; // If we have a target function ID it means that we need to support // offloading, otherwise, just execute on the host. We need to execute on host // regardless of the conditional in the if clause if, e.g., the user do not // specify target triples. if (OutlinedFnID) { if (IfCond) emitOMPIfClause(CGF, IfCond, ThenGen, ElseGen); else { RegionCodeGenTy ThenRCG(ThenGen); ThenRCG(CGF); } } else { RegionCodeGenTy ElseRCG(ElseGen); ElseRCG(CGF); } // Check the error code and execute the host version if required. auto OffloadFailedBlock = CGF.createBasicBlock("omp_offload.failed"); auto OffloadContBlock = CGF.createBasicBlock("omp_offload.cont"); auto OffloadErrorVal = CGF.EmitLoadOfScalar(OffloadError, SourceLocation()); auto Failed = CGF.Builder.CreateIsNotNull(OffloadErrorVal); CGF.Builder.CreateCondBr(Failed, OffloadFailedBlock, OffloadContBlock); CGF.EmitBlock(OffloadFailedBlock); CGF.Builder.CreateCall(OutlinedFn, KernelArgs); CGF.EmitBranch(OffloadContBlock); CGF.EmitBlock(OffloadContBlock, /*IsFinished=*/true); } void CGOpenMPRuntime::scanForTargetRegionsFunctions(const Stmt *S, StringRef ParentName) { if (!S) return; // If we find a OMP target directive, codegen the outline function and // register the result. // FIXME: Add other directives with target when they become supported. bool isTargetDirective = isa(S); if (isTargetDirective) { auto *E = cast(S); unsigned DeviceID; unsigned FileID; unsigned Line; getTargetEntryUniqueInfo(CGM.getContext(), E->getLocStart(), DeviceID, FileID, Line); // Is this a target region that should not be emitted as an entry point? If // so just signal we are done with this target region. if (!OffloadEntriesInfoManager.hasTargetRegionEntryInfo(DeviceID, FileID, ParentName, Line)) return; llvm::Function *Fn; llvm::Constant *Addr; std::tie(Fn, Addr) = CodeGenFunction::EmitOMPTargetDirectiveOutlinedFunction( CGM, cast(*E), ParentName, /*isOffloadEntry=*/true); assert(Fn && Addr && "Target region emission failed."); return; } if (const OMPExecutableDirective *E = dyn_cast(S)) { if (!E->hasAssociatedStmt()) return; scanForTargetRegionsFunctions( cast(E->getAssociatedStmt())->getCapturedStmt(), ParentName); return; } // If this is a lambda function, look into its body. if (auto *L = dyn_cast(S)) S = L->getBody(); // Keep looking for target regions recursively. for (auto *II : S->children()) scanForTargetRegionsFunctions(II, ParentName); } bool CGOpenMPRuntime::emitTargetFunctions(GlobalDecl GD) { auto &FD = *cast(GD.getDecl()); // If emitting code for the host, we do not process FD here. Instead we do // the normal code generation. if (!CGM.getLangOpts().OpenMPIsDevice) return false; // Try to detect target regions in the function. scanForTargetRegionsFunctions(FD.getBody(), CGM.getMangledName(GD)); // We should not emit any function other that the ones created during the // scanning. Therefore, we signal that this function is completely dealt // with. return true; } bool CGOpenMPRuntime::emitTargetGlobalVariable(GlobalDecl GD) { if (!CGM.getLangOpts().OpenMPIsDevice) return false; // Check if there are Ctors/Dtors in this declaration and look for target // regions in it. We use the complete variant to produce the kernel name // mangling. QualType RDTy = cast(GD.getDecl())->getType(); if (auto *RD = RDTy->getBaseElementTypeUnsafe()->getAsCXXRecordDecl()) { for (auto *Ctor : RD->ctors()) { StringRef ParentName = CGM.getMangledName(GlobalDecl(Ctor, Ctor_Complete)); scanForTargetRegionsFunctions(Ctor->getBody(), ParentName); } auto *Dtor = RD->getDestructor(); if (Dtor) { StringRef ParentName = CGM.getMangledName(GlobalDecl(Dtor, Dtor_Complete)); scanForTargetRegionsFunctions(Dtor->getBody(), ParentName); } } // If we are in target mode we do not emit any global (declare target is not // implemented yet). Therefore we signal that GD was processed in this case. return true; } bool CGOpenMPRuntime::emitTargetGlobal(GlobalDecl GD) { auto *VD = GD.getDecl(); if (isa(VD)) return emitTargetFunctions(GD); return emitTargetGlobalVariable(GD); } llvm::Function *CGOpenMPRuntime::emitRegistrationFunction() { // If we have offloading in the current module, we need to emit the entries // now and register the offloading descriptor. createOffloadEntriesAndInfoMetadata(); // Create and register the offloading binary descriptors. This is the main // entity that captures all the information about offloading in the current // compilation unit. return createOffloadingBinaryDescriptorRegistration(); } void CGOpenMPRuntime::emitTeamsCall(CodeGenFunction &CGF, const OMPExecutableDirective &D, SourceLocation Loc, llvm::Value *OutlinedFn, ArrayRef CapturedVars) { if (!CGF.HaveInsertPoint()) return; auto *RTLoc = emitUpdateLocation(CGF, Loc); CodeGenFunction::RunCleanupsScope Scope(CGF); // Build call __kmpc_fork_teams(loc, n, microtask, var1, .., varn); llvm::Value *Args[] = { RTLoc, CGF.Builder.getInt32(CapturedVars.size()), // Number of captured vars CGF.Builder.CreateBitCast(OutlinedFn, getKmpc_MicroPointerTy())}; llvm::SmallVector RealArgs; RealArgs.append(std::begin(Args), std::end(Args)); RealArgs.append(CapturedVars.begin(), CapturedVars.end()); auto RTLFn = createRuntimeFunction(OMPRTL__kmpc_fork_teams); CGF.EmitRuntimeCall(RTLFn, RealArgs); } void CGOpenMPRuntime::emitNumTeamsClause(CodeGenFunction &CGF, const Expr *NumTeams, const Expr *ThreadLimit, SourceLocation Loc) { if (!CGF.HaveInsertPoint()) return; auto *RTLoc = emitUpdateLocation(CGF, Loc); llvm::Value *NumTeamsVal = (NumTeams) ? CGF.Builder.CreateIntCast(CGF.EmitScalarExpr(NumTeams), CGF.CGM.Int32Ty, /* isSigned = */ true) : CGF.Builder.getInt32(0); llvm::Value *ThreadLimitVal = (ThreadLimit) ? CGF.Builder.CreateIntCast(CGF.EmitScalarExpr(ThreadLimit), CGF.CGM.Int32Ty, /* isSigned = */ true) : CGF.Builder.getInt32(0); // Build call __kmpc_push_num_teamss(&loc, global_tid, num_teams, thread_limit) llvm::Value *PushNumTeamsArgs[] = {RTLoc, getThreadID(CGF, Loc), NumTeamsVal, ThreadLimitVal}; CGF.EmitRuntimeCall(createRuntimeFunction(OMPRTL__kmpc_push_num_teams), PushNumTeamsArgs); } void CGOpenMPRuntime::emitTargetDataCalls( CodeGenFunction &CGF, const OMPExecutableDirective &D, const Expr *IfCond, const Expr *Device, const RegionCodeGenTy &CodeGen, TargetDataInfo &Info) { if (!CGF.HaveInsertPoint()) return; // Action used to replace the default codegen action and turn privatization // off. PrePostActionTy NoPrivAction; // Generate the code for the opening of the data environment. Capture all the // arguments of the runtime call by reference because they are used in the // closing of the region. auto &&BeginThenGen = [&D, &CGF, Device, &Info, &CodeGen, &NoPrivAction]( CodeGenFunction &CGF, PrePostActionTy &) { // Fill up the arrays with all the mapped variables. MappableExprsHandler::MapBaseValuesArrayTy BasePointers; MappableExprsHandler::MapValuesArrayTy Pointers; MappableExprsHandler::MapValuesArrayTy Sizes; MappableExprsHandler::MapFlagsArrayTy MapTypes; // Get map clause information. MappableExprsHandler MCHandler(D, CGF); MCHandler.generateAllInfo(BasePointers, Pointers, Sizes, MapTypes); // Fill up the arrays and create the arguments. emitOffloadingArrays(CGF, BasePointers, Pointers, Sizes, MapTypes, Info); llvm::Value *BasePointersArrayArg = nullptr; llvm::Value *PointersArrayArg = nullptr; llvm::Value *SizesArrayArg = nullptr; llvm::Value *MapTypesArrayArg = nullptr; emitOffloadingArraysArgument(CGF, BasePointersArrayArg, PointersArrayArg, SizesArrayArg, MapTypesArrayArg, Info); // Emit device ID if any. llvm::Value *DeviceID = nullptr; if (Device) DeviceID = CGF.Builder.CreateIntCast(CGF.EmitScalarExpr(Device), CGF.Int32Ty, /*isSigned=*/true); else DeviceID = CGF.Builder.getInt32(OMP_DEVICEID_UNDEF); // Emit the number of elements in the offloading arrays. auto *PointerNum = CGF.Builder.getInt32(Info.NumberOfPtrs); llvm::Value *OffloadingArgs[] = { DeviceID, PointerNum, BasePointersArrayArg, PointersArrayArg, SizesArrayArg, MapTypesArrayArg}; auto &RT = CGF.CGM.getOpenMPRuntime(); CGF.EmitRuntimeCall(RT.createRuntimeFunction(OMPRTL__tgt_target_data_begin), OffloadingArgs); // If device pointer privatization is required, emit the body of the region // here. It will have to be duplicated: with and without privatization. if (!Info.CaptureDeviceAddrMap.empty()) CodeGen(CGF); }; // Generate code for the closing of the data region. auto &&EndThenGen = [&CGF, Device, &Info](CodeGenFunction &CGF, PrePostActionTy &) { assert(Info.isValid() && "Invalid data environment closing arguments."); llvm::Value *BasePointersArrayArg = nullptr; llvm::Value *PointersArrayArg = nullptr; llvm::Value *SizesArrayArg = nullptr; llvm::Value *MapTypesArrayArg = nullptr; emitOffloadingArraysArgument(CGF, BasePointersArrayArg, PointersArrayArg, SizesArrayArg, MapTypesArrayArg, Info); // Emit device ID if any. llvm::Value *DeviceID = nullptr; if (Device) DeviceID = CGF.Builder.CreateIntCast(CGF.EmitScalarExpr(Device), CGF.Int32Ty, /*isSigned=*/true); else DeviceID = CGF.Builder.getInt32(OMP_DEVICEID_UNDEF); // Emit the number of elements in the offloading arrays. auto *PointerNum = CGF.Builder.getInt32(Info.NumberOfPtrs); llvm::Value *OffloadingArgs[] = { DeviceID, PointerNum, BasePointersArrayArg, PointersArrayArg, SizesArrayArg, MapTypesArrayArg}; auto &RT = CGF.CGM.getOpenMPRuntime(); CGF.EmitRuntimeCall(RT.createRuntimeFunction(OMPRTL__tgt_target_data_end), OffloadingArgs); }; // If we need device pointer privatization, we need to emit the body of the // region with no privatization in the 'else' branch of the conditional. // Otherwise, we don't have to do anything. auto &&BeginElseGen = [&Info, &CodeGen, &NoPrivAction](CodeGenFunction &CGF, PrePostActionTy &) { if (!Info.CaptureDeviceAddrMap.empty()) { CodeGen.setAction(NoPrivAction); CodeGen(CGF); } }; // We don't have to do anything to close the region if the if clause evaluates // to false. auto &&EndElseGen = [](CodeGenFunction &CGF, PrePostActionTy &) {}; if (IfCond) { emitOMPIfClause(CGF, IfCond, BeginThenGen, BeginElseGen); } else { RegionCodeGenTy RCG(BeginThenGen); RCG(CGF); } // If we don't require privatization of device pointers, we emit the body in // between the runtime calls. This avoids duplicating the body code. if (Info.CaptureDeviceAddrMap.empty()) { CodeGen.setAction(NoPrivAction); CodeGen(CGF); } if (IfCond) { emitOMPIfClause(CGF, IfCond, EndThenGen, EndElseGen); } else { RegionCodeGenTy RCG(EndThenGen); RCG(CGF); } } void CGOpenMPRuntime::emitTargetDataStandAloneCall( CodeGenFunction &CGF, const OMPExecutableDirective &D, const Expr *IfCond, const Expr *Device) { if (!CGF.HaveInsertPoint()) return; assert((isa(D) || isa(D) || isa(D)) && "Expecting either target enter, exit data, or update directives."); // Generate the code for the opening of the data environment. auto &&ThenGen = [&D, &CGF, Device](CodeGenFunction &CGF, PrePostActionTy &) { // Fill up the arrays with all the mapped variables. MappableExprsHandler::MapBaseValuesArrayTy BasePointers; MappableExprsHandler::MapValuesArrayTy Pointers; MappableExprsHandler::MapValuesArrayTy Sizes; MappableExprsHandler::MapFlagsArrayTy MapTypes; // Get map clause information. MappableExprsHandler MEHandler(D, CGF); MEHandler.generateAllInfo(BasePointers, Pointers, Sizes, MapTypes); // Fill up the arrays and create the arguments. TargetDataInfo Info; emitOffloadingArrays(CGF, BasePointers, Pointers, Sizes, MapTypes, Info); emitOffloadingArraysArgument(CGF, Info.BasePointersArray, Info.PointersArray, Info.SizesArray, Info.MapTypesArray, Info); // Emit device ID if any. llvm::Value *DeviceID = nullptr; if (Device) DeviceID = CGF.Builder.CreateIntCast(CGF.EmitScalarExpr(Device), CGF.Int32Ty, /*isSigned=*/true); else DeviceID = CGF.Builder.getInt32(OMP_DEVICEID_UNDEF); // Emit the number of elements in the offloading arrays. auto *PointerNum = CGF.Builder.getInt32(BasePointers.size()); llvm::Value *OffloadingArgs[] = { DeviceID, PointerNum, Info.BasePointersArray, Info.PointersArray, Info.SizesArray, Info.MapTypesArray}; auto &RT = CGF.CGM.getOpenMPRuntime(); // Select the right runtime function call for each expected standalone // directive. OpenMPRTLFunction RTLFn; switch (D.getDirectiveKind()) { default: llvm_unreachable("Unexpected standalone target data directive."); break; case OMPD_target_enter_data: RTLFn = OMPRTL__tgt_target_data_begin; break; case OMPD_target_exit_data: RTLFn = OMPRTL__tgt_target_data_end; break; case OMPD_target_update: RTLFn = OMPRTL__tgt_target_data_update; break; } CGF.EmitRuntimeCall(RT.createRuntimeFunction(RTLFn), OffloadingArgs); }; // In the event we get an if clause, we don't have to take any action on the // else side. auto &&ElseGen = [](CodeGenFunction &CGF, PrePostActionTy &) {}; if (IfCond) { emitOMPIfClause(CGF, IfCond, ThenGen, ElseGen); } else { RegionCodeGenTy ThenGenRCG(ThenGen); ThenGenRCG(CGF); } } namespace { /// Kind of parameter in a function with 'declare simd' directive. enum ParamKindTy { LinearWithVarStride, Linear, Uniform, Vector }; /// Attribute set of the parameter. struct ParamAttrTy { ParamKindTy Kind = Vector; llvm::APSInt StrideOrArg; llvm::APSInt Alignment; }; } // namespace static unsigned evaluateCDTSize(const FunctionDecl *FD, ArrayRef ParamAttrs) { // Every vector variant of a SIMD-enabled function has a vector length (VLEN). // If OpenMP clause "simdlen" is used, the VLEN is the value of the argument // of that clause. The VLEN value must be power of 2. // In other case the notion of the function`s "characteristic data type" (CDT) // is used to compute the vector length. // CDT is defined in the following order: // a) For non-void function, the CDT is the return type. // b) If the function has any non-uniform, non-linear parameters, then the // CDT is the type of the first such parameter. // c) If the CDT determined by a) or b) above is struct, union, or class // type which is pass-by-value (except for the type that maps to the // built-in complex data type), the characteristic data type is int. // d) If none of the above three cases is applicable, the CDT is int. // The VLEN is then determined based on the CDT and the size of vector // register of that ISA for which current vector version is generated. The // VLEN is computed using the formula below: // VLEN = sizeof(vector_register) / sizeof(CDT), // where vector register size specified in section 3.2.1 Registers and the // Stack Frame of original AMD64 ABI document. QualType RetType = FD->getReturnType(); if (RetType.isNull()) return 0; ASTContext &C = FD->getASTContext(); QualType CDT; if (!RetType.isNull() && !RetType->isVoidType()) CDT = RetType; else { unsigned Offset = 0; if (auto *MD = dyn_cast(FD)) { if (ParamAttrs[Offset].Kind == Vector) CDT = C.getPointerType(C.getRecordType(MD->getParent())); ++Offset; } if (CDT.isNull()) { for (unsigned I = 0, E = FD->getNumParams(); I < E; ++I) { if (ParamAttrs[I + Offset].Kind == Vector) { CDT = FD->getParamDecl(I)->getType(); break; } } } } if (CDT.isNull()) CDT = C.IntTy; CDT = CDT->getCanonicalTypeUnqualified(); if (CDT->isRecordType() || CDT->isUnionType()) CDT = C.IntTy; return C.getTypeSize(CDT); } static void emitX86DeclareSimdFunction(const FunctionDecl *FD, llvm::Function *Fn, const llvm::APSInt &VLENVal, ArrayRef ParamAttrs, OMPDeclareSimdDeclAttr::BranchStateTy State) { struct ISADataTy { char ISA; unsigned VecRegSize; }; ISADataTy ISAData[] = { { 'b', 128 }, // SSE { 'c', 256 }, // AVX { 'd', 256 }, // AVX2 { 'e', 512 }, // AVX512 }; llvm::SmallVector Masked; switch (State) { case OMPDeclareSimdDeclAttr::BS_Undefined: Masked.push_back('N'); Masked.push_back('M'); break; case OMPDeclareSimdDeclAttr::BS_Notinbranch: Masked.push_back('N'); break; case OMPDeclareSimdDeclAttr::BS_Inbranch: Masked.push_back('M'); break; } for (auto Mask : Masked) { for (auto &Data : ISAData) { SmallString<256> Buffer; llvm::raw_svector_ostream Out(Buffer); Out << "_ZGV" << Data.ISA << Mask; if (!VLENVal) { Out << llvm::APSInt::getUnsigned(Data.VecRegSize / evaluateCDTSize(FD, ParamAttrs)); } else Out << VLENVal; for (auto &ParamAttr : ParamAttrs) { switch (ParamAttr.Kind){ case LinearWithVarStride: Out << 's' << ParamAttr.StrideOrArg; break; case Linear: Out << 'l'; if (!!ParamAttr.StrideOrArg) Out << ParamAttr.StrideOrArg; break; case Uniform: Out << 'u'; break; case Vector: Out << 'v'; break; } if (!!ParamAttr.Alignment) Out << 'a' << ParamAttr.Alignment; } Out << '_' << Fn->getName(); Fn->addFnAttr(Out.str()); } } } void CGOpenMPRuntime::emitDeclareSimdFunction(const FunctionDecl *FD, llvm::Function *Fn) { ASTContext &C = CGM.getContext(); FD = FD->getCanonicalDecl(); // Map params to their positions in function decl. llvm::DenseMap ParamPositions; if (isa(FD)) ParamPositions.insert({FD, 0}); unsigned ParamPos = ParamPositions.size(); for (auto *P : FD->parameters()) { ParamPositions.insert({P->getCanonicalDecl(), ParamPos}); ++ParamPos; } for (auto *Attr : FD->specific_attrs()) { llvm::SmallVector ParamAttrs(ParamPositions.size()); // Mark uniform parameters. for (auto *E : Attr->uniforms()) { E = E->IgnoreParenImpCasts(); unsigned Pos; if (isa(E)) Pos = ParamPositions[FD]; else { auto *PVD = cast(cast(E)->getDecl()) ->getCanonicalDecl(); Pos = ParamPositions[PVD]; } ParamAttrs[Pos].Kind = Uniform; } // Get alignment info. auto NI = Attr->alignments_begin(); for (auto *E : Attr->aligneds()) { E = E->IgnoreParenImpCasts(); unsigned Pos; QualType ParmTy; if (isa(E)) { Pos = ParamPositions[FD]; ParmTy = E->getType(); } else { auto *PVD = cast(cast(E)->getDecl()) ->getCanonicalDecl(); Pos = ParamPositions[PVD]; ParmTy = PVD->getType(); } ParamAttrs[Pos].Alignment = (*NI) ? (*NI)->EvaluateKnownConstInt(C) : llvm::APSInt::getUnsigned( C.toCharUnitsFromBits(C.getOpenMPDefaultSimdAlign(ParmTy)) .getQuantity()); ++NI; } // Mark linear parameters. auto SI = Attr->steps_begin(); auto MI = Attr->modifiers_begin(); for (auto *E : Attr->linears()) { E = E->IgnoreParenImpCasts(); unsigned Pos; if (isa(E)) Pos = ParamPositions[FD]; else { auto *PVD = cast(cast(E)->getDecl()) ->getCanonicalDecl(); Pos = ParamPositions[PVD]; } auto &ParamAttr = ParamAttrs[Pos]; ParamAttr.Kind = Linear; if (*SI) { if (!(*SI)->EvaluateAsInt(ParamAttr.StrideOrArg, C, Expr::SE_AllowSideEffects)) { if (auto *DRE = cast((*SI)->IgnoreParenImpCasts())) { if (auto *StridePVD = cast(DRE->getDecl())) { ParamAttr.Kind = LinearWithVarStride; ParamAttr.StrideOrArg = llvm::APSInt::getUnsigned( ParamPositions[StridePVD->getCanonicalDecl()]); } } } } ++SI; ++MI; } llvm::APSInt VLENVal; if (const Expr *VLEN = Attr->getSimdlen()) VLENVal = VLEN->EvaluateKnownConstInt(C); OMPDeclareSimdDeclAttr::BranchStateTy State = Attr->getBranchState(); if (CGM.getTriple().getArch() == llvm::Triple::x86 || CGM.getTriple().getArch() == llvm::Triple::x86_64) emitX86DeclareSimdFunction(FD, Fn, VLENVal, ParamAttrs, State); } } namespace { /// Cleanup action for doacross support. class DoacrossCleanupTy final : public EHScopeStack::Cleanup { public: static const int DoacrossFinArgs = 2; private: llvm::Value *RTLFn; llvm::Value *Args[DoacrossFinArgs]; public: DoacrossCleanupTy(llvm::Value *RTLFn, ArrayRef CallArgs) : RTLFn(RTLFn) { assert(CallArgs.size() == DoacrossFinArgs); std::copy(CallArgs.begin(), CallArgs.end(), std::begin(Args)); } void Emit(CodeGenFunction &CGF, Flags /*flags*/) override { if (!CGF.HaveInsertPoint()) return; CGF.EmitRuntimeCall(RTLFn, Args); } }; } // namespace void CGOpenMPRuntime::emitDoacrossInit(CodeGenFunction &CGF, const OMPLoopDirective &D) { if (!CGF.HaveInsertPoint()) return; ASTContext &C = CGM.getContext(); QualType Int64Ty = C.getIntTypeForBitwidth(/*DestWidth=*/64, /*Signed=*/true); RecordDecl *RD; if (KmpDimTy.isNull()) { // Build struct kmp_dim { // loop bounds info casted to kmp_int64 // kmp_int64 lo; // lower // kmp_int64 up; // upper // kmp_int64 st; // stride // }; RD = C.buildImplicitRecord("kmp_dim"); RD->startDefinition(); addFieldToRecordDecl(C, RD, Int64Ty); addFieldToRecordDecl(C, RD, Int64Ty); addFieldToRecordDecl(C, RD, Int64Ty); RD->completeDefinition(); KmpDimTy = C.getRecordType(RD); } else RD = cast(KmpDimTy->getAsTagDecl()); Address DimsAddr = CGF.CreateMemTemp(KmpDimTy, "dims"); CGF.EmitNullInitialization(DimsAddr, KmpDimTy); enum { LowerFD = 0, UpperFD, StrideFD }; // Fill dims with data. LValue DimsLVal = CGF.MakeAddrLValue(DimsAddr, KmpDimTy); // dims.upper = num_iterations; LValue UpperLVal = CGF.EmitLValueForField(DimsLVal, *std::next(RD->field_begin(), UpperFD)); llvm::Value *NumIterVal = CGF.EmitScalarConversion( CGF.EmitScalarExpr(D.getNumIterations()), D.getNumIterations()->getType(), Int64Ty, D.getNumIterations()->getExprLoc()); CGF.EmitStoreOfScalar(NumIterVal, UpperLVal); // dims.stride = 1; LValue StrideLVal = CGF.EmitLValueForField(DimsLVal, *std::next(RD->field_begin(), StrideFD)); CGF.EmitStoreOfScalar(llvm::ConstantInt::getSigned(CGM.Int64Ty, /*V=*/1), StrideLVal); // Build call void __kmpc_doacross_init(ident_t *loc, kmp_int32 gtid, // kmp_int32 num_dims, struct kmp_dim * dims); llvm::Value *Args[] = {emitUpdateLocation(CGF, D.getLocStart()), getThreadID(CGF, D.getLocStart()), llvm::ConstantInt::getSigned(CGM.Int32Ty, 1), CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( DimsAddr.getPointer(), CGM.VoidPtrTy)}; llvm::Value *RTLFn = createRuntimeFunction(OMPRTL__kmpc_doacross_init); CGF.EmitRuntimeCall(RTLFn, Args); llvm::Value *FiniArgs[DoacrossCleanupTy::DoacrossFinArgs] = { emitUpdateLocation(CGF, D.getLocEnd()), getThreadID(CGF, D.getLocEnd())}; llvm::Value *FiniRTLFn = createRuntimeFunction(OMPRTL__kmpc_doacross_fini); CGF.EHStack.pushCleanup(NormalAndEHCleanup, FiniRTLFn, llvm::makeArrayRef(FiniArgs)); } void CGOpenMPRuntime::emitDoacrossOrdered(CodeGenFunction &CGF, const OMPDependClause *C) { QualType Int64Ty = CGM.getContext().getIntTypeForBitwidth(/*DestWidth=*/64, /*Signed=*/1); const Expr *CounterVal = C->getCounterValue(); assert(CounterVal); llvm::Value *CntVal = CGF.EmitScalarConversion(CGF.EmitScalarExpr(CounterVal), CounterVal->getType(), Int64Ty, CounterVal->getExprLoc()); Address CntAddr = CGF.CreateMemTemp(Int64Ty, ".cnt.addr"); CGF.EmitStoreOfScalar(CntVal, CntAddr, /*Volatile=*/false, Int64Ty); llvm::Value *Args[] = {emitUpdateLocation(CGF, C->getLocStart()), getThreadID(CGF, C->getLocStart()), CntAddr.getPointer()}; llvm::Value *RTLFn; if (C->getDependencyKind() == OMPC_DEPEND_source) RTLFn = createRuntimeFunction(OMPRTL__kmpc_doacross_post); else { assert(C->getDependencyKind() == OMPC_DEPEND_sink); RTLFn = createRuntimeFunction(OMPRTL__kmpc_doacross_wait); } CGF.EmitRuntimeCall(RTLFn, Args); } Index: vendor/clang/dist/lib/Frontend/InitPreprocessor.cpp =================================================================== --- vendor/clang/dist/lib/Frontend/InitPreprocessor.cpp (revision 314259) +++ vendor/clang/dist/lib/Frontend/InitPreprocessor.cpp (revision 314260) @@ -1,1100 +1,1102 @@ //===--- InitPreprocessor.cpp - PP initialization code. ---------*- C++ -*-===// // // The LLVM Compiler Infrastructure // // This file is distributed under the University of Illinois Open Source // License. See LICENSE.TXT for details. // //===----------------------------------------------------------------------===// // // This file implements the clang::InitializePreprocessor function. // //===----------------------------------------------------------------------===// #include "clang/Basic/FileManager.h" #include "clang/Basic/MacroBuilder.h" #include "clang/Basic/SourceManager.h" #include "clang/Basic/TargetInfo.h" #include "clang/Basic/Version.h" #include "clang/Frontend/FrontendDiagnostic.h" #include "clang/Frontend/FrontendOptions.h" #include "clang/Frontend/Utils.h" #include "clang/Lex/HeaderSearch.h" #include "clang/Lex/PTHManager.h" #include "clang/Lex/Preprocessor.h" #include "clang/Lex/PreprocessorOptions.h" #include "clang/Serialization/ASTReader.h" #include "llvm/ADT/APFloat.h" using namespace clang; static bool MacroBodyEndsInBackslash(StringRef MacroBody) { while (!MacroBody.empty() && isWhitespace(MacroBody.back())) MacroBody = MacroBody.drop_back(); return !MacroBody.empty() && MacroBody.back() == '\\'; } // Append a #define line to Buf for Macro. Macro should be of the form XXX, // in which case we emit "#define XXX 1" or "XXX=Y z W" in which case we emit // "#define XXX Y z W". To get a #define with no value, use "XXX=". static void DefineBuiltinMacro(MacroBuilder &Builder, StringRef Macro, DiagnosticsEngine &Diags) { std::pair MacroPair = Macro.split('='); StringRef MacroName = MacroPair.first; StringRef MacroBody = MacroPair.second; if (MacroName.size() != Macro.size()) { // Per GCC -D semantics, the macro ends at \n if it exists. StringRef::size_type End = MacroBody.find_first_of("\n\r"); if (End != StringRef::npos) Diags.Report(diag::warn_fe_macro_contains_embedded_newline) << MacroName; MacroBody = MacroBody.substr(0, End); // We handle macro bodies which end in a backslash by appending an extra // backslash+newline. This makes sure we don't accidentally treat the // backslash as a line continuation marker. if (MacroBodyEndsInBackslash(MacroBody)) Builder.defineMacro(MacroName, Twine(MacroBody) + "\\\n"); else Builder.defineMacro(MacroName, MacroBody); } else { // Push "macroname 1". Builder.defineMacro(Macro); } } /// AddImplicitInclude - Add an implicit \#include of the specified file to the /// predefines buffer. /// As these includes are generated by -include arguments the header search /// logic is going to search relatively to the current working directory. static void AddImplicitInclude(MacroBuilder &Builder, StringRef File) { Builder.append(Twine("#include \"") + File + "\""); } static void AddImplicitIncludeMacros(MacroBuilder &Builder, StringRef File) { Builder.append(Twine("#__include_macros \"") + File + "\""); // Marker token to stop the __include_macros fetch loop. Builder.append("##"); // ##? } /// AddImplicitIncludePTH - Add an implicit \#include using the original file /// used to generate a PTH cache. static void AddImplicitIncludePTH(MacroBuilder &Builder, Preprocessor &PP, StringRef ImplicitIncludePTH) { PTHManager *P = PP.getPTHManager(); // Null check 'P' in the corner case where it couldn't be created. const char *OriginalFile = P ? P->getOriginalSourceFile() : nullptr; if (!OriginalFile) { PP.getDiagnostics().Report(diag::err_fe_pth_file_has_no_source_header) << ImplicitIncludePTH; return; } AddImplicitInclude(Builder, OriginalFile); } /// \brief Add an implicit \#include using the original file used to generate /// a PCH file. static void AddImplicitIncludePCH(MacroBuilder &Builder, Preprocessor &PP, const PCHContainerReader &PCHContainerRdr, StringRef ImplicitIncludePCH) { std::string OriginalFile = ASTReader::getOriginalSourceFile(ImplicitIncludePCH, PP.getFileManager(), PCHContainerRdr, PP.getDiagnostics()); if (OriginalFile.empty()) return; AddImplicitInclude(Builder, OriginalFile); } /// PickFP - This is used to pick a value based on the FP semantics of the /// specified FP model. template static T PickFP(const llvm::fltSemantics *Sem, T IEEESingleVal, T IEEEDoubleVal, T X87DoubleExtendedVal, T PPCDoubleDoubleVal, T IEEEQuadVal) { if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::IEEEsingle()) return IEEESingleVal; if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::IEEEdouble()) return IEEEDoubleVal; if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::x87DoubleExtended()) return X87DoubleExtendedVal; if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::PPCDoubleDouble()) return PPCDoubleDoubleVal; assert(Sem == (const llvm::fltSemantics*)&llvm::APFloat::IEEEquad()); return IEEEQuadVal; } static void DefineFloatMacros(MacroBuilder &Builder, StringRef Prefix, const llvm::fltSemantics *Sem, StringRef Ext) { const char *DenormMin, *Epsilon, *Max, *Min; DenormMin = PickFP(Sem, "1.40129846e-45", "4.9406564584124654e-324", "3.64519953188247460253e-4951", "4.94065645841246544176568792868221e-324", "6.47517511943802511092443895822764655e-4966"); int Digits = PickFP(Sem, 6, 15, 18, 31, 33); int DecimalDigits = PickFP(Sem, 9, 17, 21, 33, 36); Epsilon = PickFP(Sem, "1.19209290e-7", "2.2204460492503131e-16", "1.08420217248550443401e-19", "4.94065645841246544176568792868221e-324", "1.92592994438723585305597794258492732e-34"); int MantissaDigits = PickFP(Sem, 24, 53, 64, 106, 113); int Min10Exp = PickFP(Sem, -37, -307, -4931, -291, -4931); int Max10Exp = PickFP(Sem, 38, 308, 4932, 308, 4932); int MinExp = PickFP(Sem, -125, -1021, -16381, -968, -16381); int MaxExp = PickFP(Sem, 128, 1024, 16384, 1024, 16384); Min = PickFP(Sem, "1.17549435e-38", "2.2250738585072014e-308", "3.36210314311209350626e-4932", "2.00416836000897277799610805135016e-292", "3.36210314311209350626267781732175260e-4932"); Max = PickFP(Sem, "3.40282347e+38", "1.7976931348623157e+308", "1.18973149535723176502e+4932", "1.79769313486231580793728971405301e+308", "1.18973149535723176508575932662800702e+4932"); SmallString<32> DefPrefix; DefPrefix = "__"; DefPrefix += Prefix; DefPrefix += "_"; Builder.defineMacro(DefPrefix + "DENORM_MIN__", Twine(DenormMin)+Ext); Builder.defineMacro(DefPrefix + "HAS_DENORM__"); Builder.defineMacro(DefPrefix + "DIG__", Twine(Digits)); Builder.defineMacro(DefPrefix + "DECIMAL_DIG__", Twine(DecimalDigits)); Builder.defineMacro(DefPrefix + "EPSILON__", Twine(Epsilon)+Ext); Builder.defineMacro(DefPrefix + "HAS_INFINITY__"); Builder.defineMacro(DefPrefix + "HAS_QUIET_NAN__"); Builder.defineMacro(DefPrefix + "MANT_DIG__", Twine(MantissaDigits)); Builder.defineMacro(DefPrefix + "MAX_10_EXP__", Twine(Max10Exp)); Builder.defineMacro(DefPrefix + "MAX_EXP__", Twine(MaxExp)); Builder.defineMacro(DefPrefix + "MAX__", Twine(Max)+Ext); Builder.defineMacro(DefPrefix + "MIN_10_EXP__","("+Twine(Min10Exp)+")"); Builder.defineMacro(DefPrefix + "MIN_EXP__", "("+Twine(MinExp)+")"); Builder.defineMacro(DefPrefix + "MIN__", Twine(Min)+Ext); } /// DefineTypeSize - Emit a macro to the predefines buffer that declares a macro /// named MacroName with the max value for a type with width 'TypeWidth' a /// signedness of 'isSigned' and with a value suffix of 'ValSuffix' (e.g. LL). static void DefineTypeSize(const Twine &MacroName, unsigned TypeWidth, StringRef ValSuffix, bool isSigned, MacroBuilder &Builder) { llvm::APInt MaxVal = isSigned ? llvm::APInt::getSignedMaxValue(TypeWidth) : llvm::APInt::getMaxValue(TypeWidth); Builder.defineMacro(MacroName, MaxVal.toString(10, isSigned) + ValSuffix); } /// DefineTypeSize - An overloaded helper that uses TargetInfo to determine /// the width, suffix, and signedness of the given type static void DefineTypeSize(const Twine &MacroName, TargetInfo::IntType Ty, const TargetInfo &TI, MacroBuilder &Builder) { DefineTypeSize(MacroName, TI.getTypeWidth(Ty), TI.getTypeConstantSuffix(Ty), TI.isTypeSigned(Ty), Builder); } static void DefineFmt(const Twine &Prefix, TargetInfo::IntType Ty, const TargetInfo &TI, MacroBuilder &Builder) { bool IsSigned = TI.isTypeSigned(Ty); StringRef FmtModifier = TI.getTypeFormatModifier(Ty); for (const char *Fmt = IsSigned ? "di" : "ouxX"; *Fmt; ++Fmt) { Builder.defineMacro(Prefix + "_FMT" + Twine(*Fmt) + "__", Twine("\"") + FmtModifier + Twine(*Fmt) + "\""); } } static void DefineType(const Twine &MacroName, TargetInfo::IntType Ty, MacroBuilder &Builder) { Builder.defineMacro(MacroName, TargetInfo::getTypeName(Ty)); } static void DefineTypeWidth(StringRef MacroName, TargetInfo::IntType Ty, const TargetInfo &TI, MacroBuilder &Builder) { Builder.defineMacro(MacroName, Twine(TI.getTypeWidth(Ty))); } static void DefineTypeSizeof(StringRef MacroName, unsigned BitWidth, const TargetInfo &TI, MacroBuilder &Builder) { Builder.defineMacro(MacroName, Twine(BitWidth / TI.getCharWidth())); } static void DefineExactWidthIntType(TargetInfo::IntType Ty, const TargetInfo &TI, MacroBuilder &Builder) { int TypeWidth = TI.getTypeWidth(Ty); bool IsSigned = TI.isTypeSigned(Ty); // Use the target specified int64 type, when appropriate, so that [u]int64_t // ends up being defined in terms of the correct type. if (TypeWidth == 64) Ty = IsSigned ? TI.getInt64Type() : TI.getUInt64Type(); const char *Prefix = IsSigned ? "__INT" : "__UINT"; DefineType(Prefix + Twine(TypeWidth) + "_TYPE__", Ty, Builder); DefineFmt(Prefix + Twine(TypeWidth), Ty, TI, Builder); StringRef ConstSuffix(TI.getTypeConstantSuffix(Ty)); Builder.defineMacro(Prefix + Twine(TypeWidth) + "_C_SUFFIX__", ConstSuffix); } static void DefineExactWidthIntTypeSize(TargetInfo::IntType Ty, const TargetInfo &TI, MacroBuilder &Builder) { int TypeWidth = TI.getTypeWidth(Ty); bool IsSigned = TI.isTypeSigned(Ty); // Use the target specified int64 type, when appropriate, so that [u]int64_t // ends up being defined in terms of the correct type. if (TypeWidth == 64) Ty = IsSigned ? TI.getInt64Type() : TI.getUInt64Type(); const char *Prefix = IsSigned ? "__INT" : "__UINT"; DefineTypeSize(Prefix + Twine(TypeWidth) + "_MAX__", Ty, TI, Builder); } static void DefineLeastWidthIntType(unsigned TypeWidth, bool IsSigned, const TargetInfo &TI, MacroBuilder &Builder) { TargetInfo::IntType Ty = TI.getLeastIntTypeByWidth(TypeWidth, IsSigned); if (Ty == TargetInfo::NoInt) return; const char *Prefix = IsSigned ? "__INT_LEAST" : "__UINT_LEAST"; DefineType(Prefix + Twine(TypeWidth) + "_TYPE__", Ty, Builder); DefineTypeSize(Prefix + Twine(TypeWidth) + "_MAX__", Ty, TI, Builder); DefineFmt(Prefix + Twine(TypeWidth), Ty, TI, Builder); } static void DefineFastIntType(unsigned TypeWidth, bool IsSigned, const TargetInfo &TI, MacroBuilder &Builder) { // stdint.h currently defines the fast int types as equivalent to the least // types. TargetInfo::IntType Ty = TI.getLeastIntTypeByWidth(TypeWidth, IsSigned); if (Ty == TargetInfo::NoInt) return; const char *Prefix = IsSigned ? "__INT_FAST" : "__UINT_FAST"; DefineType(Prefix + Twine(TypeWidth) + "_TYPE__", Ty, Builder); DefineTypeSize(Prefix + Twine(TypeWidth) + "_MAX__", Ty, TI, Builder); DefineFmt(Prefix + Twine(TypeWidth), Ty, TI, Builder); } /// Get the value the ATOMIC_*_LOCK_FREE macro should have for a type with /// the specified properties. -static const char *getLockFreeValue(unsigned TypeWidth, unsigned InlineWidth) { +static const char *getLockFreeValue(unsigned TypeWidth, unsigned TypeAlign, + unsigned InlineWidth) { // Fully-aligned, power-of-2 sizes no larger than the inline // width will be inlined as lock-free operations. - // Note: we do not need to check alignment since _Atomic(T) is always - // appropriately-aligned in clang. - if ((TypeWidth & (TypeWidth - 1)) == 0 && TypeWidth <= InlineWidth) + if (TypeWidth == TypeAlign && (TypeWidth & (TypeWidth - 1)) == 0 && + TypeWidth <= InlineWidth) return "2"; // "always lock free" // We cannot be certain what operations the lib calls might be // able to implement as lock-free on future processors. return "1"; // "sometimes lock free" } /// \brief Add definitions required for a smooth interaction between /// Objective-C++ automated reference counting and libstdc++ (4.2). static void AddObjCXXARCLibstdcxxDefines(const LangOptions &LangOpts, MacroBuilder &Builder) { Builder.defineMacro("_GLIBCXX_PREDEFINED_OBJC_ARC_IS_SCALAR"); std::string Result; { // Provide specializations for the __is_scalar type trait so that // lifetime-qualified objects are not considered "scalar" types, which // libstdc++ uses as an indicator of the presence of trivial copy, assign, // default-construct, and destruct semantics (none of which hold for // lifetime-qualified objects in ARC). llvm::raw_string_ostream Out(Result); Out << "namespace std {\n" << "\n" << "struct __true_type;\n" << "struct __false_type;\n" << "\n"; Out << "template struct __is_scalar;\n" << "\n"; if (LangOpts.ObjCAutoRefCount) { Out << "template\n" << "struct __is_scalar<__attribute__((objc_ownership(strong))) _Tp> {\n" << " enum { __value = 0 };\n" << " typedef __false_type __type;\n" << "};\n" << "\n"; } if (LangOpts.ObjCWeak) { Out << "template\n" << "struct __is_scalar<__attribute__((objc_ownership(weak))) _Tp> {\n" << " enum { __value = 0 };\n" << " typedef __false_type __type;\n" << "};\n" << "\n"; } if (LangOpts.ObjCAutoRefCount) { Out << "template\n" << "struct __is_scalar<__attribute__((objc_ownership(autoreleasing)))" << " _Tp> {\n" << " enum { __value = 0 };\n" << " typedef __false_type __type;\n" << "};\n" << "\n"; } Out << "}\n"; } Builder.append(Result); } static void InitializeStandardPredefinedMacros(const TargetInfo &TI, const LangOptions &LangOpts, const FrontendOptions &FEOpts, MacroBuilder &Builder) { if (!LangOpts.MSVCCompat && !LangOpts.TraditionalCPP) Builder.defineMacro("__STDC__"); if (LangOpts.Freestanding) Builder.defineMacro("__STDC_HOSTED__", "0"); else Builder.defineMacro("__STDC_HOSTED__"); if (!LangOpts.CPlusPlus) { if (LangOpts.C11) Builder.defineMacro("__STDC_VERSION__", "201112L"); else if (LangOpts.C99) Builder.defineMacro("__STDC_VERSION__", "199901L"); else if (!LangOpts.GNUMode && LangOpts.Digraphs) Builder.defineMacro("__STDC_VERSION__", "199409L"); } else { // FIXME: Use correct value for C++17. if (LangOpts.CPlusPlus1z) Builder.defineMacro("__cplusplus", "201406L"); // C++1y [cpp.predefined]p1: // The name __cplusplus is defined to the value 201402L when compiling a // C++ translation unit. else if (LangOpts.CPlusPlus14) Builder.defineMacro("__cplusplus", "201402L"); // C++11 [cpp.predefined]p1: // The name __cplusplus is defined to the value 201103L when compiling a // C++ translation unit. else if (LangOpts.CPlusPlus11) Builder.defineMacro("__cplusplus", "201103L"); // C++03 [cpp.predefined]p1: // The name __cplusplus is defined to the value 199711L when compiling a // C++ translation unit. else Builder.defineMacro("__cplusplus", "199711L"); // C++1z [cpp.predefined]p1: // An integer literal of type std::size_t whose value is the alignment // guaranteed by a call to operator new(std::size_t) // // We provide this in all language modes, since it seems generally useful. Builder.defineMacro("__STDCPP_DEFAULT_NEW_ALIGNMENT__", Twine(TI.getNewAlign() / TI.getCharWidth()) + TI.getTypeConstantSuffix(TI.getSizeType())); } // In C11 these are environment macros. In C++11 they are only defined // as part of . To prevent breakage when mixing C and C++ // code, define these macros unconditionally. We can define them // unconditionally, as Clang always uses UTF-16 and UTF-32 for 16-bit // and 32-bit character literals. Builder.defineMacro("__STDC_UTF_16__", "1"); Builder.defineMacro("__STDC_UTF_32__", "1"); if (LangOpts.ObjC1) Builder.defineMacro("__OBJC__"); // OpenCL v1.0/1.1 s6.9, v1.2/2.0 s6.10: Preprocessor Directives and Macros. if (LangOpts.OpenCL) { // OpenCL v1.0 and v1.1 do not have a predefined macro to indicate the // language standard with which the program is compiled. __OPENCL_VERSION__ // is for the OpenCL version supported by the OpenCL device, which is not // necessarily the language standard with which the program is compiled. // A shared OpenCL header file requires a macro to indicate the language // standard. As a workaround, __OPENCL_C_VERSION__ is defined for // OpenCL v1.0 and v1.1. switch (LangOpts.OpenCLVersion) { case 100: Builder.defineMacro("__OPENCL_C_VERSION__", "100"); break; case 110: Builder.defineMacro("__OPENCL_C_VERSION__", "110"); break; case 120: Builder.defineMacro("__OPENCL_C_VERSION__", "120"); break; case 200: Builder.defineMacro("__OPENCL_C_VERSION__", "200"); break; default: llvm_unreachable("Unsupported OpenCL version"); } Builder.defineMacro("CL_VERSION_1_0", "100"); Builder.defineMacro("CL_VERSION_1_1", "110"); Builder.defineMacro("CL_VERSION_1_2", "120"); Builder.defineMacro("CL_VERSION_2_0", "200"); if (TI.isLittleEndian()) Builder.defineMacro("__ENDIAN_LITTLE__"); if (LangOpts.FastRelaxedMath) Builder.defineMacro("__FAST_RELAXED_MATH__"); } // Not "standard" per se, but available even with the -undef flag. if (LangOpts.AsmPreprocessor) Builder.defineMacro("__ASSEMBLER__"); if (LangOpts.CUDA) Builder.defineMacro("__CUDA__"); } /// Initialize the predefined C++ language feature test macros defined in /// ISO/IEC JTC1/SC22/WG21 (C++) SD-6: "SG10 Feature Test Recommendations". static void InitializeCPlusPlusFeatureTestMacros(const LangOptions &LangOpts, MacroBuilder &Builder) { // C++98 features. if (LangOpts.RTTI) Builder.defineMacro("__cpp_rtti", "199711"); if (LangOpts.CXXExceptions) Builder.defineMacro("__cpp_exceptions", "199711"); // C++11 features. if (LangOpts.CPlusPlus11) { Builder.defineMacro("__cpp_unicode_characters", "200704"); Builder.defineMacro("__cpp_raw_strings", "200710"); Builder.defineMacro("__cpp_unicode_literals", "200710"); Builder.defineMacro("__cpp_user_defined_literals", "200809"); Builder.defineMacro("__cpp_lambdas", "200907"); Builder.defineMacro("__cpp_constexpr", LangOpts.CPlusPlus14 ? "201304" : "200704"); Builder.defineMacro("__cpp_range_based_for", LangOpts.CPlusPlus1z ? "201603" : "200907"); Builder.defineMacro("__cpp_static_assert", LangOpts.CPlusPlus1z ? "201411" : "200410"); Builder.defineMacro("__cpp_decltype", "200707"); Builder.defineMacro("__cpp_attributes", "200809"); Builder.defineMacro("__cpp_rvalue_references", "200610"); Builder.defineMacro("__cpp_variadic_templates", "200704"); Builder.defineMacro("__cpp_initializer_lists", "200806"); Builder.defineMacro("__cpp_delegating_constructors", "200604"); Builder.defineMacro("__cpp_nsdmi", "200809"); Builder.defineMacro("__cpp_inheriting_constructors", "201511"); Builder.defineMacro("__cpp_ref_qualifiers", "200710"); Builder.defineMacro("__cpp_alias_templates", "200704"); } // C++14 features. if (LangOpts.CPlusPlus14) { Builder.defineMacro("__cpp_binary_literals", "201304"); Builder.defineMacro("__cpp_digit_separators", "201309"); Builder.defineMacro("__cpp_init_captures", "201304"); Builder.defineMacro("__cpp_generic_lambdas", "201304"); Builder.defineMacro("__cpp_decltype_auto", "201304"); Builder.defineMacro("__cpp_return_type_deduction", "201304"); Builder.defineMacro("__cpp_aggregate_nsdmi", "201304"); Builder.defineMacro("__cpp_variable_templates", "201304"); } if (LangOpts.SizedDeallocation) Builder.defineMacro("__cpp_sized_deallocation", "201309"); // C++17 features. if (LangOpts.CPlusPlus1z) { Builder.defineMacro("__cpp_hex_float", "201603"); Builder.defineMacro("__cpp_inline_variables", "201606"); Builder.defineMacro("__cpp_noexcept_function_type", "201510"); Builder.defineMacro("__cpp_capture_star_this", "201603"); Builder.defineMacro("__cpp_if_constexpr", "201606"); Builder.defineMacro("__cpp_template_auto", "201606"); Builder.defineMacro("__cpp_namespace_attributes", "201411"); Builder.defineMacro("__cpp_enumerator_attributes", "201411"); Builder.defineMacro("__cpp_nested_namespace_definitions", "201411"); Builder.defineMacro("__cpp_variadic_using", "201611"); Builder.defineMacro("__cpp_aggregate_bases", "201603"); Builder.defineMacro("__cpp_structured_bindings", "201606"); Builder.defineMacro("__cpp_nontype_template_args", "201411"); Builder.defineMacro("__cpp_fold_expressions", "201603"); } if (LangOpts.AlignedAllocation) Builder.defineMacro("__cpp_aligned_new", "201606"); // TS features. if (LangOpts.ConceptsTS) Builder.defineMacro("__cpp_experimental_concepts", "1"); if (LangOpts.CoroutinesTS) Builder.defineMacro("__cpp_coroutines", "1"); } static void InitializePredefinedMacros(const TargetInfo &TI, const LangOptions &LangOpts, const FrontendOptions &FEOpts, MacroBuilder &Builder) { // Compiler version introspection macros. Builder.defineMacro("__llvm__"); // LLVM Backend Builder.defineMacro("__clang__"); // Clang Frontend #define TOSTR2(X) #X #define TOSTR(X) TOSTR2(X) Builder.defineMacro("__clang_major__", TOSTR(CLANG_VERSION_MAJOR)); Builder.defineMacro("__clang_minor__", TOSTR(CLANG_VERSION_MINOR)); Builder.defineMacro("__clang_patchlevel__", TOSTR(CLANG_VERSION_PATCHLEVEL)); #undef TOSTR #undef TOSTR2 Builder.defineMacro("__clang_version__", "\"" CLANG_VERSION_STRING " " + getClangFullRepositoryVersion() + "\""); if (!LangOpts.MSVCCompat) { // Currently claim to be compatible with GCC 4.2.1-5621, but only if we're // not compiling for MSVC compatibility Builder.defineMacro("__GNUC_MINOR__", "2"); Builder.defineMacro("__GNUC_PATCHLEVEL__", "1"); Builder.defineMacro("__GNUC__", "4"); Builder.defineMacro("__GXX_ABI_VERSION", "1002"); } // Define macros for the C11 / C++11 memory orderings Builder.defineMacro("__ATOMIC_RELAXED", "0"); Builder.defineMacro("__ATOMIC_CONSUME", "1"); Builder.defineMacro("__ATOMIC_ACQUIRE", "2"); Builder.defineMacro("__ATOMIC_RELEASE", "3"); Builder.defineMacro("__ATOMIC_ACQ_REL", "4"); Builder.defineMacro("__ATOMIC_SEQ_CST", "5"); // Support for #pragma redefine_extname (Sun compatibility) Builder.defineMacro("__PRAGMA_REDEFINE_EXTNAME", "1"); // As sad as it is, enough software depends on the __VERSION__ for version // checks that it is necessary to report 4.2.1 (the base GCC version we claim // compatibility with) first. Builder.defineMacro("__VERSION__", "\"4.2.1 Compatible " + Twine(getClangFullCPPVersion()) + "\""); // Initialize language-specific preprocessor defines. // Standard conforming mode? if (!LangOpts.GNUMode && !LangOpts.MSVCCompat) Builder.defineMacro("__STRICT_ANSI__"); if (!LangOpts.MSVCCompat && LangOpts.CPlusPlus11) Builder.defineMacro("__GXX_EXPERIMENTAL_CXX0X__"); if (LangOpts.ObjC1) { if (LangOpts.ObjCRuntime.isNonFragile()) { Builder.defineMacro("__OBJC2__"); if (LangOpts.ObjCExceptions) Builder.defineMacro("OBJC_ZEROCOST_EXCEPTIONS"); } Builder.defineMacro("__OBJC_BOOL_IS_BOOL", Twine(TI.useSignedCharForObjCBool() ? "0" : "1")); if (LangOpts.getGC() != LangOptions::NonGC) Builder.defineMacro("__OBJC_GC__"); if (LangOpts.ObjCRuntime.isNeXTFamily()) Builder.defineMacro("__NEXT_RUNTIME__"); if (LangOpts.ObjCRuntime.getKind() == ObjCRuntime::ObjFW) { VersionTuple tuple = LangOpts.ObjCRuntime.getVersion(); unsigned minor = 0; if (tuple.getMinor().hasValue()) minor = tuple.getMinor().getValue(); unsigned subminor = 0; if (tuple.getSubminor().hasValue()) subminor = tuple.getSubminor().getValue(); Builder.defineMacro("__OBJFW_RUNTIME_ABI__", Twine(tuple.getMajor() * 10000 + minor * 100 + subminor)); } Builder.defineMacro("IBOutlet", "__attribute__((iboutlet))"); Builder.defineMacro("IBOutletCollection(ClassName)", "__attribute__((iboutletcollection(ClassName)))"); Builder.defineMacro("IBAction", "void)__attribute__((ibaction)"); Builder.defineMacro("IBInspectable", ""); Builder.defineMacro("IB_DESIGNABLE", ""); } if (LangOpts.CPlusPlus) InitializeCPlusPlusFeatureTestMacros(LangOpts, Builder); // darwin_constant_cfstrings controls this. This is also dependent // on other things like the runtime I believe. This is set even for C code. if (!LangOpts.NoConstantCFStrings) Builder.defineMacro("__CONSTANT_CFSTRINGS__"); if (LangOpts.ObjC2) Builder.defineMacro("OBJC_NEW_PROPERTIES"); if (LangOpts.PascalStrings) Builder.defineMacro("__PASCAL_STRINGS__"); if (LangOpts.Blocks) { Builder.defineMacro("__block", "__attribute__((__blocks__(byref)))"); Builder.defineMacro("__BLOCKS__"); } if (!LangOpts.MSVCCompat && LangOpts.Exceptions) Builder.defineMacro("__EXCEPTIONS"); if (!LangOpts.MSVCCompat && LangOpts.RTTI) Builder.defineMacro("__GXX_RTTI"); if (LangOpts.SjLjExceptions) Builder.defineMacro("__USING_SJLJ_EXCEPTIONS__"); if (LangOpts.Deprecated) Builder.defineMacro("__DEPRECATED"); if (!LangOpts.MSVCCompat && LangOpts.CPlusPlus) { Builder.defineMacro("__GNUG__", "4"); Builder.defineMacro("__GXX_WEAK__"); Builder.defineMacro("__private_extern__", "extern"); } if (LangOpts.MicrosoftExt) { if (LangOpts.WChar) { // wchar_t supported as a keyword. Builder.defineMacro("_WCHAR_T_DEFINED"); Builder.defineMacro("_NATIVE_WCHAR_T_DEFINED"); } } if (LangOpts.Optimize) Builder.defineMacro("__OPTIMIZE__"); if (LangOpts.OptimizeSize) Builder.defineMacro("__OPTIMIZE_SIZE__"); if (LangOpts.FastMath) Builder.defineMacro("__FAST_MATH__"); // Initialize target-specific preprocessor defines. // __BYTE_ORDER__ was added in GCC 4.6. It's analogous // to the macro __BYTE_ORDER (no trailing underscores) // from glibc's header. // We don't support the PDP-11 as a target, but include // the define so it can still be compared against. Builder.defineMacro("__ORDER_LITTLE_ENDIAN__", "1234"); Builder.defineMacro("__ORDER_BIG_ENDIAN__", "4321"); Builder.defineMacro("__ORDER_PDP_ENDIAN__", "3412"); if (TI.isBigEndian()) { Builder.defineMacro("__BYTE_ORDER__", "__ORDER_BIG_ENDIAN__"); Builder.defineMacro("__BIG_ENDIAN__"); } else { Builder.defineMacro("__BYTE_ORDER__", "__ORDER_LITTLE_ENDIAN__"); Builder.defineMacro("__LITTLE_ENDIAN__"); } if (TI.getPointerWidth(0) == 64 && TI.getLongWidth() == 64 && TI.getIntWidth() == 32) { Builder.defineMacro("_LP64"); Builder.defineMacro("__LP64__"); } if (TI.getPointerWidth(0) == 32 && TI.getLongWidth() == 32 && TI.getIntWidth() == 32) { Builder.defineMacro("_ILP32"); Builder.defineMacro("__ILP32__"); } // Define type sizing macros based on the target properties. assert(TI.getCharWidth() == 8 && "Only support 8-bit char so far"); Builder.defineMacro("__CHAR_BIT__", Twine(TI.getCharWidth())); DefineTypeSize("__SCHAR_MAX__", TargetInfo::SignedChar, TI, Builder); DefineTypeSize("__SHRT_MAX__", TargetInfo::SignedShort, TI, Builder); DefineTypeSize("__INT_MAX__", TargetInfo::SignedInt, TI, Builder); DefineTypeSize("__LONG_MAX__", TargetInfo::SignedLong, TI, Builder); DefineTypeSize("__LONG_LONG_MAX__", TargetInfo::SignedLongLong, TI, Builder); DefineTypeSize("__WCHAR_MAX__", TI.getWCharType(), TI, Builder); DefineTypeSize("__INTMAX_MAX__", TI.getIntMaxType(), TI, Builder); DefineTypeSize("__SIZE_MAX__", TI.getSizeType(), TI, Builder); DefineTypeSize("__UINTMAX_MAX__", TI.getUIntMaxType(), TI, Builder); DefineTypeSize("__PTRDIFF_MAX__", TI.getPtrDiffType(0), TI, Builder); DefineTypeSize("__INTPTR_MAX__", TI.getIntPtrType(), TI, Builder); DefineTypeSize("__UINTPTR_MAX__", TI.getUIntPtrType(), TI, Builder); DefineTypeSizeof("__SIZEOF_DOUBLE__", TI.getDoubleWidth(), TI, Builder); DefineTypeSizeof("__SIZEOF_FLOAT__", TI.getFloatWidth(), TI, Builder); DefineTypeSizeof("__SIZEOF_INT__", TI.getIntWidth(), TI, Builder); DefineTypeSizeof("__SIZEOF_LONG__", TI.getLongWidth(), TI, Builder); DefineTypeSizeof("__SIZEOF_LONG_DOUBLE__",TI.getLongDoubleWidth(),TI,Builder); DefineTypeSizeof("__SIZEOF_LONG_LONG__", TI.getLongLongWidth(), TI, Builder); DefineTypeSizeof("__SIZEOF_POINTER__", TI.getPointerWidth(0), TI, Builder); DefineTypeSizeof("__SIZEOF_SHORT__", TI.getShortWidth(), TI, Builder); DefineTypeSizeof("__SIZEOF_PTRDIFF_T__", TI.getTypeWidth(TI.getPtrDiffType(0)), TI, Builder); DefineTypeSizeof("__SIZEOF_SIZE_T__", TI.getTypeWidth(TI.getSizeType()), TI, Builder); DefineTypeSizeof("__SIZEOF_WCHAR_T__", TI.getTypeWidth(TI.getWCharType()), TI, Builder); DefineTypeSizeof("__SIZEOF_WINT_T__", TI.getTypeWidth(TI.getWIntType()), TI, Builder); if (TI.hasInt128Type()) DefineTypeSizeof("__SIZEOF_INT128__", 128, TI, Builder); DefineType("__INTMAX_TYPE__", TI.getIntMaxType(), Builder); DefineFmt("__INTMAX", TI.getIntMaxType(), TI, Builder); Builder.defineMacro("__INTMAX_C_SUFFIX__", TI.getTypeConstantSuffix(TI.getIntMaxType())); DefineType("__UINTMAX_TYPE__", TI.getUIntMaxType(), Builder); DefineFmt("__UINTMAX", TI.getUIntMaxType(), TI, Builder); Builder.defineMacro("__UINTMAX_C_SUFFIX__", TI.getTypeConstantSuffix(TI.getUIntMaxType())); DefineTypeWidth("__INTMAX_WIDTH__", TI.getIntMaxType(), TI, Builder); DefineType("__PTRDIFF_TYPE__", TI.getPtrDiffType(0), Builder); DefineFmt("__PTRDIFF", TI.getPtrDiffType(0), TI, Builder); DefineTypeWidth("__PTRDIFF_WIDTH__", TI.getPtrDiffType(0), TI, Builder); DefineType("__INTPTR_TYPE__", TI.getIntPtrType(), Builder); DefineFmt("__INTPTR", TI.getIntPtrType(), TI, Builder); DefineTypeWidth("__INTPTR_WIDTH__", TI.getIntPtrType(), TI, Builder); DefineType("__SIZE_TYPE__", TI.getSizeType(), Builder); DefineFmt("__SIZE", TI.getSizeType(), TI, Builder); DefineTypeWidth("__SIZE_WIDTH__", TI.getSizeType(), TI, Builder); DefineType("__WCHAR_TYPE__", TI.getWCharType(), Builder); DefineTypeWidth("__WCHAR_WIDTH__", TI.getWCharType(), TI, Builder); DefineType("__WINT_TYPE__", TI.getWIntType(), Builder); DefineTypeWidth("__WINT_WIDTH__", TI.getWIntType(), TI, Builder); DefineTypeWidth("__SIG_ATOMIC_WIDTH__", TI.getSigAtomicType(), TI, Builder); DefineTypeSize("__SIG_ATOMIC_MAX__", TI.getSigAtomicType(), TI, Builder); DefineType("__CHAR16_TYPE__", TI.getChar16Type(), Builder); DefineType("__CHAR32_TYPE__", TI.getChar32Type(), Builder); DefineTypeWidth("__UINTMAX_WIDTH__", TI.getUIntMaxType(), TI, Builder); DefineType("__UINTPTR_TYPE__", TI.getUIntPtrType(), Builder); DefineFmt("__UINTPTR", TI.getUIntPtrType(), TI, Builder); DefineTypeWidth("__UINTPTR_WIDTH__", TI.getUIntPtrType(), TI, Builder); DefineFloatMacros(Builder, "FLT", &TI.getFloatFormat(), "F"); DefineFloatMacros(Builder, "DBL", &TI.getDoubleFormat(), ""); DefineFloatMacros(Builder, "LDBL", &TI.getLongDoubleFormat(), "L"); // Define a __POINTER_WIDTH__ macro for stdint.h. Builder.defineMacro("__POINTER_WIDTH__", Twine((int)TI.getPointerWidth(0))); // Define __BIGGEST_ALIGNMENT__ to be compatible with gcc. Builder.defineMacro("__BIGGEST_ALIGNMENT__", Twine(TI.getSuitableAlign() / TI.getCharWidth()) ); if (!LangOpts.CharIsSigned) Builder.defineMacro("__CHAR_UNSIGNED__"); if (!TargetInfo::isTypeSigned(TI.getWCharType())) Builder.defineMacro("__WCHAR_UNSIGNED__"); if (!TargetInfo::isTypeSigned(TI.getWIntType())) Builder.defineMacro("__WINT_UNSIGNED__"); // Define exact-width integer types for stdint.h DefineExactWidthIntType(TargetInfo::SignedChar, TI, Builder); if (TI.getShortWidth() > TI.getCharWidth()) DefineExactWidthIntType(TargetInfo::SignedShort, TI, Builder); if (TI.getIntWidth() > TI.getShortWidth()) DefineExactWidthIntType(TargetInfo::SignedInt, TI, Builder); if (TI.getLongWidth() > TI.getIntWidth()) DefineExactWidthIntType(TargetInfo::SignedLong, TI, Builder); if (TI.getLongLongWidth() > TI.getLongWidth()) DefineExactWidthIntType(TargetInfo::SignedLongLong, TI, Builder); DefineExactWidthIntType(TargetInfo::UnsignedChar, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::UnsignedChar, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::SignedChar, TI, Builder); if (TI.getShortWidth() > TI.getCharWidth()) { DefineExactWidthIntType(TargetInfo::UnsignedShort, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::UnsignedShort, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::SignedShort, TI, Builder); } if (TI.getIntWidth() > TI.getShortWidth()) { DefineExactWidthIntType(TargetInfo::UnsignedInt, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::UnsignedInt, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::SignedInt, TI, Builder); } if (TI.getLongWidth() > TI.getIntWidth()) { DefineExactWidthIntType(TargetInfo::UnsignedLong, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::UnsignedLong, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::SignedLong, TI, Builder); } if (TI.getLongLongWidth() > TI.getLongWidth()) { DefineExactWidthIntType(TargetInfo::UnsignedLongLong, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::UnsignedLongLong, TI, Builder); DefineExactWidthIntTypeSize(TargetInfo::SignedLongLong, TI, Builder); } DefineLeastWidthIntType(8, true, TI, Builder); DefineLeastWidthIntType(8, false, TI, Builder); DefineLeastWidthIntType(16, true, TI, Builder); DefineLeastWidthIntType(16, false, TI, Builder); DefineLeastWidthIntType(32, true, TI, Builder); DefineLeastWidthIntType(32, false, TI, Builder); DefineLeastWidthIntType(64, true, TI, Builder); DefineLeastWidthIntType(64, false, TI, Builder); DefineFastIntType(8, true, TI, Builder); DefineFastIntType(8, false, TI, Builder); DefineFastIntType(16, true, TI, Builder); DefineFastIntType(16, false, TI, Builder); DefineFastIntType(32, true, TI, Builder); DefineFastIntType(32, false, TI, Builder); DefineFastIntType(64, true, TI, Builder); DefineFastIntType(64, false, TI, Builder); char UserLabelPrefix[2] = {TI.getDataLayout().getGlobalPrefix(), 0}; Builder.defineMacro("__USER_LABEL_PREFIX__", UserLabelPrefix); if (LangOpts.FastMath || LangOpts.FiniteMathOnly) Builder.defineMacro("__FINITE_MATH_ONLY__", "1"); else Builder.defineMacro("__FINITE_MATH_ONLY__", "0"); if (!LangOpts.MSVCCompat) { if (LangOpts.GNUInline || LangOpts.CPlusPlus) Builder.defineMacro("__GNUC_GNU_INLINE__"); else Builder.defineMacro("__GNUC_STDC_INLINE__"); // The value written by __atomic_test_and_set. // FIXME: This is target-dependent. Builder.defineMacro("__GCC_ATOMIC_TEST_AND_SET_TRUEVAL", "1"); // Used by libc++ and libstdc++ to implement ATOMIC__LOCK_FREE. unsigned InlineWidthBits = TI.getMaxAtomicInlineWidth(); #define DEFINE_LOCK_FREE_MACRO(TYPE, Type) \ Builder.defineMacro("__GCC_ATOMIC_" #TYPE "_LOCK_FREE", \ getLockFreeValue(TI.get##Type##Width(), \ + TI.get##Type##Align(), \ InlineWidthBits)); DEFINE_LOCK_FREE_MACRO(BOOL, Bool); DEFINE_LOCK_FREE_MACRO(CHAR, Char); DEFINE_LOCK_FREE_MACRO(CHAR16_T, Char16); DEFINE_LOCK_FREE_MACRO(CHAR32_T, Char32); DEFINE_LOCK_FREE_MACRO(WCHAR_T, WChar); DEFINE_LOCK_FREE_MACRO(SHORT, Short); DEFINE_LOCK_FREE_MACRO(INT, Int); DEFINE_LOCK_FREE_MACRO(LONG, Long); DEFINE_LOCK_FREE_MACRO(LLONG, LongLong); Builder.defineMacro("__GCC_ATOMIC_POINTER_LOCK_FREE", getLockFreeValue(TI.getPointerWidth(0), + TI.getPointerAlign(0), InlineWidthBits)); #undef DEFINE_LOCK_FREE_MACRO } if (LangOpts.NoInlineDefine) Builder.defineMacro("__NO_INLINE__"); if (unsigned PICLevel = LangOpts.PICLevel) { Builder.defineMacro("__PIC__", Twine(PICLevel)); Builder.defineMacro("__pic__", Twine(PICLevel)); if (LangOpts.PIE) { Builder.defineMacro("__PIE__", Twine(PICLevel)); Builder.defineMacro("__pie__", Twine(PICLevel)); } } // Macros to control C99 numerics and Builder.defineMacro("__FLT_EVAL_METHOD__", Twine(TI.getFloatEvalMethod())); Builder.defineMacro("__FLT_RADIX__", "2"); Builder.defineMacro("__DECIMAL_DIG__", "__LDBL_DECIMAL_DIG__"); if (LangOpts.getStackProtector() == LangOptions::SSPOn) Builder.defineMacro("__SSP__"); else if (LangOpts.getStackProtector() == LangOptions::SSPStrong) Builder.defineMacro("__SSP_STRONG__", "2"); else if (LangOpts.getStackProtector() == LangOptions::SSPReq) Builder.defineMacro("__SSP_ALL__", "3"); // Define a macro that exists only when using the static analyzer. if (FEOpts.ProgramAction == frontend::RunAnalysis) Builder.defineMacro("__clang_analyzer__"); if (LangOpts.FastRelaxedMath) Builder.defineMacro("__FAST_RELAXED_MATH__"); if (FEOpts.ProgramAction == frontend::RewriteObjC || LangOpts.getGC() != LangOptions::NonGC) { Builder.defineMacro("__weak", "__attribute__((objc_gc(weak)))"); Builder.defineMacro("__strong", "__attribute__((objc_gc(strong)))"); Builder.defineMacro("__autoreleasing", ""); Builder.defineMacro("__unsafe_unretained", ""); } else if (LangOpts.ObjC1) { Builder.defineMacro("__weak", "__attribute__((objc_ownership(weak)))"); Builder.defineMacro("__strong", "__attribute__((objc_ownership(strong)))"); Builder.defineMacro("__autoreleasing", "__attribute__((objc_ownership(autoreleasing)))"); Builder.defineMacro("__unsafe_unretained", "__attribute__((objc_ownership(none)))"); } // On Darwin, there are __double_underscored variants of the type // nullability qualifiers. if (TI.getTriple().isOSDarwin()) { Builder.defineMacro("__nonnull", "_Nonnull"); Builder.defineMacro("__null_unspecified", "_Null_unspecified"); Builder.defineMacro("__nullable", "_Nullable"); } // OpenMP definition // OpenMP 2.2: // In implementations that support a preprocessor, the _OPENMP // macro name is defined to have the decimal value yyyymm where // yyyy and mm are the year and the month designations of the // version of the OpenMP API that the implementation support. switch (LangOpts.OpenMP) { case 0: break; case 40: Builder.defineMacro("_OPENMP", "201307"); break; case 45: Builder.defineMacro("_OPENMP", "201511"); break; default: // Default version is OpenMP 3.1 Builder.defineMacro("_OPENMP", "201107"); break; } // CUDA device path compilaton if (LangOpts.CUDAIsDevice) { // The CUDA_ARCH value is set for the GPU target specified in the NVPTX // backend's target defines. Builder.defineMacro("__CUDA_ARCH__"); } // We need to communicate this to our CUDA header wrapper, which in turn // informs the proper CUDA headers of this choice. if (LangOpts.CUDADeviceApproxTranscendentals || LangOpts.FastMath) { Builder.defineMacro("__CLANG_CUDA_APPROX_TRANSCENDENTALS__"); } // OpenCL definitions. if (LangOpts.OpenCL) { #define OPENCLEXT(Ext) \ if (TI.getSupportedOpenCLOpts().isSupported(#Ext, \ LangOpts.OpenCLVersion)) \ Builder.defineMacro(#Ext); #include "clang/Basic/OpenCLExtensions.def" } if (TI.hasInt128Type() && LangOpts.CPlusPlus && LangOpts.GNUMode) { // For each extended integer type, g++ defines a macro mapping the // index of the type (0 in this case) in some list of extended types // to the type. Builder.defineMacro("__GLIBCXX_TYPE_INT_N_0", "__int128"); Builder.defineMacro("__GLIBCXX_BITSIZE_INT_N_0", "128"); } // Get other target #defines. TI.getTargetDefines(LangOpts, Builder); } /// InitializePreprocessor - Initialize the preprocessor getting it and the /// environment ready to process a single file. This returns true on error. /// void clang::InitializePreprocessor( Preprocessor &PP, const PreprocessorOptions &InitOpts, const PCHContainerReader &PCHContainerRdr, const FrontendOptions &FEOpts) { const LangOptions &LangOpts = PP.getLangOpts(); std::string PredefineBuffer; PredefineBuffer.reserve(4080); llvm::raw_string_ostream Predefines(PredefineBuffer); MacroBuilder Builder(Predefines); // Emit line markers for various builtin sections of the file. We don't do // this in asm preprocessor mode, because "# 4" is not a line marker directive // in this mode. if (!PP.getLangOpts().AsmPreprocessor) Builder.append("# 1 \"\" 3"); // Install things like __POWERPC__, __GNUC__, etc into the macro table. if (InitOpts.UsePredefines) { if (LangOpts.CUDA && PP.getAuxTargetInfo()) InitializePredefinedMacros(*PP.getAuxTargetInfo(), LangOpts, FEOpts, Builder); InitializePredefinedMacros(PP.getTargetInfo(), LangOpts, FEOpts, Builder); // Install definitions to make Objective-C++ ARC work well with various // C++ Standard Library implementations. if (LangOpts.ObjC1 && LangOpts.CPlusPlus && (LangOpts.ObjCAutoRefCount || LangOpts.ObjCWeak)) { switch (InitOpts.ObjCXXARCStandardLibrary) { case ARCXX_nolib: case ARCXX_libcxx: break; case ARCXX_libstdcxx: AddObjCXXARCLibstdcxxDefines(LangOpts, Builder); break; } } } // Even with predefines off, some macros are still predefined. // These should all be defined in the preprocessor according to the // current language configuration. InitializeStandardPredefinedMacros(PP.getTargetInfo(), PP.getLangOpts(), FEOpts, Builder); // Add on the predefines from the driver. Wrap in a #line directive to report // that they come from the command line. if (!PP.getLangOpts().AsmPreprocessor) Builder.append("# 1 \"\" 1"); // Process #define's and #undef's in the order they are given. for (unsigned i = 0, e = InitOpts.Macros.size(); i != e; ++i) { if (InitOpts.Macros[i].second) // isUndef Builder.undefineMacro(InitOpts.Macros[i].first); else DefineBuiltinMacro(Builder, InitOpts.Macros[i].first, PP.getDiagnostics()); } // Exit the command line and go back to (2 is LC_LEAVE). if (!PP.getLangOpts().AsmPreprocessor) Builder.append("# 1 \"\" 2"); // If -imacros are specified, include them now. These are processed before // any -include directives. for (unsigned i = 0, e = InitOpts.MacroIncludes.size(); i != e; ++i) AddImplicitIncludeMacros(Builder, InitOpts.MacroIncludes[i]); // Process -include-pch/-include-pth directives. if (!InitOpts.ImplicitPCHInclude.empty()) AddImplicitIncludePCH(Builder, PP, PCHContainerRdr, InitOpts.ImplicitPCHInclude); if (!InitOpts.ImplicitPTHInclude.empty()) AddImplicitIncludePTH(Builder, PP, InitOpts.ImplicitPTHInclude); // Process -include directives. for (unsigned i = 0, e = InitOpts.Includes.size(); i != e; ++i) { const std::string &Path = InitOpts.Includes[i]; AddImplicitInclude(Builder, Path); } // Instruct the preprocessor to skip the preamble. PP.setSkipMainFilePreamble(InitOpts.PrecompiledPreambleBytes.first, InitOpts.PrecompiledPreambleBytes.second); // Copy PredefinedBuffer into the Preprocessor. PP.setPredefines(Predefines.str()); } Index: vendor/clang/dist/lib/StaticAnalyzer/Checkers/VirtualCallChecker.cpp =================================================================== --- vendor/clang/dist/lib/StaticAnalyzer/Checkers/VirtualCallChecker.cpp (revision 314259) +++ vendor/clang/dist/lib/StaticAnalyzer/Checkers/VirtualCallChecker.cpp (revision 314260) @@ -1,291 +1,292 @@ //=======- VirtualCallChecker.cpp --------------------------------*- C++ -*-==// // // The LLVM Compiler Infrastructure // // This file is distributed under the University of Illinois Open Source // License. See LICENSE.TXT for details. // //===----------------------------------------------------------------------===// // // This file defines a checker that checks virtual function calls during // construction or destruction of C++ objects. // //===----------------------------------------------------------------------===// #include "ClangSACheckers.h" #include "clang/AST/DeclCXX.h" #include "clang/AST/StmtVisitor.h" #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h" #include "clang/StaticAnalyzer/Core/Checker.h" #include "clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h" #include "llvm/ADT/SmallString.h" #include "llvm/Support/SaveAndRestore.h" #include "llvm/Support/raw_ostream.h" using namespace clang; using namespace ento; namespace { class WalkAST : public StmtVisitor { const CheckerBase *Checker; BugReporter &BR; AnalysisDeclContext *AC; /// The root constructor or destructor whose callees are being analyzed. const CXXMethodDecl *RootMethod = nullptr; /// Whether the checker should walk into bodies of called functions. /// Controlled by the "Interprocedural" analyzer-config option. bool IsInterprocedural = false; /// Whether the checker should only warn for calls to pure virtual functions /// (which is undefined behavior) or for all virtual functions (which may /// may result in unexpected behavior). bool ReportPureOnly = false; typedef const CallExpr * WorkListUnit; typedef SmallVector DFSWorkList; /// A vector representing the worklist which has a chain of CallExprs. DFSWorkList WList; // PreVisited : A CallExpr to this FunctionDecl is in the worklist, but the // body has not been visited yet. // PostVisited : A CallExpr to this FunctionDecl is in the worklist, and the // body has been visited. enum Kind { NotVisited, PreVisited, /**< A CallExpr to this FunctionDecl is in the worklist, but the body has not yet been visited. */ PostVisited /**< A CallExpr to this FunctionDecl is in the worklist, and the body has been visited. */ }; /// A DenseMap that records visited states of FunctionDecls. llvm::DenseMap VisitedFunctions; /// The CallExpr whose body is currently being visited. This is used for /// generating bug reports. This is null while visiting the body of a /// constructor or destructor. const CallExpr *visitingCallExpr; public: WalkAST(const CheckerBase *checker, BugReporter &br, AnalysisDeclContext *ac, const CXXMethodDecl *rootMethod, bool isInterprocedural, bool reportPureOnly) : Checker(checker), BR(br), AC(ac), RootMethod(rootMethod), IsInterprocedural(isInterprocedural), ReportPureOnly(reportPureOnly), visitingCallExpr(nullptr) { // Walking should always start from either a constructor or a destructor. assert(isa(rootMethod) || isa(rootMethod)); } bool hasWork() const { return !WList.empty(); } /// This method adds a CallExpr to the worklist and marks the callee as /// being PreVisited. void Enqueue(WorkListUnit WLUnit) { const FunctionDecl *FD = WLUnit->getDirectCallee(); if (!FD || !FD->getBody()) return; Kind &K = VisitedFunctions[FD]; if (K != NotVisited) return; K = PreVisited; WList.push_back(WLUnit); } /// This method returns an item from the worklist without removing it. WorkListUnit Dequeue() { assert(!WList.empty()); return WList.back(); } void Execute() { while (hasWork()) { WorkListUnit WLUnit = Dequeue(); const FunctionDecl *FD = WLUnit->getDirectCallee(); assert(FD && FD->getBody()); if (VisitedFunctions[FD] == PreVisited) { // If the callee is PreVisited, walk its body. // Visit the body. SaveAndRestore SaveCall(visitingCallExpr, WLUnit); Visit(FD->getBody()); // Mark the function as being PostVisited to indicate we have // scanned the body. VisitedFunctions[FD] = PostVisited; continue; } // Otherwise, the callee is PostVisited. // Remove it from the worklist. assert(VisitedFunctions[FD] == PostVisited); WList.pop_back(); } } // Stmt visitor methods. void VisitCallExpr(CallExpr *CE); void VisitCXXMemberCallExpr(CallExpr *CE); void VisitStmt(Stmt *S) { VisitChildren(S); } void VisitChildren(Stmt *S); void ReportVirtualCall(const CallExpr *CE, bool isPure); }; } // end anonymous namespace //===----------------------------------------------------------------------===// // AST walking. //===----------------------------------------------------------------------===// void WalkAST::VisitChildren(Stmt *S) { for (Stmt *Child : S->children()) if (Child) Visit(Child); } void WalkAST::VisitCallExpr(CallExpr *CE) { VisitChildren(CE); if (IsInterprocedural) Enqueue(CE); } void WalkAST::VisitCXXMemberCallExpr(CallExpr *CE) { VisitChildren(CE); bool callIsNonVirtual = false; // Several situations to elide for checking. if (MemberExpr *CME = dyn_cast(CE->getCallee())) { // If the member access is fully qualified (i.e., X::F), then treat // this as a non-virtual call and do not warn. if (CME->getQualifier()) callIsNonVirtual = true; if (Expr *base = CME->getBase()->IgnoreImpCasts()) { // Elide analyzing the call entirely if the base pointer is not 'this'. if (!isa(base)) return; // If the most derived class is marked final, we know that now subclass // can override this member. if (base->getBestDynamicClassType()->hasAttr()) callIsNonVirtual = true; } } // Get the callee. - const CXXMethodDecl *MD = dyn_cast(CE->getDirectCallee()); + const CXXMethodDecl *MD = + dyn_cast_or_null(CE->getDirectCallee()); if (MD && MD->isVirtual() && !callIsNonVirtual && !MD->hasAttr() && !MD->getParent()->hasAttr()) ReportVirtualCall(CE, MD->isPure()); if (IsInterprocedural) Enqueue(CE); } void WalkAST::ReportVirtualCall(const CallExpr *CE, bool isPure) { if (ReportPureOnly && !isPure) return; SmallString<100> buf; llvm::raw_svector_ostream os(buf); // FIXME: The interprocedural diagnostic experience here is not good. // Ultimately this checker should be re-written to be path sensitive. // For now, only diagnose intraprocedurally, by default. if (IsInterprocedural) { os << "Call Path : "; // Name of current visiting CallExpr. os << *CE->getDirectCallee(); // Name of the CallExpr whose body is current being walked. if (visitingCallExpr) os << " <-- " << *visitingCallExpr->getDirectCallee(); // Names of FunctionDecls in worklist with state PostVisited. for (SmallVectorImpl::iterator I = WList.end(), E = WList.begin(); I != E; --I) { const FunctionDecl *FD = (*(I-1))->getDirectCallee(); assert(FD); if (VisitedFunctions[FD] == PostVisited) os << " <-- " << *FD; } os << "\n"; } PathDiagnosticLocation CELoc = PathDiagnosticLocation::createBegin(CE, BR.getSourceManager(), AC); SourceRange R = CE->getCallee()->getSourceRange(); os << "Call to "; if (isPure) os << "pure "; os << "virtual function during "; if (isa(RootMethod)) os << "construction "; else os << "destruction "; if (isPure) os << "has undefined behavior"; else os << "will not dispatch to derived class"; BR.EmitBasicReport(AC->getDecl(), Checker, "Call to virtual function during construction or " "destruction", "C++ Object Lifecycle", os.str(), CELoc, R); } //===----------------------------------------------------------------------===// // VirtualCallChecker //===----------------------------------------------------------------------===// namespace { class VirtualCallChecker : public Checker > { public: DefaultBool isInterprocedural; DefaultBool isPureOnly; void checkASTDecl(const CXXRecordDecl *RD, AnalysisManager& mgr, BugReporter &BR) const { AnalysisDeclContext *ADC = mgr.getAnalysisDeclContext(RD); // Check the constructors. for (const auto *I : RD->ctors()) { if (!I->isCopyOrMoveConstructor()) if (Stmt *Body = I->getBody()) { WalkAST walker(this, BR, ADC, I, isInterprocedural, isPureOnly); walker.Visit(Body); walker.Execute(); } } // Check the destructor. if (CXXDestructorDecl *DD = RD->getDestructor()) if (Stmt *Body = DD->getBody()) { WalkAST walker(this, BR, ADC, DD, isInterprocedural, isPureOnly); walker.Visit(Body); walker.Execute(); } } }; } void ento::registerVirtualCallChecker(CheckerManager &mgr) { VirtualCallChecker *checker = mgr.registerChecker(); checker->isInterprocedural = mgr.getAnalyzerOptions().getBooleanOption("Interprocedural", false, checker); checker->isPureOnly = mgr.getAnalyzerOptions().getBooleanOption("PureOnly", false, checker); } Index: vendor/clang/dist/test/Analysis/virtualcall.cpp =================================================================== --- vendor/clang/dist/test/Analysis/virtualcall.cpp (revision 314259) +++ vendor/clang/dist/test/Analysis/virtualcall.cpp (revision 314260) @@ -1,130 +1,141 @@ // RUN: %clang_cc1 -analyze -analyzer-checker=optin.cplusplus.VirtualCall -analyzer-store region -verify -std=c++11 %s // RUN: %clang_cc1 -analyze -analyzer-checker=optin.cplusplus.VirtualCall -analyzer-store region -analyzer-config optin.cplusplus.VirtualCall:Interprocedural=true -DINTERPROCEDURAL=1 -verify -std=c++11 %s // RUN: %clang_cc1 -analyze -analyzer-checker=optin.cplusplus.VirtualCall -analyzer-store region -analyzer-config optin.cplusplus.VirtualCall:PureOnly=true -DPUREONLY=1 -verify -std=c++11 %s /* When INTERPROCEDURAL is set, we expect diagnostics in all functions reachable from a constructor or destructor. If it is not set, we expect diagnostics only in the constructor or destructor. When PUREONLY is set, we expect diagnostics only for calls to pure virtual functions not to non-pure virtual functions. */ class A { public: A(); A(int i); ~A() {}; virtual int foo() = 0; // from Sema: expected-note {{'foo' declared here}} virtual void bar() = 0; void f() { foo(); #if INTERPROCEDURAL // expected-warning-re@-2 {{{{^}}Call Path : foo <-- fCall to pure virtual function during construction has undefined behavior}} #endif } }; class B : public A { public: B() { foo(); #if !PUREONLY #if INTERPROCEDURAL // expected-warning-re@-3 {{{{^}}Call Path : fooCall to virtual function during construction will not dispatch to derived class}} #else // expected-warning-re@-5 {{{{^}}Call to virtual function during construction will not dispatch to derived class}} #endif #endif } ~B(); virtual int foo(); virtual void bar() { foo(); } #if INTERPROCEDURAL // expected-warning-re@-2 {{{{^}}Call Path : foo <-- barCall to virtual function during destruction will not dispatch to derived class}} #endif }; A::A() { f(); } A::A(int i) { foo(); // From Sema: expected-warning {{call to pure virtual member function 'foo' has undefined behavior}} #if INTERPROCEDURAL // expected-warning-re@-2 {{{{^}}Call Path : fooCall to pure virtual function during construction has undefined behavior}} #else // expected-warning-re@-4 {{{{^}}Call to pure virtual function during construction has undefined behavior}} #endif } B::~B() { this->B::foo(); // no-warning this->B::bar(); this->foo(); #if !PUREONLY #if INTERPROCEDURAL // expected-warning-re@-3 {{{{^}}Call Path : fooCall to virtual function during destruction will not dispatch to derived class}} #else // expected-warning-re@-5 {{{{^}}Call to virtual function during destruction will not dispatch to derived class}} #endif #endif } class C : public B { public: C(); ~C(); virtual int foo(); void f(int i); }; C::C() { f(foo()); #if !PUREONLY #if INTERPROCEDURAL // expected-warning-re@-3 {{{{^}}Call Path : fooCall to virtual function during construction will not dispatch to derived class}} #else // expected-warning-re@-5 {{{{^}}Call to virtual function during construction will not dispatch to derived class}} #endif #endif } class D : public B { public: D() { foo(); // no-warning } ~D() { bar(); } int foo() final; void bar() final { foo(); } // no-warning }; class E final : public B { public: E() { foo(); // no-warning } ~E() { bar(); } int foo() override; }; +// Regression test: don't crash when there's no direct callee. +class F { +public: + F() { + void (F::* ptr)() = &F::foo; + (this->*ptr)(); + } + void foo(); +}; + int main() { A *a; B *b; C *c; D *d; E *e; + F *f; } #include "virtualcall.h" #define AS_SYSTEM #include "virtualcall.h" #undef AS_SYSTEM Index: vendor/clang/dist/test/OpenMP/cancellation_point_codegen.cpp =================================================================== --- vendor/clang/dist/test/OpenMP/cancellation_point_codegen.cpp (revision 314259) +++ vendor/clang/dist/test/OpenMP/cancellation_point_codegen.cpp (revision 314260) @@ -1,171 +1,186 @@ // RUN: %clang_cc1 -verify -fopenmp -triple x86_64-apple-darwin13.4.0 -emit-llvm -o - %s | FileCheck %s // RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-apple-darwin13.4.0 -emit-pch -o %t %s // RUN: %clang_cc1 -fopenmp -std=c++11 -include-pch %t -fsyntax-only -verify %s -triple x86_64-apple-darwin13.4.0 -emit-llvm -o - | FileCheck %s // expected-no-diagnostics #ifndef HEADER #define HEADER int main (int argc, char **argv) { // CHECK: [[GTID:%.+]] = call i32 @__kmpc_global_thread_num( #pragma omp parallel { #pragma omp cancellation point parallel #pragma omp cancel parallel argv[0][0] = argc; } // CHECK: call void (%ident_t*, i32, void (i32*, i32*, ...)*, ...) @__kmpc_fork_call( #pragma omp sections { { #pragma omp cancellation point sections #pragma omp cancel sections } } // CHECK: call void @__kmpc_for_static_init_4( // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID]], i32 3) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: call void @__kmpc_for_static_fini( // CHECK: call void @__kmpc_barrier(%ident_t* #pragma omp sections { #pragma omp cancellation point sections #pragma omp section { #pragma omp cancellation point sections #pragma omp cancel sections } } // CHECK: call void @__kmpc_for_static_init_4( // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID]], i32 3) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID]], i32 3) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: call void @__kmpc_for_static_fini( #pragma omp for for (int i = 0; i < argc; ++i) { #pragma omp cancellation point for #pragma omp cancel for } // CHECK: call void @__kmpc_for_static_init_4( // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID]], i32 2) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: call void @__kmpc_for_static_fini( // CHECK: call void @__kmpc_barrier(%ident_t* #pragma omp task { #pragma omp cancellation point taskgroup #pragma omp cancel taskgroup } // CHECK: call i8* @__kmpc_omp_task_alloc( // CHECK: call i32 @__kmpc_omp_task( +#pragma omp task +{ +#pragma omp cancellation point taskgroup +} +// CHECK: call i8* @__kmpc_omp_task_alloc( +// CHECK: call i32 @__kmpc_omp_task( #pragma omp parallel sections { { #pragma omp cancellation point sections #pragma omp cancel sections } } // CHECK: call void (%ident_t*, i32, void (i32*, i32*, ...)*, ...) @__kmpc_fork_call( #pragma omp parallel sections { { #pragma omp cancellation point sections #pragma omp cancel sections } #pragma omp section { #pragma omp cancellation point sections } } // CHECK: call void (%ident_t*, i32, void (i32*, i32*, ...)*, ...) @__kmpc_fork_call( #pragma omp parallel for for (int i = 0; i < argc; ++i) { #pragma omp cancellation point for #pragma omp cancel for } // CHECK: call void (%ident_t*, i32, void (i32*, i32*, ...)*, ...) @__kmpc_fork_call( return argc; } // CHECK: define internal void @{{[^(]+}}(i32* {{[^,]+}}, i32* {{[^,]+}}, // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 {{[^,]+}}, i32 1) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,]+]], // CHECK: [[EXIT]] // CHECK: br label %[[RETURN:.+]] // CHECK: [[RETURN]] // CHECK: ret void + +// CHECK: define internal i32 @{{[^(]+}}(i32 +// CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 {{[^,]+}}, i32 4) +// CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 +// CHECK: br i1 [[CMP]], label %[[EXIT:[^,]+]], +// CHECK: [[EXIT]] +// CHECK: br label %[[RETURN:.+]] +// CHECK: [[RETURN]] +// CHECK: ret i32 0 // CHECK: define internal i32 @{{[^(]+}}(i32 // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 {{[^,]+}}, i32 4) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,]+]], // CHECK: [[EXIT]] // CHECK: br label %[[RETURN:.+]] // CHECK: [[RETURN]] // CHECK: ret i32 0 // CHECK: define internal void @{{[^(]+}}(i32* {{[^,]+}}, i32* {{[^,]+}}) // CHECK: call void @__kmpc_for_static_init_4( // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID:%.+]], i32 3) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: call void @__kmpc_for_static_fini( // CHECK: ret void // CHECK: define internal void @{{[^(]+}}(i32* {{[^,]+}}, i32* {{[^,]+}}) // CHECK: call void @__kmpc_for_static_init_4( // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID:%.+]], i32 3) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID]], i32 3) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: call void @__kmpc_for_static_fini( // CHECK: ret void // CHECK: define internal void @{{[^(]+}}(i32* {{[^,]+}}, i32* {{[^,]+}}, // CHECK: call void @__kmpc_for_static_init_4( // CHECK: [[RES:%.+]] = call i32 @__kmpc_cancellationpoint(%ident_t* {{[^,]+}}, i32 [[GTID:%.+]], i32 2) // CHECK: [[CMP:%.+]] = icmp ne i32 [[RES]], 0 // CHECK: br i1 [[CMP]], label %[[EXIT:[^,].+]], label %[[CONTINUE:.+]] // CHECK: [[EXIT]] // CHECK: br label // CHECK: [[CONTINUE]] // CHECK: br label // CHECK: call void @__kmpc_for_static_fini( // CHECK: ret void #endif Index: vendor/clang/dist/test/Sema/atomic-ops.c =================================================================== --- vendor/clang/dist/test/Sema/atomic-ops.c (revision 314259) +++ vendor/clang/dist/test/Sema/atomic-ops.c (revision 314260) @@ -1,507 +1,511 @@ // RUN: %clang_cc1 %s -verify -ffreestanding -fsyntax-only -triple=i686-linux-gnu -std=c11 // Basic parsing/Sema tests for __c11_atomic_* #include struct S { char c[3]; }; _Static_assert(__GCC_ATOMIC_BOOL_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_CHAR_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_CHAR16_T_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_CHAR32_T_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_WCHAR_T_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_SHORT_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_INT_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_LONG_LOCK_FREE == 2, ""); +#ifdef __i386__ +_Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 1, ""); +#else _Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 2, ""); +#endif _Static_assert(__GCC_ATOMIC_POINTER_LOCK_FREE == 2, ""); _Static_assert(__c11_atomic_is_lock_free(1), ""); _Static_assert(__c11_atomic_is_lock_free(2), ""); _Static_assert(__c11_atomic_is_lock_free(3), ""); // expected-error {{not an integral constant expression}} _Static_assert(__c11_atomic_is_lock_free(4), ""); _Static_assert(__c11_atomic_is_lock_free(8), ""); _Static_assert(__c11_atomic_is_lock_free(16), ""); // expected-error {{not an integral constant expression}} _Static_assert(__c11_atomic_is_lock_free(17), ""); // expected-error {{not an integral constant expression}} _Static_assert(__atomic_is_lock_free(1, 0), ""); _Static_assert(__atomic_is_lock_free(2, 0), ""); _Static_assert(__atomic_is_lock_free(3, 0), ""); // expected-error {{not an integral constant expression}} _Static_assert(__atomic_is_lock_free(4, 0), ""); _Static_assert(__atomic_is_lock_free(8, 0), ""); _Static_assert(__atomic_is_lock_free(16, 0), ""); // expected-error {{not an integral constant expression}} _Static_assert(__atomic_is_lock_free(17, 0), ""); // expected-error {{not an integral constant expression}} _Static_assert(atomic_is_lock_free((atomic_char*)0), ""); _Static_assert(atomic_is_lock_free((atomic_short*)0), ""); _Static_assert(atomic_is_lock_free((atomic_int*)0), ""); _Static_assert(atomic_is_lock_free((atomic_long*)0), ""); // expected-error@+1 {{__int128 is not supported on this target}} _Static_assert(atomic_is_lock_free((_Atomic(__int128)*)0), ""); // expected-error {{not an integral constant expression}} _Static_assert(atomic_is_lock_free(0 + (atomic_char*)0), ""); char i8; short i16; int i32; int __attribute__((vector_size(8))) i64; struct Incomplete *incomplete; // expected-note {{forward declaration of 'struct Incomplete'}} _Static_assert(__atomic_is_lock_free(1, &i8), ""); _Static_assert(__atomic_is_lock_free(1, &i64), ""); _Static_assert(__atomic_is_lock_free(2, &i8), ""); // expected-error {{not an integral constant expression}} _Static_assert(__atomic_is_lock_free(2, &i16), ""); _Static_assert(__atomic_is_lock_free(2, &i64), ""); _Static_assert(__atomic_is_lock_free(4, &i16), ""); // expected-error {{not an integral constant expression}} _Static_assert(__atomic_is_lock_free(4, &i32), ""); _Static_assert(__atomic_is_lock_free(4, &i64), ""); _Static_assert(__atomic_is_lock_free(8, &i32), ""); // expected-error {{not an integral constant expression}} _Static_assert(__atomic_is_lock_free(8, &i64), ""); _Static_assert(__atomic_always_lock_free(1, 0), ""); _Static_assert(__atomic_always_lock_free(2, 0), ""); _Static_assert(!__atomic_always_lock_free(3, 0), ""); _Static_assert(__atomic_always_lock_free(4, 0), ""); _Static_assert(__atomic_always_lock_free(8, 0), ""); _Static_assert(!__atomic_always_lock_free(16, 0), ""); _Static_assert(!__atomic_always_lock_free(17, 0), ""); _Static_assert(__atomic_always_lock_free(1, incomplete), ""); _Static_assert(!__atomic_always_lock_free(2, incomplete), ""); _Static_assert(!__atomic_always_lock_free(4, incomplete), ""); _Static_assert(__atomic_always_lock_free(1, &i8), ""); _Static_assert(__atomic_always_lock_free(1, &i64), ""); _Static_assert(!__atomic_always_lock_free(2, &i8), ""); _Static_assert(__atomic_always_lock_free(2, &i16), ""); _Static_assert(__atomic_always_lock_free(2, &i64), ""); _Static_assert(!__atomic_always_lock_free(4, &i16), ""); _Static_assert(__atomic_always_lock_free(4, &i32), ""); _Static_assert(__atomic_always_lock_free(4, &i64), ""); _Static_assert(!__atomic_always_lock_free(8, &i32), ""); _Static_assert(__atomic_always_lock_free(8, &i64), ""); #define _AS1 __attribute__((address_space(1))) #define _AS2 __attribute__((address_space(2))) void f(_Atomic(int) *i, const _Atomic(int) *ci, _Atomic(int*) *p, _Atomic(float) *d, int *I, const int *CI, int **P, float *D, struct S *s1, struct S *s2) { __c11_atomic_init(I, 5); // expected-error {{pointer to _Atomic}} __c11_atomic_init(ci, 5); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}} __c11_atomic_load(0); // expected-error {{too few arguments to function}} __c11_atomic_load(0,0,0); // expected-error {{too many arguments to function}} __c11_atomic_store(0,0,0); // expected-error {{address argument to atomic builtin must be a pointer}} __c11_atomic_store((int*)0,0,0); // expected-error {{address argument to atomic operation must be a pointer to _Atomic}} __c11_atomic_store(i, 0, memory_order_relaxed); __c11_atomic_store(ci, 0, memory_order_relaxed); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}} __c11_atomic_load(i, memory_order_seq_cst); __c11_atomic_load(p, memory_order_seq_cst); __c11_atomic_load(d, memory_order_seq_cst); __c11_atomic_load(ci, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}} int load_n_1 = __atomic_load_n(I, memory_order_relaxed); int *load_n_2 = __atomic_load_n(P, memory_order_relaxed); float load_n_3 = __atomic_load_n(D, memory_order_relaxed); // expected-error {{must be a pointer to integer or pointer}} __atomic_load_n(s1, memory_order_relaxed); // expected-error {{must be a pointer to integer or pointer}} load_n_1 = __atomic_load_n(CI, memory_order_relaxed); __atomic_load(i, I, memory_order_relaxed); // expected-error {{must be a pointer to a trivially-copyable type}} __atomic_load(CI, I, memory_order_relaxed); __atomic_load(I, i, memory_order_relaxed); // expected-warning {{passing '_Atomic(int) *' to parameter of type 'int *'}} __atomic_load(I, *P, memory_order_relaxed); __atomic_load(I, *P, memory_order_relaxed, 42); // expected-error {{too many arguments}} (int)__atomic_load(I, I, memory_order_seq_cst); // expected-error {{operand of type 'void'}} __atomic_load(s1, s2, memory_order_acquire); __atomic_load(CI, I, memory_order_relaxed); __atomic_load(I, CI, memory_order_relaxed); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} __atomic_load(CI, CI, memory_order_relaxed); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} __c11_atomic_store(i, 1, memory_order_seq_cst); __c11_atomic_store(p, 1, memory_order_seq_cst); // expected-warning {{incompatible integer to pointer conversion}} (int)__c11_atomic_store(d, 1, memory_order_seq_cst); // expected-error {{operand of type 'void'}} __atomic_store_n(I, 4, memory_order_release); __atomic_store_n(I, 4.0, memory_order_release); __atomic_store_n(CI, 4, memory_order_release); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}} __atomic_store_n(I, P, memory_order_release); // expected-warning {{parameter of type 'int'}} __atomic_store_n(i, 1, memory_order_release); // expected-error {{must be a pointer to integer or pointer}} __atomic_store_n(s1, *s2, memory_order_release); // expected-error {{must be a pointer to integer or pointer}} __atomic_store_n(I, I, memory_order_release); // expected-warning {{incompatible pointer to integer conversion passing 'int *' to parameter of type 'int'; dereference with *}} __atomic_store(I, *P, memory_order_release); __atomic_store(CI, I, memory_order_release); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}} __atomic_store(s1, s2, memory_order_release); __atomic_store(i, I, memory_order_release); // expected-error {{trivially-copyable}} int exchange_1 = __c11_atomic_exchange(i, 1, memory_order_seq_cst); int exchange_2 = __c11_atomic_exchange(I, 1, memory_order_seq_cst); // expected-error {{must be a pointer to _Atomic}} int exchange_3 = __atomic_exchange_n(i, 1, memory_order_seq_cst); // expected-error {{must be a pointer to integer or pointer}} int exchange_4 = __atomic_exchange_n(I, 1, memory_order_seq_cst); __atomic_exchange(s1, s2, s2, memory_order_seq_cst); __atomic_exchange(s1, I, P, memory_order_seq_cst); // expected-warning 2{{parameter of type 'struct S *'}} (int)__atomic_exchange(s1, s2, s2, memory_order_seq_cst); // expected-error {{operand of type 'void'}} __atomic_exchange(I, I, I, memory_order_seq_cst); __atomic_exchange(CI, I, I, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}} __atomic_exchange(I, I, CI, memory_order_seq_cst); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} __c11_atomic_fetch_add(i, 1, memory_order_seq_cst); __c11_atomic_fetch_add(p, 1, memory_order_seq_cst); __c11_atomic_fetch_add(d, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer or pointer}} __atomic_fetch_add(i, 3, memory_order_seq_cst); // expected-error {{pointer to integer or pointer}} __atomic_fetch_sub(I, 3, memory_order_seq_cst); __atomic_fetch_sub(P, 3, memory_order_seq_cst); __atomic_fetch_sub(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer or pointer}} __atomic_fetch_sub(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer or pointer}} __c11_atomic_fetch_and(i, 1, memory_order_seq_cst); __c11_atomic_fetch_and(p, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer}} __c11_atomic_fetch_and(d, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer}} __atomic_fetch_and(i, 3, memory_order_seq_cst); // expected-error {{pointer to integer}} __atomic_fetch_or(I, 3, memory_order_seq_cst); __atomic_fetch_xor(P, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}} __atomic_fetch_or(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}} __atomic_fetch_and(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}} _Bool cmpexch_1 = __c11_atomic_compare_exchange_strong(i, I, 1, memory_order_seq_cst, memory_order_seq_cst); _Bool cmpexch_2 = __c11_atomic_compare_exchange_strong(p, P, (int*)1, memory_order_seq_cst, memory_order_seq_cst); _Bool cmpexch_3 = __c11_atomic_compare_exchange_strong(d, I, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{incompatible pointer types}} (void)__c11_atomic_compare_exchange_strong(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} _Bool cmpexchw_1 = __c11_atomic_compare_exchange_weak(i, I, 1, memory_order_seq_cst, memory_order_seq_cst); _Bool cmpexchw_2 = __c11_atomic_compare_exchange_weak(p, P, (int*)1, memory_order_seq_cst, memory_order_seq_cst); _Bool cmpexchw_3 = __c11_atomic_compare_exchange_weak(d, I, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{incompatible pointer types}} (void)__c11_atomic_compare_exchange_weak(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} _Bool cmpexch_4 = __atomic_compare_exchange_n(I, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst); _Bool cmpexch_5 = __atomic_compare_exchange_n(I, P, 5, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{; dereference with *}} _Bool cmpexch_6 = __atomic_compare_exchange_n(I, I, P, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'int **' to parameter of type 'int'}} (void)__atomic_compare_exchange_n(CI, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}} (void)__atomic_compare_exchange_n(I, CI, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} _Bool cmpexch_7 = __atomic_compare_exchange(I, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'int' to parameter of type 'int *'}} _Bool cmpexch_8 = __atomic_compare_exchange(I, P, I, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{; dereference with *}} _Bool cmpexch_9 = __atomic_compare_exchange(I, I, I, 0, memory_order_seq_cst, memory_order_seq_cst); (void)__atomic_compare_exchange(CI, I, I, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}} (void)__atomic_compare_exchange(I, CI, I, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int *' to parameter of type 'int *' discards qualifiers}} // Pointers to different address spaces are allowed. _Bool cmpexch_10 = __c11_atomic_compare_exchange_strong((_Atomic int _AS1 *)0x308, (int _AS2 *)0x309, 1, memory_order_seq_cst, memory_order_seq_cst); const volatile int flag_k = 0; volatile int flag = 0; (void)(int)__atomic_test_and_set(&flag_k, memory_order_seq_cst); // expected-warning {{passing 'const volatile int *' to parameter of type 'volatile void *'}} (void)(int)__atomic_test_and_set(&flag, memory_order_seq_cst); __atomic_clear(&flag_k, memory_order_seq_cst); // expected-warning {{passing 'const volatile int *' to parameter of type 'volatile void *'}} __atomic_clear(&flag, memory_order_seq_cst); (int)__atomic_clear(&flag, memory_order_seq_cst); // expected-error {{operand of type 'void'}} __c11_atomic_init(ci, 0); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}} __c11_atomic_store(ci, 0, memory_order_release); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}} __c11_atomic_load(ci, memory_order_acquire); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}} // Ensure the macros behave appropriately. atomic_int n = ATOMIC_VAR_INIT(123); atomic_init(&n, 456); atomic_init(&n, (void*)0); // expected-warning {{passing 'void *' to parameter of type 'int'}} const atomic_wchar_t cawt; atomic_init(&cawt, L'x'); // expected-error {{non-const}} atomic_wchar_t awt; atomic_init(&awt, L'x'); int x = kill_dependency(12); atomic_thread_fence(); // expected-error {{too few arguments to function call}} atomic_thread_fence(memory_order_seq_cst); atomic_signal_fence(memory_order_seq_cst); void (*pfn)(memory_order) = &atomic_thread_fence; pfn = &atomic_signal_fence; int k = atomic_load_explicit(&n, memory_order_relaxed); atomic_store_explicit(&n, k, memory_order_relaxed); atomic_store(&n, atomic_load(&n)); k = atomic_exchange(&n, 72); k = atomic_exchange_explicit(&n, k, memory_order_release); atomic_compare_exchange_strong(&n, k, k); // expected-warning {{take the address with &}} atomic_compare_exchange_weak(&n, &k, k); atomic_compare_exchange_strong_explicit(&n, &k, k, memory_order_seq_cst); // expected-error {{too few arguments}} atomic_compare_exchange_weak_explicit(&n, &k, k, memory_order_seq_cst, memory_order_acquire); atomic_fetch_add(&k, n); // expected-error {{must be a pointer to _Atomic}} k = atomic_fetch_add(&n, k); k = atomic_fetch_sub(&n, k); k = atomic_fetch_and(&n, k); k = atomic_fetch_or(&n, k); k = atomic_fetch_xor(&n, k); k = atomic_fetch_add_explicit(&n, k, memory_order_acquire); k = atomic_fetch_sub_explicit(&n, k, memory_order_release); k = atomic_fetch_and_explicit(&n, k, memory_order_acq_rel); k = atomic_fetch_or_explicit(&n, k, memory_order_consume); k = atomic_fetch_xor_explicit(&n, k, memory_order_relaxed); // C11 7.17.1/4: atomic_flag is a structure type. struct atomic_flag must_be_struct = ATOMIC_FLAG_INIT; // C11 7.17.8/5 implies that it is also a typedef type. atomic_flag guard = ATOMIC_FLAG_INIT; _Bool old_val = atomic_flag_test_and_set(&guard); if (old_val) atomic_flag_clear(&guard); old_val = (atomic_flag_test_and_set)(&guard); if (old_val) (atomic_flag_clear)(&guard); const atomic_flag const_guard; atomic_flag_test_and_set(&const_guard); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const atomic_bool *' (aka 'const _Atomic(_Bool) *') invalid)}} atomic_flag_clear(&const_guard); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const atomic_bool *' (aka 'const _Atomic(_Bool) *') invalid)}} } _Atomic(int*) PR12527_a; void PR12527() { int *b = PR12527_a; } void PR16931(int* x) { // expected-note {{passing argument to parameter 'x' here}} typedef struct { _Atomic(_Bool) flag; } flag; flag flagvar = { 0 }; PR16931(&flagvar); // expected-warning {{incompatible pointer types}} } void memory_checks(_Atomic(int) *Ap, int *p, int val) { (void)__c11_atomic_load(Ap, memory_order_relaxed); (void)__c11_atomic_load(Ap, memory_order_acquire); (void)__c11_atomic_load(Ap, memory_order_consume); (void)__c11_atomic_load(Ap, memory_order_release); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_load(Ap, memory_order_acq_rel); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_load(Ap, memory_order_seq_cst); (void)__c11_atomic_load(Ap, val); (void)__c11_atomic_load(Ap, -1); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_load(Ap, 42); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_store(Ap, val, memory_order_relaxed); (void)__c11_atomic_store(Ap, val, memory_order_acquire); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_store(Ap, val, memory_order_consume); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_store(Ap, val, memory_order_release); (void)__c11_atomic_store(Ap, val, memory_order_acq_rel); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__c11_atomic_store(Ap, val, memory_order_seq_cst); (void)__c11_atomic_fetch_add(Ap, 1, memory_order_relaxed); (void)__c11_atomic_fetch_add(Ap, 1, memory_order_acquire); (void)__c11_atomic_fetch_add(Ap, 1, memory_order_consume); (void)__c11_atomic_fetch_add(Ap, 1, memory_order_release); (void)__c11_atomic_fetch_add(Ap, 1, memory_order_acq_rel); (void)__c11_atomic_fetch_add(Ap, 1, memory_order_seq_cst); (void)__c11_atomic_fetch_add( (struct Incomplete * _Atomic *)0, // expected-error {{incomplete type 'struct Incomplete'}} 1, memory_order_seq_cst); (void)__c11_atomic_init(Ap, val); (void)__c11_atomic_init(Ap, val); (void)__c11_atomic_init(Ap, val); (void)__c11_atomic_init(Ap, val); (void)__c11_atomic_init(Ap, val); (void)__c11_atomic_init(Ap, val); (void)__c11_atomic_fetch_sub(Ap, val, memory_order_relaxed); (void)__c11_atomic_fetch_sub(Ap, val, memory_order_acquire); (void)__c11_atomic_fetch_sub(Ap, val, memory_order_consume); (void)__c11_atomic_fetch_sub(Ap, val, memory_order_release); (void)__c11_atomic_fetch_sub(Ap, val, memory_order_acq_rel); (void)__c11_atomic_fetch_sub(Ap, val, memory_order_seq_cst); (void)__c11_atomic_fetch_and(Ap, val, memory_order_relaxed); (void)__c11_atomic_fetch_and(Ap, val, memory_order_acquire); (void)__c11_atomic_fetch_and(Ap, val, memory_order_consume); (void)__c11_atomic_fetch_and(Ap, val, memory_order_release); (void)__c11_atomic_fetch_and(Ap, val, memory_order_acq_rel); (void)__c11_atomic_fetch_and(Ap, val, memory_order_seq_cst); (void)__c11_atomic_fetch_or(Ap, val, memory_order_relaxed); (void)__c11_atomic_fetch_or(Ap, val, memory_order_acquire); (void)__c11_atomic_fetch_or(Ap, val, memory_order_consume); (void)__c11_atomic_fetch_or(Ap, val, memory_order_release); (void)__c11_atomic_fetch_or(Ap, val, memory_order_acq_rel); (void)__c11_atomic_fetch_or(Ap, val, memory_order_seq_cst); (void)__c11_atomic_fetch_xor(Ap, val, memory_order_relaxed); (void)__c11_atomic_fetch_xor(Ap, val, memory_order_acquire); (void)__c11_atomic_fetch_xor(Ap, val, memory_order_consume); (void)__c11_atomic_fetch_xor(Ap, val, memory_order_release); (void)__c11_atomic_fetch_xor(Ap, val, memory_order_acq_rel); (void)__c11_atomic_fetch_xor(Ap, val, memory_order_seq_cst); (void)__c11_atomic_exchange(Ap, val, memory_order_relaxed); (void)__c11_atomic_exchange(Ap, val, memory_order_acquire); (void)__c11_atomic_exchange(Ap, val, memory_order_consume); (void)__c11_atomic_exchange(Ap, val, memory_order_release); (void)__c11_atomic_exchange(Ap, val, memory_order_acq_rel); (void)__c11_atomic_exchange(Ap, val, memory_order_seq_cst); (void)__c11_atomic_compare_exchange_strong(Ap, p, val, memory_order_relaxed, memory_order_relaxed); (void)__c11_atomic_compare_exchange_strong(Ap, p, val, memory_order_acquire, memory_order_relaxed); (void)__c11_atomic_compare_exchange_strong(Ap, p, val, memory_order_consume, memory_order_relaxed); (void)__c11_atomic_compare_exchange_strong(Ap, p, val, memory_order_release, memory_order_relaxed); (void)__c11_atomic_compare_exchange_strong(Ap, p, val, memory_order_acq_rel, memory_order_relaxed); (void)__c11_atomic_compare_exchange_strong(Ap, p, val, memory_order_seq_cst, memory_order_relaxed); (void)__c11_atomic_compare_exchange_weak(Ap, p, val, memory_order_relaxed, memory_order_relaxed); (void)__c11_atomic_compare_exchange_weak(Ap, p, val, memory_order_acquire, memory_order_relaxed); (void)__c11_atomic_compare_exchange_weak(Ap, p, val, memory_order_consume, memory_order_relaxed); (void)__c11_atomic_compare_exchange_weak(Ap, p, val, memory_order_release, memory_order_relaxed); (void)__c11_atomic_compare_exchange_weak(Ap, p, val, memory_order_acq_rel, memory_order_relaxed); (void)__c11_atomic_compare_exchange_weak(Ap, p, val, memory_order_seq_cst, memory_order_relaxed); (void)__atomic_load_n(p, memory_order_relaxed); (void)__atomic_load_n(p, memory_order_acquire); (void)__atomic_load_n(p, memory_order_consume); (void)__atomic_load_n(p, memory_order_release); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_load_n(p, memory_order_acq_rel); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_load_n(p, memory_order_seq_cst); (void)__atomic_load(p, p, memory_order_relaxed); (void)__atomic_load(p, p, memory_order_acquire); (void)__atomic_load(p, p, memory_order_consume); (void)__atomic_load(p, p, memory_order_release); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_load(p, p, memory_order_acq_rel); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_load(p, p, memory_order_seq_cst); (void)__atomic_store(p, p, memory_order_relaxed); (void)__atomic_store(p, p, memory_order_acquire); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_store(p, p, memory_order_consume); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_store(p, p, memory_order_release); (void)__atomic_store(p, p, memory_order_acq_rel); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_store(p, p, memory_order_seq_cst); (void)__atomic_store_n(p, val, memory_order_relaxed); (void)__atomic_store_n(p, val, memory_order_acquire); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_store_n(p, val, memory_order_consume); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_store_n(p, val, memory_order_release); (void)__atomic_store_n(p, val, memory_order_acq_rel); // expected-warning {{memory order argument to atomic operation is invalid}} (void)__atomic_store_n(p, val, memory_order_seq_cst); (void)__atomic_fetch_add(p, val, memory_order_relaxed); (void)__atomic_fetch_add(p, val, memory_order_acquire); (void)__atomic_fetch_add(p, val, memory_order_consume); (void)__atomic_fetch_add(p, val, memory_order_release); (void)__atomic_fetch_add(p, val, memory_order_acq_rel); (void)__atomic_fetch_add(p, val, memory_order_seq_cst); (void)__atomic_fetch_sub(p, val, memory_order_relaxed); (void)__atomic_fetch_sub(p, val, memory_order_acquire); (void)__atomic_fetch_sub(p, val, memory_order_consume); (void)__atomic_fetch_sub(p, val, memory_order_release); (void)__atomic_fetch_sub(p, val, memory_order_acq_rel); (void)__atomic_fetch_sub(p, val, memory_order_seq_cst); (void)__atomic_add_fetch(p, val, memory_order_relaxed); (void)__atomic_add_fetch(p, val, memory_order_acquire); (void)__atomic_add_fetch(p, val, memory_order_consume); (void)__atomic_add_fetch(p, val, memory_order_release); (void)__atomic_add_fetch(p, val, memory_order_acq_rel); (void)__atomic_add_fetch(p, val, memory_order_seq_cst); (void)__atomic_sub_fetch(p, val, memory_order_relaxed); (void)__atomic_sub_fetch(p, val, memory_order_acquire); (void)__atomic_sub_fetch(p, val, memory_order_consume); (void)__atomic_sub_fetch(p, val, memory_order_release); (void)__atomic_sub_fetch(p, val, memory_order_acq_rel); (void)__atomic_sub_fetch(p, val, memory_order_seq_cst); (void)__atomic_fetch_and(p, val, memory_order_relaxed); (void)__atomic_fetch_and(p, val, memory_order_acquire); (void)__atomic_fetch_and(p, val, memory_order_consume); (void)__atomic_fetch_and(p, val, memory_order_release); (void)__atomic_fetch_and(p, val, memory_order_acq_rel); (void)__atomic_fetch_and(p, val, memory_order_seq_cst); (void)__atomic_fetch_or(p, val, memory_order_relaxed); (void)__atomic_fetch_or(p, val, memory_order_acquire); (void)__atomic_fetch_or(p, val, memory_order_consume); (void)__atomic_fetch_or(p, val, memory_order_release); (void)__atomic_fetch_or(p, val, memory_order_acq_rel); (void)__atomic_fetch_or(p, val, memory_order_seq_cst); (void)__atomic_fetch_xor(p, val, memory_order_relaxed); (void)__atomic_fetch_xor(p, val, memory_order_acquire); (void)__atomic_fetch_xor(p, val, memory_order_consume); (void)__atomic_fetch_xor(p, val, memory_order_release); (void)__atomic_fetch_xor(p, val, memory_order_acq_rel); (void)__atomic_fetch_xor(p, val, memory_order_seq_cst); (void)__atomic_fetch_nand(p, val, memory_order_relaxed); (void)__atomic_fetch_nand(p, val, memory_order_acquire); (void)__atomic_fetch_nand(p, val, memory_order_consume); (void)__atomic_fetch_nand(p, val, memory_order_release); (void)__atomic_fetch_nand(p, val, memory_order_acq_rel); (void)__atomic_fetch_nand(p, val, memory_order_seq_cst); (void)__atomic_and_fetch(p, val, memory_order_relaxed); (void)__atomic_and_fetch(p, val, memory_order_acquire); (void)__atomic_and_fetch(p, val, memory_order_consume); (void)__atomic_and_fetch(p, val, memory_order_release); (void)__atomic_and_fetch(p, val, memory_order_acq_rel); (void)__atomic_and_fetch(p, val, memory_order_seq_cst); (void)__atomic_or_fetch(p, val, memory_order_relaxed); (void)__atomic_or_fetch(p, val, memory_order_acquire); (void)__atomic_or_fetch(p, val, memory_order_consume); (void)__atomic_or_fetch(p, val, memory_order_release); (void)__atomic_or_fetch(p, val, memory_order_acq_rel); (void)__atomic_or_fetch(p, val, memory_order_seq_cst); (void)__atomic_xor_fetch(p, val, memory_order_relaxed); (void)__atomic_xor_fetch(p, val, memory_order_acquire); (void)__atomic_xor_fetch(p, val, memory_order_consume); (void)__atomic_xor_fetch(p, val, memory_order_release); (void)__atomic_xor_fetch(p, val, memory_order_acq_rel); (void)__atomic_xor_fetch(p, val, memory_order_seq_cst); (void)__atomic_nand_fetch(p, val, memory_order_relaxed); (void)__atomic_nand_fetch(p, val, memory_order_acquire); (void)__atomic_nand_fetch(p, val, memory_order_consume); (void)__atomic_nand_fetch(p, val, memory_order_release); (void)__atomic_nand_fetch(p, val, memory_order_acq_rel); (void)__atomic_nand_fetch(p, val, memory_order_seq_cst); (void)__atomic_exchange_n(p, val, memory_order_relaxed); (void)__atomic_exchange_n(p, val, memory_order_acquire); (void)__atomic_exchange_n(p, val, memory_order_consume); (void)__atomic_exchange_n(p, val, memory_order_release); (void)__atomic_exchange_n(p, val, memory_order_acq_rel); (void)__atomic_exchange_n(p, val, memory_order_seq_cst); (void)__atomic_exchange(p, p, p, memory_order_relaxed); (void)__atomic_exchange(p, p, p, memory_order_acquire); (void)__atomic_exchange(p, p, p, memory_order_consume); (void)__atomic_exchange(p, p, p, memory_order_release); (void)__atomic_exchange(p, p, p, memory_order_acq_rel); (void)__atomic_exchange(p, p, p, memory_order_seq_cst); (void)__atomic_compare_exchange(p, p, p, 0, memory_order_relaxed, memory_order_relaxed); (void)__atomic_compare_exchange(p, p, p, 0, memory_order_acquire, memory_order_relaxed); (void)__atomic_compare_exchange(p, p, p, 0, memory_order_consume, memory_order_relaxed); (void)__atomic_compare_exchange(p, p, p, 0, memory_order_release, memory_order_relaxed); (void)__atomic_compare_exchange(p, p, p, 0, memory_order_acq_rel, memory_order_relaxed); (void)__atomic_compare_exchange(p, p, p, 0, memory_order_seq_cst, memory_order_relaxed); (void)__atomic_compare_exchange_n(p, p, val, 0, memory_order_relaxed, memory_order_relaxed); (void)__atomic_compare_exchange_n(p, p, val, 0, memory_order_acquire, memory_order_relaxed); (void)__atomic_compare_exchange_n(p, p, val, 0, memory_order_consume, memory_order_relaxed); (void)__atomic_compare_exchange_n(p, p, val, 0, memory_order_release, memory_order_relaxed); (void)__atomic_compare_exchange_n(p, p, val, 0, memory_order_acq_rel, memory_order_relaxed); (void)__atomic_compare_exchange_n(p, p, val, 0, memory_order_seq_cst, memory_order_relaxed); } void nullPointerWarning(_Atomic(int) *Ap, int *p, int val) { // The 'expected' pointer shouldn't be NULL. (void)__c11_atomic_compare_exchange_strong(Ap, NULL, val, memory_order_relaxed, memory_order_relaxed); // expected-warning {{null passed to a callee that requires a non-null argument}} (void)atomic_compare_exchange_weak(Ap, ((void*)0), val); // expected-warning {{null passed to a callee that requires a non-null argument}} (void)__atomic_compare_exchange_n(p, NULL, val, 0, memory_order_relaxed, memory_order_relaxed); // expected-warning {{null passed to a callee that requires a non-null argument}} }