Skip to content

[C++][Modules] A module directive may only appear as the first preprocessing tokens in a file #144233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yronglin
Copy link
Contributor

This PR is 2nd part of P1857R3 implementation, and mainly implement the restriction A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.):
cpp.pre:

module-file:
    pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]

We also refine tests use split-file instead of conditional macro.

…first preprocessing tokens in a file

Signed-off-by: yronglin <[email protected]>
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Jun 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 14, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: None (yronglin)

Changes

This PR is 2nd part of P1857R3 implementation, and mainly implement the restriction A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.):
cpp.pre:

module-file:
    pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]

We also refine tests use split-file instead of conditional macro.


Patch is 64.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144233.diff

25 Files Affected:

  • (modified) clang/include/clang/Lex/Lexer.h (+3)
  • (modified) clang/include/clang/Lex/Preprocessor.h (+5)
  • (modified) clang/include/clang/Lex/Token.h (+9-3)
  • (modified) clang/include/clang/Sema/Sema.h (+2-1)
  • (modified) clang/lib/Lex/Lexer.cpp (+10)
  • (modified) clang/lib/Parse/Parser.cpp (+4-3)
  • (modified) clang/lib/Sema/SemaModule.cpp (+8-6)
  • (modified) clang/test/CXX/basic/basic.link/p1.cpp (+107-36)
  • (modified) clang/test/CXX/basic/basic.link/p2.cpp (+13-13)
  • (modified) clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp (+56-26)
  • (modified) clang/test/CXX/module/basic/basic.def.odr/p6.cppm (+134-40)
  • (modified) clang/test/CXX/module/basic/basic.link/module-declaration.cpp (+35-29)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm (+23-9)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.interface/p1.cppm (+18-21)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/p1.cpp (+30-14)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/p5.cpp (+48-17)
  • (modified) clang/test/CXX/module/module.interface/p2.cpp (+12-14)
  • (modified) clang/test/CXX/module/module.unit/p8.cpp (+28-20)
  • (modified) clang/test/Driver/modules.cpp (+18-13)
  • (modified) clang/test/Modules/named-modules-adl-3.cppm (+1)
  • (modified) clang/test/Modules/reserved-names-1.cppm (+10)
  • (modified) clang/test/Modules/reserved-names-system-header-1.cpp (+1)
  • (modified) clang/test/Modules/reserved-names-system-header-2.cpp (+1)
  • (modified) clang/test/SemaCXX/modules.cppm (+46-29)
  • (modified) clang/test/SemaCXX/type-aware-new-delete-transparent-contexts.cpp (+15-5)
diff --git a/clang/include/clang/Lex/Lexer.h b/clang/include/clang/Lex/Lexer.h
index bb65ae010cffa..ca812ba1583fb 100644
--- a/clang/include/clang/Lex/Lexer.h
+++ b/clang/include/clang/Lex/Lexer.h
@@ -143,6 +143,9 @@ class Lexer : public PreprocessorLexer {
   /// True if this is the first time we're lexing the input file.
   bool IsFirstTimeLexingFile;
 
+  /// True if current lexing token is the first pp-token.
+  bool IsFirstPPToken;
+
   // NewLinePtr - A pointer to new line character '\n' being lexed. For '\r\n',
   // it also points to '\n.'
   const char *NewLinePtr;
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 78be2bd64d61c..436ac783928f4 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -350,6 +350,9 @@ class Preprocessor {
   /// Whether the last token we lexed was an '@'.
   bool LastTokenWasAt = false;
 
+  /// First pp-token in current translation unit.
+  Token FirstPPToken;
+
   /// A position within a C++20 import-seq.
   class StdCXXImportSeq {
   public:
@@ -1766,6 +1769,8 @@ class Preprocessor {
   std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
                                                              bool ForHasEmbed);
 
+  void setFirstPPToken(const Token &Tok) { FirstPPToken = Tok; }
+  Token getFirstPPToken() const { return FirstPPToken; }
   bool LexAfterModuleImport(Token &Result);
   void CollectPpImportSuffix(SmallVectorImpl<Token> &Toks);
 
diff --git a/clang/include/clang/Lex/Token.h b/clang/include/clang/Lex/Token.h
index 4f29fb7d11415..d4dfd7b44d9af 100644
--- a/clang/include/clang/Lex/Token.h
+++ b/clang/include/clang/Lex/Token.h
@@ -86,9 +86,12 @@ class Token {
                                 // macro stringizing or charizing operator.
     CommaAfterElided = 0x200, // The comma following this token was elided (MS).
     IsEditorPlaceholder = 0x400, // This identifier is a placeholder.
-    IsReinjected = 0x800, // A phase 4 token that was produced before and
-                          // re-added, e.g. via EnterTokenStream. Annotation
-                          // tokens are *not* reinjected.
+
+    IsReinjected = 0x800,  // A phase 4 token that was produced before and
+                           // re-added, e.g. via EnterTokenStream. Annotation
+                           // tokens are *not* reinjected.
+    FirstPPToken = 0x1000, // This token is the first pp token in the
+                           // translation unit.
   };
 
   tok::TokenKind getKind() const { return Kind; }
@@ -318,6 +321,9 @@ class Token {
   /// represented as characters between '<#' and '#>' in the source code. The
   /// lexer uses identifier tokens to represent placeholders.
   bool isEditorPlaceholder() const { return getFlag(IsEditorPlaceholder); }
+
+  /// Returns true if this token is the first pp-token.
+  bool isFirstPPToken() const { return getFlag(FirstPPToken); }
 };
 
 /// Information about the conditional stack (\#if directives)
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 29452bb37260d..9397546c8fc5d 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -9822,7 +9822,8 @@ class Sema final : public SemaBase {
   DeclGroupPtrTy ActOnModuleDecl(SourceLocation StartLoc,
                                  SourceLocation ModuleLoc, ModuleDeclKind MDK,
                                  ModuleIdPath Path, ModuleIdPath Partition,
-                                 ModuleImportState &ImportState);
+                                 ModuleImportState &ImportState,
+                                 bool IntroducerIsFirstPPToken);
 
   /// The parser has processed a global-module-fragment declaration that begins
   /// the definition of the global module fragment of the current module unit.
diff --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index 93200458f04b4..06a74fa759794 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -174,6 +174,8 @@ void Lexer::InitLexer(const char *BufStart, const char *BufPtr,
   ExtendedTokenMode = 0;
 
   NewLinePtr = nullptr;
+
+  IsFirstPPToken = true;
 }
 
 /// Lexer constructor - Create a new lexer object for the specified buffer
@@ -3725,6 +3727,11 @@ bool Lexer::Lex(Token &Result) {
     HasLeadingEmptyMacro = false;
   }
 
+  if (IsFirstPPToken) {
+    Result.setFlag(Token::FirstPPToken);
+    IsFirstPPToken = false;
+  }
+
   bool atPhysicalStartOfLine = IsAtPhysicalStartOfLine;
   IsAtPhysicalStartOfLine = false;
   bool isRawLex = isLexingRawMode();
@@ -3732,6 +3739,9 @@ bool Lexer::Lex(Token &Result) {
   bool returnedToken = LexTokenInternal(Result, atPhysicalStartOfLine);
   // (After the LexTokenInternal call, the lexer might be destroyed.)
   assert((returnedToken || !isRawLex) && "Raw lex must succeed");
+
+  if (returnedToken && Result.isFirstPPToken() && PP)
+    PP->setFirstPPToken(Result);
   return returnedToken;
 }
 
diff --git a/clang/lib/Parse/Parser.cpp b/clang/lib/Parse/Parser.cpp
index 788ed79e0c1fa..18f399aca59e8 100644
--- a/clang/lib/Parse/Parser.cpp
+++ b/clang/lib/Parse/Parser.cpp
@@ -2340,7 +2340,8 @@ void Parser::ParseMicrosoftIfExistsExternalDeclaration() {
 
 Parser::DeclGroupPtrTy
 Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
-  SourceLocation StartLoc = Tok.getLocation();
+  Token Introducer = Tok;
+  SourceLocation StartLoc = Introducer.getLocation();
 
   Sema::ModuleDeclKind MDK = TryConsumeToken(tok::kw_export)
                                  ? Sema::ModuleDeclKind::Interface
@@ -2359,7 +2360,7 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
   // Parse a global-module-fragment, if present.
   if (getLangOpts().CPlusPlusModules && Tok.is(tok::semi)) {
     SourceLocation SemiLoc = ConsumeToken();
-    if (ImportState != Sema::ModuleImportState::FirstDecl) {
+    if (!Introducer.isFirstPPToken()) {
       Diag(StartLoc, diag::err_global_module_introducer_not_at_start)
         << SourceRange(StartLoc, SemiLoc);
       return nullptr;
@@ -2416,7 +2417,7 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
   ExpectAndConsumeSemi(diag::err_module_expected_semi);
 
   return Actions.ActOnModuleDecl(StartLoc, ModuleLoc, MDK, Path, Partition,
-                                 ImportState);
+                                 ImportState, Introducer.isFirstPPToken());
 }
 
 Decl *Parser::ParseModuleImport(SourceLocation AtLoc,
diff --git a/clang/lib/Sema/SemaModule.cpp b/clang/lib/Sema/SemaModule.cpp
index 9fcaad48d3058..486aa3d395134 100644
--- a/clang/lib/Sema/SemaModule.cpp
+++ b/clang/lib/Sema/SemaModule.cpp
@@ -258,11 +258,11 @@ static bool DiagReservedModuleName(Sema &S, const IdentifierInfo *II,
 Sema::DeclGroupPtrTy
 Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
                       ModuleDeclKind MDK, ModuleIdPath Path,
-                      ModuleIdPath Partition, ModuleImportState &ImportState) {
+                      ModuleIdPath Partition, ModuleImportState &ImportState,
+                      bool IntroducerIsFirstPPToken) {
   assert(getLangOpts().CPlusPlusModules &&
          "should only have module decl in standard C++ modules");
 
-  bool IsFirstDecl = ImportState == ModuleImportState::FirstDecl;
   bool SeenGMF = ImportState == ModuleImportState::GlobalFragment;
   // If any of the steps here fail, we count that as invalidating C++20
   // module state;
@@ -328,12 +328,14 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
           SeenGMF == (bool)this->TheGlobalModuleFragment) &&
          "mismatched global module state");
 
-  // In C++20, the module-declaration must be the first declaration if there
-  // is no global module fragment.
-  if (getLangOpts().CPlusPlusModules && !IsFirstDecl && !SeenGMF) {
+  // In C++20, A module directive may only appear as the first preprocessing
+  // tokens in a file (excluding the global module fragment.).
+  if (getLangOpts().CPlusPlusModules && !IntroducerIsFirstPPToken && !SeenGMF) {
     Diag(ModuleLoc, diag::err_module_decl_not_at_start);
     SourceLocation BeginLoc =
-        ModuleScopes.empty()
+        PP.getFirstPPToken().getLocation().isValid()
+            ? PP.getFirstPPToken().getLocation()
+        : ModuleScopes.empty()
             ? SourceMgr.getLocForStartOfFile(SourceMgr.getMainFileID())
             : ModuleScopes.back().BeginLoc;
     if (BeginLoc.isValid()) {
diff --git a/clang/test/CXX/basic/basic.link/p1.cpp b/clang/test/CXX/basic/basic.link/p1.cpp
index c6a119aa7f47c..26a5f025f42fe 100644
--- a/clang/test/CXX/basic/basic.link/p1.cpp
+++ b/clang/test/CXX/basic/basic.link/p1.cpp
@@ -1,57 +1,128 @@
-// RUN: %clang_cc1 -std=c++2a -verify %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_MODULE_DECL %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_MODULE_DECL -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG -DNO_MODULE_DECL %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG -DNO_MODULE_DECL -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DEXPORT_FRAGS %s
-
-#ifndef NO_GLOBAL_FRAG
-#ifdef EXPORT_FRAGS
-export // expected-error {{global module fragment cannot be exported}}
-#endif
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++2a -verify %t/M.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoModuleDecl.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoModuleDeclAndNoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFragAndNoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFragAndNoModuleDecl.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFragAndNoModuleDeclAndNoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/ExportFrags.cppm
+
+//--- M.cppm
 module;
-#ifdef NO_MODULE_DECL
-// expected-error@-2 {{missing 'module' declaration at end of global module fragment introduced here}}
-#endif
-#endif
+extern int a; // #a1
+export module Foo;
+
+int a; // expected-error {{declaration of 'a' in module Foo follows declaration in the global module}}
+       // expected-note@#a1 {{previous decl}}
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+module :private; // #priv-frag
+int b; // ok
+module :private; // expected-error {{private module fragment redefined}}
+                 // expected-note@#priv-frag {{previous definition is here}}
+
+//--- NoGlobalFrag.cppm
+
+extern int a; // #a1
+export module Foo; // expected-error {{module declaration must occur at the start of the translation unit}}
+                   // expected-note@-2 {{add 'module;' to the start of the file to introduce a global module fragment}}
+
+// expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
+// expected-note@#a1 {{previous decl}}
+
+int a; // #a2
+extern int b;
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+module :private; // #priv-frag
+int b; // ok
+module :private; // expected-error {{private module fragment redefined}}
+// expected-note@#priv-frag {{previous definition is here}}
 
+//--- NoModuleDecl.cppm
+module; // expected-error {{missing 'module' declaration at end of global module fragment introduced here}}
 extern int a; // #a1
+int a; // #a2
+extern int b;
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+module :private; // expected-error {{private module fragment declaration with no preceding module declaration}}
+int b; // ok
 
-#ifndef NO_MODULE_DECL
+//--- NoPrivateFrag.cppm
+module;
+extern int a; // #a1
 export module Foo;
-#ifdef NO_GLOBAL_FRAG
-// expected-error@-2 {{module declaration must occur at the start of the translation unit}}
+
+// expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
+// expected-note@#a1 {{previous decl}}
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+int b; // ok
+
+
+//--- NoModuleDeclAndNoPrivateFrag.cppm
+module; // expected-error {{missing 'module' declaration at end of global module fragment introduced here}}
+extern int a; // #a1
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+
+int b; // ok
+
+//--- NoGlobalFragAndNoPrivateFrag.cppm
+extern int a; // #a1
+export module Foo; // expected-error {{module declaration must occur at the start of the translation unit}}
 // expected-note@1 {{add 'module;' to the start of the file to introduce a global module fragment}}
-#endif
 
 // expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
 // expected-note@#a1 {{previous decl}}
-#endif
 
 int a; // #a2
 extern int b;
 
 module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
 
-#ifndef NO_PRIVATE_FRAG
-#ifdef EXPORT_FRAGS
-export // expected-error {{private module fragment cannot be exported}}
-#endif
+int b; // ok
+
+//--- NoGlobalFragAndNoModuleDecl.cppm
+extern int a; // #a1
+int a; // #a2
+extern int b;
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
 module :private; // #priv-frag
-#ifdef NO_MODULE_DECL
-// expected-error@-2 {{private module fragment declaration with no preceding module declaration}}
-#endif
-#endif
+// expected-error@-1 {{private module fragment declaration with no preceding module declaration}}
+int b; // ok
 
+
+//--- NoGlobalFragAndNoModuleDeclAndNoPrivateFrag.cppm
+extern int a; // #a1
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
 int b; // ok
 
+//--- ExportFrags.cppm
+export module; // expected-error {{global module fragment cannot be exported}}
+extern int a; // #a1
+export module Foo;
+// expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
+// expected-note@#a1 {{previous decl}}
 
-#ifndef NO_PRIVATE_FRAG
-#ifndef NO_MODULE_DECL
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+
+module :private; // #priv-frag
+
+int b; // ok
 module :private; // expected-error {{private module fragment redefined}}
-// expected-note@#priv-frag {{previous definition is here}}
-#endif
-#endif
+                 // expected-note@#priv-frag {{previous definition is here}}
diff --git a/clang/test/CXX/basic/basic.link/p2.cpp b/clang/test/CXX/basic/basic.link/p2.cpp
index ccad42022ee80..94cbc62490b2f 100644
--- a/clang/test/CXX/basic/basic.link/p2.cpp
+++ b/clang/test/CXX/basic/basic.link/p2.cpp
@@ -1,16 +1,16 @@
-// RUN: %clang_cc1 -std=c++2a -DEXPORT %s -verify
-// RUN: %clang_cc1 -std=c++2a -DEXPORT %s -emit-module-interface -o %t.pcm
-// RUN: %clang_cc1 -std=c++2a -UEXPORT %s -verify -fmodule-file=M=%t.pcm
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++2a %t/pmf_in_interface.cpp -verify
+// RUN: %clang_cc1 -std=c++2a %t/pmf_in_interface.cpp -emit-module-interface -o %t.pcm
+// RUN: %clang_cc1 -std=c++2a %t/pmf_in_implementation.cpp -verify -fmodule-file=M=%t.pcm
 
-#ifdef EXPORT
-// expected-no-diagnostics
-export
-#else
-// expected-note@+2 {{add 'export' here}}
-#endif
-module M;
 
-#ifndef EXPORT
-// expected-error@+2 {{private module fragment in module implementation unit}}
-#endif
+//--- pmf_in_interface.cpp
+// expected-no-diagnostics
+export module M;
 module :private;
+
+//--- pmf_in_implementation.cpp
+module M; // expected-note {{add 'export' here}}
+module :private; // expected-error {{private module fragment in module implementation unit}}
diff --git a/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp b/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
index d70eb7de22c6a..fd0038b3f7745 100644
--- a/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
+++ b/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
@@ -1,14 +1,16 @@
 // RUN: rm -rf %t
-// RUN: mkdir -p %t
-// RUN: echo '#ifndef FOO_H' > %t/foo.h
-// RUN: echo '#define FOO_H' >> %t/foo.h
-// RUN: echo 'extern int in_header;' >> %t/foo.h
-// RUN: echo '#endif' >> %t/foo.h
-// RUN: %clang_cc1 -std=c++2a -I%t -emit-module-interface -DINTERFACE %s -o %t.pcm
-// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm -DIMPLEMENTATION %s -verify -fno-modules-error-recovery
-// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %s -verify -fno-modules-error-recovery
-
-#ifdef INTERFACE
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -std=c++2a -I%t -emit-module-interface %t/interface.cppm -o %t.pcm
+// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %t/implA.cppm -verify -fno-modules-error-recovery
+// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %t/implB.cppm -verify -fno-modules-error-recovery
+
+//--- foo.h
+#ifndef FOO_H
+#define FOO_H
+extern int in_header;
+#endif
+
+//--- interface.cppm
 module;
 #include "foo.h"
 // FIXME: The following need to be moved to a header file. The global module
@@ -22,11 +24,9 @@ static int internal;
 module :private;
 int not_exported_private;
 static int internal_private;
-#else
 
-#ifdef IMPLEMENTATION
+//--- implA.cppm
 module;
-#endif
 
 void test_early() {
   in_header = 1; // expected-error {{use of undeclared identifier 'in_header'}}
@@ -46,11 +46,7 @@ void test_early() {
   internal_private = 1; // expected-error {{undeclared identifier}}
 }
 
-#ifdef IMPLEMENTATION
 module A;
-#else
-import A;
-#endif
 
 void test_late() {
   in_header = 1; // expected-error {{missing '#include "foo.h"'; 'in_header' must be declared before it is used}}
@@ -61,20 +57,54 @@ void test_late() {
   exported = 1;
 
   not_exported = 1;
-#ifndef IMPLEMENTATION
-  // expected-error@-2 {{use of undeclared identifier 'not_exported'; did you mean 'exported'?}}
-  // [email protected]:18 {{'exported' declared here}}
-#endif
 
   internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
 
   not_exported_private = 1;
-#ifndef IMPLEMENTATION
-  // FIXME: should not be visible here
-  // expected-error@-3 {{undeclared identifier}}
-#endif
 
   internal_private = 1; // expected-error {{use of undeclared identifier 'internal_private'}}
 }
 
-#endif
+//--- implB.cppm
+module;
+
+void test_early() {
+  in_header = 1; // expected-error {{use of undeclared identifier 'in_header'}}
+  // expected-note@* {{not visible}}
+
+  global_module_fragment = 1; // expected-error {{use of undeclared identifier 'global_module_fragment'}}
+
+  exported = 1; // expected-error {{use of undeclared identifier 'exported'}}
+
+  not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'}}
+
+  // FIXME: We need better diagnostic message for static variable.
+  internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
+
+  not_exported_private = 1; // expected-error {{undeclared identifier}}
+
+  internal_private = 1; // expected-error {{undeclared identifier}}
+}
+
+export module B;
+import A;
+
+void test_late() {
+  in_header = 1; // expected-error {{missing '#include "foo.h"'; 'in_header' must be declared before it is used}}
+  // expected-note@* {{not visible}}
+
+  global_module_fragment = 1; // expected-error {{missing '#include'; 'global_module_fragment' must be declared before it is used}}
+
+  exported = 1;
+
+  not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'; did you mean 'exported'?}}
+  // expected-note@* {{'exported' declared here}}
+
+  internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
+
+  not_exported_private = 1;
+  // F...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 14, 2025

@llvm/pr-subscribers-clang-modules

Author: None (yronglin)

Changes

This PR is 2nd part of P1857R3 implementation, and mainly implement the restriction A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.):
cpp.pre:

module-file:
    pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]

We also refine tests use split-file instead of conditional macro.


Patch is 64.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144233.diff

25 Files Affected:

  • (modified) clang/include/clang/Lex/Lexer.h (+3)
  • (modified) clang/include/clang/Lex/Preprocessor.h (+5)
  • (modified) clang/include/clang/Lex/Token.h (+9-3)
  • (modified) clang/include/clang/Sema/Sema.h (+2-1)
  • (modified) clang/lib/Lex/Lexer.cpp (+10)
  • (modified) clang/lib/Parse/Parser.cpp (+4-3)
  • (modified) clang/lib/Sema/SemaModule.cpp (+8-6)
  • (modified) clang/test/CXX/basic/basic.link/p1.cpp (+107-36)
  • (modified) clang/test/CXX/basic/basic.link/p2.cpp (+13-13)
  • (modified) clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp (+56-26)
  • (modified) clang/test/CXX/module/basic/basic.def.odr/p6.cppm (+134-40)
  • (modified) clang/test/CXX/module/basic/basic.link/module-declaration.cpp (+35-29)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm (+23-9)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.interface/p1.cppm (+18-21)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/p1.cpp (+30-14)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/p5.cpp (+48-17)
  • (modified) clang/test/CXX/module/module.interface/p2.cpp (+12-14)
  • (modified) clang/test/CXX/module/module.unit/p8.cpp (+28-20)
  • (modified) clang/test/Driver/modules.cpp (+18-13)
  • (modified) clang/test/Modules/named-modules-adl-3.cppm (+1)
  • (modified) clang/test/Modules/reserved-names-1.cppm (+10)
  • (modified) clang/test/Modules/reserved-names-system-header-1.cpp (+1)
  • (modified) clang/test/Modules/reserved-names-system-header-2.cpp (+1)
  • (modified) clang/test/SemaCXX/modules.cppm (+46-29)
  • (modified) clang/test/SemaCXX/type-aware-new-delete-transparent-contexts.cpp (+15-5)
diff --git a/clang/include/clang/Lex/Lexer.h b/clang/include/clang/Lex/Lexer.h
index bb65ae010cffa..ca812ba1583fb 100644
--- a/clang/include/clang/Lex/Lexer.h
+++ b/clang/include/clang/Lex/Lexer.h
@@ -143,6 +143,9 @@ class Lexer : public PreprocessorLexer {
   /// True if this is the first time we're lexing the input file.
   bool IsFirstTimeLexingFile;
 
+  /// True if current lexing token is the first pp-token.
+  bool IsFirstPPToken;
+
   // NewLinePtr - A pointer to new line character '\n' being lexed. For '\r\n',
   // it also points to '\n.'
   const char *NewLinePtr;
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 78be2bd64d61c..436ac783928f4 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -350,6 +350,9 @@ class Preprocessor {
   /// Whether the last token we lexed was an '@'.
   bool LastTokenWasAt = false;
 
+  /// First pp-token in current translation unit.
+  Token FirstPPToken;
+
   /// A position within a C++20 import-seq.
   class StdCXXImportSeq {
   public:
@@ -1766,6 +1769,8 @@ class Preprocessor {
   std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
                                                              bool ForHasEmbed);
 
+  void setFirstPPToken(const Token &Tok) { FirstPPToken = Tok; }
+  Token getFirstPPToken() const { return FirstPPToken; }
   bool LexAfterModuleImport(Token &Result);
   void CollectPpImportSuffix(SmallVectorImpl<Token> &Toks);
 
diff --git a/clang/include/clang/Lex/Token.h b/clang/include/clang/Lex/Token.h
index 4f29fb7d11415..d4dfd7b44d9af 100644
--- a/clang/include/clang/Lex/Token.h
+++ b/clang/include/clang/Lex/Token.h
@@ -86,9 +86,12 @@ class Token {
                                 // macro stringizing or charizing operator.
     CommaAfterElided = 0x200, // The comma following this token was elided (MS).
     IsEditorPlaceholder = 0x400, // This identifier is a placeholder.
-    IsReinjected = 0x800, // A phase 4 token that was produced before and
-                          // re-added, e.g. via EnterTokenStream. Annotation
-                          // tokens are *not* reinjected.
+
+    IsReinjected = 0x800,  // A phase 4 token that was produced before and
+                           // re-added, e.g. via EnterTokenStream. Annotation
+                           // tokens are *not* reinjected.
+    FirstPPToken = 0x1000, // This token is the first pp token in the
+                           // translation unit.
   };
 
   tok::TokenKind getKind() const { return Kind; }
@@ -318,6 +321,9 @@ class Token {
   /// represented as characters between '<#' and '#>' in the source code. The
   /// lexer uses identifier tokens to represent placeholders.
   bool isEditorPlaceholder() const { return getFlag(IsEditorPlaceholder); }
+
+  /// Returns true if this token is the first pp-token.
+  bool isFirstPPToken() const { return getFlag(FirstPPToken); }
 };
 
 /// Information about the conditional stack (\#if directives)
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 29452bb37260d..9397546c8fc5d 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -9822,7 +9822,8 @@ class Sema final : public SemaBase {
   DeclGroupPtrTy ActOnModuleDecl(SourceLocation StartLoc,
                                  SourceLocation ModuleLoc, ModuleDeclKind MDK,
                                  ModuleIdPath Path, ModuleIdPath Partition,
-                                 ModuleImportState &ImportState);
+                                 ModuleImportState &ImportState,
+                                 bool IntroducerIsFirstPPToken);
 
   /// The parser has processed a global-module-fragment declaration that begins
   /// the definition of the global module fragment of the current module unit.
diff --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index 93200458f04b4..06a74fa759794 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -174,6 +174,8 @@ void Lexer::InitLexer(const char *BufStart, const char *BufPtr,
   ExtendedTokenMode = 0;
 
   NewLinePtr = nullptr;
+
+  IsFirstPPToken = true;
 }
 
 /// Lexer constructor - Create a new lexer object for the specified buffer
@@ -3725,6 +3727,11 @@ bool Lexer::Lex(Token &Result) {
     HasLeadingEmptyMacro = false;
   }
 
+  if (IsFirstPPToken) {
+    Result.setFlag(Token::FirstPPToken);
+    IsFirstPPToken = false;
+  }
+
   bool atPhysicalStartOfLine = IsAtPhysicalStartOfLine;
   IsAtPhysicalStartOfLine = false;
   bool isRawLex = isLexingRawMode();
@@ -3732,6 +3739,9 @@ bool Lexer::Lex(Token &Result) {
   bool returnedToken = LexTokenInternal(Result, atPhysicalStartOfLine);
   // (After the LexTokenInternal call, the lexer might be destroyed.)
   assert((returnedToken || !isRawLex) && "Raw lex must succeed");
+
+  if (returnedToken && Result.isFirstPPToken() && PP)
+    PP->setFirstPPToken(Result);
   return returnedToken;
 }
 
diff --git a/clang/lib/Parse/Parser.cpp b/clang/lib/Parse/Parser.cpp
index 788ed79e0c1fa..18f399aca59e8 100644
--- a/clang/lib/Parse/Parser.cpp
+++ b/clang/lib/Parse/Parser.cpp
@@ -2340,7 +2340,8 @@ void Parser::ParseMicrosoftIfExistsExternalDeclaration() {
 
 Parser::DeclGroupPtrTy
 Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
-  SourceLocation StartLoc = Tok.getLocation();
+  Token Introducer = Tok;
+  SourceLocation StartLoc = Introducer.getLocation();
 
   Sema::ModuleDeclKind MDK = TryConsumeToken(tok::kw_export)
                                  ? Sema::ModuleDeclKind::Interface
@@ -2359,7 +2360,7 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
   // Parse a global-module-fragment, if present.
   if (getLangOpts().CPlusPlusModules && Tok.is(tok::semi)) {
     SourceLocation SemiLoc = ConsumeToken();
-    if (ImportState != Sema::ModuleImportState::FirstDecl) {
+    if (!Introducer.isFirstPPToken()) {
       Diag(StartLoc, diag::err_global_module_introducer_not_at_start)
         << SourceRange(StartLoc, SemiLoc);
       return nullptr;
@@ -2416,7 +2417,7 @@ Parser::ParseModuleDecl(Sema::ModuleImportState &ImportState) {
   ExpectAndConsumeSemi(diag::err_module_expected_semi);
 
   return Actions.ActOnModuleDecl(StartLoc, ModuleLoc, MDK, Path, Partition,
-                                 ImportState);
+                                 ImportState, Introducer.isFirstPPToken());
 }
 
 Decl *Parser::ParseModuleImport(SourceLocation AtLoc,
diff --git a/clang/lib/Sema/SemaModule.cpp b/clang/lib/Sema/SemaModule.cpp
index 9fcaad48d3058..486aa3d395134 100644
--- a/clang/lib/Sema/SemaModule.cpp
+++ b/clang/lib/Sema/SemaModule.cpp
@@ -258,11 +258,11 @@ static bool DiagReservedModuleName(Sema &S, const IdentifierInfo *II,
 Sema::DeclGroupPtrTy
 Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
                       ModuleDeclKind MDK, ModuleIdPath Path,
-                      ModuleIdPath Partition, ModuleImportState &ImportState) {
+                      ModuleIdPath Partition, ModuleImportState &ImportState,
+                      bool IntroducerIsFirstPPToken) {
   assert(getLangOpts().CPlusPlusModules &&
          "should only have module decl in standard C++ modules");
 
-  bool IsFirstDecl = ImportState == ModuleImportState::FirstDecl;
   bool SeenGMF = ImportState == ModuleImportState::GlobalFragment;
   // If any of the steps here fail, we count that as invalidating C++20
   // module state;
@@ -328,12 +328,14 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, SourceLocation ModuleLoc,
           SeenGMF == (bool)this->TheGlobalModuleFragment) &&
          "mismatched global module state");
 
-  // In C++20, the module-declaration must be the first declaration if there
-  // is no global module fragment.
-  if (getLangOpts().CPlusPlusModules && !IsFirstDecl && !SeenGMF) {
+  // In C++20, A module directive may only appear as the first preprocessing
+  // tokens in a file (excluding the global module fragment.).
+  if (getLangOpts().CPlusPlusModules && !IntroducerIsFirstPPToken && !SeenGMF) {
     Diag(ModuleLoc, diag::err_module_decl_not_at_start);
     SourceLocation BeginLoc =
-        ModuleScopes.empty()
+        PP.getFirstPPToken().getLocation().isValid()
+            ? PP.getFirstPPToken().getLocation()
+        : ModuleScopes.empty()
             ? SourceMgr.getLocForStartOfFile(SourceMgr.getMainFileID())
             : ModuleScopes.back().BeginLoc;
     if (BeginLoc.isValid()) {
diff --git a/clang/test/CXX/basic/basic.link/p1.cpp b/clang/test/CXX/basic/basic.link/p1.cpp
index c6a119aa7f47c..26a5f025f42fe 100644
--- a/clang/test/CXX/basic/basic.link/p1.cpp
+++ b/clang/test/CXX/basic/basic.link/p1.cpp
@@ -1,57 +1,128 @@
-// RUN: %clang_cc1 -std=c++2a -verify %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_MODULE_DECL %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_MODULE_DECL -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG -DNO_MODULE_DECL %s
-// RUN: %clang_cc1 -std=c++2a -verify -DNO_GLOBAL_FRAG -DNO_MODULE_DECL -DNO_PRIVATE_FRAG %s
-// RUN: %clang_cc1 -std=c++2a -verify -DEXPORT_FRAGS %s
-
-#ifndef NO_GLOBAL_FRAG
-#ifdef EXPORT_FRAGS
-export // expected-error {{global module fragment cannot be exported}}
-#endif
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++2a -verify %t/M.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoModuleDecl.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoModuleDeclAndNoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFragAndNoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFragAndNoModuleDecl.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/NoGlobalFragAndNoModuleDeclAndNoPrivateFrag.cppm
+// RUN: %clang_cc1 -std=c++2a -verify %t/ExportFrags.cppm
+
+//--- M.cppm
 module;
-#ifdef NO_MODULE_DECL
-// expected-error@-2 {{missing 'module' declaration at end of global module fragment introduced here}}
-#endif
-#endif
+extern int a; // #a1
+export module Foo;
+
+int a; // expected-error {{declaration of 'a' in module Foo follows declaration in the global module}}
+       // expected-note@#a1 {{previous decl}}
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+module :private; // #priv-frag
+int b; // ok
+module :private; // expected-error {{private module fragment redefined}}
+                 // expected-note@#priv-frag {{previous definition is here}}
+
+//--- NoGlobalFrag.cppm
+
+extern int a; // #a1
+export module Foo; // expected-error {{module declaration must occur at the start of the translation unit}}
+                   // expected-note@-2 {{add 'module;' to the start of the file to introduce a global module fragment}}
+
+// expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
+// expected-note@#a1 {{previous decl}}
+
+int a; // #a2
+extern int b;
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+module :private; // #priv-frag
+int b; // ok
+module :private; // expected-error {{private module fragment redefined}}
+// expected-note@#priv-frag {{previous definition is here}}
 
+//--- NoModuleDecl.cppm
+module; // expected-error {{missing 'module' declaration at end of global module fragment introduced here}}
 extern int a; // #a1
+int a; // #a2
+extern int b;
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+module :private; // expected-error {{private module fragment declaration with no preceding module declaration}}
+int b; // ok
 
-#ifndef NO_MODULE_DECL
+//--- NoPrivateFrag.cppm
+module;
+extern int a; // #a1
 export module Foo;
-#ifdef NO_GLOBAL_FRAG
-// expected-error@-2 {{module declaration must occur at the start of the translation unit}}
+
+// expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
+// expected-note@#a1 {{previous decl}}
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+int b; // ok
+
+
+//--- NoModuleDeclAndNoPrivateFrag.cppm
+module; // expected-error {{missing 'module' declaration at end of global module fragment introduced here}}
+extern int a; // #a1
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+
+int b; // ok
+
+//--- NoGlobalFragAndNoPrivateFrag.cppm
+extern int a; // #a1
+export module Foo; // expected-error {{module declaration must occur at the start of the translation unit}}
 // expected-note@1 {{add 'module;' to the start of the file to introduce a global module fragment}}
-#endif
 
 // expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
 // expected-note@#a1 {{previous decl}}
-#endif
 
 int a; // #a2
 extern int b;
 
 module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
 
-#ifndef NO_PRIVATE_FRAG
-#ifdef EXPORT_FRAGS
-export // expected-error {{private module fragment cannot be exported}}
-#endif
+int b; // ok
+
+//--- NoGlobalFragAndNoModuleDecl.cppm
+extern int a; // #a1
+int a; // #a2
+extern int b;
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
 module :private; // #priv-frag
-#ifdef NO_MODULE_DECL
-// expected-error@-2 {{private module fragment declaration with no preceding module declaration}}
-#endif
-#endif
+// expected-error@-1 {{private module fragment declaration with no preceding module declaration}}
+int b; // ok
 
+
+//--- NoGlobalFragAndNoModuleDeclAndNoPrivateFrag.cppm
+extern int a; // #a1
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
 int b; // ok
 
+//--- ExportFrags.cppm
+export module; // expected-error {{global module fragment cannot be exported}}
+extern int a; // #a1
+export module Foo;
+// expected-error@#a2 {{declaration of 'a' in module Foo follows declaration in the global module}}
+// expected-note@#a1 {{previous decl}}
 
-#ifndef NO_PRIVATE_FRAG
-#ifndef NO_MODULE_DECL
+int a; // #a2
+extern int b;
+
+module; // expected-error {{'module;' introducing a global module fragment can appear only at the start of the translation unit}}
+
+module :private; // #priv-frag
+
+int b; // ok
 module :private; // expected-error {{private module fragment redefined}}
-// expected-note@#priv-frag {{previous definition is here}}
-#endif
-#endif
+                 // expected-note@#priv-frag {{previous definition is here}}
diff --git a/clang/test/CXX/basic/basic.link/p2.cpp b/clang/test/CXX/basic/basic.link/p2.cpp
index ccad42022ee80..94cbc62490b2f 100644
--- a/clang/test/CXX/basic/basic.link/p2.cpp
+++ b/clang/test/CXX/basic/basic.link/p2.cpp
@@ -1,16 +1,16 @@
-// RUN: %clang_cc1 -std=c++2a -DEXPORT %s -verify
-// RUN: %clang_cc1 -std=c++2a -DEXPORT %s -emit-module-interface -o %t.pcm
-// RUN: %clang_cc1 -std=c++2a -UEXPORT %s -verify -fmodule-file=M=%t.pcm
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++2a %t/pmf_in_interface.cpp -verify
+// RUN: %clang_cc1 -std=c++2a %t/pmf_in_interface.cpp -emit-module-interface -o %t.pcm
+// RUN: %clang_cc1 -std=c++2a %t/pmf_in_implementation.cpp -verify -fmodule-file=M=%t.pcm
 
-#ifdef EXPORT
-// expected-no-diagnostics
-export
-#else
-// expected-note@+2 {{add 'export' here}}
-#endif
-module M;
 
-#ifndef EXPORT
-// expected-error@+2 {{private module fragment in module implementation unit}}
-#endif
+//--- pmf_in_interface.cpp
+// expected-no-diagnostics
+export module M;
 module :private;
+
+//--- pmf_in_implementation.cpp
+module M; // expected-note {{add 'export' here}}
+module :private; // expected-error {{private module fragment in module implementation unit}}
diff --git a/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp b/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
index d70eb7de22c6a..fd0038b3f7745 100644
--- a/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
+++ b/clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
@@ -1,14 +1,16 @@
 // RUN: rm -rf %t
-// RUN: mkdir -p %t
-// RUN: echo '#ifndef FOO_H' > %t/foo.h
-// RUN: echo '#define FOO_H' >> %t/foo.h
-// RUN: echo 'extern int in_header;' >> %t/foo.h
-// RUN: echo '#endif' >> %t/foo.h
-// RUN: %clang_cc1 -std=c++2a -I%t -emit-module-interface -DINTERFACE %s -o %t.pcm
-// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm -DIMPLEMENTATION %s -verify -fno-modules-error-recovery
-// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %s -verify -fno-modules-error-recovery
-
-#ifdef INTERFACE
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -std=c++2a -I%t -emit-module-interface %t/interface.cppm -o %t.pcm
+// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %t/implA.cppm -verify -fno-modules-error-recovery
+// RUN: %clang_cc1 -std=c++2a -I%t -fmodule-file=A=%t.pcm %t/implB.cppm -verify -fno-modules-error-recovery
+
+//--- foo.h
+#ifndef FOO_H
+#define FOO_H
+extern int in_header;
+#endif
+
+//--- interface.cppm
 module;
 #include "foo.h"
 // FIXME: The following need to be moved to a header file. The global module
@@ -22,11 +24,9 @@ static int internal;
 module :private;
 int not_exported_private;
 static int internal_private;
-#else
 
-#ifdef IMPLEMENTATION
+//--- implA.cppm
 module;
-#endif
 
 void test_early() {
   in_header = 1; // expected-error {{use of undeclared identifier 'in_header'}}
@@ -46,11 +46,7 @@ void test_early() {
   internal_private = 1; // expected-error {{undeclared identifier}}
 }
 
-#ifdef IMPLEMENTATION
 module A;
-#else
-import A;
-#endif
 
 void test_late() {
   in_header = 1; // expected-error {{missing '#include "foo.h"'; 'in_header' must be declared before it is used}}
@@ -61,20 +57,54 @@ void test_late() {
   exported = 1;
 
   not_exported = 1;
-#ifndef IMPLEMENTATION
-  // expected-error@-2 {{use of undeclared identifier 'not_exported'; did you mean 'exported'?}}
-  // [email protected]:18 {{'exported' declared here}}
-#endif
 
   internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
 
   not_exported_private = 1;
-#ifndef IMPLEMENTATION
-  // FIXME: should not be visible here
-  // expected-error@-3 {{undeclared identifier}}
-#endif
 
   internal_private = 1; // expected-error {{use of undeclared identifier 'internal_private'}}
 }
 
-#endif
+//--- implB.cppm
+module;
+
+void test_early() {
+  in_header = 1; // expected-error {{use of undeclared identifier 'in_header'}}
+  // expected-note@* {{not visible}}
+
+  global_module_fragment = 1; // expected-error {{use of undeclared identifier 'global_module_fragment'}}
+
+  exported = 1; // expected-error {{use of undeclared identifier 'exported'}}
+
+  not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'}}
+
+  // FIXME: We need better diagnostic message for static variable.
+  internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
+
+  not_exported_private = 1; // expected-error {{undeclared identifier}}
+
+  internal_private = 1; // expected-error {{undeclared identifier}}
+}
+
+export module B;
+import A;
+
+void test_late() {
+  in_header = 1; // expected-error {{missing '#include "foo.h"'; 'in_header' must be declared before it is used}}
+  // expected-note@* {{not visible}}
+
+  global_module_fragment = 1; // expected-error {{missing '#include'; 'global_module_fragment' must be declared before it is used}}
+
+  exported = 1;
+
+  not_exported = 1; // expected-error {{use of undeclared identifier 'not_exported'; did you mean 'exported'?}}
+  // expected-note@* {{'exported' declared here}}
+
+  internal = 1; // expected-error {{use of undeclared identifier 'internal'}}
+
+  not_exported_private = 1;
+  // F...
[truncated]

@vogelsgesang
Copy link
Member

Should this commit also mark the papers as implemented on https://clang.llvm.org/cxx_status.html?

@yronglin
Copy link
Contributor Author

yronglin commented Jun 15, 2025

Should this commit also mark the papers as implemented on https://clang.llvm.org/cxx_status.html?

Thanks for the review! This PR is only part of it, so I want to mark it as "implemented" after the paper feature is fully implemented.

bool atPhysicalStartOfLine = IsAtPhysicalStartOfLine;
IsAtPhysicalStartOfLine = false;
bool isRawLex = isLexingRawMode();
(void) isRawLex;
bool returnedToken = LexTokenInternal(Result, atPhysicalStartOfLine);
// (After the LexTokenInternal call, the lexer might be destroyed.)
assert((returnedToken || !isRawLex) && "Raw lex must succeed");

if (returnedToken && Result.isFirstPPToken() && PP)
PP->setFirstPPToken(Result);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is incorrect to save the first pp-token here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants