[flang] Initial documentation for .mod files

Original-commit: flang-compiler/f18@f1809b833f Reviewed-on: https://github.com/flang-compiler/f18/pull/126 Tree-same-pre-rewrite: false
2018-07-16 16:26:14 -07:00 · 2018-07-16 16:26:14 -07:00 · dda6fa8eba
parent 43f2ce0739
commit dda6fa8eba
1 changed files with 124 additions and 0 deletions
--- a/flang/documentation/mod-files.md
+++ b/flang/documentation/mod-files.md
@ -0,0 +1,124 @@
+# Module Files
+
+Module files hold information from a module that is necessary to compile 
+program units that depend on the module.
+
+## Name
+
+Module files must be searchable by module name. They are typically named
+`<modulename>.mod`. The advantage of using `.mod` is that it is consistent with
+other compilers so users will know what they are. Also, makefiles and scripts
+often use `rm *.mod` to clean up.
+
+The disadvantage of using the same name as other compilers is that it is not
+clear which compiler created a `.mod` file and files from multiple compilers
+cannot be in the same directory. This could be solved by adding something
+between the module name and extension, e.g. `<modulename>-f18.mod`.
+
+## Format
+
+The proposed format for module files is a Fortran source.
+Declarations of all visibile entities will be included, along with private
+entities that they depend on. Executable statements will be omitted.
+
+### Header
+
+There will be a header containing extra information that cannot be expressed
+in Fortran. This will take the form of a comment or directive
+at the beginning of the file.
+
+If it's a comment, the module file reader would have to strip it out and
+perform *ad hoc* parsing on it. If it's a directive the compiler could
+parse it like other directives as part of the grammar.
+Processing the header before parsing might result in better error messages
+when the `.mod` file is invalid.
+
+Regardless of whether the header is a comment or directive we can use the
+same string to introduce it: `!mod$`.
+
+Information in the header:
+- Magic string to confirm it is an f18 `.mod` file
+- Version information: to indicate the version of the file format, in case it changes,
+  and the version of the compiler that wrote the file, for diagnostics.
+- Checksum of the body of the current file
+- Modules we depend on and the checksum of their module file when the current
+  module file is created
+- Source file dependency information?
+- Compilation options?
+
+### Body
+
+The body will consist of minimal Fortran source for the required declarations.
+The order will match the order they first appeared in the source.
+
+Some normalization will take place:
+- extraneous spaces will be removed
+- implicit types will be made explicit
+- attributes will be written in a consistent order
+- entity declarations will be combined into a single declaration
+- function return types specified in a *prefix-spec* will be replaced by
+  an entity declaration
+- etc.
+
+#### Symbols included
+
+All public symbols from the module need to be included.
+
+In addition, some private symbols are needed:
+- private types that appear in the public API
+- private components of non-private derived types
+- private parameters used in non-private declarations (initial values, kind parameters)
+- others?
+
+It might be possible to anonymize private names if users don't want them exposed
+in the `.mod` file. (Currently they are readable in PGI `.mod` files.)
+
+#### USE associate
+
+A module that contains `USE` statements needs them represented in the
+`.mod` file.
+Each use-associated symbol will be written as a separate *use-only* statement,
+possibly with renaming.
+
+Alternatives:
+- Emit a single `USE` for each module, listing all of the symbols that were
+  use-associated in the *only-list*.
+- Detect when all of the symbols from a module are imported (either by a *use-stmt*
+  without an *only-list* or because all of the public symbols of the module
+  have been listed in *only-list*s). In that case collapse them into a single *use-stmt*.
+- Emit the *use-stmt*s that appeared in the original source.
+
+## Reading and writing module files
+
+A command-line option (e.g. `-module`) will specified a directory to
+search for `.mod` files and to write them to.
+If not specified it defaults to the current directory.
+
+### Writing modules files
+
+When writing a module file, if the existing one matches what would be written,
+the timestamp is not updated.
+
+Module files will be written after semantics, i.e. after the compiler has
+determined the module is valid Fortran.<br>
+**NOTE:** PGI does create `.mod` files sometimes even when the module has a
+compilation error.
+
+When the compiler can get far enough to determine it is compiling a module
+but then encounters an error, it will delete the existing `.mod` file
+if present.
+
+### Reading module files
+
+When the compiler finds a `.mod` file it needs to read, it firsts checks the first
+line and verifies it is a valid module file. It can also verify checksums of
+modules it depends on and report if they are out of date.
+
+If the header is valid, the module file will be run through the parser and name
+resolution to recreate the symbols from the module. Once the symbol table is
+populated the parse tree can be discarded.
+
+When processing `.mod` files we know they are valid Fortran with these properties:
+1. The input (without the header) is already in the "cooked input" format.
+2. No preprocessing is necessary.
+3. No errors can occur.