The Fortran language recognizer here is an LL recursive descent parser composed from a "parser combinator" library that defines a few fundamental parsers and a few ways to compose them into more powerful parsers. For our purposes here, a *parser* is any object that can attempt to recognize an instance of some syntax from an input stream. It may succeed or fail. On success, it may return some semantic value to its caller. In C++ terms, a parser is any instance of a class that (1) has a constexpr default constructor, (2) defines a resultType typedef, and (3) provides a member or static function std::optional Parse(ParseState *) const; static std::optional Parse(ParseState *); that accepts a pointer to a ParseState as its argument and returns a std::optional as a result, with the presence or absence of a value in the std::optional<> signifying success or failure respectively. The resultType of a parser is typically the class type of some particular node type in the parse tree. ParseState is a class that encapsulates a position in the source stream, collects messages, and holds a few state flags that can affect tokenization (e.g., are we in a character literal?). Instances of ParseState are independent and complete -- they are cheap to duplicate when necessary to implement backtracking. The constexpr default constructor of a parser is important. The functions (below) that operate on instances of parsers are themselves all constexpr. This use of compile-time expressions allows the entirety of a recursive descent parser for a language to be constructed at compilation time through the use of templates. These objects and functions are (or return) the fundamental parsers: ok always succeeds without advancing pure(x) always succeeds without advancing, returning some value x fail(msg) always fails with the given message; optionally typed cut always fails, with no message guard(pred) succeeds if the predicate expression evaluates to true rawNextChar returns the next raw character; fails at EOF cookedNextChar returns the next character after preprocessing, skipping Fortran line continuations and comments; fails at EOF These functions and operators generate new parsers from combinations of other parsers: !p ok if p fails, cut if p succeeds p >> q match p, then q, returning q's value p / q match p, then q, returning p's value p || q match p if it succeeds, else match q; p and q must be same type lookAhead(p) succeeds iff p does, but doesn't modify state attempt(p) succeeds iff p does, safely preserving state on failure many(p) a greedy sequence of zero or more nonempty successes of p; returns std::list<> of values some(p) a greedy sequence of one or more successes of p skipMany(p) same as many(p), but discards result (performance optimizer) maybe(p) try to match p, returning optional defaulted(p) matches p, or else returns a default-constructed instance of p's resultType nonemptySeparated(p, q) repeatedly match p q p q p q ... p, returning the values of the p's extension(p) parses p if strict standard compliance is disabled, with a warning if nonstandard usage warnings are enabled deprecated(p) parses p if strict standard compliance is disabled, with a warning if deprecated usage warnings are enabled inContext("...", p) run p within an error message context Note that "a >> b >> c / d / e" matches a sequence of five parsers, but returns only the result that was obtained by matching c. The following "applicative" combinators modify or combine the values returned by parsers: construct{}(p1, p2, ...) matches zero or more parsers in succession, collecting their results and then passing them with move semantics to a constructor for the type T if they all succeed applyFunction(f, p1, p2, ...) matches one or more parsers in succession, collecting their results and passing them as rvalue reference arguments to some function, returning its result applyLambda([](&&x){}, p1, p2, ...) is the same thing, but for lambdas and other function objects applyMem(mf, p1, p2, ...) is the same thing, but invokes a member function of the result of the first parser These are non-advancing state inquiry and update parsers: getColumn returns 1-based column position inCharLiteral succeeds under withinCharLiteral inFortran succeeds unless in a preprocessing directive inFixedForm succeeds in fixed-form source setInFixedForm sets the fixed-form flag, returns prior value columns returns the 1-based column number after which source is clipped setColumns(c) sets "columns", returns prior value When parsing depends on the result values of earlier parses, the "monadic bind" combinator is available (but please try to avoid using it, as it makes automatic analysis of the grammar difficult): p >>= f match p, yielding some value x on success, then match the parser returned from the function call f(x) Last, we have these basic parsers on which the actual grammar of the Fortran is built. All of the following parsers consume characters acquired from "cookedNextChar". spaces always succeeds after consuming any spaces or tabs digit matches one cooked decimal digit (0-9) letter matches one cooked letter (A-Z) CharMatch<'c'>{} matches one specific cooked character "..."_tok match contents, skipping spaces before and after, and with multiple spaces accepted for any internal space "..." >> p the tok suffix is optional on a string before >> and after / parenthesized(p) shorthand for "(" >> p / ")" bracketed(p) shorthand for "[" >> p / "]" withinCharLiteral(p) apply p, tokenizing for CHARACTER/Hollerith literals nonEmptyListOf(p) matches a comma-separated list of one or more p's optionalListOf(p) ditto, but can be empty "..."_debug emit the string and succeed, for parser debugging