Changing CPython’s Grammar

Abstract

There’s more to changing Python’s grammar than editing Grammar/Grammar and Python/compile.c. This document aims to be a checklist of places that must also be fixed.

It is probably incomplete. If you see omissions, submit a bug or patch.

This document is not intended to be an instruction manual on Python grammar hacking, for several reasons.

Rationale

People are getting this wrong all the time; it took well over a year before someone noticed that adding the floor division operator (//) broke the parser module.

Checklist

  • Grammar/Grammar: OK, you’d probably worked this one out :)
  • Parser/Python.asdl may need changes to match the Grammar. Run make to regenerate Include/Python-ast.h and Python/Python-ast.c.
  • Python/ast.c will need changes to create the AST objects involved with the Grammar change.
  • Parser/pgen needs to be rerun to regenerate Include/graminit.h and Python/graminit.c. (make should handle this for you.)
  • Python/symtable.c: This handles the symbol collection pass that happens immediately before the compilation pass.
  • Python/compile.c: You will need to create or modify the compiler_* functions to generate opcodes for your productions.
  • You may need to regenerate Lib/symbol.py and/or Lib/token.py and/or Lib/keyword.py.
  • The parser module. Add some of your new syntax to test_parser, bang on Modules/parsermodule.c until it passes.
  • Add some usage of your new syntax to test_grammar.py
  • If you’ve gone so far as to change the token structure of Python, then the Lib/tokenizer.py library module will need to be changed.
  • Certain changes may require tweaks to the library module pyclbr.
  • Lib/lib2to3/Grammar.txt may need changes to match the Grammar.
  • Documentation must be written!
  • After everything has been checked in, you’re likely to see a new change to Python/Python-ast.c. This is because this (generated) file contains the git version of the source from which it was generated. There’s no way to avoid this; you just have to submit this file separately.