Learn Cpython Internals
Python Grammar:
介绍Python语法、编译执行过程及Cypython架构UML图
repo: https://github.com/python/cpython/tree/29d018aa63b72161cfc67602dc3dbd386272da64
Main [Programs/python.c]
=> Py_Main [Modules/main.c]
=> pymain_main
=> pymain_init
=> _PyRuntime_Initialize
=> _Py_InitializeFromWideArgs
=> init_python
=> _Py_InitializeMainInterpreter
=> _Py_RunMain
=> PyRun_AnyFileExFlags
=> PyParser_ASTFromFileObject
=> PyParser_ParseFileObject
=> PyTokenizer_FromFile
=> parsetok: for (;;) {PyTokenizer_Get}
=> PyAST_FromNodeObject
=> run_mod
=> PyAST_CompileObject
=> PySymtable_BuildObject:
symtable_visit_stmt(st,stmt_ty) for stmt_ty in asdl_seq
=> compiler_mod
=> compiler_enter_scope
=> compiler_body:
VISIT(c, stmt, stmt_ty) for stmt_ty in asdl_seq
=> compiler_exit_scope
=> assemble
=> run_eval_code_obj
=> PyEval_EvalCode
=> PyEval_EvalCodeEx
=> _PyEval_EvalCodeWithName
=> _PyFrame_New_NoTrack
=> PyEval_EvalFrameEx
=> eval_frame
=> _PyEval_EvalFrameDefault:
main_loop
https://cpython-devguide.readthedocs.io/compiler
Compiler process:
Parser/parsetok.c
) Python/ast.c
) Python/compile.c
) Python/compile.c
) Excution:
Python/ceval.c
)an LL(1) parser: Compilers: Principles, Techniques, and Tools
Python grammar: Grammar/Grammar
Include/graminit.h
Python tokens: Grammar/Tokens
Include/token.h
The parse tree: Include/node.h
CHILD(node *, int)
RCHILD(node *, int)
NCH(node *)
: Number of childrenSTR(node *)
TYPE(node *)
REQ(node *, TYPE)
LINENO(node *)
Parser/parsetok.c
parsetok
The Zephyr Abstract Syntax Description Language - Princeton CS
Python AST nodes: Parser/Python.asdl
Parser/asdl.py
Python/asdl.c
Include/asdl.h
Python/Python-ast.c
Include/Python-ast.h
xxx_ty
: AST node
asdl_seq *
: a sequence of AST nodes
_Py_asdl_seq_new(Py_ssize_t, PyArena *)
asdl_seq_GET(asdl_seq *, int)
asdl_seq_SET(asdl_seq *, int, stmt_ty)
asdl_seq_LEN(asdl_seq *)
an arena: a memory is pooled in a single location for easy allocation and removal.
Include/pyarena.h
Python/pyarena.c
PyArena
structure
PyArena_New()
PyArena_Free()
PyArena_AddPyObject()
Python/ast.c
PyAST_FromNode()
PyAST_FromNodeObject()
ast_for_xxx
=> xxx_ty
a directed graph: models the flow of a program using basic blocks
Python bytecode: intermediate representation (IR)
Basic blocks: a block of IR
Code is directly generated from the basic blocks (with jump targets adjusted based on the output order) by doing a post-order depth-first search on the CFG following the edges.
Python/compile.c
PyAST_CompileObject()
PySymtable_BuildObject()
: Python/symtable.c
symtable_visit_xxx
=> symbol tablecompiler_mod()
compiler_body(struct compiler *c, asdl_seq *stmts)
VISIT(c, stmt, stmt_ty) for stmt_ty in stmts
assemble(compiler c)
=> PyCodeObject *co
dfs(c, entryblock, &a, nblocks)
assemble_jump_offsets(&a, c)
assemble_emit
co = makecode(c, &a)
Include/code.h
PyCodeObject
Python/ceval.c
_PyEval_EvalFrameDefault()
Title | Brief | Author | Version |
---|---|---|---|
A guide from parser to objects, observed using GDB | Code walk from Parser, AST, Sym Table and Objects | Louie Lu | 3.7.a0 |
Green Tree Snakes | The missing Python AST docs | Thomas Kluyver | 3.6 |
Yet another guided tour of CPython | A guide for how CPython REPL works | Guido van Rossum | 3.5 |
Python Asynchronous I/O Walkthrough | How CPython async I/O, generator and coroutine works | Philip Guo | 3.5 |
Coding Patterns for Python Extensions | Reliable patterns of coding Python Extensions in C | Paul Ross | 3.4 |
Title | Brief | Author | Version |
---|---|---|---|
Python’s Innards Series | ceval, objects, pystate and miscellaneous topics | Yaniv Aknin | 3.1 |
Eli Bendersky’s Python Internals | Objects, Symbol tables and miscellaneous topics | Eli Bendersky | 3.x |
A guide from parser to objects, observed using Eclipse | Code walk from Parser, AST, Sym Table and Objects | Prashanth Raghu | 2.7.12 |
CPython internals: A ten-hour codewalk through the Python interpreter source code | Code walk from source code to generators | Philip Guo | 2.7.8 |