Skip to content
Prev Previous commit
Next Next commit
Address review
  • Loading branch information
Eclips4 committed Jan 13, 2025
commit 26d55595970fea72bd596a2e424453e2b789ca2d
2 changes: 0 additions & 2 deletions InternalDocs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,3 @@ it is not, please report that through the
[The Source Code Locations Table](locations.md)

[Exception Handling](exception_handling.md)

[The Virtual Machine](vm-state.md)
16 changes: 16 additions & 0 deletions InternalDocs/frames.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,20 @@ This seems to provide the best performance without excessive complexity.
The specials have a fixed size, so the offset of the locals is know. The
interpreter needs to hold two pointers, a frame pointer and a stack pointer.

### Fast locals and evaluation stack

The frame contains a single array of object pointers, `localsplus`,
which contains both the fast locals and the stack. The top of the
stack, including the locals, is indicated by `stacktop`.
For example, in a function with three locals, if the stack contains
one value, `frame->stacktop == 4`.

The interpreters share an implementation which uses the same memory
but caches the depth (as a pointer) in a C local, `stack_pointer`.
We aren't sure yet exactly how the JIT will implement the stack;
likely some of the values near the top of the stack will be held in registers.


#### Alternative layout

An alternative layout that was used for part of 3.11 alpha was:
Expand Down Expand Up @@ -124,6 +138,8 @@ if the frame were to resume. After `frame.f_lineno` is set, `instr_ptr` points t
the next instruction to be executed. During a call to a python function,
`instr_ptr` points to the call instruction, because this is what we would expect
to see in an exception traceback.
Dispatching on `instr_ptr` would be very inefficient, so in Tier 1 we cache the
upcoming value of `instr_ptr` in the C local `next_instr`.

The `return_offset` field determines where a `RETURN` should go in the caller,
relative to `instr_ptr`. It is only meaningful to the callee, so it needs to
Expand Down
10 changes: 10 additions & 0 deletions InternalDocs/tier2_engine.md → Python/tier2_engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,3 +148,13 @@ TO DO.
The implementation will change soon, so there is no point in
documenting it until then.


# Tier 2 IR format

The tier 2 IR (Internal Representation) format is also the basis for the Tier 2 interpreter (though the two formats may eventually differ). This format is also used as the input to the machine code generator (the JIT compiler).

Tier 2 IR entries are all the same size; there is no equivalent to `EXTENDED_ARG` or trailing inline cache entries. Each instruction is a struct with the following fields (all integers of varying sizes):

- **opcode**: Sometimes the same as a Tier 1 opcode, sometimes a separate micro opcode. Tier 2 opcodes are 9 bits (as opposed to Tier 1 opcodes, which fit in 8 bits). By convention, Tier 2 opcode names start with `_`.
- **oparg**: The argument. Usually the same as the Tier 1 oparg after expansion of `EXTENDED_ARG` prefixes. Up to 32 bits.
- **operand**: An additional argument, Typically the value of *one* cache item from the Tier 1 inline cache, up to 64 bits.
22 changes: 2 additions & 20 deletions InternalDocs/vm-state.md → Python/vm-state.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,11 @@
## Definition of Tiers

- **Tier 1** is the classic Python bytecode interpreter.
This includes the specializing [adaptive interpreter](adaptive.md).
- **Tier 2**, also known as the micro-instruction ("uop") interpreter, is a new interpreter with a different instruction format.
This includes the specializing [adaptive interpreter](../InternalDocs/adaptive.md).
- **Tier 2**, also known as the micro-instruction ("uop") interpreter, is a new execution engine.
It was introduced in Python 3.13, and also forms the basis for a JIT using copy-and-patch technology. See [Tier 2](tier2_engine.md) for more information.


# Frame state

Almost all interpreter state is nominally stored in the frame structure.
A pointer to the current frame is held in `frame`, for more information about what `frame` contains see [Frames](frames.md):

Copy link
Copy Markdown
Member

@iritkatriel iritkatriel Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @markshannon

I can't comment on lines that haven't changed, so this is for a number of comments on vm-state.md.

L 21: The interpreters share an implementation of what? The frame? Caches the depth - is this stack depth?

L 40: Add a link to exception_handling.md.

L45: Not sure what you mean here: "The implementation of jumps within a single Tier 2 superblock/trace is just that, an implementation."

L51: "within the superblock" is repeated twice.
L52: what cannot be modified?

I think it might be worth moving the contents of the "Thread state and interpreter state" section to the beginning, as a high level overview of the components of the state, and then drill into the parts.

The "Tier 2 IR format" section doesn't seem to belong to the vm-state topic at all.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interpreters share an implementation of what?

Both tier 1 and tier 2 use the same canonical in-memory representation. Tier 2 might store some values temporarily in registers, but that should be invisible to other code. The reason this is noteworthy is that other VMs, e.g. HotSpot can have different frame layouts for the compiler and interpreter.

# Thread state and interpreter state

Expand All @@ -23,20 +19,6 @@ The thread state is also used to access the **interpreter state** (`tstate->inte
The interpreter state also holds the optimizer state (`optimizer` and some counters).
Note that the eval breaker may be moved to the thread state soon as part of the multicore (PEP 703) work.

## Fast locals and evaluation stack

The frame contains a single array of object pointers, `localsplus`, which contains both the fast locals and the stack.
The top of the stack, including the locals, is indicated by `stacktop`.
For example, in a function with three locals, if the stack contains one value, `frame->stacktop == 4`.

The interpreters share an implementation which uses the same memory but caches the depth (as a pointer) in a C local, `stack_pointer`.
We aren't sure yet exactly how the JIT will implement the stack; likely some of the values near the top of the stack will be held in registers.

## Instruction pointer

The canonical, in-memory, representation of the instruction pointer is `frame->instr_ptr`.
It always points to an instruction in the bytecode array of the frame's code object.
Dispatching on `frame->instr_ptr` would be very inefficient, so in Tier 1 we cache the upcoming value of `frame->instr_ptr` in the C local `next_instr`.

## Tier 2

Expand Down