src
Directory actions
More options
Directory actions
More options
src
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
parent directory.. | ||||
pg-python architecture ====================== This README provides a high-level overview of the implementation. Native Typing ------------- pg-python was designed with the goal of providing Python interfaces to PostgreSQL objects--often conceptual rather than literal. This marks a sharp contrast with traditional PL implementations as parameters are *not* sent through conversion functions to become common language objects. Rather, instances of "interface types" are used to represent values coming from the database. Conversion mechanisms are, however, provided through standard language interfaces as Postgres objects are not often directly usable by Python libraries. Native typing provides Python interfaces to Postgres types using Python types. This is achieved by extending the PyTypeObject structure with Postgres type information(PyPgTypeInfo) extracted from a type's corresponding pg_type row. Those types are 'Postgres.Type' objects. Postgres.Type's are conceptual, that is, they are *not* instances of the 'pg_type' relation type. Many Postgres.Type's have custom implementations providing access to an object's data. Notably, time types provide properties that allow the user to leverage much of date_part()'s functionality directly in Python. These convenience interfaces for standard Postgres types are common and contribute to most of the size of types/*.c files. Support for Postgres operators are provided. Normally, this is a direct mapping of the Python operator's textual representation, so "pg_ob1 * pg_ob2" will resolve the "*" operator based on the types of pg_ob1 and pg_ob2. When only one Postgres object is present, the other operand will be converted to that type. Error Management ---------------- pg-python handles database errors by raising a Python exception representing the error. By doing this, pg-python must record when this has been done so that future database actions can be prohibited. The `pl_state` global is used to track the error state of the PL. When a database error(ereport, elog) occurs, pl_state is altered to indicate that subsequent use of database functionality should be restricted by an exception being raised. The "DB_IS_NOT_READY()" macro should be used at the beginning of every Python interface that requires that the database is in a non-error state. Before the language handler exits, the PL must check the state to validate that a database error has not been suffocated. Exiting the handler indicating success in a failed transaction is likely to cause trouble for the caller, so this check is employed to guarantee that the error state is properly indicated. The following error codes have been added: ERRCODE_PYTHON_ERROR This is used to describe common PL or language errors. These may have been indirectly caused by the user. This code is used to indicate that the Python error came from the edges of the PL, not directly from the user's code. ERRCODE_PYTHON_EXCEPTION A Python exception was raised by the function's code. ERRCODE_PYTHON_PROTOCOL_VIOLATION The user failed to meet the PL's expectations. Currently, this is used by check_state and triggers. ERRCODE_PYTHON_RELAY Used internally by the PL to designate that the failure is a Python exception and it should be raised (Python) or thrown (Postgres) accordingly. Transaction Management (Internal Subtransactions) ------------------------------------------------- Subtransactions are managed using a simple counter. Each Postgres.Transaction instance will get a local transaction identifier that is used to validate that transactions are exited in the appropriate order. Prepared Statements and Cursors (SPI) ------------------------------------- Database access is implemented using two types: the Postgres.Statement and the Postgres.Cursor. The statement objects provide an interface to a parameterized plan. When invoked, a Postgres.Cursor object is created to manage the result set of the Portal. Scrollable cursors are supported using the statement's "declare" method. When a scrollable cursor is created, the seek and read methods are available for use. Postgres Interface Module ------------------------- All Postgres interfaces are provided using the "Postgres" module. This is a built-in module that is extended with the contents of the "module.py" file. Specifically, once the C portions of module are initialized, the Python portion is executed in the module's context. This provides an easy way to do further, high-level initialization(@pytypes is implemented in pure-Python). First Call Initialization ------------------------- Much of the PL is initialized in _PG_init. However, it is possible for _PG_init to be executed outside of a transaction(preloading), which means that we cannot fully initialize interface module and the Python environment as it depends on the system cache being available--specifically, regprocedure is used to implement "proc(...)", and regtype is used to provide the "Postgres.types" pseudo-module. Inline Execution ---------------- DO statements are handled using a fake PyPgFunction object--that is, there is no corresponding pg_proc entry. The function's module is set to the InlineExecutor class defined in the pure-Python portion of the Postgres module(See src/module.py). This class provides the "main" entry point which takes a single argument, the source to execute. Interrupt Support ----------------- Interrupts are supported by overriding the default signal handlers for SIGTERM and SIGINT. While scary at first, these overrides are *not* installed unless the PL has been invoked--they are installed at first call, not on _PG_init. Additionally, when they have been installed, the new functionality is only invoked in transactions where there has been PL activity. Specifically, the `handler_count` must be greater than zero *and* there must be a Python stack frame. Managing interrupts during PL execution is fairly tricky. When an interrupt occurs, the KeyboardInterrupt exception is set via Py_AddPendingCall iff CFI doesn't raise. If the Python interpreter never actually calls it, it's important the Py_MakePendingCalls is performed between transactions in order to avoid a spurious exception upon re-entry.