|
| 1 | +This directory contains the Go logic that's executed by the `EmbeddedOnlineFeatureServer` from Python. |
| 2 | + |
| 3 | +## Building and Linking |
| 4 | +[gopy](https://github.com/go-python/gopy) generates (and compiles) a CPython extension module from a Go package. That's what we're using here, as visible in [setup.py](../setup.py). |
| 5 | + |
| 6 | +Under the hood, gopy invokes `go build`, and then templates `cgo` stubs for the Go module that exposes the public functions from the Go module as C functions. |
| 7 | +For our project, this stuff can be found at `sdk/python/feast/embedded_go/lib/embedded.go` & `sdk/python/feast/embedded_go/lib/embedded_go.h` after running `make compile-go-lib`. |
| 8 | + |
| 9 | +## Arrow memory management |
| 10 | +Understanding this is the trickiest part of this integration. |
| 11 | + |
| 12 | +At a high level, when using the Python<>Go integration, the Python layer exports request data into an [Arrow Record batch](https://arrow.apache.org/docs/python/data.html) which is transferred to Go using Arrow's zero copy mechanism. |
| 13 | +Similarly, the Go layer converts feature values read from the online store into a Record Batch that's exported to Python using the same mechanics. |
| 14 | + |
| 15 | +The first thing to note is that from the Python perspective, all the export logic assumes that we're exporting to & importing from C, not Go. This is because pyarrow only interops with C, and the fact we're using Go is an implementation detail not relevant to the Python layer. |
| 16 | + |
| 17 | +### Export Entities & Request data from Python to Go |
| 18 | +The code exporting to C is this, in [online_feature_service.py](../sdk/python/feast/embedded_go/online_features_service.py) |
| 19 | +``` |
| 20 | +( |
| 21 | + entities_c_schema, |
| 22 | + entities_ptr_schema, |
| 23 | + entities_c_array, |
| 24 | + entities_ptr_array, |
| 25 | +) = allocate_schema_and_array() |
| 26 | +( |
| 27 | + req_data_c_schema, |
| 28 | + req_data_ptr_schema, |
| 29 | + req_data_c_array, |
| 30 | + req_data_ptr_array, |
| 31 | +) = allocate_schema_and_array() |
| 32 | +
|
| 33 | +batch, schema = map_to_record_batch(entities, join_keys_types) |
| 34 | +schema._export_to_c(entities_ptr_schema) |
| 35 | +batch._export_to_c(entities_ptr_array) |
| 36 | +
|
| 37 | +batch, schema = map_to_record_batch(request_data) |
| 38 | +schema._export_to_c(req_data_ptr_schema) |
| 39 | +batch._export_to_c(req_data_ptr_array) |
| 40 | +``` |
| 41 | + |
| 42 | +Under the hood, `allocate_schema_and_array` allocates a pointer (`struct ArrowSchema*` and `struct ArrowArray*`) in native memory (i.e. the C layer) using `cffi`. |
| 43 | +Next, the RecordBatch exports to this pointer using [`_export_to_c`](https://github.com/apache/arrow/blob/master/python/pyarrow/table.pxi#L2509), which uses [`ExportRecordBatch`](https://arrow.apache.org/docs/cpp/api/c_abi.html#_CPPv417ExportRecordBatchRK11RecordBatchP10ArrowArrayP11ArrowSchema) under the hood. |
| 44 | + |
| 45 | +As per the documentation for ExportRecordBatch: |
| 46 | +> Status ExportRecordBatch(const RecordBatch &batch, struct ArrowArray *out, struct ArrowSchema *out_schema = NULLPTR) |
| 47 | +> Export C++ RecordBatch using the C data interface format. |
| 48 | +> |
| 49 | +> The record batch is exported as if it were a struct array. The resulting ArrowArray struct keeps the record batch data and buffers alive until its release callback is called by the consumer. |
| 50 | +
|
| 51 | +This is why `GetOnlineFeatures()` in `online_features.go` calls `record.Release()` as below: |
| 52 | +``` |
| 53 | +entitiesRecord, err := readArrowRecord(entities) |
| 54 | +if err != nil { |
| 55 | + return err |
| 56 | +} |
| 57 | +defer entitiesRecord.Release() |
| 58 | +... |
| 59 | +requestDataRecords, err := readArrowRecord(requestData) |
| 60 | +if err != nil { |
| 61 | + return err |
| 62 | +} |
| 63 | +defer requestDataRecords.Release() |
| 64 | +``` |
| 65 | + |
| 66 | +Additionally, we need to pass in a pair of pointers to `GetOnlineFeatures()` that are populated by the Go layer, and the resultant feature values can be passed back to Python (via the C layer) using zero-copy semantics. |
| 67 | +That happens as follows: |
| 68 | +``` |
| 69 | +( |
| 70 | + features_c_schema, |
| 71 | + features_ptr_schema, |
| 72 | + features_c_array, |
| 73 | + features_ptr_array, |
| 74 | +) = allocate_schema_and_array() |
| 75 | +
|
| 76 | +... |
| 77 | +
|
| 78 | +record_batch = pa.RecordBatch._import_from_c( |
| 79 | + features_ptr_array, features_ptr_schema |
| 80 | +) |
| 81 | +``` |
| 82 | + |
| 83 | +The corresponding Go code that exports this data is: |
| 84 | +``` |
| 85 | +result := array.NewRecord(arrow.NewSchema(outputFields, nil), outputColumns, int64(numRows)) |
| 86 | +
|
| 87 | +cdata.ExportArrowRecordBatch(result, |
| 88 | + cdata.ArrayFromPtr(output.DataPtr), |
| 89 | + cdata.SchemaFromPtr(output.SchemaPtr)) |
| 90 | +``` |
| 91 | + |
| 92 | +The documentation for `ExportArrowRecordBatch` is great. It has this super useful caveat: |
| 93 | + |
| 94 | +> // The release function on the populated CArrowArray will properly decrease the reference counts, |
| 95 | +> // and release the memory if the record has already been released. But since this must be explicitly |
| 96 | +> // done, make sure it is released so that you do not create a memory leak. |
| 97 | +
|
| 98 | +This implies that the reciever is on the hook for explicitly releasing this memory. |
| 99 | + |
| 100 | +However, we're using `_import_from_c`, which uses [`ImportRecordBatch`](https://arrow.apache.org/docs/cpp/api/c_abi.html#_CPPv417ImportRecordBatchP10ArrowArrayP11ArrowSchema), which implies that the receiver of the RecordBatch is the new owner of the data. |
| 101 | +This is wrapped by pyarrow - and when the corresponding python object goes out of scope, it should clean up the underlying record batch. |
| 102 | + |
| 103 | +Another thing to note (which I'm not sure may be the source of issues) is that Arrow has the concept of [Memory Pools](https://arrow.apache.org/docs/python/api/memory.html#memory-pools). |
| 104 | +Memory pools can be set in python as well as in Go. I *believe* that if we use the CGoArrowAllocator, that uses whatever pool C++ uses, which should be the same as the one used by PyArrow. But this should be vetted. |
| 105 | + |
| 106 | + |
| 107 | +### References |
| 108 | +- https://arrow.apache.org/docs/format/CDataInterface.html#memory-management |
| 109 | +- https://arrow.apache.org/docs/python/memory.html |
0 commit comments