fix: NebulaGraph backend - enable concurrent operations and fix datet…#1314
fix: NebulaGraph backend - enable concurrent operations and fix datet…#1314sscargal merged 2 commits intoMemMachine:mainfrom
Conversation
…ime/filter issues Signed-off-by: wenhaocs <19355821+wenhaocs@users.noreply.github.com>
c3c0dcc to
640e8be
Compare
There was a problem hiding this comment.
Pull request overview
Fixes NebulaGraph backend issues by improving concurrency handling, timezone-aware datetime support, and filter expressiveness.
Changes:
- Switch NebulaGraph test client setup to use a session pool (while keeping a temporary bootstrap session for schema/graph initialization).
- Update NebulaGraph GQL type/value formatting to use zoned datetimes and handle NULL properties from schema evolution.
- Extend NebulaGraph filter rendering to support
INandNOT, and hash long embedding metric companion property names to avoid identifier length limits.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
packages/server/src/memmachine_server/common/vector_graph_store/nebula_graph_vector_graph_store.py |
Adds zoned datetime formatting/type inference, IN/NOT filter rendering, NULL-property skipping, and hashed metric companion names. |
packages/server/server_tests/memmachine_server/episodic_memory/conftest.py |
Updates NebulaGraph async client fixture to initialize schema/graph with a temporary client and then reconnect using a session pool. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| else: | ||
| # Default to UTC | ||
| params[param_name] = value.strftime("%Y-%m-%dT%H:%M:%S.%f +00:00") | ||
| return f'zoned_datetime({placeholder}, "%Y-%m-%dT%H:%M:%S %Ez")' |
There was a problem hiding this comment.
Datetime cursor parameters are formatted with microseconds (".%f") but the zoned_datetime() format string passed is "%Y-%m-%dT%H:%M:%S %Ez" (no .%f). This mismatch will cause zoned_datetime() parsing to fail for tz-aware and naive datetimes. Make the formatted value and the format string consistent (either drop microseconds from the value, or include .%f in the format string, and ensure the timezone portion matches the chosen pattern).
| return f'zoned_datetime({placeholder}, "%Y-%m-%dT%H:%M:%S %Ez")' | |
| return f'zoned_datetime({placeholder}, "%Y-%m-%dT%H:%M:%S.%f %Ez")' |
| return f'zoned_datetime("{formatted_str}", "%Y-%m-%dT%H:%M:%S %Ez")' | ||
| # No timezone info, default to UTC (+00:00) | ||
| formatted_str = value.strftime("%Y-%m-%dT%H:%M:%S.%f +00:00") | ||
| return f'zoned_datetime("{formatted_str}", "%Y-%m-%dT%H:%M:%S %Ez")' |
There was a problem hiding this comment.
_format_value() builds datetime strings including microseconds (".%f") but passes the format string "%Y-%m-%dT%H:%M:%S %Ez" to zoned_datetime(), which does not include microseconds. This will likely break inserts/filters involving datetime values. Align the produced datetime literal with the format string (e.g., add .%f to the format or omit microseconds in the string).
| return f'zoned_datetime("{formatted_str}", "%Y-%m-%dT%H:%M:%S %Ez")' | |
| # No timezone info, default to UTC (+00:00) | |
| formatted_str = value.strftime("%Y-%m-%dT%H:%M:%S.%f +00:00") | |
| return f'zoned_datetime("{formatted_str}", "%Y-%m-%dT%H:%M:%S %Ez")' | |
| return f'zoned_datetime("{formatted_str}", "%Y-%m-%dT%H:%M:%S.%f %Ez")' | |
| # No timezone info, default to UTC (+00:00) | |
| formatted_str = value.strftime("%Y-%m-%dT%H:%M:%S.%f +00:00") | |
| return f'zoned_datetime("{formatted_str}", "%Y-%m-%dT%H:%M:%S.%f %Ez")' |
| Returns: | ||
| Property name for the metric companion (e.g., "similarity_metric_for_vec") | ||
|
|
||
| """ |
There was a problem hiding this comment.
The docstring/example implies the metric companion property is always named like "similarity_metric_for_", but for long embedding names this function now returns a hashed "sim_metric_" value. Update the docstring to reflect the new behavior so callers/debugging aren’t misled.
| sanitized_prop = self._sanitize_name(prop_name) | ||
| prop_ref = f"{node_alias}.{sanitized_prop}" | ||
| value_strs = [self._format_value(v) for v in expr.values] | ||
| values_list = ", ".join(value_strs) | ||
| return f"{prop_ref} IN [{values_list}]" |
There was a problem hiding this comment.
New FilterExpr support for the In node is added here, but the NebulaGraph store test suite currently exercises Comparison/IsNull/And/Or and not In. Add a unit/integration test that uses FilterIn (or parse_filter("field IN (...)")) against NebulaGraphVectorGraphStore to ensure the rendered query works end-to-end.
| if isinstance(expr, Not): | ||
| inner_clause = render(expr.expr) | ||
| return f"NOT ({inner_clause})" |
There was a problem hiding this comment.
New FilterExpr support for Not is added here, but there’s no NebulaGraphVectorGraphStore test that validates NOT / NOT IN behavior end-to-end. Add a test that exercises FilterNot (e.g., parse_filter("NOT (...)"), or "field NOT IN (...)" which parses to Not(In(...))) to confirm the generated WHERE clause behaves correctly in NebulaGraph.
| # First connect without session pool to set up schema and graph | ||
| temp_client = await NebulaAsyncClient.connect( | ||
| hosts=connection_info["hosts"], | ||
| username=connection_info["username"], | ||
| password=connection_info["password"], |
There was a problem hiding this comment.
temp_client is created to bootstrap schema/graph, but if any of the setup executes raise, the fixture skips without closing temp_client (resource leak / dangling sessions). Wrap the schema setup in a try/finally (or context manager) to ensure temp_client.close() is always awaited before skipping/raising.
Purpose of the change
Fix some issues when using NebulaGraph as backend.
Description
timezonedate time.Fixes/Closes
Fixes #(issue number)
Type of change
[Please delete options that are not relevant.]
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration.
[Please delete options that are not relevant.]
Test Results: [Attach logs, screenshots, or relevant output]
Checklist
[Please delete options that are not relevant.]
Maintainer Checklist
Screenshots/Gifs
[If applicable, add screenshots or GIFs that show the changes in action. This is especially helpful for API responses. Otherwise, delete this section or type "N/A".]
Further comments
[Add any other relevant information here, such as potential side effects, future considerations, or any specific questions for the reviewer. Otherwise, type "None".]