Skip to content

feat(api): restrict field/property names to a portable, readable rule (#1383)#1396

Open
alighazi288 wants to merge 3 commits into
MemMachine:mainfrom
alighazi288:feat/1383-safe-field-names
Open

feat(api): restrict field/property names to a portable, readable rule (#1383)#1396
alighazi288 wants to merge 3 commits into
MemMachine:mainfrom
alighazi288:feat/1383-safe-field-names

Conversation

@alighazi288
Copy link
Copy Markdown
Contributor

Purpose of the change

Implement the validation rule requested in #1383 for user-supplied field/property names to ensure data portability and readability across storage backends (AWS, Chroma, Neo4j) without requiring aggressive translation layers.

Description

Adds a SafeFieldName annotated type in packages/common/src/memmachine_common/api/spec.py enforcing the rule from the issue on user-supplied field/property names:

  • starts with a lowercase letter, then [a-z0-9_]
  • max 32 characters
  • leading underscore reserved for system use

Distinct from the existing _is_valid_name / SafeId (which validates identifiersorg_id, project_id, UIDs — and stays permissive).

Where it's applied

Input/spec models only:

  • Dict-key types: MemoryMessage.metadata, the four *.set_metadata filter dicts (Search / List / GetSemanticSetId / ListSemanticSetIds), AddFeatureSpec.feature_metadata, UpdateFeatureSpec.metadata.

  • Scalar fields: AddFeatureSpec.{tag, feature}, UpdateFeatureSpec.{tag, feature}, CreateSemanticSetTypeSpec {name, metadata_tags items}, GetSemanticSetIdSpec.metadata_tags items, AddSemanticCategorySpec.category_name, AddSemanticCategoryTemplateSpec.category_name, AddSemanticTagSpec.tag_name.

Deliberately not touched

  • Response/entity models (Episode.metadata, SemanticFeature.*, etc.): permissive so legacy records still
    deserialize. Strict input + permissive output drains the corpus as records are rewritten.

  • Lookup-by-name fields (category_name on AddFeatureSpec, UpdateFeatureSpec, DisableSemanticCategorySpec): resolved against existing categories at semantic_memory.py:299; tightening would orphan legacy categories. Pinned by test_lookup_carveout_stays_permissive.

  • set_id: identifier, not a field/property name. Out of issue scope.

Fixes/Closes

Fixes #1383

Type of change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested?

  • Unit Test

packages/server/server_tests/memmachine_server/server/api_v2/test_spec.py gains 7 test functions (~19 cases via parametrise):

  • Validator unit tests (valid + invalid, with per-case error-phrase assertion).
  • One Pydantic wiring test per surface category: dict-key, scalar, list-item.
  • Lookup-permissive carveout test.
  • UpdateFeatureSpec partial-update test (None defaults skip validation).

Plus two existing fixtures in test_router.py / test_memory.py renamed from "User Sessions""user_sessions".

Checklist

  • I have signed the commit(s) within this pull request
  • My code follows the style guidelines of this project (See STYLE_GUIDE.md)
  • I have performed a self-review of my own code
  • I have commented my code
  • My changes generate no new warnings
  • I have added unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Confirmed all checks passed
  • Contributor has signed the commit(s)
  • Reviewed the code
  • Run, Tested, and Verified the change(s) work as expected

Further comments

Compatibility Note: This is a breaking change for REST clients attempting to submit field/property names outside the new rule. A 422 Unprocessable Entity will be returned citing the exact rule violated (length, reserved-prefix, or charset).
For read-modify-write patterns, updating a legacy record containing non-conforming keys will result in a 422 until the client scrubs or re-keys the data.

…MemMachine#1383)

Implement the validation rule requested in MemMachine#1383 for user-supplied
field/property names:
- starts with a lowercase letter, then [a-z0-9_]
- max 32 characters
- leading underscore reserved for system use

Adds `_is_valid_field_name` and `SafeFieldName` in `common/api/spec.py`, alongside the existing `_is_valid_name` / `SafeId` (which validates identifiers — kept permissive). The new rule is applied to user-supplied field/property names on input/spec models. Response/entity models stay permissive so legacy records continue to deserialize, and lookup-by-name fields (`*.category_name` on `AddFeatureSpec`, `UpdateFeatureSpec`, `DisableSemanticCategorySpec`) are kept permissive — those resolve against existing categories at `semantic_memory.py:299`, so tightening would block operations on legacy categories.
CI on MemMachine#1396 surfaced two issues missed in the initial pass:

- ty static check (server, 3.12/3.14): ty doesn't honor Pydantic's
  metaclass-generated `__init__` defaults, so direct kwargs
  construction (`AddFeatureSpec(...)`) of specs with optional fields
  was flagged as missing-argument. The rest of the file uses
  `Model.model_validate({...})` for exactly this reason; converted
  the four new tests to match.

- test-install integration tests: six existing `create_semantic_set_type`
  fixtures in `test_integration_complete.py` used `name=` values
  with spaces and capitals ("User Sessions", "List Test Set Type",
  etc.). The live server now enforces `SafeFieldName` on
  `CreateSemanticSetTypeSpec.name`, so these 422'd. Renamed each to
  snake_case.
@alighazi288
Copy link
Copy Markdown
Contributor Author

@edwinyyyu Would appreciate a review when you have the time!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a stricter validation rule for user-supplied field/property names in the API v2 spec models to improve metadata portability/readability across storage backends (per #1383), and updates tests/fixtures to comply.

Changes:

  • Introduces SafeFieldName (and _is_valid_field_name) with a 1–32 char, lowercase-start, [a-z0-9_] rule and reserved leading _.
  • Applies SafeFieldName to selected request/spec inputs (dict keys, scalars, list items) while keeping certain lookup-by-name fields permissive for legacy compatibility.
  • Updates server/client tests to use compliant semantic set type names and adds unit tests for validation + Pydantic wiring.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/common/src/memmachine_common/api/spec.py Adds SafeFieldName validation and applies it to relevant request/spec model fields.
packages/server/server_tests/memmachine_server/server/api_v2/test_spec.py Adds validator/wiring test coverage for SafeFieldName and carveout behavior.
packages/server/server_tests/memmachine_server/server/api_v2/test_router.py Updates REST router test payload/assertions for new name/tag rules.
packages/client/client_tests/test_memory.py Updates client test expectations for semantic set type name.
packages/client/client_tests/test_integration_complete.py Updates integration tests to use compliant semantic set type name values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +193 to +196
raise InvalidNameError(
"Field/property name must start with a-z and contain only "
f"[a-z0-9_]: '{v}'",
)
Comment on lines +182 to +186
if not v or len(v) > MAX_FIELD_NAME_LENGTH:
raise InvalidNameError(
f"Field/property name must be 1-{MAX_FIELD_NAME_LENGTH} chars, "
f"got {len(v)}: '{v}'",
)
Comment on lines +467 to +468
reused on `tag`, `feature`, `name`, `category_name` (create paths),
`tag_name`."""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat]: Limit characters allowed in field/property names; reserve field/property names beginning with underscore

3 participants