Background
We're on Pydantic ≥2.11. Many models map snake_case Python fields to camelCase external keys (storage/state/API JSON), and today every field spells its alias out by hand. For example in src/crawlee/statistics/_models.py:
stats_id: Annotated[int | None, Field(alias='statsId')] = None
requests_finished: Annotated[int, Field(alias='requestsFinished')] = 0
crawler_started_at: Annotated[datetime | None, Field(alias='crawlerStartedAt')] = None
Proposal
Set an alias generator once in the model config so the camelCase aliases are derived automatically:
from pydantic.alias_generators import to_camel
model_config = ConfigDict(validate_by_name=True, validate_by_alias=True, alias_generator=to_camel)
Fields then drop the manual alias (stats_id: int | None = None) and still (de)serialize as statsId. We already standardize on validate_by_name=True, validate_by_alias=True (the 2.11 successors of populate_by_name), so the only new piece is alias_generator. Best placed on a shared base model, since subclasses currently redeclare the config.
Where it makes sense
- Apply to models that already use camelCase aliases:
Request, SessionModel, StatisticsState, the storage_clients/models.py set, event data classes, etc.
- Keep an explicit
Field(alias=...) wherever to_camel produces the wrong key (acronyms / irregular casing). Each field's generated alias must equal the current one.
- Skip models whose serialized form is snake_case, e.g.
RequestQueueState in _file_system/_request_queue_client.py, and Configuration (BaseSettings with env-var AliasChoices).
Scope
Refactor only. The wire format must not change; generated aliases have to match the existing ones, guarded by the current (de)serialization tests.
Background
We're on Pydantic ≥2.11. Many models map snake_case Python fields to camelCase external keys (storage/state/API JSON), and today every field spells its alias out by hand. For example in
src/crawlee/statistics/_models.py:Proposal
Set an alias generator once in the model config so the camelCase aliases are derived automatically:
Fields then drop the manual alias (
stats_id: int | None = None) and still (de)serialize asstatsId. We already standardize onvalidate_by_name=True, validate_by_alias=True(the 2.11 successors ofpopulate_by_name), so the only new piece isalias_generator. Best placed on a shared base model, since subclasses currently redeclare the config.Where it makes sense
Request,SessionModel,StatisticsState, thestorage_clients/models.pyset, event data classes, etc.Field(alias=...)whereverto_camelproduces the wrong key (acronyms / irregular casing). Each field's generated alias must equal the current one.RequestQueueStatein_file_system/_request_queue_client.py, andConfiguration(BaseSettingswith env-varAliasChoices).Scope
Refactor only. The wire format must not change; generated aliases have to match the existing ones, guarded by the current (de)serialization tests.