Expected Behavior
FeatureStore didn't use to perform any I/O on instantiation. This should be the expected behavior. As far as I can tell, all other abstractions follow this convention. Some examples include all implementations of Provider, DataSource, RegistryStore, InfraObject, and RetrievalJob.
This is also consistent with other python APIs such as Amazon's boto3
s3 = boto3.client('s3') # no I/O; doesn't try to read credentials from config file on disk; etc
s3.list_buckets() # explicit I/O operation
bucket_policy = s3.BucketPolicy('bucket_name') # no I/O; doesn't verify if bucket exists
bucket_policy.load() # explicit I/O operation
However this is not consistent across the PEP 249 implementations. Trino doesn't perform any I/O. However, by default, the BigQuery SDK tries to load a credentials file from disk, the psycopg2 library tries to establish a valid connection, and sqlite3 will try to create or open a file database.
Current Behavior
#2256 introduced the self._registry._initialize_registry() line to FeatureStore's __init__ method. This tries to fetch the registry's proto from the registry and fails if this does not exist or if the remote registry is not accessible from the current environment.
Steps to reproduce
>>> store = FeatureStore(
... config=RepoConfig(
... project="foo",
... provider="local",
... registry="s3://my-bucket/registry.db",
... )
... )
[...]
File ".venv/lib/python3.9/site-packages/botocore/signers.py", line 103, in handler
return self.sign(operation_name, request)
File ".venv/lib/python3.9/site-packages/botocore/signers.py", line 187, in sign
auth.add_auth(request)
File ".venv/lib/python3.9/site-packages/botocore/auth.py", line 405, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials
Specifications
- Version: 0.20.2
- Platform: macOS
- Subsystem:
Possible Solution
Lazy load the registry's proto. I think this can be achieved by removing self._registry._initialize_registry() from the __init__ method. Maybe @adchia @felixwang9817 can provide more context here. Also, interested in hearing your take on this.
Expected Behavior
FeatureStoredidn't use to perform any I/O on instantiation. This should be the expected behavior. As far as I can tell, all other abstractions follow this convention. Some examples include all implementations ofProvider,DataSource,RegistryStore,InfraObject, andRetrievalJob.This is also consistent with other python APIs such as Amazon's boto3
However this is not consistent across the PEP 249 implementations. Trino doesn't perform any I/O. However, by default, the BigQuery SDK tries to load a credentials file from disk, the psycopg2 library tries to establish a valid connection, and sqlite3 will try to create or open a file database.
Current Behavior
#2256 introduced the
self._registry._initialize_registry()line toFeatureStore's__init__method. This tries to fetch the registry's proto from the registry and fails if this does not exist or if the remote registry is not accessible from the current environment.Steps to reproduce
Specifications
Possible Solution
Lazy load the registry's proto. I think this can be achieved by removing
self._registry._initialize_registry()from the__init__method. Maybe @adchia @felixwang9817 can provide more context here. Also, interested in hearing your take on this.