This is Unstract's python package which helps connect to a number of different filesystems and databases.
Filesystems are supported with the help of fsspec libraries that provide a uniform interface to these connectors.
The following filesystems are supported
- Google Drive
- S3/Minio
- Unstract Cloud Storage
- Box
- Dropbox (issues exist around file discovery/listing)
- HTTP(S)
The following databases are supported
- Snowflake
- PostgreSQL
- MySQL
- MSSQL
- Redshift
- MariaDB
- BigQuery
To get started with local development,
- Create and source a virtual environment if you haven't already following these steps.
- If you're using Mac, install the below library needed for PyMSSQL
brew install pkg-config freetds
- Install the required dependencies with
pdm installIf the GCSHelper is used, the following environment variables need to be set
- GOOGLE_SERVICE_ACCOUNT : The service account JSON to perform authentication with Google Cloud Storage account.
- GOOGLE_PROJECT_ID : The project ID associated with the Google Cloud Storage account.
TODO: Use a test framework and document way to run tests
