Fix time zone issue with get_historical_features#1475
Merged
Conversation
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
woop
reviewed
Apr 19, 2021
| "customer_id", | ||
| ] | ||
| ).reset_index(drop=True), | ||
| check_dtype=False, |
Member
There was a problem hiding this comment.
Is there a reason for this change?
Collaborator
Author
There was a problem hiding this comment.
There were errors for 32bit vs 64 bit stuff, and didn't think it was important to change. I can look into it if you're concerned though
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
woop
approved these changes
Apr 19, 2021
Collaborator
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tsotnet, woop The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Member
|
/lgtm |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Tsotne Tabidze tsotne@tecton.ai
What this PR does / why we need it: There was an issue with FeatureStore.get_historical_features when you pass records with timestamp with non-UTC time zones (pandas merge_asof was throwing an exception). This PR fixes that issue by converting all timestamps in the entity to UTC. If timestamp is tz-naive, we assume it's UTC. If it's tz-aware, we localize to UTC.
I also made significant changes to the tests. First, when generating entities we now generate timestamps in all kinds of different formats. Second,
test_historical_retrieval.py:get_expected_training_dffunction used a very similar logic to the local offline store (based on pandas merge_asof). Meaning that if there are bugs in the local offline store, we replicate them in the test. Instead of adding the identical logic to this test, I rewrote this method to do manual (non-pandas) join. This should give us more confidence in the offline store correctness.Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: