You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[adapters] Delta output: enable checkpoints.
The Delta output connector did not create periodic checkpoints.
While this is in itself problematic, it also meant that the connector
became slow over time, due to this delta-rs bug, which causes the
`update_incremental` function to scan the entire transaction log on
every commit:
delta-io/delta-kernel-rs#2103.
This commit:
- Introduces the `checkpoint_interval` option, which tells
the connector to configure checkpoint interval when creating
the table.
- Creates a CommitBuilder that is actually setup to create
checkpoints.
Without this fix the time to create a trivial delta commit increases from
1.5s to 6s after ~1000 commits. With the fix it remains constant at
~2s.
Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
[adapters]: parallel delta output encoder
Use SplitCursor to split the batch and distribute it across tasks,
each task retries encoding and writing to delta lake and then returns
Add actions which main task retries to commit to delta lake
Signed-off-by: Swanand Mulay <73115739+swanandx@users.noreply.github.com>