Skip to content

Latest commit

 

History

History

README.md

Examples of writing to Sinks

This module contains example pipelines that use the Beam IO connectors also known as Sinks to write in streaming and batch.

Batch

test_write_bounded.py - a simple pipeline taking a bounded PCollection as input using the Create transform (useful for testing) and writing it to files using multiple IOs.

Running the pipeline

To run the pipeline locally:

python -m apache_beam.examples.sinks.test_write_bounded

Streaming

Two example pipelines that use 2 different approches for creating the input.

test_write_unbounded.py uses TestStream, a method where you can control when data arrives and how watermark advances. This is especially useful in unit tests.

test_periodicimpulse.py uses PeriodicImpulse, a method useful to test pipelines in realtime. You can run it to Dataflow as well.

Running the pipeline

To run the pipelines locally:

python -m apache_beam.examples.sinks.test_write_unbounded
python -m apache_beam.examples.sinks.test_periodicimpulse