Skip to content

Commit b14971f

Browse files
committed
update kafka instructions for dealing with messages etc
1 parent 2a77e77 commit b14971f

2 files changed

Lines changed: 47 additions & 4 deletions

File tree

dataflow/flex-templates/kafka_to_bigquery/README.md

Lines changed: 46 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -130,9 +130,9 @@ For this, we need two parts running:
130130
> </details>
131131

132132
The Kafka server must be accessible to *external* applications.
133-
For this we need a
134-
[static IP address](https://cloud.google.com/compute/docs/ip-addresses/reserve-static-external-ip-address)
135-
for the Kafka server to live.
133+
For this we need an
134+
[external static IP address](https://cloud.google.com/compute/docs/ip-addresses/reserve-static-external-ip-address)
135+
for the Kafka server to live. Not an internal IP address.
136136

137137
> ℹ️ If you already have a Kafka server running you can skip this section.
138138
> Just make sure to store its IP address into an environment variable.
@@ -186,6 +186,14 @@ export KAFKA_IMAGE="gcr.io/$PROJECT/samples/dataflow/kafka:latest"
186186
# Build the Kafka server image into Container Registry.
187187
gcloud builds submit --tag $KAFKA_IMAGE kafka/
188188
189+
# If a different topic, address, kafka port, or zookeeper port is desired,
190+
# update the following environment variables before starting the server.
191+
# Otherwise, the default values will be used in the Dockerfile:
192+
export KAFKA_TOPIC=<topic-name>
193+
export KAFKA_ADDRESS=<kafka-address>
194+
export KAFKA_PORT=<kafka-port>
195+
export ZOOKEEPER_PORT=<zookeeper-port>
196+
189197
# Create and start a new instance.
190198
# The --address flag binds the VM's address to the static address we created.
191199
# The --container-env KAFKA_ADDRESS is an environment variable passed to the
@@ -200,6 +208,41 @@ gcloud compute instances create-with-container kafka-vm \
200208
--tags "kafka-server"
201209
```
202210
211+
### Sending messages to Kafka server
212+
213+
The Kafka server should be running at this point, but in its current state no
214+
messages are being sent to a topic which will cause the KafkaToBigQuery
215+
template to fail. So ssh into the `kafka-vm` that was created earlier and issue
216+
the below commands that are required based on your timing. Messages sent before
217+
the template is started will be present when the template is started. If the
218+
desire is to send messages after the template has started, then the messages
219+
will be processed as they are sent.
220+
221+
```sh
222+
# 1. If the existing topic is not sufficient, please create a new one:
223+
docker run --rm --network host bitnami/kafka:3.4.0 \
224+
/opt/bitnami/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 \
225+
--create --topic <topic-name> --partitions 1 --replication-factor 1
226+
227+
# 2. If the existing topic needs deleting, please delete it:
228+
docker run --rm --network host bitnami/kafka:3.4.0 \
229+
/opt/bitnami/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 \
230+
--delete --topic <topic-name>
231+
232+
# 3. If messages need to be sent, send them to the Kafka topic via the following
233+
# command and then hit enter after each message. End via ctrl+c:
234+
docker run -i --rm --network host bitnami/kafka:3.4.0 \
235+
/opt/bitnami/kafka/bin/kafka-console-producer.sh \
236+
--bootstrap-server localhost:9092 --topic messages
237+
238+
# 4. If the messages need to be verified that they exist, issue this command
239+
# and end via ctrl+c:
240+
docker run -it --rm --network host bitnami/kafka:3.4.0 \
241+
/opt/bitnami/kafka/bin/kafka-console-consumer.sh \
242+
--bootstrap-server localhost:9092 --topic messages --from-beginning
243+
```
244+
245+
203246
### Creating and running a Flex Template
204247
205248
> <details><summary>

dataflow/flex-templates/kafka_to_bigquery/kafka/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ ENV ZOOKEEPER_PORT="${ZOOKEEPER_PORT:-2181}"
3333

3434
# Download and install Apache Kafka.
3535
RUN apk add --no-cache bash \
36-
&& wget http://apache.mirrors.spacedump.net/kafka/${KAFKA_VERSION}/kafka_${SCALA_VERSION}-${KAFKA_VERSION}.tgz \
36+
&& wget http://archive.apache.org/dist/kafka/${KAFKA_VERSION}/kafka_${SCALA_VERSION}-${KAFKA_VERSION}.tgz \
3737
-O /tmp/kafka.tgz \
3838
&& tar xzf /tmp/kafka.tgz -C /opt && rm /tmp/kafka.tgz \
3939
&& ln -s /opt/kafka_${SCALA_VERSION}-${KAFKA_VERSION} /opt/kafka

0 commit comments

Comments
 (0)