py: fix Kafka Avro tests for improved performance by rivudhk · Pull Request #5880 · feldera/feldera

rivudhk · 2026-03-20T15:35:51Z

Set start_from: "earliest" for existing variants to prevent the consumer from missing messages
Add a delay after pipeline starts for variants without pre-created topics
Use UUID suffixes for view and topic names to prevent schema/topic collisions
Remove default 'num_partitions' and 'replication_factor' in create_kafka_topic (previously set to 1)
Increase 'timeout_s' to 3600 seconds for 'wait_for_rows'
Rename variable 'futures' in 'create_kafka_topics' to 'tpcs' for consistency with 'delete_kafka_topics'
Remove PYTEST_EXTRA_ARGS in YAML workflows to enable the Kafka tests
Add variable RUN_ID: 1 in YAML workflow for Python workload tests for the Kafka test

mythical-fred

Commit subject ends in a colon ("as follows:") making it a fragment rather than a standalone summary. Please rewrite, e.g.: "py: fix Kafka Avro tests: topic naming, start_from, timeouts".

mythical-fred · 2026-03-20T15:52:38Z

python/tests/workloads/test_kafka_avro.py

-    futures = get_kafka_admin().create_topics([new_topic])
-    for t, f in futures.items():
+    tpcs = get_kafka_admin().create_topics([new_topic])
+    for topic, tpcs in tpcs.items():


Variable shadowing: tpcs (the dict from create_topics) is immediately reused as the loop variable in for topic, tpcs in tpcs.items(). Python evaluates .items() before entering the loop so it works, but it is confusing — and it mirrors the same pattern in delete_kafka_topics. Suggest using fut for the loop variable:

for topic, fut in tpcs.items(): try: fut.result()

mythical-fred · 2026-03-20T15:52:38Z

python/tests/workloads/test_kafka_avro.py

+
+        # For variants without pre-created topics, wait for the output connector to create them
+        if not v.create_topic:
+            time.sleep(3)


Unconditional time.sleep(3) to wait for auto-topic creation is a flakiness risk — three seconds may not be enough under load. Is there a way to poll until the topic is visible (e.g. via AdminClient.list_topics()) instead of sleeping?

gz

is there code that cleans up the topics

-need to clean up topics if test is successful
-need to clean up old topics that have been used by old failed tests after a few days

gz · 2026-03-21T16:30:14Z

.github/workflows/test-integration-platform.yml

      - name: Run platform tests
        if: ${{ vars.CI_DRY_RUN != 'true' }}
-        run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/platform --timeout=1500 -vv ${{ vars.PYTEST_EXTRA }}
+        run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/platform --timeout=1500 -vv


lets keep the variable in yaml for the future but we can remove the test exclusion from
the variable

gz · 2026-03-21T16:30:26Z

.github/workflows/test-integration-runtime.yml

      - name: Python workload tests
        if: ${{ vars.CI_DRY_RUN != 'true' && !contains(vars.CI_SKIP_JOBS, 'runtime-workload') }}
-        run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/workloads --timeout=3600 -vv ${{ vars.PYTEST_EXTRA }}
+        run: uv run --locked pytest -n ${{ vars.PYTEST_WORKERS }} tests/workloads --timeout=3600 -vv


gz · 2026-03-21T16:32:30Z

python/tests/workloads/test_kafka_avro.py


-        self.topic1 = f"my_topic_avro{suffix}"
-        self.topic2 = f"my_topic_avro2{suffix}"
+        self.topic1 = f"my_topic_avro_{self.id}_{suffix}"


this start with the name of the test instead of my_... will make it easier to attribute the topic to a test in the kafka instance

gz · 2026-03-21T16:33:14Z

python/tests/workloads/test_kafka_avro.py

-        self.topic1 = f"my_topic_avro{suffix}"
-        self.topic2 = f"my_topic_avro2{suffix}"
+        self.topic1 = f"my_topic_avro_{self.id}_{suffix}"
+        self.topic2 = f"my_topic_avro2_{self.id}_{suffix}"


where does id come from?

It is manually set in the TEST_CONFIGS array within the TestKafkaAvro class.

gz · 2026-03-21T16:33:49Z

python/tests/workloads/test_kafka_avro.py

+
+        # For variants without pre-created topics, wait for the output connector to create them
+        if not v.create_topic:
+            time.sleep(3)


- Set start_from: "earliest" for existing variants to prevent the consumer from missing messages - Add a delay after pipeline starts for variants without pre-created topics - Use UUID suffixes for view and topic names to prevent schema/topic collisions - Remove default 'num_partitions' and 'replication_factor' in create_kafka_topic (previously set to 1) - Increase 'timeout_s' to 3600 seconds for 'wait_for_rows' - Rename variable 'futures' in 'create_kafka_topics' to 'tpcs' for consistency with' delete_kafka_topics' Signed-off-by: rivudhk <rivudhkr@gmail.com>

- Rename variable tpcs to fut for clarity - Replace flaky time.sleep() with polling(poll_until_topic_exists) - Update topic naming mechanism to use pipeline name and other suffixes - Add setUpClass cleanup to delete topics older than 3 days from previous runs - Add variable RUN_ID: 1 in YAML workflow for Python workload tests for the Kafka test Signed-off-by: rivudhk <rivudhkr@gmail.com>

mythical-fred

Blockers addressed — commit subject cleaned up, variable renamed, sleep replaced with polling. LGTM.

gz · 2026-04-07T18:07:21Z

python/tests/workloads/test_kafka_avro.py



-def wait_for_rows(pipeline, expected_rows, timeout_s=1800, poll_interval_s=5):
+def wait_for_rows(pipeline, expected_rows, timeout_s=3600, poll_interval_s=5):


any reason we need a 1h timeout? it likely seems overkill if it doesnt arrive in 30min, it probably will not arrive in 1h?

even 1800 seems too high imo. I'd give this 10min max

gz · 2026-04-07T18:07:37Z

.github/workflows/test-integration-runtime.yml

          FELDERA_TLS_INSECURE: true
          KAFKA_BOOTSTRAP_SERVERS: ${{ vars.CI_KAFKA_BOOTSTRAP }}
          SCHEMA_REGISTRY_URL: ${{ vars.CI_SCHEMA_REGISTRY }}
+          RUN_ID: 1 # we use this to run a single variant of the Kafka tests in test_kafka_avro.py


shouldthis be github.run_id? (i dont know just asking)

gz · 2026-04-07T18:09:03Z

python/tests/workloads/test_kafka_avro.py

+        suffix = uuid.uuid4().hex[:4]

-        suffix = uuid.uuid4().hex[:8]
+        self.pipeline_name = pipeline_name


you can use unique_pipeline_name in feldera.testutils I htink

gz · 2026-04-07T18:14:08Z

python/tests/workloads/test_kafka_avro.py

-            f.result()
-            print(f"Topic {t} created with {num_partitions} partitions")
+            fut.result()
+            print(


do we use print in other tests? I think we have some form of logging framework cc @abhizer

Within the framework we use the logging framework, but within tests we mostly just use print with stderr.

gz · 2026-04-07T18:15:54Z

python/tests/workloads/test_kafka_avro.py

+
+        topics_to_delete = []
+        for topic in admin.list_topics().topics:
+            if topic.startswith("local_test_kafka_avro_"):


Suggested change

if topic.startswith("local_test_kafka_avro_"):

if topic.startswith($SOME_VAR):

should this name be hardcoded?

mythical-fred suggested changes Mar 20, 2026

View reviewed changes

rivudhk temporarily deployed to ci March 20, 2026 16:08 — with GitHub Actions Inactive

gz reviewed Mar 21, 2026

View reviewed changes

rivudhk added 2 commits April 1, 2026 01:23

rivudhk force-pushed the fix_kafka_tests branch from 4002fdc to bbdb2f1 Compare March 31, 2026 19:41

rivudhk changed the title ~~py: fix Kafka Avro tests as follows:~~ py: fix Kafka Avro tests for improved performance Mar 31, 2026

mythical-fred approved these changes Apr 4, 2026

View reviewed changes

gz approved these changes Apr 7, 2026

View reviewed changes



		def wait_for_rows(pipeline, expected_rows, timeout_s=1800, poll_interval_s=5):
		def wait_for_rows(pipeline, expected_rows, timeout_s=3600, poll_interval_s=5):

	if topic.startswith("local_test_kafka_avro_"):
	if topic.startswith($SOME_VAR):

Conversation

rivudhk commented Mar 20, 2026

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants