Skip to content

Commit da0c7c3

Browse files
committed
benchmark: Update the benchmark for revised Flink and Nexmark benchmarks.
Signed-off-by: Ben Pfaff <blp@feldera.com>
1 parent 70686c9 commit da0c7c3

File tree

16 files changed

+112
-492
lines changed

16 files changed

+112
-492
lines changed

benchmark/README.md

Lines changed: 21 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# Comparative Benchmarking
22

3-
It's useful for comparative purposes to be able to run the same
4-
benchmark on DBSP and other systems. The `run-nexmark.sh` script in
5-
this directory supports running the
6-
[Nexmark](https://datalab.cs.pdx.edu/niagara/NEXMark/) benchmarks in
7-
comparable ways across a few different systems, currently:
3+
It's useful to be able to run the same benchmark on Feldera and other
4+
systems. The `run-nexmark.sh` script in this directory supports
5+
running the [Nexmark](https://datalab.cs.pdx.edu/niagara/NEXMark/)
6+
benchmarks in comparable ways across a few different systems,
7+
currently:
88

9-
* DBSP.
9+
* Feldera.
1010

1111
* Flink.
1212

@@ -20,7 +20,7 @@ options are:
2020

2121
* The underlying system to use, with `--runner`:
2222

23-
- `--runner=dbsp` for DBSP.
23+
- `--runner=feldera` for Feldera.
2424

2525
- `--runner=flink` for standalone Flink.
2626

@@ -51,7 +51,7 @@ options are:
5151
formulated that way.
5252

5353
* Set the number of cores to use with `--cores`. The default is
54-
however many cores your system has, but no more than 16. DBSP
54+
however many cores your system has, but no more than 16. Feldera
5555
uses the exact number of cores specified; some of the other
5656
runners only approximate the number of cores.
5757

@@ -69,9 +69,19 @@ options, or you can use `suite.mk`, which is a wrapper around
6969

7070
# Setting up the runners
7171

72-
## DBSP setup
72+
## Feldera setup
7373

74-
`run-nexmark.sh` supports DBSP without special setup.
74+
`run-nexmark.sh` supports Feldera without special setup.
75+
76+
Feldera uses temporary files for storage, by default in `/tmp`. If
77+
`/tmp` is `tmpfs`, as is the default on Fedora and some other
78+
distributions, then these files will really be written into memory.
79+
In that case, consider setting `TMPDIR` in the environment to a
80+
directory on a real filesystem, e.g.:
81+
82+
```
83+
TMPDIR=/var/run ./run-nexmark.sh
84+
```
7585

7686
## Flink setup
7787

@@ -88,7 +98,7 @@ To use `run-nexmark.sh` with Beam, first follow the instructions for
8898
building Nexmark in `beam/README.md`. This can be as simple as:
8999

90100
```
91-
(cd flink && ./setup.sh)
101+
(cd beam && ./setup.sh)
92102
```
93103

94104
### Google Cloud Dataflow on Beam
@@ -180,50 +190,3 @@ To make sure that you don't get billed further, you can delete the
180190
project that you created for benchmarking. The ongoing cost of the
181191
project is minimal (perhaps $1 per month or less) as long as no jobs
182192
run, so there is not much reason to delete it.
183-
184-
# Sample results
185-
186-
The following table, produced using `suite.mk` and `analysis.sps`,
187-
shows results obtained on one particular system:
188-
189-
```
190-
16-core Nexmark Streaming Performance
191-
╭────────┬───────────────────────────────────────────────────────────────────────────╮
192-
│ │ runner │
193-
│ ├─────────┬─────────┬──────────────────────────┬────────────────────────────┤
194-
│ │ DBSP │ Flink │ Flink on Beam │ Dataflow on Beam │
195-
│ ├─────────┼─────────┼──────────────────────────┼────────────────────────────┤
196-
│ │ language│ language│ language │ language │
197-
│ ├─────────┼─────────┼────────┬────────┬────────┼─────────┬─────────┬────────┤
198-
│ │ default │ default │ default│ SQL │ ZetaSQL│ default │ SQL │ ZetaSQL│
199-
│ ├─────────┼─────────┼────────┼────────┼────────┼─────────┼─────────┼────────┤
200-
│ │ events/s│ events/s│events/s│events/s│events/s│ events/s│ events/s│events/s│
201-
│ ├─────────┼─────────┼────────┼────────┼────────┼─────────┼─────────┼────────┤
202-
│ │ Mean │ Mean │ Mean │ Mean │ Mean │ Mean │ Mean │ Mean │
203-
├────────┼─────────┼─────────┼────────┼────────┼────────┼─────────┼─────────┼────────┤
204-
│query 0 │9,926,544│1,889,538│ 283,366│ 225,276│ 208,986│ 697,837│ 679,810│ 197,746│
205-
│ 1 │9,942,334│ 516,358│ 316,056│ 305,904│ │1,023,541│ 819,001│ │
206-
│ 2 │9,927,529│1,834,189│ 517,331│ 500,250│ 417,014│1,824,818│1,773,050│ 329,598│
207-
│ 3 │9,936,407│ 617,623│ 555,247│ 415,455│ 402,414│ 793,651│ 694,444│ 413,394│
208-
│ 4 │9,768,487│ 423,881│ 93,712│ │ │ 63,068│ │ │
209-
│ 5 │9,906,875│ 362,190│ 251,699│ │ │ 114,758│ │ │
210-
│ 6 │9,829,942│ │ 88,488│ │ │ 63,577│ │ │
211-
│ 7 │7,380,618│ 208,461│ 194,970│ 68,069│ │ 102,491│ 5,518│ │
212-
│ 8 │9,380,863│ 532,430│ 396,511│ │ │ 418,760│ │ │
213-
│ 9 │2,107,437│ 205,650│ 88,715│ │ │ 69,701│ │ │
214-
│ 10│ │ │ 199,681│ │ │ 106,758│ │ │
215-
│ 11│ │ 509,767│ 194,326│ │ │ 83,731│ │ │
216-
│ 12│9,134,088│ 753,699│ 365,631│ │ │ 176,149│ │ │
217-
│ 13│5,778,009│ 584,037│ 225,734│ 141,844│ 134,445│1,002,004│ 564,016│ 110,681│
218-
│ 14│9,928,515│ 539,046│ 200,803│ │ │ 108,319│ │ │
219-
│ 15│8,911,862│ 431,345│ │ │ │ │ │ │
220-
│ 16│3,094,251│ 236,567│ 339,443│ │ │ 208,030│ │ │
221-
│ 17│7,127,076│1,382,246│ │ │ │ │ │ │
222-
│ 18│3,377,351│1,079,844│ │ │ │ │ │ │
223-
│ 19│2,732,390│ 556,923│ │ │ │ │ │ │
224-
│ 20│3,444,356│ 367,674│ │ │ │ │ │ │
225-
│ 21│9,760,859│ 780,567│ │ │ │ │ │ │
226-
│ 22│9,935,420│ 433,862│ │ │ │ │ │ │
227-
╰────────┴─────────┴─────────┴────────┴────────┴────────┴─────────┴─────────┴────────╯
228-
Beam Spark performance omitted because Nexmark hangs in streaming mode.
229-
```

benchmark/analysis.sps

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
1-
/* This is SPSS syntax for generating a summary table. It works with
2-
/* GNU PSPP's current tip-of-master (the CTABLES command isn't supported
3-
/* in the latest release version).
1+
/* This is SPSS syntax for generating a summary table.
2+
/* It also works with GNU PSPP 2.0 or later.
43

54
DATA LIST LIST(',') NOTABLE FILE='|grep -vh when *-100M.csv'
65
/when (YMDHMS17)
@@ -13,7 +12,7 @@ DATA LIST LIST(',') NOTABLE FILE='|grep -vh when *-100M.csv'
1312
elapsed (F5.3).
1413
VARIABLE LEVEL cores events elapsed (SCALE) query (NOMINAL).
1514

16-
SELECT IF mode='stream'.
15+
*SELECT IF mode='stream'.
1716

1817
VALUE LABELS language
1918
'sql' 'SQL'
@@ -26,6 +25,7 @@ VARIABLE LEVEL querynum (NOMINAL).
2625

2726
RECODE runner
2827
('dbsp'=0)
28+
('feldera'=0)
2929
('flink'=1)
3030
('beam.direct'=2)
3131
('beam.flink'=3)
@@ -35,7 +35,7 @@ RECODE runner
3535
INTO nrunner.
3636
VARIABLE LABEL nrunner 'runner'.
3737
VALUE LABELS nrunner
38-
0 'DBSP'
38+
0 'Feldera'
3939
1 'Flink'
4040
2 'Beam (direct)'
4141
3 'Flink on Beam'
@@ -47,7 +47,6 @@ COMPUTE eps=events/elapsed.
4747
VARIABLE LABEL eps 'events/s'.
4848
FORMATS eps(COMMA10).
4949
CTABLES
50-
/TABLE=querynum BY nrunner > language > eps
50+
/TABLE=querynum BY nrunner > mode > language > eps
5151
/TITLES
52-
TITLE='16-core Nexmark Streaming Performance'
53-
CAPTION='Beam Spark performance omitted because Nexmark hangs in streaming mode.'.
52+
TITLE='16-core Nexmark Streaming Performance'.

benchmark/beam/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ https://beam.apache.org/documentation/sdks/java/testing/nexmark/.
2828

2929
## Prerequisites
3030

31-
Install the Java Development Kit. JDK versions 8, 11, and 17 should
32-
work. These instructions were tested with JDK 11.
31+
Install the Java Development Kit. These instructions were tested with
32+
OpenJDK 21.0.2.
3333

3434
## Setting up Beam
3535

@@ -42,10 +42,10 @@ You can follow the instructions below to build Nexmark, or run
4242
git clone https://github.com/apache/beam.git
4343
```
4444

45-
If you wish to benchmark a particular version, check out its tag:
45+
If you wish to benchmark a particular version, check it out:
4646

4747
```
48-
(cd beam && git checkout v2.46.0)
48+
(cd beam && git checkout origin/release-2.55.0)
4949
```
5050

5151
2. Apply `configurable-spark-master.patch`:

0 commit comments

Comments
 (0)