Skip to content

Commit 163140b

Browse files
authored
Support for i8, Time. Documented PIVOT operation. (#383)
* Support for i8, Time. Documented PIVOT operation. * [sql] Correct code generation for conditional aggregates Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
1 parent 585a192 commit 163140b

44 files changed

Lines changed: 2364 additions & 1391 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CONTRIBUTING.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,29 @@ it on as an open-source patch. The e-mail address used to sign must match the e
88
author. If you set your `user.name` and `user.email` git config values, you can sign your commit automatically
99
with `git commit -s`.
1010

11+
## Dependencies
12+
13+
We develop and test under Linux. Windows Subsystem for Linux works fine.
14+
The setup can be seen in this [Earthfile](Earthfile).
15+
16+
Our known dependencies are:
17+
- Runtime
18+
- a Rust tool chain (install rustup and the default toolchain)
19+
- this will need a C and C++ compiler installed (e.g., gcc, gcc++)
20+
- cmake
21+
- libdev-ssl
22+
- SQL Compiler
23+
- a Java Virtual Machine (at least Java 8)
24+
- maven
25+
- graphviz
26+
- Cloud and UI
27+
- Python 3
28+
- typescript
29+
- Redpanda or Kafka
30+
31+
Additional dependencies are automatically installed by the Rust,
32+
maven, Python, and typescript build tools.
33+
1134
## Contribution Flow
1235

1336
### Forking

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@
22

33
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
44
[![CI workflow](https://github.com/feldera/dbsp/actions/workflows/ci.yml/badge.svg)](https://github.com/feldera/dbsp/actions)
5+
<!--
56
[![codecov](https://codecov.io/gh/feldera/dbsp/branch/main/graph/badge.svg?token=0wZcmD11gt)](https://codecov.io/gh/feldera/dbsp)
7+
-->
68
[![nightly](https://github.com/feldera/dbsp/actions/workflows/containers.yml/badge.svg)](https://github.com/feldera/dbsp/actions/workflows/containers.yml)
79

8-
910
The [Feldera Continuous Analytics Platform](https://www.feldera.com/), or Feldera in short, is a
1011
fast computational engine for *continuous analytics* over data in-motion. It
1112
allows users to build data pipelines as SQL programs that are continuously
@@ -92,6 +93,11 @@ To learn more about Feldera, we recommend going through the [documentation](http
9293
* [SQL reference](https://docs.feldera.io/docs/sql/intro)
9394
* [API reference](https://docs.feldera.io/docs/api/rest/)
9495

96+
## Contributing
97+
98+
Most of the software in this repository is governed by an open-source license.
99+
We welcome contributions. Here are some [guidelines](CONTRIBUTING.md).
100+
95101
## Theory
96102

97103
Feldera achieves its objectives by building on a solid mathematical

crates/dbsp/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ uuid = { version = "1.1.2", features = ["v4"], optional = true }
4343
arc-swap = "1.5.1"
4444
mimalloc-rust-sys = "1.7.2"
4545
rand = "0.8.5"
46+
rust_decimal = "1.29"
4647

4748
[dependencies.size-of]
4849
version = "0.1.5"

crates/dbsp/src/algebra/mod.rs

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ pub use order::{PartialOrder, TotalOrder};
1717
pub use present::Present;
1818
pub use zset::{IndexedZSet, ZSet};
1919

20+
use rust_decimal::Decimal;
2021
use size_of::SizeOf;
2122
use std::{
2223
marker::PhantomData,
@@ -327,6 +328,24 @@ impl MulByRef<isize> for F64 {
327328
}
328329
}
329330

331+
impl MulByRef<isize> for Decimal {
332+
type Output = Self;
333+
334+
#[inline]
335+
fn mul_by_ref(&self, w: &isize) -> Self::Output {
336+
*self * Decimal::from(*w)
337+
}
338+
}
339+
340+
impl MulByRef<isize> for Option<Decimal> {
341+
type Output = Self;
342+
343+
#[inline]
344+
fn mul_by_ref(&self, w: &isize) -> Self::Output {
345+
self.as_ref().map(|x| (*x * Decimal::from(*w)))
346+
}
347+
}
348+
330349
impl MulByRef<isize> for Option<i32> {
331350
type Output = Self;
332351

@@ -437,6 +456,24 @@ impl MulByRef<i64> for F64 {
437456
}
438457
}
439458

459+
impl MulByRef<i64> for Decimal {
460+
type Output = Self;
461+
462+
#[inline]
463+
fn mul_by_ref(&self, w: &i64) -> Self::Output {
464+
*self * Decimal::from(*w)
465+
}
466+
}
467+
468+
impl MulByRef<i64> for Option<Decimal> {
469+
type Output = Self;
470+
471+
#[inline]
472+
fn mul_by_ref(&self, w: &i64) -> Self::Output {
473+
self.as_ref().map(|x| (*x * Decimal::from(*w)))
474+
}
475+
}
476+
440477
impl MulByRef<i64> for Option<i32> {
441478
type Output = Self;
442479

docs/docs/sql/aggregates.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,3 +68,116 @@ aggregate function.
6868
<td>Same as <code>EVERY</code></td>
6969
</tr>
7070
</table>
71+
72+
If `FILTER` is specified, then only the input rows for which the
73+
filter_clause evaluates to true are fed to the aggregate function;
74+
other rows are discarded. For example:
75+
76+
```sql
77+
SELECT
78+
count(*) AS unfiltered,
79+
count(*) FILTER (WHERE i < 5) AS filtered
80+
FROM TABLE
81+
```
82+
83+
## Pivots
84+
85+
The SQL `PIVOT` operation can be used to turn rows into columns. It
86+
usually replaces a `GROUP-BY` operation when the group keys are known
87+
in advance. Instead of producing one row for each group, `PIVOT` can
88+
produce one *column* for each group.
89+
90+
### Syntax
91+
92+
```
93+
PIVOT ( { aggregate_expression [ AS aggregate_expression_alias ] } [ , ... ]
94+
FOR column_with_data IN ( column_list ) )
95+
```
96+
97+
### Parameters
98+
99+
- aggregate_expression
100+
Specifies an aggregate expression (`SUM`, `COUNT(DISTINCT )`, etc.).
101+
102+
- aggregate_expression_alias
103+
Specifies a column name for the aggregate expression.
104+
105+
- column_with_data
106+
A column that produces all the values that will become new
107+
column names.
108+
109+
- column_list
110+
Columns that show the pivoted data.
111+
112+
### Example
113+
114+
```sql
115+
CREATE TABLE FURNITURE (
116+
type VARCHAR,
117+
year INTEGER,
118+
count INTEGER
119+
);
120+
INSERT INTO FURNITURE VALUES
121+
('chair', 2020, 4),
122+
('table', 2021, 3),
123+
('chair', 2021, 4),
124+
('desk', 2023, 1),
125+
('table', 2023, 2);
126+
127+
SELECT year, type, SUM(count) FROM FURNITURE GROUP BY year,type;
128+
year | type | sum
129+
-------------------
130+
2020 | chair | 4
131+
2021 | table | 3
132+
2021 | chair | 4
133+
2023 | desk | 1
134+
2023 | table | 2
135+
(5 rows)
136+
137+
SELECT * FROM FURNITURE
138+
PIVOT (
139+
SUM(count) AS ct
140+
FOR type IN ('desk' AS desks, 'table' AS tables, 'chair' as chairs)
141+
);
142+
143+
year | desks | tables | chairs
144+
------------------------------
145+
2020 | | | 4
146+
2021 | | 3 | 4
147+
2023 | 1 | 2 |
148+
(3 rows)
149+
```
150+
151+
Notice how the same information is presented in a tabular form where
152+
we have a column for each type of object. PIVOTs require all the
153+
possible "type"s to be specified when the query is written. Notice
154+
that if we add an additional type, the `GROUP BY` query will produce a
155+
correct result, while the `PIVOT` query will produce the same result.
156+
157+
```sql
158+
INSERT INTO FURNITURE VALUES ('bed', 2020, 5);
159+
SELECT year, type, SUM(count) FROM FURNITURE GROUP BY year,type;
160+
year | type | sum
161+
-------------------
162+
2020 | chair | 4
163+
2020 | bed | 5
164+
2021 | table | 3
165+
2021 | chair | 4
166+
2023 | desk | 1
167+
2023 | table | 2
168+
(6 rows)
169+
170+
SELECT * FROM FURNITURE
171+
PIVOT (
172+
SUM(count) AS ct
173+
FOR type IN ('desk' AS desks, 'table' AS tables, 'chair' as chairs)
174+
);
175+
176+
year | desks | tables | chairs
177+
------------------------------
178+
2020 | | | 4
179+
2021 | | 3 | 4
180+
2023 | 1 | 2 |
181+
(3 rows)
182+
```
183+

docs/docs/sql/string.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ strings of any length.
1717
Trailing spaces are removed when converting a character value to one
1818
of the other string types. Note that trailing spaces are semantically
1919
significant in character varying and text values, and when using
20-
pattern matching, that is LIKE and regular expressions.
20+
pattern matching (e.g., LIKE and regular expressions).
2121

2222
## String constants (literals)
2323

docs/docs/sql/types.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ The compiler supports the following SQL data types:
2424
- `INTERVAL`, a SQL interval. Two types of intervals are supported:
2525
long intervals (comprising years and months), and short intervals,
2626
comprising days, hours, minutes, seconds.
27+
- `TIME`, the time of the day, with a precision of nanoseconds.
2728
- `TIMESTAMP`, a SQL timestamp without a timezone. A timestamp
2829
represents a value containing a date and a time, with a precision up
2930
to a millisecond.

sql-to-dbsp-compiler/SQL-compiler/src/main/java/org/dbsp/sqlCompiler/compiler/backend/DBSPCompiler.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -229,21 +229,25 @@ private void compileInternal(String statements, boolean many, @Nullable String c
229229
} catch (SqlParseException e) {
230230
this.messages.reportError(e);
231231
if (this.options.optimizerOptions.throwOnError) {
232+
System.err.println(this.messages.toString());
232233
throw new RuntimeException(e);
233234
}
234235
} catch (CalciteContextException e) {
235236
this.messages.reportError(e);
236237
if (this.options.optimizerOptions.throwOnError) {
238+
System.err.println(this.messages.toString());
237239
throw new RuntimeException(e);
238240
}
239241
} catch (BaseCompilerException e) {
240242
this.messages.reportError(e);
241243
if (this.options.optimizerOptions.throwOnError) {
244+
System.err.println(this.messages.toString());
242245
throw new RuntimeException(e);
243246
}
244247
} catch (Throwable e) {
245248
this.messages.reportError(e);
246249
if (this.options.optimizerOptions.throwOnError) {
250+
System.err.println(this.messages.toString());
247251
throw new RuntimeException(e);
248252
}
249253
}

0 commit comments

Comments
 (0)