Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,29 @@ it on as an open-source patch. The e-mail address used to sign must match the e
author. If you set your `user.name` and `user.email` git config values, you can sign your commit automatically
with `git commit -s`.

## Dependencies

We develop and test under Linux. Windows Subsystem for Linux works fine.
The setup can be seen in this [Earthfile](Earthfile).

Our known dependencies are:
- Runtime
- a Rust tool chain (install rustup and the default toolchain)
- this will need a C and C++ compiler installed (e.g., gcc, gcc++)
- cmake
- libdev-ssl
- SQL Compiler
- a Java Virtual Machine (at least Java 8)
- maven
- graphviz
- Cloud and UI
- Python 3
- typescript
- Redpanda or Kafka

Additional dependencies are automatically installed by the Rust,
maven, Python, and typescript build tools.

## Contribution Flow

### Forking
Expand Down
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![CI workflow](https://github.com/feldera/dbsp/actions/workflows/ci.yml/badge.svg)](https://github.com/feldera/dbsp/actions)
<!--
[![codecov](https://codecov.io/gh/feldera/dbsp/branch/main/graph/badge.svg?token=0wZcmD11gt)](https://codecov.io/gh/feldera/dbsp)
-->
[![nightly](https://github.com/feldera/dbsp/actions/workflows/containers.yml/badge.svg)](https://github.com/feldera/dbsp/actions/workflows/containers.yml)


The [Feldera Continuous Analytics Platform](https://www.feldera.com/), or Feldera in short, is a
fast computational engine for *continuous analytics* over data in-motion. It
allows users to build data pipelines as SQL programs that are continuously
Expand Down Expand Up @@ -92,6 +93,11 @@ To learn more about Feldera, we recommend going through the [documentation](http
* [SQL reference](https://docs.feldera.io/docs/sql/intro)
* [API reference](https://docs.feldera.io/docs/api/rest/)

## Contributing

Most of the software in this repository is governed by an open-source license.
We welcome contributions. Here are some [guidelines](CONTRIBUTING.md).

## Theory

Feldera achieves its objectives by building on a solid mathematical
Expand Down
1 change: 1 addition & 0 deletions crates/dbsp/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ uuid = { version = "1.1.2", features = ["v4"], optional = true }
arc-swap = "1.5.1"
mimalloc-rust-sys = "1.7.2"
rand = "0.8.5"
rust_decimal = "1.29"

[dependencies.size-of]
version = "0.1.5"
Expand Down
37 changes: 37 additions & 0 deletions crates/dbsp/src/algebra/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ pub use order::{PartialOrder, TotalOrder};
pub use present::Present;
pub use zset::{IndexedZSet, ZSet};

use rust_decimal::Decimal;
use size_of::SizeOf;
use std::{
marker::PhantomData,
Expand Down Expand Up @@ -327,6 +328,24 @@ impl MulByRef<isize> for F64 {
}
}

impl MulByRef<isize> for Decimal {
type Output = Self;

#[inline]
fn mul_by_ref(&self, w: &isize) -> Self::Output {
*self * Decimal::from(*w)
}
}

impl MulByRef<isize> for Option<Decimal> {
type Output = Self;

#[inline]
fn mul_by_ref(&self, w: &isize) -> Self::Output {
self.as_ref().map(|x| (*x * Decimal::from(*w)))
}
}

impl MulByRef<isize> for Option<i32> {
type Output = Self;

Expand Down Expand Up @@ -437,6 +456,24 @@ impl MulByRef<i64> for F64 {
}
}

impl MulByRef<i64> for Decimal {
type Output = Self;

#[inline]
fn mul_by_ref(&self, w: &i64) -> Self::Output {
*self * Decimal::from(*w)
}
}

impl MulByRef<i64> for Option<Decimal> {
type Output = Self;

#[inline]
fn mul_by_ref(&self, w: &i64) -> Self::Output {
self.as_ref().map(|x| (*x * Decimal::from(*w)))
}
}

impl MulByRef<i64> for Option<i32> {
type Output = Self;

Expand Down
113 changes: 113 additions & 0 deletions docs/docs/sql/aggregates.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,116 @@ aggregate function.
<td>Same as <code>EVERY</code></td>
</tr>
</table>

If `FILTER` is specified, then only the input rows for which the
filter_clause evaluates to true are fed to the aggregate function;
other rows are discarded. For example:

```sql
SELECT
count(*) AS unfiltered,
count(*) FILTER (WHERE i < 5) AS filtered
FROM TABLE
```

## Pivots

The SQL `PIVOT` operation can be used to turn rows into columns. It
usually replaces a `GROUP-BY` operation when the group keys are known
in advance. Instead of producing one row for each group, `PIVOT` can
produce one *column* for each group.

### Syntax

```
PIVOT ( { aggregate_expression [ AS aggregate_expression_alias ] } [ , ... ]
FOR column_with_data IN ( column_list ) )
```

### Parameters

- aggregate_expression
Specifies an aggregate expression (`SUM`, `COUNT(DISTINCT )`, etc.).

- aggregate_expression_alias
Specifies a column name for the aggregate expression.

- column_with_data
A column that produces all the values that will become new
column names.

- column_list
Columns that show the pivoted data.

### Example

```sql
CREATE TABLE FURNITURE (
type VARCHAR,
year INTEGER,
count INTEGER
);
INSERT INTO FURNITURE VALUES
('chair', 2020, 4),
('table', 2021, 3),
('chair', 2021, 4),
('desk', 2023, 1),
('table', 2023, 2);

SELECT year, type, SUM(count) FROM FURNITURE GROUP BY year,type;
year | type | sum
-------------------
2020 | chair | 4
2021 | table | 3
2021 | chair | 4
2023 | desk | 1
2023 | table | 2
(5 rows)

SELECT * FROM FURNITURE
PIVOT (
SUM(count) AS ct
FOR type IN ('desk' AS desks, 'table' AS tables, 'chair' as chairs)
);

year | desks | tables | chairs
------------------------------
2020 | | | 4
2021 | | 3 | 4
2023 | 1 | 2 |
(3 rows)
```

Notice how the same information is presented in a tabular form where
we have a column for each type of object. PIVOTs require all the
possible "type"s to be specified when the query is written. Notice
that if we add an additional type, the `GROUP BY` query will produce a
correct result, while the `PIVOT` query will produce the same result.

```sql
INSERT INTO FURNITURE VALUES ('bed', 2020, 5);
SELECT year, type, SUM(count) FROM FURNITURE GROUP BY year,type;
year | type | sum
-------------------
2020 | chair | 4
2020 | bed | 5
2021 | table | 3
2021 | chair | 4
2023 | desk | 1
2023 | table | 2
(6 rows)

SELECT * FROM FURNITURE
PIVOT (
SUM(count) AS ct
FOR type IN ('desk' AS desks, 'table' AS tables, 'chair' as chairs)
);

year | desks | tables | chairs
------------------------------
2020 | | | 4
2021 | | 3 | 4
2023 | 1 | 2 |
(3 rows)
```

2 changes: 1 addition & 1 deletion docs/docs/sql/string.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ strings of any length.
Trailing spaces are removed when converting a character value to one
of the other string types. Note that trailing spaces are semantically
significant in character varying and text values, and when using
pattern matching, that is LIKE and regular expressions.
pattern matching (e.g., LIKE and regular expressions).

## String constants (literals)

Expand Down
1 change: 1 addition & 0 deletions docs/docs/sql/types.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ The compiler supports the following SQL data types:
- `INTERVAL`, a SQL interval. Two types of intervals are supported:
long intervals (comprising years and months), and short intervals,
comprising days, hours, minutes, seconds.
- `TIME`, the time of the day, with a precision of nanoseconds.
- `TIMESTAMP`, a SQL timestamp without a timezone. A timestamp
represents a value containing a date and a time, with a precision up
to a millisecond.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -229,21 +229,25 @@ private void compileInternal(String statements, boolean many, @Nullable String c
} catch (SqlParseException e) {
this.messages.reportError(e);
if (this.options.optimizerOptions.throwOnError) {
System.err.println(this.messages.toString());
throw new RuntimeException(e);
}
} catch (CalciteContextException e) {
this.messages.reportError(e);
if (this.options.optimizerOptions.throwOnError) {
System.err.println(this.messages.toString());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be logging these instead of reporting to stderr?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually just for debugging. The 'throwOnError' flag is only designed for development.
The effect of this is to print in the IDE console some much nicer errors in addition to a stack trace:

(no input file): Error parsing SQL
(no input file):23:17: error: Error parsing SQL: Encountered "," at line 23, column 17.
Was expecting:
    "IN" ...
    
   23|        FOR name, age) IN (('John', 30) AS c1, ('Mike', 40) AS c2)
                      ^^

throw new RuntimeException(e);
}
} catch (BaseCompilerException e) {
this.messages.reportError(e);
if (this.options.optimizerOptions.throwOnError) {
System.err.println(this.messages.toString());
throw new RuntimeException(e);
}
} catch (Throwable e) {
this.messages.reportError(e);
if (this.options.optimizerOptions.throwOnError) {
System.err.println(this.messages.toString());
throw new RuntimeException(e);
}
}
Expand Down
Loading