Skip to content

[SQL] Implementation for user-defined functions#1129

Merged
mihaibudiu merged 6 commits into
mainfrom
udf
Dec 14, 2023
Merged

[SQL] Implementation for user-defined functions#1129
mihaibudiu merged 6 commits into
mainfrom
udf

Conversation

@mihaibudiu
Copy link
Copy Markdown
Contributor

Is this a user-visible change (yes/no): yes

Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
@mihaibudiu mihaibudiu added the User-facing For PRs that lead to Feldera-user visible changes label Dec 12, 2023
@mihaibudiu
Copy link
Copy Markdown
Contributor Author

This solution is not 100% because it doesn't allow people to specify additional Rust libraries they can use in their code.
For that we will need to make some tweaks, like supplying a directory with a Cargo.toml and a rust file file instead of a rust file with the --udf command-line option.

this.udf = new HashMap<>();
}

static class RlikeFunction extends SqlFunction {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static classes here were moved from CalciteCompiler.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be attributing them?

@mihaibudiu
Copy link
Copy Markdown
Contributor Author

@ryzhyr please review only the rust api and the documentation

Copy link
Copy Markdown
Contributor

@ryzhyk ryzhyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great stuff.

We will need to discuss with @lalithsuresh how to best integrate this with the API.

I think we shouldn't publish doc changes until then, since there is now way for users to provide those Rust files to Feldera or to invoke the compiler directly.

Comment thread docs/sql/grammar.md
: type [NOT NULL]

createFunctionStatement
: CREATE FUNCTION name '(' [ columnDecl [, columnDecl ]* ] ')' RETURNS generalType
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are arguments immutable, so these are pure functions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SQL everything is immutable

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see other dialects have additional attributes like the language type etc, but those look like they'd be easy to add over time.

Comment thread docs/sql/udf.md
standard Rust types that the compiler uses to implement SQL datatypes.
The next section explains what these types are.

Notice the function implemented has an all-capitals name (which is not
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems repetitive. Didn't you explain this above?

Comment thread docs/sql/udf.md Outdated
a standard convention for Rust), dictated by the default SQL
capitalization rules.

There is no type casting or type inference performed for the function
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a limitation of the current implementation or is this standard for SQL?

Comment thread docs/sql/udf.md Outdated
The SQL family of `INTERVAL` types translates to one of two Rust
types: `ShortInterval` (representing intervals from days to seconds),
and `LongInterval` (representing intervals from years to months).
(Our dialect of SQL does allow mixing the two kinds of intervals in a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean does not?

Comment thread docs/sql/udf.md Outdated

In the Rust implementation the function always has to return the type
`Result<T, Box<dyn std::error::Error>>`, where `T` is the Rust
equivalent expected return type of the SQL function. The Rust
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OF THE expected?

Comment thread docs/sql/udf.md
equivalent expected return type of the SQL function. The Rust
function should return an `Err` only when the function fails at
runtime; in this case the returned error can use the source position
information to indicate where the error has originated in the code.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to the program when the function returns an error?

Comment thread docs/sql/udf.md Outdated

The first argument passed to the Rust function is always `pos:
&SourcePositionRange`. This argument indicates where in the source
position is placed the code that generated the call to this
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

}
}

pub fn st_distance_geopoint_geopoint(left: GeoPoint, right: GeoPoint) -> F64 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to pass these arguments by value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three more functions are generated by a macro, with signatures like
st_distance_geopointN_geopoint(left: Option<GeoPoint>, right: GeoPoint) -> Option<F64>. The signature of the other functions is related to this one. They all also call this function.

@@ -0,0 +1,9 @@
//! Crate which just re-exports all the types that are used to implement
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a module, not a crate. But more importantly I'm not sure it's useful. Anything that needs to be exported should be exported from lib.rs using pub use ....

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I want is people who use this stuff to only write one use line.
If there's a simple way to do this, great.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I don't see how this file helps with this.

@mihaibudiu
Copy link
Copy Markdown
Contributor Author

I think we shouldn't publish doc changes until then, since there is now way for users to provide those Rust files to Feldera or to invoke the compiler directly.

There are multiple layers of documentation, and I do need to document the grammar and the compiler flags. I think this is fine. if we postpone documentation we forget about it.

Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
@mihaibudiu
Copy link
Copy Markdown
Contributor Author

We should move this discussion to #1134

@mihaibudiu mihaibudiu added the SQL compiler Related to the SQL compiler label Dec 14, 2023
Comment thread docs/sql/grammar.md
: type [NOT NULL]

createFunctionStatement
: CREATE FUNCTION name '(' [ columnDecl [, columnDecl ]* ] ')' RETURNS generalType
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see other dialects have additional attributes like the language type etc, but those look like they'd be easy to add over time.

this.udf = new HashMap<>();
}

static class RlikeFunction extends SqlFunction {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be attributing them?

@mihaibudiu mihaibudiu merged commit e28181d into main Dec 14, 2023
@mihaibudiu mihaibudiu deleted the udf branch December 14, 2023 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

SQL compiler Related to the SQL compiler User-facing For PRs that lead to Feldera-user visible changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants