You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: python/felderize/spark/data/skills/spark_skills.md
+32-5Lines changed: 32 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -255,7 +255,32 @@ Results are always `VARIANT` — cast to get a concrete type: `CAST(v['age'] AS
255
255
|`json_array_length(s)`|`CARDINALITY(CAST(PARSE_JSON(s) AS VARIANT ARRAY))`||
256
256
257
257
Notes:
258
-
- Parse once, reuse: `SELECT PARSE_JSON(col) AS v, CAST(v['key'] AS VARCHAR) AS k` — lateral column aliases supported.
258
+
-**Parse once, reuse — for simple SELECT (no GROUP BY).** When extracting multiple fields from the same JSON column without aggregation, call `PARSE_JSON` once as a lateral alias:
259
+
260
+
```sql
261
+
-- Simple SELECT: parse once, reuse alias
262
+
SELECT
263
+
PARSE_JSON(payload) AS v,
264
+
CAST(v['user_id'] ASVARCHAR) AS user_id,
265
+
CAST(v['amount'] AS DOUBLE) AS amount,
266
+
CAST(v['currency'] ASVARCHAR) AS currency
267
+
FROM raw_events;
268
+
```
269
+
270
+
-**With GROUP BY: use a CTE to pre-parse.** The lateral alias pattern breaks with GROUP BY because `payload` must be in the GROUP BY or an aggregate. Pre-parse in a CTE instead. The CTE goes *inside*`CREATE VIEW ... AS`, not before it:
271
+
272
+
```sql
273
+
CREATEVIEWsummaryAS
274
+
WITH parsed AS (
275
+
SELECT*, PARSE_JSON(payload) AS v FROM raw_events
276
+
)
277
+
SELECT
278
+
CAST(v['user_id'] ASVARCHAR) AS user_id,
279
+
COUNT(*) AS cnt
280
+
FROM parsed
281
+
GROUP BY CAST(v['user_id'] ASVARCHAR);
282
+
```
283
+
259
284
- For performance-critical paths, `CREATE TYPE` + `jsonstring_as_<type>` is more efficient than `PARSE_JSON` + bracket access.
260
285
261
286
#### Math functions
@@ -590,10 +615,12 @@ SELECT id, UPPER(name) AS name, SOUNDEX(name) AS sound FROM users
590
615
591
616
#### Bitwise (scalar)
592
617
593
-
| Function | Notes |
594
-
|----------|-------|
595
-
|`bit_and(a, b)`, `bit_or(a, b)`, `bit_xor(a, b)`| Scalar 2-arg form not supported; only aggregate `BIT_AND/OR/XOR(col)` exists |
596
-
|`shiftleft`, `shiftright`, `shiftrightunsigned`| No bitwise shift syntax in Feldera |
618
+
| Spark | Feldera | Notes |
619
+
|-------|---------|-------|
620
+
|`bit_and(a, b)`|`a & b`| Use bitwise operator |
621
+
|`bit_or(a, b)`|`a \| b`| Use bitwise operator |
622
+
|`bit_xor(a, b)`|`a ^ b`| Use bitwise operator |
623
+
|`shiftleft`, `shiftright`, `shiftrightunsigned`| — | No bitwise shift syntax in Feldera |
0 commit comments