®
+
®
[](https://gitter.im/locationtech/rasterframes?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
@@ -6,7 +6,7 @@ RasterFrames® brings together Earth-observation (EO) data access, cloud computi
RasterFrames provides a DataFrame-centric view over arbitrary raster data, enabling spatiotemporal queries, map algebra raster operations, and compatibility with the ecosystem of Spark ML algorithms. By using DataFrames as the core cognitive and compute data model, it is able to deliver these features in a form that is both accessible to general analysts and scalable along with the rapidly growing data footprint.
-
+
Please see the [Getting Started](http://rasterframes.io/getting-started.html) section of the Users' Manual to start using RasterFrames.
@@ -17,7 +17,6 @@ Please see the [Getting Started](http://rasterframes.io/getting-started.html) se
* [Gitter Channel](https://gitter.im/locationtech/rasterframes)
* [Submit an Issue](https://github.com/locationtech/rasterframes/issues)
-
## Contributing
Community contributions are always welcome. To get started, please review our [contribution guidelines](https://github.com/locationtech/rasterframes/blob/develop/CONTRIBUTING.md), [code of conduct](https://github.com/locationtech/rasterframes/blob/develop/CODE_OF_CONDUCT.md), and reach out to us on [gitter](https://gitter.im/locationtech/rasterframes) so the community can help you get started!
@@ -62,6 +61,11 @@ Additional, Python sepcific build instruction may be found at [pyrasterframes/sr
## Copyright and License
-RasterFrames is released under the Apache 2.0 License, copyright Astraea, Inc. 2017-2019.
+RasterFrames is released under the commercial-friendly Apache 2.0 License, copyright Astraea, Inc. 2017-2021.
+
+## Commercial Support
+
+As the sponsors and developers of RasterFrames, [Astraea, Inc.](https://astraea.earth/) is uniquely positioned to expand its capabilities. If you need additional functionality or just some architectural guidance to get your project off to the right start, we can provide a full range of [consulting and development services](https://astraea.earth/services/) around RasterFrames. We can be reached at [info@astraea.io](mailto:info@astraea.io).
+
diff --git a/RELEASE.md b/RELEASE.md
new file mode 100644
index 000000000..99337c8c8
--- /dev/null
+++ b/RELEASE.md
@@ -0,0 +1,23 @@
+# RasterFrames Release Process
+
+1. Make sure `release-notes.md` is updated.
+2. Use `git flow release start x.y.z` to create release branch.
+3. Manually edit `version.sbt` and `version.py` to set value to `x.y.z` and commit changes.
+4. Do `docker login` if necessary.
+5. `sbt` shell commands:
+ a. `clean`
+ b. `test it:test`
+ c. `makeSite`
+ d. `publishSigned` (LocationTech credentials required)
+ e. `sonatypeReleaseAll`. It can take a while, but should eventually show up [here](https://search.maven.org/search?q=g:org.locationtech.rasterframes).
+ f. `docs/ghpagesPushSite`
+ g. `rf-notebook/publish`
+6. `cd pyrasterframes/target/python/dist`
+7. `python3 -m twine upload pyrasterframes-x.y.z-py2.py3-none-any.whl`
+8. Commit any changes that were necessary.
+9. `git-flow finish release`. Make sure to push tags, develop and master
+ branches.
+10. On `develop`, update `version.sbt` and `version.py` to next development
+ version (`x.y.(z+1)-SNAPSHOT` and `x.y.(z+1).dev0`). Commit and push.
+11. In GitHub, create a new release with the created tag. Copy relevant
+ section of release notes into the description.
diff --git a/bench/build.sbt b/bench/build.sbt
index 36eb61323..e01ea663d 100644
--- a/bench/build.sbt
+++ b/bench/build.sbt
@@ -11,7 +11,7 @@ libraryDependencies ++= Seq(
jmhIterations := Some(5)
jmhWarmupIterations := Some(8)
jmhTimeUnit := None
-javaOptions in Jmh := Seq("-Xmx4g")
+Jmh / javaOptions := Seq("-Xmx4g")
// To enable profiling:
// jmhExtraOptions := Some("-prof jmh.extras.JFR")
diff --git a/bench/src/main/scala/org/locationtech/rasterframes/bench/CatalystSerializerBench.scala b/bench/src/main/scala/org/locationtech/rasterframes/bench/CatalystSerializerBench.scala
deleted file mode 100644
index 12a6b0486..000000000
--- a/bench/src/main/scala/org/locationtech/rasterframes/bench/CatalystSerializerBench.scala
+++ /dev/null
@@ -1,92 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2019 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.bench
-
-import java.util.concurrent.TimeUnit
-
-import geotrellis.proj4.{CRS, LatLng, Sinusoidal}
-import org.apache.spark.sql.Row
-import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.locationtech.rasterframes.encoders.{CatalystSerializer, StandardEncoders}
-import org.openjdk.jmh.annotations._
-
-@BenchmarkMode(Array(Mode.AverageTime))
-@State(Scope.Benchmark)
-@OutputTimeUnit(TimeUnit.MICROSECONDS)
-class CatalystSerializerBench extends SparkEnv {
-
- val serde = CatalystSerializer[CRS]
-
- val epsg: CRS = LatLng
- val epsgEnc: Row = serde.toRow(epsg)
- val proj4: CRS = Sinusoidal
- val proj4Enc: Row = serde.toRow(proj4)
-
- var crsEnc: ExpressionEncoder[CRS] = _
-
- @Setup(Level.Trial)
- def setupData(): Unit = {
- crsEnc = StandardEncoders.crsEncoder.resolveAndBind()
- }
-
- @Benchmark
- def encodeEpsg(): Row = {
- serde.toRow(epsg)
- }
-
- @Benchmark
- def encodeProj4(): Row = {
- serde.toRow(proj4)
- }
-
- @Benchmark
- def decodeEpsg(): CRS = {
- serde.fromRow(epsgEnc)
- }
-
- @Benchmark
- def decodeProj4(): CRS = {
- serde.fromRow(proj4Enc)
- }
-
- @Benchmark
- def exprEncodeEpsg(): InternalRow = {
- crsEnc.toRow(epsg)
- }
-
- @Benchmark
- def exprEncodeProj4(): InternalRow = {
- crsEnc.toRow(proj4)
- }
-
-// @Benchmark
-// def exprDecodeEpsg(): CRS = {
-//
-// }
-//
-// @Benchmark
-// def exprDecodeProj4(): CRS = {
-//
-// }
-
-}
diff --git a/bench/src/main/scala/org/locationtech/rasterframes/bench/CellTypeBench.scala b/bench/src/main/scala/org/locationtech/rasterframes/bench/CellTypeBench.scala
index dfc88f855..3a4d9f3f1 100644
--- a/bench/src/main/scala/org/locationtech/rasterframes/bench/CellTypeBench.scala
+++ b/bench/src/main/scala/org/locationtech/rasterframes/bench/CellTypeBench.scala
@@ -21,10 +21,9 @@
package org.locationtech.rasterframes.bench
import java.util.concurrent.TimeUnit
-
import geotrellis.raster.{CellType, DoubleUserDefinedNoDataCellType, IntUserDefinedNoDataCellType}
import org.apache.spark.sql.catalyst.InternalRow
-import org.locationtech.rasterframes.encoders.CatalystSerializer._
+import org.locationtech.rasterframes.encoders.StandardEncoders
import org.openjdk.jmh.annotations._
@BenchmarkMode(Array(Mode.AverageTime))
@@ -37,16 +36,12 @@ class CellTypeBench {
def setupData(): Unit = {
ct = IntUserDefinedNoDataCellType(scala.util.Random.nextInt())
val o: CellType = DoubleUserDefinedNoDataCellType(scala.util.Random.nextDouble())
- row = o.toInternalRow
+ row = StandardEncoders.cellTypeEncoder.createSerializer()(o)
}
@Benchmark
- def fromRow(): CellType = {
- row.to[CellType]
- }
+ def fromRow(): CellType = StandardEncoders.cellTypeEncoder.createDeserializer()(row)
@Benchmark
- def intoRow(): InternalRow = {
- ct.toInternalRow
- }
+ def intoRow(): InternalRow = StandardEncoders.cellTypeEncoder.createSerializer()(ct)
}
diff --git a/bench/src/main/scala/org/locationtech/rasterframes/bench/RasterRefBench.scala b/bench/src/main/scala/org/locationtech/rasterframes/bench/RasterRefBench.scala
index 448fab9c3..c7e36d985 100644
--- a/bench/src/main/scala/org/locationtech/rasterframes/bench/RasterRefBench.scala
+++ b/bench/src/main/scala/org/locationtech/rasterframes/bench/RasterRefBench.scala
@@ -28,8 +28,7 @@ import org.apache.spark.sql._
import org.locationtech.rasterframes._
import org.locationtech.rasterframes.expressions.generators.RasterSourceToRasterRefs
import org.locationtech.rasterframes.expressions.transformers.RasterRefToTile
-import org.locationtech.rasterframes.model.TileDimensions
-import org.locationtech.rasterframes.ref.RasterSource
+import org.locationtech.rasterframes.ref.RFRasterSource
import org.openjdk.jmh.annotations._
@BenchmarkMode(Array(Mode.AverageTime))
@@ -43,11 +42,11 @@ class RasterRefBench extends SparkEnv with LazyLogging {
@Setup(Level.Trial)
def setupData(): Unit = {
- val r1 = RasterSource(remoteCOGSingleband1)
- val r2 = RasterSource(remoteCOGSingleband2)
+ val r1 = RFRasterSource(remoteCOGSingleband1)
+ val r2 = RFRasterSource(remoteCOGSingleband2)
singleDF = Seq((r1, r2)).toDF("B1", "B2")
- .select(RasterRefToTile(RasterSourceToRasterRefs(Some(TileDimensions(r1.dimensions)), Seq(0), $"B1", $"B2")))
+ .select(RasterRefToTile(RasterSourceToRasterRefs(Some(r1.dimensions), Seq(0), $"B1", $"B2")))
expandedDF = Seq((r1, r2)).toDF("B1", "B2")
.select(RasterRefToTile(RasterSourceToRasterRefs($"B1", $"B2")))
diff --git a/bench/src/main/scala/org/locationtech/rasterframes/bench/TileCellScanBench.scala b/bench/src/main/scala/org/locationtech/rasterframes/bench/TileCellScanBench.scala
index 350ac811a..737e0c9b2 100644
--- a/bench/src/main/scala/org/locationtech/rasterframes/bench/TileCellScanBench.scala
+++ b/bench/src/main/scala/org/locationtech/rasterframes/bench/TileCellScanBench.scala
@@ -23,9 +23,9 @@ package org.locationtech.rasterframes.bench
import java.util.concurrent.TimeUnit
+import geotrellis.raster.Dimensions
import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.rf.TileUDT
-import org.locationtech.rasterframes.tiles.InternalRowTile
import org.openjdk.jmh.annotations._
@BenchmarkMode(Array(Mode.AverageTime))
@@ -56,20 +56,9 @@ class TileCellScanBench extends SparkEnv {
@Benchmark
def deserializeRead(): Double = {
val tile = TileType.deserialize(tileRow)
- val (cols, rows) = tile.dimensions
- tile.getDouble(cols - 1, rows - 1) +
- tile.getDouble(cols/2, rows/2) +
- tile.getDouble(0, 0)
- }
-
- @Benchmark
- def internalRowRead(): Double = {
- val tile = new InternalRowTile(tileRow)
- val cols = tile.cols
- val rows = tile.rows
+ val Dimensions(cols, rows) = tile.dimensions
tile.getDouble(cols - 1, rows - 1) +
tile.getDouble(cols/2, rows/2) +
tile.getDouble(0, 0)
}
}
-
diff --git a/bench/src/main/scala/org/locationtech/rasterframes/bench/TileEncodeBench.scala b/bench/src/main/scala/org/locationtech/rasterframes/bench/TileEncodeBench.scala
index 20e255b06..d49027206 100644
--- a/bench/src/main/scala/org/locationtech/rasterframes/bench/TileEncodeBench.scala
+++ b/bench/src/main/scala/org/locationtech/rasterframes/bench/TileEncodeBench.scala
@@ -24,13 +24,11 @@ package org.locationtech.rasterframes.bench
import java.net.URI
import java.util.concurrent.TimeUnit
-import org.locationtech.rasterframes.ref.RasterRef.RasterRefTile
-import org.locationtech.rasterframes.ref.RasterRef
import geotrellis.raster.Tile
import geotrellis.vector.Extent
import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.locationtech.rasterframes.ref.{RasterRef, RasterSource}
+import org.locationtech.rasterframes.ref.{RasterRef, RFRasterSource}
import org.openjdk.jmh.annotations._
@BenchmarkMode(Array(Mode.AverageTime))
@@ -53,24 +51,23 @@ class TileEncodeBench extends SparkEnv {
@Setup(Level.Trial)
def setupData(): Unit = {
cellTypeName match {
- case "rasterRef" ⇒
+ case "rasterRef" =>
val baseCOG = "https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/149/039/LC08_L1TP_149039_20170411_20170415_01_T1/LC08_L1TP_149039_20170411_20170415_01_T1_B1.TIF"
val extent = Extent(253785.0, 3235185.0, 485115.0, 3471015.0)
- tile = RasterRefTile(RasterRef(RasterSource(URI.create(baseCOG)), 0, Some(extent), None))
- case _ ⇒
+ tile = RasterRef(RFRasterSource(URI.create(baseCOG)), 0, Some(extent), None)
+ case _ =>
tile = randomTile(tileSize, tileSize, cellTypeName)
}
}
@Benchmark
def encode(): InternalRow = {
- tileEncoder.toRow(tile)
+ tileEncoder.createSerializer.apply(tile)
}
@Benchmark
def roundTrip(): Tile = {
- val row = tileEncoder.toRow(tile)
- boundEncoder.fromRow(row)
+ val row = tileEncoder.createSerializer().apply(tile)
+ boundEncoder.createDeserializer().apply(row)
}
}
-
diff --git a/bench/src/main/scala/org/locationtech/rasterframes/bench/package.scala b/bench/src/main/scala/org/locationtech/rasterframes/bench/package.scala
index 65d8ab88f..8296cad37 100644
--- a/bench/src/main/scala/org/locationtech/rasterframes/bench/package.scala
+++ b/bench/src/main/scala/org/locationtech/rasterframes/bench/package.scala
@@ -37,10 +37,10 @@ package object bench {
val cellType = CellType.fromName(cellTypeName)
val tile = ArrayTile.alloc(cellType, cols, rows)
if(cellType.isFloatingPoint) {
- tile.mapDouble(_ ⇒ rnd.nextGaussian())
+ tile.mapDouble(_ => rnd.nextGaussian())
}
else {
- tile.map(_ ⇒ {
+ tile.map(_ => {
var c = NODATA
do {
c = rnd.nextInt(255)
diff --git a/build.sbt b/build.sbt
index f941ea060..52c544e8a 100644
--- a/build.sbt
+++ b/build.sbt
@@ -19,6 +19,15 @@
*
*/
+// Leave me and my custom keys alone!
+Global / lintUnusedKeysOnLoad := false
+ThisBuild / versionScheme := Some("semver-spec")
+ThisBuild / dynverVTagPrefix := false
+ThisBuild / dynverSonatypeSnapshots := true
+ThisBuild / publishMavenStyle := true
+ThisBuild / Test / publishArtifact := false
+
+
addCommandAlias("makeSite", "docs/makeSite")
addCommandAlias("previewSite", "docs/previewSite")
addCommandAlias("ghpagesPushSite", "docs/ghpagesPushSite")
@@ -28,19 +37,17 @@ addCommandAlias("console", "datasource/console")
lazy val IntegrationTest = config("it") extend Test
lazy val root = project
- .in(file("."))
.withId("RasterFrames")
- .aggregate(core, datasource, pyrasterframes, experimental)
- .enablePlugins(RFReleasePlugin)
+ .aggregate(core, datasource)
.settings(
- publish / skip := true,
- clean := clean.dependsOn(`rf-notebook`/clean).value
- )
+ publish / skip := true)
lazy val `rf-notebook` = project
.dependsOn(pyrasterframes)
+ .disablePlugins(CiReleasePlugin)
.enablePlugins(RFAssemblyPlugin, DockerPlugin)
- .settings(publish / skip := true)
+ .settings(
+ publish / skip := true)
lazy val core = project
.enablePlugins(BuildInfoPlugin)
@@ -50,43 +57,61 @@ lazy val core = project
.settings(
moduleName := "rasterframes",
libraryDependencies ++= Seq(
+ `slf4j-api`,
shapeless,
+ circe("core").value,
+ circe("generic").value,
+ circe("parser").value,
+ circe("generic-extras").value,
+ frameless excludeAll ExclusionRule(organization = "com.github.mpilquist"),
`jts-core`,
+ `spray-json`,
geomesa("z3").value,
geomesa("spark-jts").value,
- `geotrellis-contrib-vlm`,
- `geotrellis-contrib-gdal`,
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided,
- geotrellis("spark").value,
- geotrellis("raster").value,
- geotrellis("s3").value,
+ // TODO: scala-uri brings an outdated simulacrum dep
+ // Fix it in GT
+ geotrellis("spark").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
+ geotrellis("raster").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
+ geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
geotrellis("spark-testkit").value % Test excludeAll (
ExclusionRule(organization = "org.scalastic"),
- ExclusionRule(organization = "org.scalatest")
+ ExclusionRule(organization = "org.scalatest"),
+ ExclusionRule(organization = "com.github.mpilquist")
),
scaffeine,
- scalatest
+ sparktestingbase().value % Test excludeAll ExclusionRule("org.scala-lang.modules", "scala-xml_2.12"),
+ `scala-logging`
),
+ libraryDependencies ++= {
+ val gv = rfGeoTrellisVersion.value
+ if (gv.startsWith("3")) Seq[ModuleID](
+ geotrellis("gdal").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
+ geotrellis("s3-spark").value excludeAll ExclusionRule(organization = "com.github.mpilquist")
+ )
+ else Seq.empty[ModuleID]
+ },
buildInfoKeys ++= Seq[BuildInfoKey](
- moduleName, version, scalaVersion, sbtVersion, rfGeoTrellisVersion, rfGeoMesaVersion, rfSparkVersion
+ version, scalaVersion, rfGeoTrellisVersion, rfGeoMesaVersion, rfSparkVersion
),
buildInfoPackage := "org.locationtech.rasterframes",
buildInfoObject := "RFBuildInfo",
buildInfoOptions := Seq(
BuildInfoOption.ToMap,
- BuildInfoOption.BuildTime,
BuildInfoOption.ToJson
)
)
lazy val pyrasterframes = project
- .dependsOn(core, datasource, experimental)
+ .dependsOn(core, datasource)
+ .disablePlugins(CiReleasePlugin)
.enablePlugins(RFAssemblyPlugin, PythonBuildPlugin)
.settings(
+ publish / skip := true,
libraryDependencies ++= Seq(
- geotrellis("s3").value,
+ geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided
@@ -100,16 +125,26 @@ lazy val datasource = project
.settings(
moduleName := "rasterframes-datasource",
libraryDependencies ++= Seq(
- geotrellis("s3").value,
+ compilerPlugin("org.scalamacros" % "paradise" % "2.1.1" cross CrossVersion.full),
+ compilerPlugin("org.typelevel" % "kind-projector" % "0.13.2" cross CrossVersion.full),
+ sttpCatsCe2,
+ stac4s,
+ framelessRefined excludeAll ExclusionRule(organization = "com.github.mpilquist"),
+ geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
spark("core").value % Provided,
spark("mllib").value % Provided,
- spark("sql").value % Provided
+ spark("sql").value % Provided,
+ `better-files`
),
- initialCommands in console := (initialCommands in console).value +
+ Compile / console / scalacOptions ~= { _.filterNot(Set("-Ywarn-unused-import", "-Ywarn-unused:imports")) },
+ Test / console / scalacOptions ~= { _.filterNot(Set("-Ywarn-unused-import", "-Ywarn-unused:imports")) },
+ console / initialCommands := (console / initialCommands).value +
"""
|import org.locationtech.rasterframes.datasource.geotrellis._
|import org.locationtech.rasterframes.datasource.geotiff._
- |""".stripMargin
+ |""".stripMargin,
+ IntegrationTest / fork := true,
+ IntegrationTest / javaOptions := Seq("-Xmx3g -XX:+UseG1GC")
)
lazy val experimental = project
@@ -120,35 +155,36 @@ lazy val experimental = project
.settings(
moduleName := "rasterframes-experimental",
libraryDependencies ++= Seq(
- geotrellis("s3").value,
+ geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided
),
- fork in IntegrationTest := true,
- javaOptions in IntegrationTest := Seq("-Xmx2G"),
- parallelExecution in IntegrationTest := false
+ IntegrationTest / fork := true,
+ IntegrationTest / javaOptions := (datasource / IntegrationTest / javaOptions).value
)
lazy val docs = project
.dependsOn(core, datasource, pyrasterframes)
+ .disablePlugins(CiReleasePlugin)
.enablePlugins(SiteScaladocPlugin, ParadoxPlugin, ParadoxMaterialThemePlugin, GhpagesPlugin, ScalaUnidocPlugin)
.settings(
- apiURL := Some(url("http://rasterframes.io/latest/api")),
+ publish / skip := true,
+ apiURL := Some(url("https://rasterframes.io/latest/api")),
autoAPIMappings := true,
ghpagesNoJekyll := true,
ScalaUnidoc / siteSubdirName := "latest/api",
paradox / siteSubdirName := ".",
paradoxProperties ++= Map(
"version" -> version.value,
- "scaladoc.org.apache.spark.sql.rf" -> "http://rasterframes.io/latest",
+ "scaladoc.org.apache.spark.sql.rf" -> "https://rasterframes.io/latest",
"github.base_url" -> ""
),
paradoxNavigationExpandDepth := Some(3),
Compile / paradoxMaterialTheme ~= { _
.withRepository(uri("https://github.com/locationtech/rasterframes"))
.withCustomStylesheet("assets/custom.css")
- .withCopyright("""© 2017-2019 Astraea, Inc. All rights reserved.""")
+ .withCopyright("""© 2017-2021 Astraea, Inc. All rights reserved.""")
.withLogo("assets/images/RF-R.svg")
.withFavicon("assets/images/RasterFrames_32x32.ico")
.withColor("blue-grey", "light-blue")
@@ -168,9 +204,7 @@ lazy val docs = project
addMappingsToSiteDir(Compile / paradox / mappings, paradox / siteSubdirName)
)
-//ParadoxMaterialThemePlugin.paradoxMaterialThemeSettings(Paradox)
-
lazy val bench = project
+ .disablePlugins(CiReleasePlugin)
.dependsOn(core % "compile->test")
.settings(publish / skip := true)
-
diff --git a/build/circleci/Dockerfile b/build/circleci/Dockerfile
deleted file mode 100644
index a2356f7b6..000000000
--- a/build/circleci/Dockerfile
+++ /dev/null
@@ -1,81 +0,0 @@
-FROM circleci/openjdk:8-jdk
-
-ENV OPENJPEG_VERSION 2.3.1
-ENV GDAL_VERSION 2.4.1
-ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64/
-
-# most of these libraries required for
-# python-pip pandoc && pip install setuptools => required for pyrasterframes testing
-RUN sudo apt-get update && \
- sudo apt remove \
- python python-minimal python2.7 python2.7-minimal \
- libpython-stdlib libpython2.7 libpython2.7-minimal libpython2.7-stdlib \
- && sudo apt-get install -y \
- pandoc \
- wget \
- gcc g++ build-essential \
- libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev \
- libcurl4-gnutls-dev \
- libproj-dev \
- libgeos-dev \
- libhdf4-alt-dev \
- bash-completion \
- cmake \
- imagemagick \
- libpng-dev \
- libffi-dev \
- && sudo apt autoremove \
- && sudo apt-get clean all
-# && sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 1
-# todo s
-
-RUN cd /tmp && \
- wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz && \
- tar xzf Python-3.7.4.tgz && \
- cd Python-3.7.4 && \
- ./configure --with-ensurepip=install --prefix=/usr/local --enable-optimization && \
- make && \
- sudo make altinstall && \
- rm -rf Python-3.7.4*
-
-RUN sudo ln -s /usr/local/bin/python3.7 /usr/local/bin/python && \
- sudo curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
- sudo python get-pip.py && \
- sudo pip3 install setuptools ipython==6.2.1
-
-# install OpenJPEG
-RUN cd /tmp && \
- wget https://github.com/uclouvain/openjpeg/archive/v${OPENJPEG_VERSION}.tar.gz && \
- tar -xf v${OPENJPEG_VERSION}.tar.gz && \
- cd openjpeg-${OPENJPEG_VERSION}/ && \
- mkdir build && \
- cd build && \
- cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local/ && \
- make -j && \
- sudo make install && \
- cd /tmp && rm -Rf v${OPENJPEG_VERSION}.tar.gz openjpeg*
-
-# Compile and install GDAL with Java bindings
-RUN cd /tmp && \
- wget http://download.osgeo.org/gdal/${GDAL_VERSION}/gdal-${GDAL_VERSION}.tar.gz && \
- tar -xf gdal-${GDAL_VERSION}.tar.gz && \
- cd gdal-${GDAL_VERSION} && \
- ./configure \
- --with-curl \
- --with-hdf4 \
- --with-geos \
- --with-geotiff=internal \
- --with-hide-internal-symbols \
- --with-libtiff=internal \
- --with-libz=internal \
- --with-mrf \
- --with-openjpeg \
- --with-threads \
- --without-jp2mrsid \
- --without-netcdf \
- --without-ecw \
- && \
- make -j 8 && \
- sudo make install && \
- sudo ldconfig && \
- cd /tmp && sudo rm -Rf gdal*
diff --git a/build/circleci/Makefile b/build/circleci/Makefile
deleted file mode 100644
index 57cef6b1f..000000000
--- a/build/circleci/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-all:
- docker build -t "s22s/rasterframes-circleci:latest" .
diff --git a/build/circleci/README.md b/build/circleci/README.md
deleted file mode 100644
index 6a507cc5f..000000000
--- a/build/circleci/README.md
+++ /dev/null
@@ -1,6 +0,0 @@
-# CircleCI Dockerfile Build file
-
-```bash
-make
-docker push s22s/rasterframes-circleci:latest
-```
diff --git a/core/src/it/resources/log4j.properties b/core/src/it/resources/log4j.properties
index 1135e4b34..94c1d1b92 100644
--- a/core/src/it/resources/log4j.properties
+++ b/core/src/it/resources/log4j.properties
@@ -40,6 +40,8 @@ log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.locationtech.rasterframes=WARN
log4j.logger.org.locationtech.rasterframes.ref=WARN
log4j.logger.org.apache.parquet.hadoop.ParquetRecordReader=OFF
+log4j.logger.geotrellis.spark=INFO
+log4j.logger.geotrellis.raster.gdal=ERROR
# SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs in SparkSQL with Hive support
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
diff --git a/core/src/it/scala/org/locationtech/rasterframes/ref/RasterRefIT.scala b/core/src/it/scala/org/locationtech/rasterframes/ref/RasterRefIT.scala
index 88b5b8617..2e098c008 100644
--- a/core/src/it/scala/org/locationtech/rasterframes/ref/RasterRefIT.scala
+++ b/core/src/it/scala/org/locationtech/rasterframes/ref/RasterRefIT.scala
@@ -30,18 +30,17 @@ import org.locationtech.rasterframes.expressions.aggregates.TileRasterizerAggreg
class RasterRefIT extends TestEnvironment {
describe("practical subregion reads") {
- ignore("should construct a natural color composite") {
+ it("should construct a natural color composite") {
import spark.implicits._
- def scene(idx: Int) = URI.create(s"https://landsat-pds.s3.us-west-2.amazonaws.com" +
- s"/c1/L8/176/039/LC08_L1TP_176039_20190703_20190718_01_T1/LC08_L1TP_176039_20190703_20190718_01_T1_B$idx.TIF")
+ def scene(idx: Int) = TestData.remoteCOGSingleBand(idx)
- val redScene = RasterSource(scene(4))
+ val redScene = RFRasterSource(scene(4))
// [west, south, east, north]
val area = Extent(31.115, 29.963, 31.148, 29.99).reproject(LatLng, redScene.crs)
val red = RasterRef(redScene, 0, Some(area), None)
- val green = RasterRef(RasterSource(scene(3)), 0, Some(area), None)
- val blue = RasterRef(RasterSource(scene(2)), 0, Some(area), None)
+ val green = RasterRef(RFRasterSource(scene(3)), 0, Some(area), None)
+ val blue = RasterRef(RFRasterSource(scene(2)), 0, Some(area), None)
val rf = Seq((red, green, blue)).toDF("red", "green", "blue")
val df = rf.select(
@@ -55,11 +54,11 @@ class RasterRefIT extends TestEnvironment {
stats.get.dataCells shouldBe > (1000L)
}
- //import geotrellis.raster.io.geotiff.{GeoTiffOptions, MultibandGeoTiff, Tiled}
- //import geotrellis.raster.io.geotiff.compression.{DeflateCompression, NoCompression}
- //import geotrellis.raster.io.geotiff.tags.codes.ColorSpace
- //val tiffOptions = GeoTiffOptions(Tiled, DeflateCompression, ColorSpace.RGB)
- //MultibandGeoTiff(raster, raster.crs, tiffOptions).write("target/composite.tif")
+ import geotrellis.raster.io.geotiff.compression.DeflateCompression
+ import geotrellis.raster.io.geotiff.tags.codes.ColorSpace
+ import geotrellis.raster.io.geotiff.{GeoTiffOptions, MultibandGeoTiff, Tiled}
+ val tiffOptions = GeoTiffOptions(Tiled, DeflateCompression, ColorSpace.RGB)
+ MultibandGeoTiff(raster.raster, raster.crs, tiffOptions).write("target/composite.tif")
}
}
}
\ No newline at end of file
diff --git a/core/src/it/scala/org/locationtech/rasterframes/ref/RasterSourceIT.scala b/core/src/it/scala/org/locationtech/rasterframes/ref/RasterSourceIT.scala
index ae8b0b1d4..824bb4094 100644
--- a/core/src/it/scala/org/locationtech/rasterframes/ref/RasterSourceIT.scala
+++ b/core/src/it/scala/org/locationtech/rasterframes/ref/RasterSourceIT.scala
@@ -44,10 +44,10 @@ class RasterSourceIT extends TestEnvironment with TestData {
val bURI = new URI(
"https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/016/034/LC08_L1TP_016034_20181003_20181003_01_RT/LC08_L1TP_016034_20181003_20181003_01_RT_B2.TIF")
val red = time("read B4") {
- RasterSource(rURI).readAll()
+ RFRasterSource(rURI).readAll()
}
val blue = time("read B2") {
- RasterSource(bURI).readAll()
+ RFRasterSource(bURI).readAll()
}
time("test empty") {
red should not be empty
@@ -69,47 +69,47 @@ class RasterSourceIT extends TestEnvironment with TestData {
it("should read JPEG2000 scene") {
- RasterSource(localSentinel).readAll().flatMap(_.tile.statisticsDouble).size should be(64)
+ RFRasterSource(localSentinel).readAll().flatMap(_.tile.statisticsDouble).size should be(64)
}
it("should read small MRF scene with one band converted from MODIS HDF") {
val (expectedTileCount, _) = expectedTileCountAndBands(2400, 2400)
- RasterSource(modisConvertedMrfPath).readAll().flatMap(_.tile.statisticsDouble).size should be (expectedTileCount)
+ RFRasterSource(modisConvertedMrfPath).readAll().flatMap(_.tile.statisticsDouble).size should be (expectedTileCount)
}
it("should read remote HTTP MRF scene") {
val (expectedTileCount, bands) = expectedTileCountAndBands(6257, 7584, 4)
- RasterSource(remoteHttpMrfPath).readAll(bands = bands).flatMap(_.tile.statisticsDouble).size should be (expectedTileCount)
+ RFRasterSource(remoteHttpMrfPath).readAll(bands = bands).flatMap(_.tile.statisticsDouble).size should be (expectedTileCount)
}
it("should read remote S3 MRF scene") {
val (expectedTileCount, bands) = expectedTileCountAndBands(6257, 7584, 4)
- RasterSource(remoteS3MrfPath).readAll(bands = bands).flatMap(_.tile.statisticsDouble).size should be (expectedTileCount)
+ RFRasterSource(remoteS3MrfPath).readAll(bands = bands).flatMap(_.tile.statisticsDouble).size should be (expectedTileCount)
}
}
} else {
describe("GDAL missing error support") {
it("should throw exception reading JPEG2000 scene") {
intercept[IllegalArgumentException] {
- RasterSource(localSentinel)
+ RFRasterSource(localSentinel)
}
}
it("should throw exception reading MRF scene with one band converted from MODIS HDF") {
intercept[IllegalArgumentException] {
- RasterSource(modisConvertedMrfPath)
+ RFRasterSource(modisConvertedMrfPath)
}
}
it("should throw exception reading remote HTTP MRF scene") {
intercept[IllegalArgumentException] {
- RasterSource(remoteHttpMrfPath)
+ RFRasterSource(remoteHttpMrfPath)
}
}
it("should throw exception reading remote S3 MRF scene") {
intercept[IllegalArgumentException] {
- RasterSource(remoteS3MrfPath)
+ RFRasterSource(remoteS3MrfPath)
}
}
}
@@ -117,7 +117,7 @@ class RasterSourceIT extends TestEnvironment with TestData {
private def expectedTileCountAndBands(x:Int, y:Int, bandCount:Int = 1) = {
val imageDimensions = Seq(x.toDouble, y.toDouble)
- val tilesPerBand = imageDimensions.map(x ⇒ ceil(x / NOMINAL_TILE_SIZE)).product
+ val tilesPerBand = imageDimensions.map(x => ceil(x / NOMINAL_TILE_SIZE)).product
val bands = Range(0, bandCount)
val expectedTileCount = tilesPerBand * bands.length
(expectedTileCount, bands)
diff --git a/core/src/main/resources/application.conf b/core/src/main/resources/application.conf
new file mode 100644
index 000000000..3565f4b83
--- /dev/null
+++ b/core/src/main/resources/application.conf
@@ -0,0 +1,19 @@
+geotrellis.raster.gdal {
+ options {
+ // See https://trac.osgeo.org/gdal/wiki/ConfigOptions for options
+ //CPL_DEBUG = "OFF"
+ AWS_REQUEST_PAYER = "requester"
+ GDAL_DISABLE_READDIR_ON_OPEN = "YES"
+ CPL_VSIL_CURL_ALLOWED_EXTENSIONS = ".tif,.tiff,.jp2,.mrf,.idx,.lrc,.mrf.aux.xml,.vrt"
+ GDAL_CACHEMAX = 512
+ GDAL_PAM_ENABLED = "NO"
+ CPL_VSIL_CURL_CHUNK_SIZE = 1000000
+ GDAL_HTTP_MAX_RETRY=10
+ GDAL_HTTP_RETRY_DELAY=2
+ }
+ // set this to `false` if CPL_DEBUG is `ON`
+ useExceptions = true
+ // See https://github.com/locationtech/geotrellis/issues/3184#issuecomment-592553807
+ acceptable-datasets = ["SOURCE", "WARPED"]
+ number-of-attempts = 2147483647
+}
\ No newline at end of file
diff --git a/core/src/main/resources/reference.conf b/core/src/main/resources/reference.conf
index bcdca6aa3..8cc0e4292 100644
--- a/core/src/main/resources/reference.conf
+++ b/core/src/main/resources/reference.conf
@@ -1,23 +1,9 @@
rasterframes {
nominal-tile-size = 256
- prefer-gdal = true
- showable-tiles = true
+ prefer-gdal = false
+ showable-tiles = false
showable-max-cells = 20
max-truncate-row-element-length = 40
raster-source-cache-timeout = 120 seconds
+ jp2-gdal-thread-lock = false
}
-
-vlm.gdal {
- options {
- // See https://trac.osgeo.org/gdal/wiki/ConfigOptions for options
- //CPL_DEBUG = "OFF"
- AWS_REQUEST_PAYER = "requester"
- GDAL_DISABLE_READDIR_ON_OPEN = "YES"
- CPL_VSIL_CURL_ALLOWED_EXTENSIONS = ".tif,.tiff,.jp2,.mrf,.idx,.lrc,.mrf.aux.xml,.vrt"
- GDAL_CACHEMAX = 512
- GDAL_PAM_ENABLED = "NO"
- CPL_VSIL_CURL_CHUNK_SIZE = 1000000
- }
- // set this to `false` if CPL_DEBUG is `ON`
- useExceptions = true
-}
\ No newline at end of file
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/CrsUDT.scala b/core/src/main/scala/org/apache/spark/sql/rf/CrsUDT.scala
new file mode 100644
index 000000000..74b9941c0
--- /dev/null
+++ b/core/src/main/scala/org/apache/spark/sql/rf/CrsUDT.scala
@@ -0,0 +1,63 @@
+/*
+ * This software is licensed under the Apache 2 license, quoted below.
+ *
+ * Copyright 2021 Azavea, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not
+ * use this file except in compliance with the License. You may obtain a copy of
+ * the License at
+ *
+ * [http://www.apache.org/licenses/LICENSE-2.0]
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ *
+ * SPDX-License-Identifier: Apache-2.0
+ *
+ */
+
+package org.apache.spark.sql.rf
+import geotrellis.proj4.CRS
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.locationtech.rasterframes.model.LazyCRS
+import org.apache.spark.sql.catalyst.InternalRow
+
+
+@SQLUserDefinedType(udt = classOf[CrsUDT])
+class CrsUDT extends UserDefinedType[CRS] {
+ override def typeName: String = CrsUDT.typeName
+
+ override def pyUDT: String = "pyrasterframes.rf_types.CrsUDT"
+
+ def userClass: Class[CRS] = classOf[CRS]
+
+ def sqlType: DataType = StringType
+
+ override def serialize(obj: CRS): UTF8String =
+ Option(obj)
+ .map { crs => UTF8String.fromString(crs.toProj4String) }
+ .orNull
+
+ override def deserialize(datum: Any): CRS =
+ Option(datum)
+ .collect {
+ case ir: InternalRow => LazyCRS(ir.getString(0))
+ case s: UTF8String => LazyCRS(s.toString)
+ }
+ .orNull
+
+ override def acceptsType(dataType: DataType): Boolean = dataType match {
+ case _: CrsUDT => true
+ case _ => super.acceptsType(dataType)
+ }
+}
+
+case object CrsUDT {
+ UDTRegistration.register(classOf[CRS].getName, classOf[CrsUDT].getName)
+
+ final val typeName: String = "crs"
+}
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/FilterTranslator.scala b/core/src/main/scala/org/apache/spark/sql/rf/FilterTranslator.scala
index 6433ef8d3..d7a183796 100644
--- a/core/src/main/scala/org/apache/spark/sql/rf/FilterTranslator.scala
+++ b/core/src/main/scala/org/apache/spark/sql/rf/FilterTranslator.scala
@@ -20,7 +20,6 @@ package org.apache.spark.sql.rf
import java.sql.{Date, Timestamp}
import org.locationtech.rasterframes.expressions.SpatialRelation.{Contains, Intersects}
-import org.locationtech.rasterframes.rules._
import org.apache.spark.sql.catalyst.CatalystTypeConverters.{convertToScala, createToScalaConverter}
import org.apache.spark.sql.catalyst.expressions
import org.apache.spark.sql.catalyst.expressions.{Attribute, EmptyRow, Expression, Literal}
@@ -30,9 +29,11 @@ import org.apache.spark.sql.sources.Filter
import org.apache.spark.sql.types.{DateType, StringType, TimestampType}
import org.apache.spark.unsafe.types.UTF8String
import org.locationtech.geomesa.spark.jts.rules.GeometryLiteral
-import org.locationtech.rasterframes.rules.{SpatialFilters, TemporalFilters}
+import org.locationtech.rasterframes.rules.TemporalFilters
/**
+ * TODO: fix it, how to implement these filters as ScalaUDFs?
+ * Why do we need them?
* This is a copy of [[org.apache.spark.sql.execution.datasources.DataSourceStrategy.translateFilter]], modified to add our spatial predicates.
*
* @since 1/11/18
@@ -46,55 +47,61 @@ object FilterTranslator {
*/
def translateFilter(predicate: Expression): Option[Filter] = {
predicate match {
- case Intersects(a: Attribute, Literal(geom, udt: AbstractGeometryUDT[_])) ⇒
- Some(SpatialFilters.Intersects(a.name, udt.deserialize(geom)))
+ case Intersects(a: Attribute, Literal(geom, udt: AbstractGeometryUDT[_])) =>
+ // Some(SpatialFilters.Intersects(a.name, udt.deserialize(geom)))
+ ???
- case Contains(a: Attribute, Literal(geom, udt: AbstractGeometryUDT[_])) ⇒
- Some(SpatialFilters.Contains(a.name, udt.deserialize(geom)))
+ case Contains(a: Attribute, Literal(geom, udt: AbstractGeometryUDT[_])) =>
+ // Some(SpatialFilters.Contains(a.name, udt.deserialize(geom)))
+ ???
- case Intersects(a: Attribute, GeometryLiteral(_, geom)) ⇒
- Some(SpatialFilters.Intersects(a.name, geom))
+ case Intersects(a: Attribute, GeometryLiteral(_, geom)) =>
+ // Some(SpatialFilters.Intersects(a.name, geom))
+ ???
- case Contains(a: Attribute, GeometryLiteral(_, geom)) ⇒
- Some(SpatialFilters.Contains(a.name, geom))
+ case Contains(a: Attribute, GeometryLiteral(_, geom)) =>
+ // Some(SpatialFilters.Contains(a.name, geom))
+ ???
case expressions.And(
expressions.GreaterThanOrEqual(a: Attribute, Literal(start, TimestampType)),
expressions.LessThanOrEqual(b: Attribute, Literal(end, TimestampType))
- ) if a.name == b.name ⇒
+ ) if a.name == b.name =>
val toScala = createToScalaConverter(TimestampType)(_: Any).asInstanceOf[Timestamp]
- Some(TemporalFilters.BetweenTimes(a.name, toScala(start), toScala(end)))
+ // Some(TemporalFilters.BetweenTimes(a.name, toScala(start), toScala(end)))
+ ???
case expressions.And(
expressions.GreaterThanOrEqual(a: Attribute, Literal(start, DateType)),
expressions.LessThanOrEqual(b: Attribute, Literal(end, DateType))
- ) if a.name == b.name ⇒
+ ) if a.name == b.name =>
val toScala = createToScalaConverter(DateType)(_: Any).asInstanceOf[Date]
- Some(TemporalFilters.BetweenDates(a.name, toScala(start), toScala(end)))
+ // Some(TemporalFilters.BetweenDates(a.name, toScala(start), toScala(end)))
+ ???
// TODO: Need to figure out how to generalize over capturing right-hand pairs
case expressions.And(expressions.And(left,
expressions.GreaterThanOrEqual(a: Attribute, Literal(start, TimestampType))),
expressions.LessThanOrEqual(b: Attribute, Literal(end, TimestampType))
- ) if a.name == b.name ⇒
+ ) if a.name == b.name =>
val toScala = createToScalaConverter(TimestampType)(_: Any).asInstanceOf[Timestamp]
for {
- leftFilter ← translateFilter(left)
+ leftFilter <- translateFilter(left)
rightFilter = TemporalFilters.BetweenTimes(a.name, toScala(start), toScala(end))
- } yield sources.And(leftFilter, rightFilter)
+ } yield sources.And(leftFilter, ???)
// TODO: Ditto as above
case expressions.And(expressions.And(left,
expressions.GreaterThanOrEqual(a: Attribute, Literal(start, DateType))),
expressions.LessThanOrEqual(b: Attribute, Literal(end, DateType))
- ) if a.name == b.name ⇒
+ ) if a.name == b.name =>
val toScala = createToScalaConverter(DateType)(_: Any).asInstanceOf[Date]
for {
- leftFilter ← translateFilter(left)
+ leftFilter <- translateFilter(left)
rightFilter = TemporalFilters.BetweenDates(a.name, toScala(start), toScala(end))
- } yield sources.And(leftFilter, rightFilter)
+ } yield sources.And(leftFilter, ???)
case expressions.EqualTo(a: Attribute, Literal(v, t)) =>
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/QuinaryExpression.scala b/core/src/main/scala/org/apache/spark/sql/rf/QuinaryExpression.scala
new file mode 100644
index 000000000..2f4ce827b
--- /dev/null
+++ b/core/src/main/scala/org/apache/spark/sql/rf/QuinaryExpression.scala
@@ -0,0 +1,111 @@
+package org.apache.spark.sql.rf
+
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.expressions.codegen.{CodeGenerator, CodegenContext, ExprCode, FalseLiteral}
+
+/**
+ * An expression with five inputs and one output. The output is by default evaluated to null if any input is evaluated to null
+ */
+abstract class QuinaryExpression extends Expression {
+
+ override def foldable: Boolean = children.forall(_.foldable)
+
+ override def nullable: Boolean = children.exists(_.nullable)
+
+ /**
+ * Default behavior of evaluation according to the default nullability of QuaternaryExpression.
+ * If subclass of QuaternaryExpression override nullable, probably should also override this.
+ */
+ override def eval(input: InternalRow): Any = {
+ val exprs = children
+ val value1 = exprs(0).eval(input)
+ if (value1 != null) {
+ val value2 = exprs(1).eval(input)
+ if (value2 != null) {
+ val value3 = exprs(2).eval(input)
+ if (value3 != null) {
+ val value4 = exprs(3).eval(input)
+ if (value4 != null) {
+ val value5 = exprs(4).eval(input)
+ if (value5 != null) {
+ return nullSafeEval(value1, value2, value3, value4, value5)
+ }
+ }
+ }
+ }
+ }
+ null
+ }
+
+ /**
+ * Called by default [[eval]] implementation. If subclass of QuinaryExpression keep the
+ * default nullability, they can override this method to save null-check code. If we need
+ * full control of evaluation process, we should override [[eval]].
+ */
+ protected def nullSafeEval(input1: Any, input2: Any, input3: Any, input4: Any, input5: Any): Any =
+ sys.error(s"QuinaryExpressions must override either eval or nullSafeEval")
+
+ /**
+ * Short hand for generating quinary evaluation code.
+ * If either of the sub-expressions is null, the result of this computation
+ * is assumed to be null.
+ *
+ * @param f accepts five variable names and returns Java code to compute the output.
+ */
+ protected def defineCodeGen(ctx: CodegenContext, ev: ExprCode, f: (String, String, String, String, String) => String): ExprCode = {
+ nullSafeCodeGen(ctx, ev, (eval1, eval2, eval3, eval4, eval5) => {
+ s"${ev.value} = ${f(eval1, eval2, eval3, eval4, eval5)};"
+ })
+ }
+
+ /**
+ * Short hand for generating quinary evaluation code.
+ * If either of the sub-expressions is null, the result of this computation
+ * is assumed to be null.
+ *
+ * @param f function that accepts the 5 non-null evaluation result names of children
+ * and returns Java code to compute the output.
+ */
+ protected def nullSafeCodeGen(ctx: CodegenContext, ev: ExprCode, f: (String, String, String, String, String) => String): ExprCode = {
+ val firstGen = children(0).genCode(ctx)
+ val secondGen = children(1).genCode(ctx)
+ val thridGen = children(2).genCode(ctx)
+ val fourthGen = children(3).genCode(ctx)
+ val fifthGen = children(4).genCode(ctx)
+ val resultCode = f(firstGen.value, secondGen.value, thridGen.value, fourthGen.value, fifthGen.value)
+
+ if (nullable) {
+ val nullSafeEval =
+ firstGen.code + ctx.nullSafeExec(children(0).nullable, firstGen.isNull) {
+ secondGen.code + ctx.nullSafeExec(children(1).nullable, secondGen.isNull) {
+ thridGen.code + ctx.nullSafeExec(children(2).nullable, thridGen.isNull) {
+ fourthGen.code + ctx.nullSafeExec(children(3).nullable, fourthGen.isNull) {
+ fifthGen.code + ctx.nullSafeExec(children(4).nullable, fifthGen.isNull) {
+ s"""
+ ${ev.isNull} = false; // resultCode could change nullability.
+ $resultCode
+ """
+ }
+ }
+ }
+ }
+ }
+
+ ev.copy(code = code"""
+ boolean ${ev.isNull} = true;
+ ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+ $nullSafeEval""")
+ } else {
+ ev.copy(code = code"""
+ ${firstGen.code}
+ ${secondGen.code}
+ ${thridGen.code}
+ ${fourthGen.code}
+ ${fifthGen.code}
+ ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+ $resultCode""", isNull = FalseLiteral)
+ }
+ }
+}
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/RasterSourceUDT.scala b/core/src/main/scala/org/apache/spark/sql/rf/RasterSourceUDT.scala
index 51d204b58..4715609b2 100644
--- a/core/src/main/scala/org/apache/spark/sql/rf/RasterSourceUDT.scala
+++ b/core/src/main/scala/org/apache/spark/sql/rf/RasterSourceUDT.scala
@@ -21,69 +21,58 @@
package org.apache.spark.sql.rf
-import java.nio.ByteBuffer
-
-import org.locationtech.rasterframes.encoders.CatalystSerializer._
import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.types.{DataType, UDTRegistration, UserDefinedType, _}
-import org.locationtech.rasterframes.encoders.CatalystSerializer
-import org.locationtech.rasterframes.ref.RasterSource
+import org.apache.spark.sql.types._
+import org.locationtech.rasterframes.ref.RFRasterSource
import org.locationtech.rasterframes.util.KryoSupport
+import java.nio.ByteBuffer
+
/**
* Catalyst representation of a RasterSource.
*
* @since 9/5/18
*/
+// TODO: remove it
@SQLUserDefinedType(udt = classOf[RasterSourceUDT])
-class RasterSourceUDT extends UserDefinedType[RasterSource] {
- import RasterSourceUDT._
- override def typeName = "rf_rastersource"
+class RasterSourceUDT extends UserDefinedType[RFRasterSource] {
+ override def typeName = "rastersource"
override def pyUDT: String = "pyrasterframes.rf_types.RasterSourceUDT"
- def userClass: Class[RasterSource] = classOf[RasterSource]
+ def userClass: Class[RFRasterSource] = classOf[RFRasterSource]
- override def sqlType: DataType = schemaOf[RasterSource]
+ def sqlType: DataType = StructType(Seq(
+ StructField("raster_source_kryo", BinaryType, false)
+ ))
- override def serialize(obj: RasterSource): InternalRow =
+ def serialize(obj: RFRasterSource): InternalRow =
Option(obj)
- .map(_.toInternalRow)
+ .map { rs => InternalRow(KryoSupport.serialize(rs).array()) }
.orNull
- override def deserialize(datum: Any): RasterSource =
+ def deserialize(datum: Any): RFRasterSource =
Option(datum)
.collect {
- case ir: InternalRow ⇒ ir.to[RasterSource]
+ case ir: InternalRow =>
+ val bytes = ir.getBinary(0)
+ KryoSupport.deserialize[RFRasterSource](ByteBuffer.wrap(bytes))
+ case bytes: Array[Byte] =>
+ KryoSupport.deserialize[RFRasterSource](ByteBuffer.wrap(bytes))
+
}
.orNull
-
- private[sql] override def acceptsType(dataType: DataType) = dataType match {
- case _: RasterSourceUDT ⇒ true
- case _ ⇒ super.acceptsType(dataType)
+ private[sql] override def acceptsType(dataType: DataType): Boolean = dataType match {
+ case _: RasterSourceUDT => true
+ case _ => super.acceptsType(dataType)
}
}
object RasterSourceUDT {
- UDTRegistration.register(classOf[RasterSource].getName, classOf[RasterSourceUDT].getName)
+ UDTRegistration.register(classOf[RFRasterSource].getName, classOf[RasterSourceUDT].getName)
/** Deserialize a byte array, also used inside the Python API */
- def from(byteArray: Array[Byte]): RasterSource = CatalystSerializer.CatalystIO.rowIO.create(byteArray).to[RasterSource]
-
- implicit val rasterSourceSerializer: CatalystSerializer[RasterSource] = new CatalystSerializer[RasterSource] {
-
- override val schema: StructType = StructType(Seq(
- StructField("raster_source_kryo", BinaryType, false)
- ))
-
- override def to[R](t: RasterSource, io: CatalystIO[R]): R = {
- val buf = KryoSupport.serialize(t)
- io.create(buf.array())
- }
-
- override def from[R](row: R, io: CatalystIO[R]): RasterSource = {
- KryoSupport.deserialize[RasterSource](ByteBuffer.wrap(io.getByteArray(row, 0)))
- }
- }
+ def from(byteArray: Array[Byte]): RFRasterSource =
+ KryoSupport.deserialize[RFRasterSource](ByteBuffer.wrap(byteArray))
}
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/TileUDT.scala b/core/src/main/scala/org/apache/spark/sql/rf/TileUDT.scala
index e72930ad3..4c8fa341e 100644
--- a/core/src/main/scala/org/apache/spark/sql/rf/TileUDT.scala
+++ b/core/src/main/scala/org/apache/spark/sql/rf/TileUDT.scala
@@ -20,15 +20,18 @@
*/
package org.apache.spark.sql.rf
-import geotrellis.raster._
+
+import geotrellis.raster.{ArrayTile, BufferTile, CellType, ConstantTile, GridBounds, Tile}
import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.types.{DataType, _}
-import org.locationtech.rasterframes.encoders.CatalystSerializer
-import org.locationtech.rasterframes.encoders.CatalystSerializer._
-import org.locationtech.rasterframes.model.{Cells, TileDataContext}
-import org.locationtech.rasterframes.ref.RasterRef.RasterRefTile
-import org.locationtech.rasterframes.tiles.InternalRowTile
+import org.apache.spark.sql.execution.datasources.parquet.ParquetReadSupport
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.locationtech.rasterframes._
+import org.locationtech.rasterframes.encoders.syntax._
+import org.locationtech.rasterframes.ref.RasterRef
+import org.locationtech.rasterframes.tiles.{ProjectedRasterTile, ShowableTile}
+import scala.util.Try
/**
* UDT for singleband tiles.
@@ -37,66 +40,99 @@ import org.locationtech.rasterframes.tiles.InternalRowTile
*/
@SQLUserDefinedType(udt = classOf[TileUDT])
class TileUDT extends UserDefinedType[Tile] {
- import TileUDT._
override def typeName = TileUDT.typeName
override def pyUDT: String = "pyrasterframes.rf_types.TileUDT"
def userClass: Class[Tile] = classOf[Tile]
- def sqlType: StructType = schemaOf[Tile]
-
- override def serialize(obj: Tile): InternalRow =
- Option(obj)
- .map(_.toInternalRow)
- .orNull
+ def sqlType: StructType = StructType(Seq(
+ StructField("cellType", StringType, false),
+ StructField("cols", IntegerType, false),
+ StructField("rows", IntegerType, false),
+ StructField("cells", BinaryType, true),
+ StructField("gridBounds", gridBoundsEncoder[Int].schema, true),
+ // make it parquet compliant, only expanded UDTs can be in a UDT schema
+ StructField("ref", ParquetReadSupport.expandUDT(RasterRef.rasterRefEncoder.schema), true)
+ ))
+
+ def serialize(obj: Tile): InternalRow = {
+ if (obj == null) return null
+ obj match {
+ // TODO: review matches there
+ case ref: RasterRef =>
+ val ct = UTF8String.fromString(ref.cellType.toString())
+ InternalRow(ct, ref.cols, ref.rows, null, null, ref.toInternalRow)
+ case ProjectedRasterTile(ref: RasterRef, _, _) =>
+ val ct = UTF8String.fromString(ref.cellType.toString())
+ InternalRow(ct, ref.cols, ref.rows, null, null, ref.toInternalRow)
+ case prt: ProjectedRasterTile =>
+ val tile = prt.tile
+ val ct = UTF8String.fromString(tile.cellType.toString())
+ InternalRow(ct, tile.cols, tile.rows, tile.toBytes(), null, null)
+ case bt: BufferTile =>
+ val tile = bt.sourceTile.toArrayTile()
+ val ct = UTF8String.fromString(tile.cellType.toString())
+ InternalRow(ct, tile.cols, tile.rows, tile.toBytes(), bt.gridBounds.toInternalRow, null)
+ case const: ConstantTile =>
+ // Must expand constant tiles so they can be interpreted properly in catalyst and Python.
+ val tile = const.toArrayTile()
+ val ct = UTF8String.fromString(tile.cellType.toString())
+ InternalRow(ct, tile.cols, tile.rows, tile.toBytes(), null, null)
+ case tile =>
+ val ct = UTF8String.fromString(tile.cellType.toString())
+ InternalRow(ct, tile.cols, tile.rows, tile.toBytes(), null, null)
+ }
+ }
- override def deserialize(datum: Any): Tile =
- Option(datum)
- .collect {
- case ir: InternalRow ⇒ ir.to[Tile]
+ def deserialize(datum: Any): Tile = {
+ if (datum == null) return null
+ val row = datum.asInstanceOf[InternalRow]
+
+ /** TODO: a compatible encoder for the ProjectedRasterTile? */
+ val tile: Tile =
+ if (!row.isNullAt(5)) {
+ Try {
+ val ir = row.getStruct(5, 5)
+ val ref = ir.as[RasterRef]
+ ref
+ }/*.orElse {
+ Try(
+ ProjectedRasterTile
+ .projectedRasterTileEncoder
+ .resolveAndBind()
+ .createDeserializer()(row)
+ .tile
+ )
+ }*/.get
+ } else if(!row.isNullAt(4)) {
+ val ct = CellType.fromName(row.getString(0))
+ val cols = row.getInt(1)
+ val rows = row.getInt(2)
+ val bytes = row.getBinary(3)
+ val gridBounds = row.getStruct(4, 5).as[GridBounds[Int]]
+ BufferTile(ArrayTile.fromBytes(bytes, ct, cols, rows), gridBounds)
+ } else {
+ val ct = CellType.fromName(row.getString(0))
+ val cols = row.getInt(1)
+ val rows = row.getInt(2)
+ val bytes = row.getBinary(3)
+ ArrayTile.fromBytes(bytes, ct, cols, rows)
}
- .map {
- case realIRT: InternalRowTile ⇒ realIRT.realizedTile
- case other ⇒ other
- }
- .orNull
+
+ if (TileUDT.showableTiles) new ShowableTile(tile) else tile
+ }
override def acceptsType(dataType: DataType): Boolean = dataType match {
- case _: TileUDT ⇒ true
- case _ ⇒ super.acceptsType(dataType)
+ case _: TileUDT => true
+ case _ => super.acceptsType(dataType)
}
}
-case object TileUDT {
+case object TileUDT {
+ private val showableTiles = org.locationtech.rasterframes.rfConfig.getBoolean("showable-tiles")
+
UDTRegistration.register(classOf[Tile].getName, classOf[TileUDT].getName)
final val typeName: String = "tile"
-
- implicit def tileSerializer: CatalystSerializer[Tile] = new CatalystSerializer[Tile] {
-
- override val schema: StructType = StructType(Seq(
- StructField("cell_context", schemaOf[TileDataContext], true),
- StructField("cell_data", schemaOf[Cells], false)
- ))
-
- override def to[R](t: Tile, io: CatalystIO[R]): R = io.create(
- t match {
- case _: RasterRefTile => null
- case o => io.to(TileDataContext(o))
- },
- io.to(Cells(t))
- )
-
- override def from[R](row: R, io: CatalystIO[R]): Tile = {
- val cells = io.get[Cells](row, 1)
-
- row match {
- case ir: InternalRow if !cells.isRef ⇒ new InternalRowTile(ir)
- case _ ⇒
- val ctx = io.get[TileDataContext](row, 0)
- cells.toTile(ctx)
- }
- }
- }
}
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/VersionShims.scala b/core/src/main/scala/org/apache/spark/sql/rf/VersionShims.scala
index 81418d466..511a4dc49 100644
--- a/core/src/main/scala/org/apache/spark/sql/rf/VersionShims.scala
+++ b/core/src/main/scala/org/apache/spark/sql/rf/VersionShims.scala
@@ -1,21 +1,17 @@
package org.apache.spark.sql.rf
import java.lang.reflect.Constructor
-
-import org.apache.spark.sql.catalyst.FunctionIdentifier
-import org.apache.spark.sql.catalyst.analysis.FunctionRegistry
-import org.apache.spark.sql.catalyst.analysis.FunctionRegistry.FunctionBuilder
+import org.apache.spark.sql.catalyst.analysis.{FunctionRegistry, FunctionRegistryBase}
+import org.apache.spark.sql.catalyst.analysis.FunctionRegistry.{FUNC_ALIAS, FunctionBuilder}
import org.apache.spark.sql.catalyst.catalog.CatalogTable
import org.apache.spark.sql.catalyst.expressions.objects.{Invoke, InvokeLike}
-import org.apache.spark.sql.catalyst.expressions.{AttributeReference, Expression, ExpressionDescription, ExpressionInfo}
+import org.apache.spark.sql.catalyst.expressions.{AttributeReference, Expression, ExpressionInfo}
import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
import org.apache.spark.sql.execution.datasources.LogicalRelation
import org.apache.spark.sql.sources.BaseRelation
import org.apache.spark.sql.types.DataType
-import org.apache.spark.sql.{AnalysisException, DataFrame, Dataset, SQLContext}
import scala.reflect._
-import scala.util.{Failure, Success, Try}
/**
* Collection of Spark version compatibility adapters.
@@ -23,28 +19,11 @@ import scala.util.{Failure, Success, Try}
* @since 2/13/18
*/
object VersionShims {
- def readJson(sqlContext: SQLContext, rows: Dataset[String]): DataFrame = {
- // NB: Will get a deprecation warning for Spark 2.2.x
- sqlContext.read.json(rows.rdd) // <-- deprecation warning expected
- }
-
def updateRelation(lr: LogicalRelation, base: BaseRelation): LogicalPlan = {
val lrClazz = classOf[LogicalRelation]
val ctor = lrClazz.getConstructors.head.asInstanceOf[Constructor[LogicalRelation]]
ctor.getParameterTypes.length match {
- // In Spark 2.1.0 the signature looks like this:
- //
- // case class LogicalRelation(
- // relation: BaseRelation,
- // expectedOutputAttributes: Option[Seq[Attribute]] = None,
- // catalogTable: Option[CatalogTable] = None)
- // extends LeafNode with MultiInstanceRelation
- // In Spark 2.2.0 it's like this:
- // case class LogicalRelation(
- // relation: BaseRelation,
- // output: Seq[AttributeReference],
- // catalogTable: Option[CatalogTable])
- case 3 ⇒
+ case 3 =>
val arg2: Seq[AttributeReference] = lr.output
val arg3: Option[CatalogTable] = lr.catalogTable
if(ctor.getParameterTypes()(1).isAssignableFrom(classOf[Option[_]])) {
@@ -54,21 +33,13 @@ object VersionShims {
ctor.newInstance(base, arg2, arg3)
}
- // In Spark 2.3.0 this signature is this:
- //
- // case class LogicalRelation(
- // relation: BaseRelation,
- // output: Seq[AttributeReference],
- // catalogTable: Option[CatalogTable],
- // override val isStreaming: Boolean)
- // extends LeafNode with MultiInstanceRelation {
- case 4 ⇒
+ case 4 =>
val arg2: Seq[AttributeReference] = lr.output
val arg3: Option[CatalogTable] = lr.catalogTable
val arg4 = lrClazz.getMethod("isStreaming").invoke(lr)
ctor.newInstance(base, arg2, arg3, arg4)
- case _ ⇒
+ case _ =>
throw new NotImplementedError("LogicalRelation constructor has unexpected shape")
}
}
@@ -80,29 +51,12 @@ object VersionShims {
val ctor = classOf[Invoke].getConstructors.head
val TRUE = Boolean.box(true)
ctor.getParameterTypes.length match {
- // In Spark 2.1.0 the signature looks like this:
- //
- // case class Invoke(
- // targetObject: Expression,
- // functionName: String,
- // dataType: DataType,
- // arguments: Seq[Expression] = Nil,
- // propagateNull: Boolean = true) extends InvokeLike
- case 5 ⇒
+ case 5 =>
ctor.newInstance(targetObject, functionName, dataType, Nil, TRUE).asInstanceOf[InvokeLike]
- // In spark 2.2.0 the signature looks like this:
- //
- // case class Invoke(
- // targetObject: Expression,
- // functionName: String,
- // dataType: DataType,
- // arguments: Seq[Expression] = Nil,
- // propagateNull: Boolean = true,
- // returnNullable : Boolean = true) extends InvokeLike
- case 6 ⇒
+ case 6 =>
ctor.newInstance(targetObject, functionName, dataType, Nil, TRUE, TRUE).asInstanceOf[InvokeLike]
- case _ ⇒
+ case _ =>
throw new NotImplementedError("Invoke constructor has unexpected shape")
}
}
@@ -113,8 +67,8 @@ object VersionShims {
// Spark 2.3 introduced a new way of specifying Functions
val spark23FI = "org.apache.spark.sql.catalyst.FunctionIdentifier"
registry.getClass.getDeclaredMethods
- .filter(m ⇒ m.getName == "registerFunction" && m.getParameterCount == 2)
- .foreach { m ⇒
+ .filter(m => m.getName == "registerFunction" && m.getParameterCount == 2)
+ .foreach { m =>
val firstParam = m.getParameterTypes()(0)
if(firstParam == classOf[String])
m.invoke(registry, name, builder)
@@ -130,68 +84,18 @@ object VersionShims {
}
}
- // Much of the code herein is copied from org.apache.spark.sql.catalyst.analysis.FunctionRegistry
- def registerExpression[T <: Expression: ClassTag](name: String): Unit = {
- val clazz = classTag[T].runtimeClass
-
- def expressionInfo: ExpressionInfo = {
- val df = clazz.getAnnotation(classOf[ExpressionDescription])
- if (df != null) {
- if (df.extended().isEmpty) {
- new ExpressionInfo(clazz.getCanonicalName, null, name, df.usage(), df.arguments(), df.examples(), df.note(), df.since())
- } else {
- // This exists for the backward compatibility with old `ExpressionDescription`s defining
- // the extended description in `extended()`.
- new ExpressionInfo(clazz.getCanonicalName, null, name, df.usage(), df.extended())
- }
- } else {
- new ExpressionInfo(clazz.getCanonicalName, name)
- }
+ def registerExpression[T <: Expression : ClassTag](
+ name: String,
+ setAlias: Boolean = false,
+ since: Option[String] = None
+ ): (String, (ExpressionInfo, FunctionBuilder)) = {
+ val (expressionInfo, builder) = FunctionRegistryBase.build[T](name, since)
+ val newBuilder = (expressions: Seq[Expression]) => {
+ val expr = builder(expressions)
+ if (setAlias) expr.setTagValue(FUNC_ALIAS, name)
+ expr
}
- def findBuilder: FunctionBuilder = {
- val constructors = clazz.getConstructors
- // See if we can find a constructor that accepts Seq[Expression]
- val varargCtor = constructors.find(_.getParameterTypes.toSeq == Seq(classOf[Seq[_]]))
- val builder = (expressions: Seq[Expression]) => {
- if (varargCtor.isDefined) {
- // If there is an apply method that accepts Seq[Expression], use that one.
- Try(varargCtor.get.newInstance(expressions).asInstanceOf[Expression]) match {
- case Success(e) => e
- case Failure(e) =>
- // the exception is an invocation exception. To get a meaningful message, we need the
- // cause.
- throw new AnalysisException(e.getCause.getMessage)
- }
- } else {
- // Otherwise, find a constructor method that matches the number of arguments, and use that.
- val params = Seq.fill(expressions.size)(classOf[Expression])
- val f = constructors.find(_.getParameterTypes.toSeq == params).getOrElse {
- val validParametersCount = constructors
- .filter(_.getParameterTypes.forall(_ == classOf[Expression]))
- .map(_.getParameterCount).distinct.sorted
- val expectedNumberOfParameters = if (validParametersCount.length == 1) {
- validParametersCount.head.toString
- } else {
- validParametersCount.init.mkString("one of ", ", ", " and ") +
- validParametersCount.last
- }
- throw new AnalysisException(s"Invalid number of arguments for function ${clazz.getSimpleName}. " +
- s"Expected: $expectedNumberOfParameters; Found: ${params.length}")
- }
- Try(f.newInstance(expressions : _*).asInstanceOf[Expression]) match {
- case Success(e) => e
- case Failure(e) =>
- // the exception is an invocation exception. To get a meaningful message, we need the
- // cause.
- throw new AnalysisException(e.getCause.getMessage)
- }
- }
- }
-
- builder
- }
-
- registry.registerFunction(FunctionIdentifier(name), expressionInfo, findBuilder)
+ (name, (expressionInfo, newBuilder))
}
}
}
diff --git a/core/src/main/scala/org/apache/spark/sql/rf/package.scala b/core/src/main/scala/org/apache/spark/sql/rf/package.scala
index 4035b60c4..bc062899c 100644
--- a/core/src/main/scala/org/apache/spark/sql/rf/package.scala
+++ b/core/src/main/scala/org/apache/spark/sql/rf/package.scala
@@ -43,6 +43,7 @@ package object rf {
// which is where the registration actually happens. The ordering matters!
RasterSourceUDT
TileUDT
+ CrsUDT
}
def registry(sqlContext: SQLContext): FunctionRegistry = {
@@ -55,7 +56,7 @@ package object rf {
/** Lookup the registered Catalyst UDT for the given Scala type. */
def udtOf[T >: Null: TypeTag]: UserDefinedType[T] =
- UDTRegistration.getUDTFor(typeTag[T].tpe.toString).map(_.newInstance().asInstanceOf[UserDefinedType[T]])
+ UDTRegistration.getUDTFor(typeTag[T].tpe.toString).map(_.getDeclaredConstructor().newInstance().asInstanceOf[UserDefinedType[T]])
.getOrElse(throw new IllegalArgumentException(typeTag[T].tpe + " doesn't have a corresponding UDT"))
/** Creates a Catalyst expression for flattening the fields in a struct into columns. */
@@ -65,7 +66,6 @@ package object rf {
implicit class WithPPrint[T](enc: ExpressionEncoder[T]) {
def pprint(): Unit = {
println(enc.getClass.getSimpleName + "{")
- println("\tflat=" + enc.flat)
println("\tschema=" + enc.schema)
println("\tserializers=" + enc.serializer)
println("\tnamedExpressions=" + enc.namedExpressions)
diff --git a/core/src/main/scala/org/locationtech/rasterframes/PairRDDConverter.scala b/core/src/main/scala/org/locationtech/rasterframes/PairRDDConverter.scala
index 658c0d65d..14a754ec5 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/PairRDDConverter.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/PairRDDConverter.scala
@@ -23,7 +23,7 @@ package org.locationtech.rasterframes
import org.locationtech.rasterframes.util._
import geotrellis.raster.{MultibandTile, Tile, TileFeature}
-import geotrellis.spark.{SpaceTimeKey, SpatialKey}
+import geotrellis.layer._
import org.apache.spark.rdd.RDD
import org.apache.spark.sql._
import org.apache.spark.sql.rf.TileUDT
@@ -80,14 +80,14 @@ object PairRDDConverter {
def toDataFrame(rdd: RDD[(SpaceTimeKey, Tile)])(implicit spark: SparkSession): DataFrame = {
import spark.implicits._
- rdd.map{ case (k, v) ⇒ (k.spatialKey, k.temporalKey, v)}.toDF(schema.fields.map(_.name): _*)
+ rdd.map{ case (k, v) => (k.spatialKey, k.temporalKey, v)}.toDF(schema.fields.map(_.name): _*)
}
}
/** Enables conversion of `RDD[(SpatialKey, TileFeature[Tile, D])]` to DataFrame. */
implicit def spatialTileFeatureConverter[D: Encoder] = new PairRDDConverter[SpatialKey, TileFeature[Tile, D]] {
implicit val featureEncoder = implicitly[Encoder[D]]
- implicit val rowEncoder = Encoders.tuple(spatialKeyEncoder, singlebandTileEncoder, featureEncoder)
+ implicit val rowEncoder = Encoders.tuple(spatialKeyEncoder, tileEncoder, featureEncoder)
val schema: StructType = {
val base = spatialTileConverter.schema
@@ -96,14 +96,14 @@ object PairRDDConverter {
def toDataFrame(rdd: RDD[(SpatialKey, TileFeature[Tile, D])])(implicit spark: SparkSession): DataFrame = {
import spark.implicits._
- rdd.map{ case (k, v) ⇒ (k, v.tile, v.data)}.toDF(schema.fields.map(_.name): _*)
+ rdd.map{ case (k, v) => (k, v.tile, v.data)}.toDF(schema.fields.map(_.name): _*)
}
}
/** Enables conversion of `RDD[(SpaceTimeKey, TileFeature[Tile, D])]` to DataFrame. */
implicit def spaceTimeTileFeatureConverter[D: Encoder] = new PairRDDConverter[SpaceTimeKey, TileFeature[Tile, D]] {
implicit val featureEncoder = implicitly[Encoder[D]]
- implicit val rowEncoder = Encoders.tuple(spatialKeyEncoder, temporalKeyEncoder, singlebandTileEncoder, featureEncoder)
+ implicit val rowEncoder = Encoders.tuple(spatialKeyEncoder, temporalKeyEncoder, tileEncoder, featureEncoder)
val schema: StructType = {
val base = spaceTimeTileConverter.schema
@@ -112,7 +112,7 @@ object PairRDDConverter {
def toDataFrame(rdd: RDD[(SpaceTimeKey, TileFeature[Tile, D])])(implicit spark: SparkSession): DataFrame = {
import spark.implicits._
- val tupRDD = rdd.map { case (k, v) ⇒ (k.spatialKey, k.temporalKey, v.tile, v.data) }
+ val tupRDD = rdd.map { case (k, v) => (k.spatialKey, k.temporalKey, v.tile, v.data) }
rddToDatasetHolder(tupRDD)
tupRDD.toDF(schema.fields.map(_.name): _*)
@@ -126,7 +126,7 @@ object PairRDDConverter {
val basename = TILE_COLUMN.columnName
- val tiles = for(i ← 1 to bands) yield {
+ val tiles = for(i <- 1 to bands) yield {
val name = if(bands <= 1) basename else s"${basename}_$i"
StructField(name , serializableTileUDT, nullable = false)
}
@@ -136,20 +136,20 @@ object PairRDDConverter {
def toDataFrame(rdd: RDD[(SpatialKey, MultibandTile)])(implicit spark: SparkSession): DataFrame = {
spark.createDataFrame(
- rdd.map { case (k, v) ⇒ Row(Row(k.col, k.row) +: v.bands: _*) },
+ rdd.map { case (k, v) => Row(Row(k.col, k.row) +: v.bands: _*) },
schema
)
}
}
/** Enables conversion of `RDD[(SpaceTimeKey, MultibandTile)]` to DataFrame. */
- def forSpaceTimeMultiband(bands: Int) = new PairRDDConverter[SpaceTimeKey, MultibandTile] {
+ def forSpaceTimeMultiband(bands: Int): PairRDDConverter[SpaceTimeKey, MultibandTile] = new PairRDDConverter[SpaceTimeKey, MultibandTile] {
val schema: StructType = {
val base = spaceTimeTileConverter.schema
val basename = TILE_COLUMN.columnName
- val tiles = for(i ← 1 to bands) yield {
+ val tiles = for(i <- 1 to bands) yield {
StructField(s"${basename}_$i" , serializableTileUDT, nullable = false)
}
@@ -158,7 +158,7 @@ object PairRDDConverter {
def toDataFrame(rdd: RDD[(SpaceTimeKey, MultibandTile)])(implicit spark: SparkSession): DataFrame = {
spark.createDataFrame(
- rdd.map { case (k, v) ⇒ Row(Seq(Row(k.spatialKey.col, k.spatialKey.row), Row(k.temporalKey)) ++ v.bands: _*) },
+ rdd.map { case (k, v) => Row(Seq(Row(k.spatialKey.col, k.spatialKey.row), Row(k.temporalKey)) ++ v.bands: _*) },
schema
)
}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/RasterFunctions.scala b/core/src/main/scala/org/locationtech/rasterframes/RasterFunctions.scala
index 213f0f77d..accca888d 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/RasterFunctions.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/RasterFunctions.scala
@@ -20,422 +20,10 @@
*/
package org.locationtech.rasterframes
-import geotrellis.proj4.CRS
-import geotrellis.raster.mapalgebra.local.LocalTileBinaryOp
-import geotrellis.raster.render.ColorRamp
-import geotrellis.raster.{CellType, Tile}
-import geotrellis.vector.Extent
-import org.apache.spark.annotation.Experimental
-import org.apache.spark.sql.functions.{lit, udf}
-import org.apache.spark.sql.{Column, TypedColumn}
-import org.locationtech.jts.geom.Geometry
-import org.locationtech.rasterframes.expressions.TileAssembler
-import org.locationtech.rasterframes.expressions.accessors._
-import org.locationtech.rasterframes.expressions.aggregates._
-import org.locationtech.rasterframes.expressions.generators._
-import org.locationtech.rasterframes.expressions.localops._
-import org.locationtech.rasterframes.expressions.tilestats._
-import org.locationtech.rasterframes.expressions.transformers.RenderPNG.{RenderCompositePNG, RenderColorRampPNG}
-import org.locationtech.rasterframes.expressions.transformers._
-import org.locationtech.rasterframes.model.TileDimensions
-import org.locationtech.rasterframes.stats._
-import org.locationtech.rasterframes.{functions => F}
+import org.locationtech.rasterframes.functions._
/**
- * UDFs for working with Tiles in Spark DataFrames.
- *
+ * Mix-in for UDFs for working with Tiles in Spark DataFrames.
* @since 4/3/17
*/
-trait RasterFunctions {
- import util._
-
- // format: off
- /** Query the number of (cols, rows) in a Tile. */
- def rf_dimensions(col: Column): TypedColumn[Any, TileDimensions] = GetDimensions(col)
-
- /** Extracts the bounding box of a geometry as an Extent */
- def st_extent(col: Column): TypedColumn[Any, Extent] = GeometryToExtent(col)
-
- /** Extracts the bounding box from a RasterSource or ProjectedRasterTile */
- def rf_extent(col: Column): TypedColumn[Any, Extent] = GetExtent(col)
-
- /** Extracts the CRS from a RasterSource or ProjectedRasterTile */
- def rf_crs(col: Column): TypedColumn[Any, CRS] = GetCRS(col)
-
- /** Extracts the tile from a ProjectedRasterTile, or passes through a Tile. */
- def rf_tile(col: Column): TypedColumn[Any, Tile] = RealizeTile(col)
-
- /** Flattens Tile into a double array. */
- def rf_tile_to_array_double(col: Column): TypedColumn[Any, Array[Double]] =
- TileToArrayDouble(col)
-
- /** Flattens Tile into an integer array. */
- def rf_tile_to_array_int(col: Column): TypedColumn[Any, Array[Double]] =
- TileToArrayDouble(col)
-
- @Experimental
- /** Convert array in `arrayCol` into a Tile of dimensions `cols` and `rows`*/
- def rf_array_to_tile(arrayCol: Column, cols: Int, rows: Int): TypedColumn[Any, Tile] = withTypedAlias("rf_array_to_tile")(
- udf[Tile, AnyRef](F.arrayToTile(cols, rows)).apply(arrayCol).as[Tile]
- )
-
- /** Create a Tile from a column of cell data with location indexes and preform cell conversion. */
- def rf_assemble_tile(columnIndex: Column, rowIndex: Column, cellData: Column, tileCols: Int, tileRows: Int, ct: CellType): TypedColumn[Any, Tile] =
- rf_convert_cell_type(TileAssembler(columnIndex, rowIndex, cellData, lit(tileCols), lit(tileRows)), ct).as(cellData.columnName).as[Tile](singlebandTileEncoder)
-
- /** Create a Tile from a column of cell data with location indexes and perform cell conversion. */
- def rf_assemble_tile(columnIndex: Column, rowIndex: Column, cellData: Column, tileCols: Int, tileRows: Int): TypedColumn[Any, Tile] =
- TileAssembler(columnIndex, rowIndex, cellData, lit(tileCols), lit(tileRows))
-
- /** Create a Tile from a column of cell data with location indexes. */
- def rf_assemble_tile(columnIndex: Column, rowIndex: Column, cellData: Column, tileCols: Column, tileRows: Column): TypedColumn[Any, Tile] =
- TileAssembler(columnIndex, rowIndex, cellData, tileCols, tileRows)
-
- /** Extract the Tile's cell type */
- def rf_cell_type(col: Column): TypedColumn[Any, CellType] = GetCellType(col)
-
- /** Change the Tile's cell type */
- def rf_convert_cell_type(col: Column, cellType: CellType): Column = SetCellType(col, cellType)
-
- /** Change the Tile's cell type */
- def rf_convert_cell_type(col: Column, cellTypeName: String): Column = SetCellType(col, cellTypeName)
-
- /** Change the Tile's cell type */
- def rf_convert_cell_type(col: Column, cellType: Column): Column = SetCellType(col, cellType)
-
- /** Change the interpretation of the Tile's cell values according to specified CellType */
- def rf_interpret_cell_type_as(col: Column, cellType: CellType): Column = InterpretAs(col, cellType)
-
- /** Change the interpretation of the Tile's cell values according to specified CellType */
- def rf_interpret_cell_type_as(col: Column, cellTypeName: String): Column = InterpretAs(col, cellTypeName)
-
- /** Change the interpretation of the Tile's cell values according to specified CellType */
- def rf_interpret_cell_type_as(col: Column, cellType: Column): Column = InterpretAs(col, cellType)
-
- /** Resample tile to different size based on scalar factor or tile whose dimension to match. Scalar less
- * than one will downsample tile; greater than one will upsample. Uses nearest-neighbor. */
- def rf_resample[T: Numeric](tileCol: Column, factorValue: T) = Resample(tileCol, factorValue)
-
- /** Resample tile to different size based on scalar factor or tile whose dimension to match. Scalar less
- * than one will downsample tile; greater than one will upsample. Uses nearest-neighbor. */
- def rf_resample(tileCol: Column, factorCol: Column) = Resample(tileCol, factorCol)
-
- /** Convert a bounding box structure to a Geometry type. Intented to support multiple schemas. */
- def st_geometry(extent: Column): TypedColumn[Any, Geometry] = ExtentToGeometry(extent)
-
- /** Extract the extent of a RasterSource or ProjectedRasterTile as a Geometry type. */
- def rf_geometry(raster: Column): TypedColumn[Any, Geometry] = GetGeometry(raster)
-
- /** Assign a `NoData` value to the tile column. */
- def rf_with_no_data(col: Column, nodata: Double): Column = SetNoDataValue(col, nodata)
-
- /** Assign a `NoData` value to the tile column. */
- def rf_with_no_data(col: Column, nodata: Int): Column = SetNoDataValue(col, nodata)
-
- /** Assign a `NoData` value to the tile column. */
- def rf_with_no_data(col: Column, nodata: Column): Column = SetNoDataValue(col, nodata)
-
- /** Compute the full column aggregate floating point histogram. */
- def rf_agg_approx_histogram(col: Column): TypedColumn[Any, CellHistogram] = HistogramAggregate(col)
-
- /** Compute the full column aggregate floating point statistics. */
- def rf_agg_stats(col: Column): TypedColumn[Any, CellStatistics] = CellStatsAggregate(col)
-
- /** Computes the column aggregate mean. */
- def rf_agg_mean(col: Column) = CellMeanAggregate(col)
-
- /** Computes the number of non-NoData cells in a column. */
- def rf_agg_data_cells(col: Column): TypedColumn[Any, Long] = CellCountAggregate.DataCells(col)
-
- /** Computes the number of NoData cells in a column. */
- def rf_agg_no_data_cells(col: Column): TypedColumn[Any, Long] = CellCountAggregate.NoDataCells(col)
-
- /** Compute the Tile-wise mean */
- def rf_tile_mean(col: Column): TypedColumn[Any, Double] =
- TileMean(col)
-
- /** Compute the Tile-wise sum */
- def rf_tile_sum(col: Column): TypedColumn[Any, Double] =
- Sum(col)
-
- /** Compute the minimum cell value in tile. */
- def rf_tile_min(col: Column): TypedColumn[Any, Double] =
- TileMin(col)
-
- /** Compute the maximum cell value in tile. */
- def rf_tile_max(col: Column): TypedColumn[Any, Double] =
- TileMax(col)
-
- /** Compute TileHistogram of Tile values. */
- def rf_tile_histogram(col: Column): TypedColumn[Any, CellHistogram] =
- TileHistogram(col)
-
- /** Compute statistics of Tile values. */
- def rf_tile_stats(col: Column): TypedColumn[Any, CellStatistics] =
- TileStats(col)
-
- /** Counts the number of non-NoData cells per Tile. */
- def rf_data_cells(tile: Column): TypedColumn[Any, Long] =
- DataCells(tile)
-
- /** Counts the number of NoData cells per Tile. */
- def rf_no_data_cells(tile: Column): TypedColumn[Any, Long] =
- NoDataCells(tile)
-
- /** Returns true if all cells in the tile are NoData.*/
- def rf_is_no_data_tile(tile: Column): TypedColumn[Any, Boolean] =
- IsNoDataTile(tile)
-
- /** Returns true if any cells in the tile are true (non-zero and not NoData). */
- def rf_exists(tile: Column): TypedColumn[Any, Boolean] = Exists(tile)
-
- /** Returns true if all cells in the tile are true (non-zero and not NoData). */
- def rf_for_all(tile: Column): TypedColumn[Any, Boolean] = ForAll(tile)
-
- /** Compute cell-local aggregate descriptive statistics for a column of Tiles. */
- def rf_agg_local_stats(col: Column) =
- LocalStatsAggregate(col)
-
- /** Compute the cell-wise/local max operation between Tiles in a column. */
- def rf_agg_local_max(col: Column): TypedColumn[Any, Tile] = LocalTileOpAggregate.LocalMaxUDAF(col)
-
- /** Compute the cellwise/local min operation between Tiles in a column. */
- def rf_agg_local_min(col: Column): TypedColumn[Any, Tile] = LocalTileOpAggregate.LocalMinUDAF(col)
-
- /** Compute the cellwise/local mean operation between Tiles in a column. */
- def rf_agg_local_mean(col: Column): TypedColumn[Any, Tile] = LocalMeanAggregate(col)
-
- /** Compute the cellwise/local count of non-NoData cells for all Tiles in a column. */
- def rf_agg_local_data_cells(col: Column): TypedColumn[Any, Tile] = LocalCountAggregate.LocalDataCellsUDAF(col)
-
- /** Compute the cellwise/local count of NoData cells for all Tiles in a column. */
- def rf_agg_local_no_data_cells(col: Column): TypedColumn[Any, Tile] = LocalCountAggregate.LocalNoDataCellsUDAF(col)
-
- /** Cellwise addition between two Tiles or Tile and scalar column. */
- def rf_local_add(left: Column, right: Column): Column = Add(left, right)
-
- /** Cellwise addition of a scalar value to a tile. */
- def rf_local_add[T: Numeric](tileCol: Column, value: T): Column = Add(tileCol, value)
-
- /** Cellwise subtraction between two Tiles. */
- def rf_local_subtract(left: Column, right: Column): Column = Subtract(left, right)
-
- /** Cellwise subtraction of a scalar value from a tile. */
- def rf_local_subtract[T: Numeric](tileCol: Column, value: T): Column = Subtract(tileCol, value)
-
- /** Cellwise multiplication between two Tiles. */
- def rf_local_multiply(left: Column, right: Column): Column = Multiply(left, right)
-
- /** Cellwise multiplication of a tile by a scalar value. */
- def rf_local_multiply[T: Numeric](tileCol: Column, value: T): Column = Multiply(tileCol, value)
-
- /** Cellwise division between two Tiles. */
- def rf_local_divide(left: Column, right: Column): Column = Divide(left, right)
-
- /** Cellwise division of a tile by a scalar value. */
- def rf_local_divide[T: Numeric](tileCol: Column, value: T): Column = Divide(tileCol, value)
-
- /** Perform an arbitrary GeoTrellis `LocalTileBinaryOp` between two Tile columns. */
- def rf_local_algebra(op: LocalTileBinaryOp, left: Column, right: Column): TypedColumn[Any, Tile] =
- withTypedAlias(opName(op), left, right)(udf[Tile, Tile, Tile](op.apply).apply(left, right))
-
- /** Compute the normalized difference of two tile columns */
- def rf_normalized_difference(left: Column, right: Column) =
- NormalizedDifference(left, right)
-
- /** Constructor for tile column with a single cell value. */
- def rf_make_constant_tile(value: Number, cols: Int, rows: Int, cellType: CellType): TypedColumn[Any, Tile] =
- rf_make_constant_tile(value, cols, rows, cellType.name)
-
- /** Constructor for tile column with a single cell value. */
- def rf_make_constant_tile(value: Number, cols: Int, rows: Int, cellTypeName: String): TypedColumn[Any, Tile] = {
- val constTile = udf(() => F.makeConstantTile(value, cols, rows, cellTypeName))
- withTypedAlias(s"rf_make_constant_tile($value, $cols, $rows, $cellTypeName)")(constTile.apply())
- }
-
- /** Create a column constant tiles of zero */
- def rf_make_zeros_tile(cols: Int, rows: Int, cellType: CellType): TypedColumn[Any, Tile] =
- rf_make_zeros_tile(cols, rows, cellType.name)
-
- /** Create a column constant tiles of zero */
- def rf_make_zeros_tile(cols: Int, rows: Int, cellTypeName: String): TypedColumn[Any, Tile] = {
- import org.apache.spark.sql.rf.TileUDT.tileSerializer
- val constTile = encoders.serialized_literal(F.tileZeros(cols, rows, cellTypeName))
- withTypedAlias(s"rf_make_zeros_tile($cols, $rows, $cellTypeName)")(constTile)
- }
-
- /** Creates a column of tiles containing all ones */
- def rf_make_ones_tile(cols: Int, rows: Int, cellType: CellType): TypedColumn[Any, Tile] =
- rf_make_ones_tile(cols, rows, cellType.name)
-
- /** Creates a column of tiles containing all ones */
- def rf_make_ones_tile(cols: Int, rows: Int, cellTypeName: String): TypedColumn[Any, Tile] = {
- import org.apache.spark.sql.rf.TileUDT.tileSerializer
- val constTile = encoders.serialized_literal(F.tileOnes(cols, rows, cellTypeName))
- withTypedAlias(s"rf_make_ones_tile($cols, $rows, $cellTypeName)")(constTile)
- }
-
- /** Where the rf_mask tile contains NODATA, replace values in the source tile with NODATA */
- def rf_mask(sourceTile: Column, maskTile: Column): TypedColumn[Any, Tile] =
- Mask.MaskByDefined(sourceTile, maskTile)
-
- /** Where the `maskTile` equals `maskValue`, replace values in the source tile with `NoData` */
- def rf_mask_by_value(sourceTile: Column, maskTile: Column, maskValue: Column): TypedColumn[Any, Tile] =
- Mask.MaskByValue(sourceTile, maskTile, maskValue)
-
- /** Where the `maskTile` does **not** contain `NoData`, replace values in the source tile with `NoData` */
- def rf_inverse_mask(sourceTile: Column, maskTile: Column): TypedColumn[Any, Tile] =
- Mask.InverseMaskByDefined(sourceTile, maskTile)
-
- /** Where the `maskTile` does **not** equal `maskValue`, replace values in the source tile with `NoData` */
- def rf_inverse_mask_by_value(sourceTile: Column, maskTile: Column, maskValue: Column): TypedColumn[Any, Tile] =
- Mask.InverseMaskByValue(sourceTile, maskTile, maskValue)
-
- /** Create a tile where cells in the grid defined by cols, rows, and bounds are filled with the given value. */
- def rf_rasterize(geometry: Column, bounds: Column, value: Column, cols: Int, rows: Int): TypedColumn[Any, Tile] =
- withTypedAlias("rf_rasterize", geometry)(
- udf(F.rasterize(_: Geometry, _: Geometry, _: Int, cols, rows)).apply(geometry, bounds, value)
- )
-
- def rf_rasterize(geometry: Column, bounds: Column, value: Column, cols: Column, rows: Column): TypedColumn[Any, Tile] =
- withTypedAlias("rf_rasterize", geometry)(
- udf(F.rasterize).apply(geometry, bounds, value, cols, rows)
- )
-
- /** Reproject a column of geometry from one CRS to another.
- * @param sourceGeom Geometry column to reproject
- * @param srcCRS Native CRS of `sourceGeom` as a literal
- * @param dstCRSCol Destination CRS as a column
- */
- def st_reproject(sourceGeom: Column, srcCRS: CRS, dstCRSCol: Column): TypedColumn[Any, Geometry] =
- ReprojectGeometry(sourceGeom, srcCRS, dstCRSCol)
-
- /** Reproject a column of geometry from one CRS to another.
- * @param sourceGeom Geometry column to reproject
- * @param srcCRSCol Native CRS of `sourceGeom` as a column
- * @param dstCRS Destination CRS as a literal
- */
- def st_reproject(sourceGeom: Column, srcCRSCol: Column, dstCRS: CRS): TypedColumn[Any, Geometry] =
- ReprojectGeometry(sourceGeom, srcCRSCol, dstCRS)
-
- /** Reproject a column of geometry from one CRS to another.
- * @param sourceGeom Geometry column to reproject
- * @param srcCRS Native CRS of `sourceGeom` as a literal
- * @param dstCRS Destination CRS as a literal
- */
- def st_reproject(sourceGeom: Column, srcCRS: CRS, dstCRS: CRS): TypedColumn[Any, Geometry] =
- ReprojectGeometry(sourceGeom, srcCRS, dstCRS)
-
- /** Reproject a column of geometry from one CRS to another.
- * @param sourceGeom Geometry column to reproject
- * @param srcCRSCol Native CRS of `sourceGeom` as a column
- * @param dstCRSCol Destination CRS as a column
- */
- def st_reproject(sourceGeom: Column, srcCRSCol: Column, dstCRSCol: Column): TypedColumn[Any, Geometry] =
- ReprojectGeometry(sourceGeom, srcCRSCol, dstCRSCol)
-
- /** Render Tile as ASCII string, for debugging purposes. */
- def rf_render_ascii(tile: Column): TypedColumn[Any, String] =
- DebugRender.RenderAscii(tile)
-
- /** Render Tile cell values as numeric values, for debugging purposes. */
- def rf_render_matrix(tile: Column): TypedColumn[Any, String] =
- DebugRender.RenderMatrix(tile)
-
- /** Converts tiles in a column into PNG encoded byte array, using given ColorRamp to assign values to colors. */
- def rf_render_png(tile: Column, colors: ColorRamp): TypedColumn[Any, Array[Byte]] =
- RenderColorRampPNG(tile, colors)
-
- /** Converts columns of tiles representing RGB channels into a PNG encoded byte array. */
- def rf_render_png(red: Column, green: Column, blue: Column): TypedColumn[Any, Array[Byte]] =
- RenderCompositePNG(red, green, blue)
-
- /** Converts columns of tiles representing RGB channels into a single RGB packaged tile. */
- def rf_rgb_composite(red: Column, green: Column, blue: Column): Column =
- RGBComposite(red, green, blue)
-
- /** Cellwise less than value comparison between two tiles. */
- def rf_local_less(left: Column, right: Column): Column = Less(left, right)
-
- /** Cellwise less than value comparison between a tile and a scalar. */
- def rf_local_less[T: Numeric](tileCol: Column, value: T): Column = Less(tileCol, value)
-
- /** Cellwise less than or equal to value comparison between a tile and a scalar. */
- def rf_local_less_equal(left: Column, right: Column): Column = LessEqual(left, right)
-
- /** Cellwise less than or equal to value comparison between a tile and a scalar. */
- def rf_local_less_equal[T: Numeric](tileCol: Column, value: T): Column = LessEqual(tileCol, value)
-
- /** Cellwise greater than value comparison between two tiles. */
- def rf_local_greater(left: Column, right: Column): Column = Greater(left, right)
-
- /** Cellwise greater than value comparison between a tile and a scalar. */
- def rf_local_greater[T: Numeric](tileCol: Column, value: T): Column = Greater(tileCol, value)
- /** Cellwise greater than or equal to value comparison between two tiles. */
- def rf_local_greater_equal(left: Column, right: Column): Column = GreaterEqual(left, right)
-
- /** Cellwise greater than or equal to value comparison between a tile and a scalar. */
- def rf_local_greater_equal[T: Numeric](tileCol: Column, value: T): Column = GreaterEqual(tileCol, value)
-
- /** Cellwise equal to value comparison between two tiles. */
- def rf_local_equal(left: Column, right: Column): Column = Equal(left, right)
-
- /** Cellwise equal to value comparison between a tile and a scalar. */
- def rf_local_equal[T: Numeric](tileCol: Column, value: T): Column = Equal(tileCol, value)
-
- /** Cellwise inequality comparison between two tiles. */
- def rf_local_unequal(left: Column, right: Column): Column = Unequal(left, right)
-
- /** Cellwise inequality comparison between a tile and a scalar. */
- def rf_local_unequal[T: Numeric](tileCol: Column, value: T): Column = Unequal(tileCol, value)
-
- /** Return a tile with ones where the input is NoData, otherwise zero */
- def rf_local_no_data(tileCol: Column): Column = Undefined(tileCol)
-
- /** Return a tile with zeros where the input is NoData, otherwise one*/
- def rf_local_data(tileCol: Column): Column = Defined(tileCol)
-
- /** Round cell values to nearest integer without chaning cell type. */
- def rf_round(tileCol: Column): Column = Round(tileCol)
-
- /** Compute the absolute value of each cell. */
- def rf_abs(tileCol: Column): Column = Abs(tileCol)
-
- /** Take natural logarithm of cell values. */
- def rf_log(tileCol: Column): Column = Log(tileCol)
-
- /** Take base 10 logarithm of cell values. */
- def rf_log10(tileCol: Column): Column = Log10(tileCol)
-
- /** Take base 2 logarithm of cell values. */
- def rf_log2(tileCol: Column): Column = Log2(tileCol)
-
- /** Natural logarithm of one plus cell values. */
- def rf_log1p(tileCol: Column): Column = Log1p(tileCol)
-
- /** Exponential of cell values */
- def rf_exp(tileCol: Column): Column = Exp(tileCol)
-
- /** Ten to the power of cell values */
- def rf_exp10(tileCol: Column): Column = Exp10(tileCol)
-
- /** Two to the power of cell values */
- def rf_exp2(tileCol: Column): Column = Exp2(tileCol)
-
- /** Exponential of cell values, less one*/
- def rf_expm1(tileCol: Column): Column = ExpM1(tileCol)
-
- /** Return the incoming tile untouched. */
- def rf_identity(tileCol: Column): Column = Identity(tileCol)
-
- /** Create a row for each cell in Tile. */
- def rf_explode_tiles(cols: Column*): Column = rf_explode_tiles_sample(1.0, None, cols: _*)
-
- /** Create a row for each cell in Tile with random sampling and optional seed. */
- def rf_explode_tiles_sample(sampleFraction: Double, seed: Option[Long], cols: Column*): Column =
- ExplodeTiles(sampleFraction, seed, cols)
-
- /** Create a row for each cell in Tile with random sampling (no seed). */
- def rf_explode_tiles_sample(sampleFraction: Double, cols: Column*): Column =
- ExplodeTiles(sampleFraction, None, cols)
-}
+trait RasterFunctions extends TileFunctions with LocalFunctions with SpatialFunctions with AggregateFunctions with FocalFunctions
diff --git a/core/src/main/scala/org/locationtech/rasterframes/StandardColumns.scala b/core/src/main/scala/org/locationtech/rasterframes/StandardColumns.scala
index 2e82ab356..cd4e9580a 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/StandardColumns.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/StandardColumns.scala
@@ -25,12 +25,12 @@ import java.sql.Timestamp
import geotrellis.proj4.CRS
import geotrellis.raster.Tile
-import geotrellis.spark.{SpatialKey, TemporalKey}
+import geotrellis.layer._
import geotrellis.vector.{Extent, ProjectedExtent}
import org.apache.spark.sql.functions.col
import org.locationtech.jts.geom.{Point => jtsPoint, Polygon => jtsPolygon}
-import org.locationtech.rasterframes.encoders.StandardEncoders.PrimitiveEncoders._
import org.locationtech.rasterframes.tiles.ProjectedRasterTile
+import org.locationtech.rasterframes.encoders.SparkBasicEncoders._
/**
* Constants identifying column in most RasterFrames.
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/CatalystSerializer.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/CatalystSerializer.scala
deleted file mode 100644
index 831411557..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/CatalystSerializer.scala
+++ /dev/null
@@ -1,162 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2018 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import CatalystSerializer.CatalystIO
-import org.apache.spark.sql.Row
-import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.util.ArrayData
-import org.apache.spark.sql.types._
-import org.apache.spark.unsafe.types.UTF8String
-
-/**
- * Typeclass for converting to/from JVM object to catalyst encoding. The reason this exists is that
- * instantiating and binding `ExpressionEncoder[T]` is *very* expensive, and not suitable for
- * operations internal to an `Expression`.
- *
- * @since 10/19/18
- */
-trait CatalystSerializer[T] extends Serializable {
- def schema: StructType
- protected def to[R](t: T, io: CatalystIO[R]): R
- protected def from[R](t: R, io: CatalystIO[R]): T
-
- final def toRow(t: T): Row = to(t, CatalystIO[Row])
- final def fromRow(row: Row): T = from(row, CatalystIO[Row])
-
- final def toInternalRow(t: T): InternalRow = to(t, CatalystIO[InternalRow])
- final def fromInternalRow(row: InternalRow): T = from(row, CatalystIO[InternalRow])
-}
-
-object CatalystSerializer extends StandardSerializers {
- def apply[T: CatalystSerializer]: CatalystSerializer[T] = implicitly
-
- def schemaOf[T: CatalystSerializer]: StructType = apply[T].schema
-
- /**
- * For some reason `Row` and `InternalRow` share no common base type. Instead of using
- * structural types (which use reflection), this typeclass is used to normalize access
- * to the underlying storage construct.
- *
- * @tparam R row storage type
- */
- trait CatalystIO[R] extends Serializable {
- def create(values: Any*): R
- def to[T: CatalystSerializer](t: T): R = CatalystSerializer[T].to(t, this)
- def toSeq[T: CatalystSerializer](t: Seq[T]): AnyRef
- def get[T >: Null: CatalystSerializer](d: R, ordinal: Int): T
- def getSeq[T >: Null: CatalystSerializer](d: R, ordinal: Int): Seq[T]
- def isNullAt(d: R, ordinal: Int): Boolean
- def getBoolean(d: R, ordinal: Int): Boolean
- def getByte(d: R, ordinal: Int): Byte
- def getShort(d: R, ordinal: Int): Short
- def getInt(d: R, ordinal: Int): Int
- def getLong(d: R, ordinal: Int): Long
- def getFloat(d: R, ordinal: Int): Float
- def getDouble(d: R, ordinal: Int): Double
- def getString(d: R, ordinal: Int): String
- def getByteArray(d: R, ordinal: Int): Array[Byte]
- def encode(str: String): AnyRef
- }
-
- object CatalystIO {
- def apply[R: CatalystIO]: CatalystIO[R] = implicitly
-
- trait AbstractRowEncoder[R <: Row] extends CatalystIO[R] {
- override def isNullAt(d: R, ordinal: Int): Boolean = d.isNullAt(ordinal)
- override def getBoolean(d: R, ordinal: Int): Boolean = d.getBoolean(ordinal)
- override def getByte(d: R, ordinal: Int): Byte = d.getByte(ordinal)
- override def getShort(d: R, ordinal: Int): Short = d.getShort(ordinal)
- override def getInt(d: R, ordinal: Int): Int = d.getInt(ordinal)
- override def getLong(d: R, ordinal: Int): Long = d.getLong(ordinal)
- override def getFloat(d: R, ordinal: Int): Float = d.getFloat(ordinal)
- override def getDouble(d: R, ordinal: Int): Double = d.getDouble(ordinal)
- override def getString(d: R, ordinal: Int): String = d.getString(ordinal)
- override def getByteArray(d: R, ordinal: Int): Array[Byte] =
- d.get(ordinal).asInstanceOf[Array[Byte]]
- override def get[T >: Null: CatalystSerializer](d: R, ordinal: Int): T = {
- d.getAs[Any](ordinal) match {
- case r: Row => r.to[T]
- case o => o.asInstanceOf[T]
- }
- }
- override def toSeq[T: CatalystSerializer](t: Seq[T]): AnyRef = t.map(_.toRow)
- override def getSeq[T >: Null: CatalystSerializer](d: R, ordinal: Int): Seq[T] =
- d.getSeq[Row](ordinal).map(_.to[T])
- override def encode(str: String): String = str
- }
-
- implicit val rowIO: CatalystIO[Row] = new AbstractRowEncoder[Row] {
- override def create(values: Any*): Row = Row(values: _*)
- }
-
- implicit val internalRowIO: CatalystIO[InternalRow] = new CatalystIO[InternalRow] {
- override def isNullAt(d: InternalRow, ordinal: Int): Boolean = d.isNullAt(ordinal)
- override def getBoolean(d: InternalRow, ordinal: Int): Boolean = d.getBoolean(ordinal)
- override def getByte(d: InternalRow, ordinal: Int): Byte = d.getByte(ordinal)
- override def getShort(d: InternalRow, ordinal: Int): Short = d.getShort(ordinal)
- override def getInt(d: InternalRow, ordinal: Int): Int = d.getInt(ordinal)
- override def getLong(d: InternalRow, ordinal: Int): Long = d.getLong(ordinal)
- override def getFloat(d: InternalRow, ordinal: Int): Float = d.getFloat(ordinal)
- override def getDouble(d: InternalRow, ordinal: Int): Double = d.getDouble(ordinal)
- override def getString(d: InternalRow, ordinal: Int): String = d.getString(ordinal)
- override def getByteArray(d: InternalRow, ordinal: Int): Array[Byte] = d.getBinary(ordinal)
- override def get[T >: Null: CatalystSerializer](d: InternalRow, ordinal: Int): T = {
- val ser = CatalystSerializer[T]
- val struct = d.getStruct(ordinal, ser.schema.size)
- struct.to[T]
- }
- override def create(values: Any*): InternalRow = InternalRow(values: _*)
- override def toSeq[T: CatalystSerializer](t: Seq[T]): ArrayData =
- ArrayData.toArrayData(t.map(_.toInternalRow).toArray)
-
- override def getSeq[T >: Null: CatalystSerializer](d: InternalRow, ordinal: Int): Seq[T] = {
- val ad = d.getArray(ordinal)
- val result = Array.ofDim[Any](ad.numElements()).asInstanceOf[Array[T]]
- ad.foreach(
- CatalystSerializer[T].schema,
- (i, v) => result(i) = v.asInstanceOf[InternalRow].to[T]
- )
- result.toSeq
- }
- override def encode(str: String): UTF8String = UTF8String.fromString(str)
- }
- }
-
- implicit class WithToRow[T: CatalystSerializer](t: T) {
- def toInternalRow: InternalRow = if (t == null) null else CatalystSerializer[T].toInternalRow(t)
- def toRow: Row = if (t == null) null else CatalystSerializer[T].toRow(t)
- }
-
- implicit class WithFromInternalRow(val r: InternalRow) extends AnyVal {
- def to[T >: Null: CatalystSerializer]: T = if (r == null) null else CatalystSerializer[T].fromInternalRow(r)
- }
-
- implicit class WithFromRow(val r: Row) extends AnyVal {
- def to[T >: Null: CatalystSerializer]: T = if (r == null) null else CatalystSerializer[T].fromRow(r)
- }
-
- implicit class WithTypeConformity(val left: DataType) extends AnyVal {
- def conformsTo[T >: Null: CatalystSerializer]: Boolean =
- org.apache.spark.sql.rf.WithTypeConformity(left).conformsTo(schemaOf[T])
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/CatalystSerializerEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/CatalystSerializerEncoder.scala
deleted file mode 100644
index 792b74165..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/CatalystSerializerEncoder.scala
+++ /dev/null
@@ -1,83 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2019 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import org.apache.spark.sql.catalyst.analysis.GetColumnByOrdinal
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.apache.spark.sql.catalyst.expressions._
-import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCode}
-import org.apache.spark.sql.catalyst.{InternalRow, ScalaReflection}
-import org.apache.spark.sql.types.{DataType, ObjectType, StructField, StructType}
-
-import scala.reflect.runtime.universe.TypeTag
-
-object CatalystSerializerEncoder {
-
- case class CatSerializeToRow[T](child: Expression, serde: CatalystSerializer[T])
- extends UnaryExpression {
- override def dataType: DataType = serde.schema
- override protected def nullSafeEval(input: Any): Any = {
- val value = input.asInstanceOf[T]
- serde.toInternalRow(value)
- }
- override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
- val cs = ctx.addReferenceObj("serde", serde, serde.getClass.getName)
- nullSafeCodeGen(ctx, ev, input => s"${ev.value} = $cs.toInternalRow($input);")
- }
- }
- case class CatDeserializeFromRow[T](child: Expression, serde: CatalystSerializer[T], outputType: DataType)
- extends UnaryExpression {
- override def dataType: DataType = outputType
-
- private def objType = outputType match {
- case ot: ObjectType => ot.cls.getName
- case o => s"java.lang.Object /* $o */" // not sure what to do here... hopefully shouldn't happen
- }
- override protected def nullSafeEval(input: Any): Any = {
- val row = input.asInstanceOf[InternalRow]
- serde.fromInternalRow(row)
- }
- override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
- val cs = ctx.addReferenceObj("serde", serde, classOf[CatalystSerializer[_]].getName)
- nullSafeCodeGen(ctx, ev, input => s"${ev.value} = ($objType) $cs.fromInternalRow($input);")
- }
- }
- def apply[T: TypeTag: CatalystSerializer](flat: Boolean = false): ExpressionEncoder[T] = {
- val serde = CatalystSerializer[T]
-
- val schema = if (flat)
- StructType(Seq(
- StructField("value", serde.schema, true)
- ))
- else serde.schema
-
- val parentType: DataType = ScalaReflection.dataTypeFor[T]
-
- val inputObject = BoundReference(0, parentType, nullable = true)
-
- val serializer = CatSerializeToRow(inputObject, serde)
-
- val deserializer: Expression = CatDeserializeFromRow(GetColumnByOrdinal(0, schema), serde, parentType)
-
- ExpressionEncoder(schema, flat = flat, Seq(serializer), deserializer, typeToClassTag[T])
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/CellTypeEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/CellTypeEncoder.scala
deleted file mode 100644
index ea01d4143..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/CellTypeEncoder.scala
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2017 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import geotrellis.raster.{CellType, DataType}
-import org.apache.spark.sql.catalyst.ScalaReflection
-import org.apache.spark.sql.catalyst.analysis.GetColumnByOrdinal
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.apache.spark.sql.rf.VersionShims.InvokeSafely
-import org.apache.spark.sql.types.{ObjectType, StringType}
-import org.apache.spark.unsafe.types.UTF8String
-import CatalystSerializer._
-import scala.reflect.classTag
-
-/**
- * Custom encoder for GT [[CellType]]. It's necessary since [[CellType]] is a type alias of
- * a type intersection.
- * @since 7/21/17
- */
-object CellTypeEncoder {
- def apply(): ExpressionEncoder[CellType] = {
- // We can't use StringBackedEncoder due to `CellType` being a type alias,
- // and Spark doesn't like that.
- import org.apache.spark.sql.catalyst.expressions._
- import org.apache.spark.sql.catalyst.expressions.objects._
- val ctType = ScalaReflection.dataTypeFor[DataType]
- val schema = schemaOf[CellType]
- val inputObject = BoundReference(0, ctType, nullable = false)
-
- val intermediateType = ObjectType(classOf[String])
- val serializer: Expression =
- StaticInvoke(
- classOf[UTF8String],
- StringType,
- "fromString",
- InvokeSafely(inputObject, "name", intermediateType) :: Nil
- )
-
- val inputRow = GetColumnByOrdinal(0, schema)
- val deserializer: Expression =
- StaticInvoke(CellType.getClass, ctType, "fromName", InvokeSafely(inputRow, "toString", intermediateType) :: Nil)
-
- ExpressionEncoder[CellType](schema, flat = false, Seq(serializer), deserializer, classTag[CellType])
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/DelegatingSubfieldEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/DelegatingSubfieldEncoder.scala
deleted file mode 100644
index cf4c2e5ac..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/DelegatingSubfieldEncoder.scala
+++ /dev/null
@@ -1,74 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2017 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import org.apache.spark.sql.catalyst.ScalaReflection
-import org.apache.spark.sql.catalyst.analysis.{GetColumnByOrdinal, UnresolvedAttribute, UnresolvedExtractValue}
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.apache.spark.sql.catalyst.expressions.objects.NewInstance
-import org.apache.spark.sql.catalyst.expressions._
-import org.apache.spark.sql.types.{StructField, StructType}
-import org.apache.spark.sql.rf.VersionShims.InvokeSafely
-
-import scala.reflect.runtime.universe.TypeTag
-
-/**
- * Encoder builder for types composed of other fields with {{ExpressionEncoder}}s.
- *
- * @since 8/2/17
- */
-object DelegatingSubfieldEncoder {
- def apply[T: TypeTag](
- fieldEncoders: (String, ExpressionEncoder[_])*): ExpressionEncoder[T] = {
- val schema = StructType(fieldEncoders.map {
- case (name, encoder) ⇒
- StructField(name, encoder.schema, false)
- })
-
- val parentType = ScalaReflection.dataTypeFor[T]
-
- val inputObject = BoundReference(0, parentType, nullable = false)
- val serializer = CreateNamedStruct(fieldEncoders.flatMap {
- case (name, encoder) ⇒
- val enc = encoder.serializer.map(_.transform {
- case r: BoundReference if r != inputObject ⇒
- InvokeSafely(inputObject, name, r.dataType)
- })
- Literal(name) :: CreateStruct(enc) :: Nil
- })
-
- val fieldDeserializers = fieldEncoders.map(_._2).zipWithIndex.map {
- case (enc, index) ⇒
- val input = GetColumnByOrdinal(index, enc.schema)
- val deserialized = enc.deserializer.transformUp {
- case UnresolvedAttribute(nameParts) ⇒
- UnresolvedExtractValue(input, Literal(nameParts.head))
- case GetColumnByOrdinal(ordinal, _) ⇒ GetStructField(input, ordinal)
- }
- If(IsNull(input), Literal.create(null, deserialized.dataType), deserialized)
- }
-
- val deserializer: Expression = NewInstance(runtimeClass[T], fieldDeserializers, parentType, propagateNull = false)
-
- ExpressionEncoder(schema, flat = false, serializer.flatten, deserializer, typeToClassTag[T])
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/EnvelopeEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/EnvelopeEncoder.scala
deleted file mode 100644
index 50d66f3e0..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/EnvelopeEncoder.scala
+++ /dev/null
@@ -1,62 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2019 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import org.locationtech.jts.geom.Envelope
-import org.apache.spark.sql.catalyst.ScalaReflection
-import org.apache.spark.sql.catalyst.analysis.GetColumnByOrdinal
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.apache.spark.sql.catalyst.expressions.objects.NewInstance
-import org.apache.spark.sql.catalyst.expressions.{BoundReference, CreateNamedStruct, Literal}
-import org.apache.spark.sql.rf.VersionShims.InvokeSafely
-import org.apache.spark.sql.types._
-import CatalystSerializer._
-import scala.reflect.classTag
-
-/**
- * Spark DataSet codec for JTS Envelope.
- *
- * @since 2/22/18
- */
-object EnvelopeEncoder {
-
- val schema = schemaOf[Envelope]
-
- val dataType: DataType = ScalaReflection.dataTypeFor[Envelope]
-
- def apply(): ExpressionEncoder[Envelope] = {
- val inputObject = BoundReference(0, ObjectType(classOf[Envelope]), nullable = true)
-
- val invokers = schema.flatMap { f ⇒
- val getter = "get" + f.name.head.toUpper + f.name.tail
- Literal(f.name) :: InvokeSafely(inputObject, getter, DoubleType) :: Nil
- }
-
- val serializer = CreateNamedStruct(invokers)
- val deserializer = NewInstance(classOf[Envelope],
- (0 to 3).map(GetColumnByOrdinal(_, DoubleType)),
- dataType, false
- )
-
- new ExpressionEncoder[Envelope](schema, flat = false, serializer.flatten, deserializer, classTag[Envelope])
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/ManualTypedEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/ManualTypedEncoder.scala
new file mode 100644
index 000000000..d9fd8282b
--- /dev/null
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/ManualTypedEncoder.scala
@@ -0,0 +1,91 @@
+package org.locationtech.rasterframes.encoders
+
+import frameless.{RecordEncoderField, TypedEncoder}
+import org.apache.spark.sql.FramelessInternals
+import org.apache.spark.sql.catalyst.expressions.objects.{Invoke, InvokeLike, NewInstance, StaticInvoke}
+import org.apache.spark.sql.catalyst.expressions.{CreateNamedStruct, Expression, GetStructField, If, IsNull, Literal}
+import org.apache.spark.sql.types.{DataType, Metadata, StructField, StructType}
+
+import scala.reflect.{ClassTag, classTag}
+
+/** Can be useful for non Scala types and for complicated case classes with implicits in the constructor. */
+object ManualTypedEncoder {
+ /** Invokes apply from the companion object. */
+ def staticInvoke[T: ClassTag](
+ fields: List[RecordEncoderField],
+ fieldNameModify: String => String = identity,
+ isNullable: Boolean = true
+ ): TypedEncoder[T] = apply[T](fields, { (classTag, newArgs, jvmRepr) => StaticInvoke(classTag.runtimeClass, jvmRepr, "apply", newArgs, propagateNull = true, returnNullable = false) }, fieldNameModify, isNullable)
+
+ /** Invokes object constructor. */
+ def newInstance[T: ClassTag](
+ fields: List[RecordEncoderField],
+ fieldNameModify: String => String = identity,
+ isNullable: Boolean = true
+ ): TypedEncoder[T] = apply[T](fields, { (classTag, newArgs, jvmRepr) => NewInstance(classTag.runtimeClass, newArgs, jvmRepr, propagateNull = true) }, fieldNameModify, isNullable)
+
+ def apply[T: ClassTag](
+ fields: List[RecordEncoderField],
+ newInstanceExpression: (ClassTag[T], Seq[Expression], DataType) => InvokeLike,
+ fieldNameModify: String => String = identity,
+ isNullable: Boolean = true
+ ): TypedEncoder[T] = make[T](fields, newInstanceExpression, fieldNameModify, isNullable, classTag[T])
+
+ private def make[T](
+ // the catalyst struct
+ fields: List[RecordEncoderField],
+ // newInstanceExpression for the fromCatalyst function
+ newInstanceExpression: (ClassTag[T], Seq[Expression], DataType) => InvokeLike,
+ // allows to convert the field name into the field name getter
+ fieldNameModify: String => String,
+ // is the codec nullable
+ isNullable: Boolean,
+ // ClassTag is required for the TypedEncoder constructor
+ // it is passed explicitly to disambiguate ClassTag passed implicitly as a function argument
+ // and the one from the TypedEncoder constructor
+ ct: ClassTag[T]
+ ): TypedEncoder[T] = new TypedEncoder[T]()(ct) {
+ def nullable: Boolean = isNullable
+
+ def jvmRepr: DataType = FramelessInternals.objectTypeFor[T]
+
+ def catalystRepr: DataType = {
+ val structFields = fields.map { field =>
+ StructField(
+ name = field.name,
+ dataType = field.encoder.catalystRepr,
+ nullable = field.encoder.nullable,
+ metadata = Metadata.empty
+ )
+ }
+
+ StructType(structFields)
+ }
+
+ def fromCatalyst(path: Expression): Expression = {
+ val newArgs: Seq[Expression] = fields.map { field =>
+ field.encoder.fromCatalyst( GetStructField(path, field.ordinal, Some(field.name)) )
+ }
+ val newExpr = newInstanceExpression(classTag, newArgs, jvmRepr)
+
+ val nullExpr = Literal.create(null, jvmRepr)
+ If(IsNull(path), nullExpr, newExpr)
+ }
+
+ def toCatalyst(path: Expression): Expression = {
+ val nameExprs = fields.map { field => Literal(field.name) }
+
+ val valueExprs: Seq[Expression] = fields.map { field =>
+ val fieldPath = Invoke(path, fieldNameModify(field.name), field.encoder.jvmRepr, Nil)
+ field.encoder.toCatalyst(fieldPath)
+ }
+
+ // the way exprs are encoded in CreateNamedStruct
+ val exprs = nameExprs.zip(valueExprs).flatMap { case (nameExpr, valueExpr) => nameExpr :: valueExpr :: Nil }
+
+ val createExpr = CreateNamedStruct(exprs)
+ val nullExpr = Literal.create(null, createExpr.dataType)
+ If(IsNull(path), nullExpr, createExpr)
+ }
+ }
+}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/SerializersCache.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/SerializersCache.scala
new file mode 100644
index 000000000..02cfde90f
--- /dev/null
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/SerializersCache.scala
@@ -0,0 +1,68 @@
+package org.locationtech.rasterframes.encoders
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.encoders.{ExpressionEncoder, RowEncoder}
+
+import scala.collection.mutable
+import scala.reflect.runtime.universe.TypeTag
+
+object SerializersCache {
+ /**
+ * Spark partitions are executed on a blocking thread pool.
+ * We can keep the cache of (De)Serializers (every serializer instance creation is pretty expensive),
+ * but the cache should be local per thread.
+ *
+ * When used from multiple threads (De)Serializers tend to corrupt data and / or fail at runtime.
+ * The alternative can be to use global locks or to use a separate executor per each (De)Serializer.
+ */
+ private class ThreadLocalHashMap[K, V] extends ThreadLocal[mutable.HashMap[K, V]] {
+ override def initialValue(): mutable.HashMap[K, V] = mutable.HashMap.empty
+ }
+ private object ThreadLocalHashMap {
+ def empty[K, V]: ThreadLocalHashMap[K, V] = new ThreadLocalHashMap
+ }
+
+ /** SerializerSafe ensures that all Serializers from the pool call copy after application. */
+ case class SerializerSafe[T](underlying: ExpressionEncoder.Serializer[T]) {
+ def apply(t: T): InternalRow = underlying.apply(t).copy()
+ }
+
+ // T => InternalRow
+ private val cacheSerializer: ThreadLocalHashMap[TypeTag[_], SerializerSafe[_]] = ThreadLocalHashMap.empty
+ // Row with Schema T => InternalRow
+ private val cacheSerializerRow: ThreadLocalHashMap[TypeTag[_], SerializerSafe[Row]] = ThreadLocalHashMap.empty
+ // InternalRow => T
+ private val cacheDeserializer: ThreadLocalHashMap[TypeTag[_], ExpressionEncoder.Deserializer[_]] = ThreadLocalHashMap.empty
+ // InternalRow => Row with Schema T
+ private val cacheDeserializerRow: ThreadLocalHashMap[TypeTag[_], ExpressionEncoder.Deserializer[Row]] = ThreadLocalHashMap.empty
+
+ def serializer[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): SerializerSafe[T] =
+ cacheSerializer.get.getOrElseUpdate(tag, SerializerSafe(encoder.createSerializer())).asInstanceOf[SerializerSafe[T]]
+
+ def rowSerializer[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): SerializerSafe[Row] =
+ cacheSerializerRow.get.getOrElseUpdate(tag, SerializerSafe(RowEncoder(encoder.schema).createSerializer()))
+
+ def deserializer[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): ExpressionEncoder.Deserializer[T] =
+ cacheDeserializer.get.getOrElseUpdate(tag, encoder.resolveAndBind().createDeserializer()).asInstanceOf[ExpressionEncoder.Deserializer[T]]
+
+ def rowDeserializer[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): ExpressionEncoder.Deserializer[Row] =
+ cacheDeserializerRow.get.getOrElseUpdate(tag, RowEncoder(encoder.schema).resolveAndBind().createDeserializer())
+
+ /**
+ * https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-RowEncoder.html
+ * https://github.com/apache/spark/blob/93cec49212fe82816fcadf69f429cebaec60e058/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L75-L86
+ */
+ def rowDeserialize[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): Row => T =
+ { row => deserializer[T](tag, encoder)(rowSerializer[T](tag, encoder)(row)) }
+
+ def rowSerialize[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): T => Row =
+ { t => rowDeserializer[T](tag, encoder)(serializer[T](tag, encoder)(t)) }
+
+ def clean(): Unit = {
+ cacheSerializer.remove()
+ cacheSerializerRow.remove()
+ cacheDeserializer.remove()
+ cacheDeserializerRow.remove()
+ }
+}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/SparkBasicEncoders.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/SparkBasicEncoders.scala
index e2830f7f1..b1257b6bd 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/SparkBasicEncoders.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/SparkBasicEncoders.scala
@@ -28,11 +28,13 @@ import scala.reflect.runtime.universe._
/**
* Container for primitive Spark encoders, pulled into implicit scope.
+ * Be careful with these imports, it may conflict with spark.implicits._ when is in the same scope.
*
* @since 12/28/17
*/
private[rasterframes] trait SparkBasicEncoders {
implicit def arrayEnc[T: TypeTag]: Encoder[Array[T]] = ExpressionEncoder()
+ implicit def seqEnc[T: TypeTag]: Encoder[Seq[T]] = ExpressionEncoder()
implicit val intEnc: Encoder[Int] = Encoders.scalaInt
implicit val longEnc: Encoder[Long] = Encoders.scalaLong
implicit val stringEnc: Encoder[String] = Encoders.STRING
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardEncoders.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardEncoders.scala
index 256da58d8..639b33cbd 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardEncoders.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardEncoders.scala
@@ -21,58 +21,70 @@
package org.locationtech.rasterframes.encoders
-import java.net.URI
-import java.sql.Timestamp
-
import org.locationtech.rasterframes.stats.{CellHistogram, CellStatistics, LocalCellStatistics}
import org.locationtech.jts.geom.Envelope
import geotrellis.proj4.CRS
-import geotrellis.raster.{CellSize, CellType, Raster, Tile, TileLayout}
-import geotrellis.spark.tiling.LayoutDefinition
-import geotrellis.spark.{KeyBounds, SpaceTimeKey, SpatialKey, TemporalKey, TemporalProjectedExtent, TileLayerMetadata}
+import geotrellis.raster.{CellGrid, CellSize, CellType, Dimensions, GridBounds, Raster, Tile, TileLayout}
+import geotrellis.layer._
import geotrellis.vector.{Extent, ProjectedExtent}
-import org.apache.spark.sql.{Encoder, Encoders}
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+import org.apache.spark.sql.catalyst.util.QuantileSummaries
import org.locationtech.geomesa.spark.jts.encoders.SpatialEncoders
-import org.locationtech.rasterframes.model.{CellContext, Cells, TileContext, TileDataContext}
+import org.locationtech.rasterframes.model.{CellContext, LongExtent, TileContext, TileDataContext}
+import frameless.TypedEncoder
+import geotrellis.raster.mapalgebra.focal.{Kernel, Neighborhood, TargetCell}
+import org.locationtech.rasterframes.ref.RFRasterSource
+import org.locationtech.rasterframes.tiles.ProjectedRasterTile
+import java.net.URI
+import java.sql.Timestamp
+import scala.reflect.ClassTag
import scala.reflect.runtime.universe._
-/**
- * Implicit encoder definitions for RasterFrameLayer types.
- */
-trait StandardEncoders extends SpatialEncoders {
- object PrimitiveEncoders extends SparkBasicEncoders
+trait StandardEncoders extends SpatialEncoders with TypedEncoders {
def expressionEncoder[T: TypeTag]: ExpressionEncoder[T] = ExpressionEncoder()
- implicit def spatialKeyEncoder: ExpressionEncoder[SpatialKey] = ExpressionEncoder()
- implicit def temporalKeyEncoder: ExpressionEncoder[TemporalKey] = ExpressionEncoder()
- implicit def spaceTimeKeyEncoder: ExpressionEncoder[SpaceTimeKey] = ExpressionEncoder()
- implicit def layoutDefinitionEncoder: ExpressionEncoder[LayoutDefinition] = ExpressionEncoder()
- implicit def stkBoundsEncoder: ExpressionEncoder[KeyBounds[SpaceTimeKey]] = ExpressionEncoder()
- implicit def extentEncoder: ExpressionEncoder[Extent] = ExpressionEncoder[Extent]()
- implicit def singlebandTileEncoder: ExpressionEncoder[Tile] = ExpressionEncoder()
- implicit def rasterEncoder: ExpressionEncoder[Raster[Tile]] = ExpressionEncoder()
- implicit def tileLayerMetadataEncoder[K: TypeTag]: ExpressionEncoder[TileLayerMetadata[K]] = TileLayerMetadataEncoder()
- implicit def crsEncoder: ExpressionEncoder[CRS] = CRSEncoder()
- implicit def projectedExtentEncoder: ExpressionEncoder[ProjectedExtent] = ProjectedExtentEncoder()
- implicit def temporalProjectedExtentEncoder: ExpressionEncoder[TemporalProjectedExtent] = TemporalProjectedExtentEncoder()
- implicit def cellTypeEncoder: ExpressionEncoder[CellType] = CellTypeEncoder()
- implicit def cellSizeEncoder: ExpressionEncoder[CellSize] = ExpressionEncoder()
- implicit def uriEncoder: ExpressionEncoder[URI] = URIEncoder()
- implicit def envelopeEncoder: ExpressionEncoder[Envelope] = EnvelopeEncoder()
- implicit def timestampEncoder: ExpressionEncoder[Timestamp] = ExpressionEncoder()
- implicit def strMapEncoder: ExpressionEncoder[Map[String, String]] = ExpressionEncoder()
- implicit def cellStatsEncoder: ExpressionEncoder[CellStatistics] = ExpressionEncoder()
- implicit def cellHistEncoder: ExpressionEncoder[CellHistogram] = ExpressionEncoder()
- implicit def localCellStatsEncoder: ExpressionEncoder[LocalCellStatistics] = ExpressionEncoder()
- implicit def tilelayoutEncoder: ExpressionEncoder[TileLayout] = ExpressionEncoder()
- implicit def cellContextEncoder: ExpressionEncoder[CellContext] = CellContext.encoder
- implicit def cellsEncoder: ExpressionEncoder[Cells] = Cells.encoder
- implicit def tileContextEncoder: ExpressionEncoder[TileContext] = TileContext.encoder
- implicit def tileDataContextEncoder: ExpressionEncoder[TileDataContext] = TileDataContext.encoder
- implicit def extentTilePairEncoder: Encoder[(ProjectedExtent, Tile)] = Encoders.tuple(projectedExtentEncoder, singlebandTileEncoder)
+ implicit def optionalEncoder[T: TypedEncoder]: ExpressionEncoder[Option[T]] = typedExpressionEncoder[Option[T]]
+
+ implicit lazy val strMapEncoder: ExpressionEncoder[Map[String, String]] = ExpressionEncoder()
+ implicit lazy val projectedExtentEncoder: ExpressionEncoder[ProjectedExtent] = ExpressionEncoder()
+ implicit lazy val temporalProjectedExtentEncoder: ExpressionEncoder[TemporalProjectedExtent] = ExpressionEncoder()
+ implicit lazy val timestampEncoder: ExpressionEncoder[Timestamp] = ExpressionEncoder()
+ implicit lazy val cellStatsEncoder: ExpressionEncoder[CellStatistics] = ExpressionEncoder()
+ implicit lazy val cellHistEncoder: ExpressionEncoder[CellHistogram] = ExpressionEncoder()
+ implicit lazy val localCellStatsEncoder: ExpressionEncoder[LocalCellStatistics] = ExpressionEncoder()
+
+ implicit lazy val crsExpressionEncoder: ExpressionEncoder[CRS] = typedExpressionEncoder
+ implicit lazy val uriEncoder: ExpressionEncoder[URI] = typedExpressionEncoder[URI]
+ implicit lazy val neighborhoodEncoder: ExpressionEncoder[Neighborhood] = typedExpressionEncoder[Neighborhood]
+ implicit lazy val targetCellEncoder: ExpressionEncoder[TargetCell] = typedExpressionEncoder[TargetCell]
+ implicit lazy val kernelEncoder: ExpressionEncoder[Kernel] = typedExpressionEncoder[Kernel]
+ implicit lazy val quantileSummariesEncoder: ExpressionEncoder[QuantileSummaries] = typedExpressionEncoder[QuantileSummaries]
+ implicit lazy val envelopeEncoder: ExpressionEncoder[Envelope] = typedExpressionEncoder
+ implicit lazy val longExtentEncoder: ExpressionEncoder[LongExtent] = typedExpressionEncoder
+ implicit lazy val extentEncoder: ExpressionEncoder[Extent] = typedExpressionEncoder
+ implicit lazy val cellSizeEncoder: ExpressionEncoder[CellSize] = typedExpressionEncoder
+ implicit lazy val tileLayoutEncoder: ExpressionEncoder[TileLayout] = typedExpressionEncoder
+ implicit lazy val spatialKeyEncoder: ExpressionEncoder[SpatialKey] = typedExpressionEncoder
+ implicit lazy val temporalKeyEncoder: ExpressionEncoder[TemporalKey] = typedExpressionEncoder
+ implicit lazy val spaceTimeKeyEncoder: ExpressionEncoder[SpaceTimeKey] = typedExpressionEncoder
+ implicit def keyBoundsEncoder[K: TypedEncoder]: ExpressionEncoder[KeyBounds[K]] = typedExpressionEncoder[KeyBounds[K]]
+ implicit lazy val cellTypeEncoder: ExpressionEncoder[CellType] = typedExpressionEncoder[CellType]
+ implicit def dimensionsEncoder[N: Integral: TypedEncoder]: ExpressionEncoder[Dimensions[N]] = typedExpressionEncoder[Dimensions[N]]
+ implicit def gridBoundsEncoder[N: Integral: TypedEncoder]: ExpressionEncoder[GridBounds[N]] = typedExpressionEncoder[GridBounds[N]]
+ implicit lazy val layoutDefinitionEncoder: ExpressionEncoder[LayoutDefinition] = typedExpressionEncoder
+ implicit def tileLayerMetadataEncoder[K: TypedEncoder: ClassTag]: ExpressionEncoder[TileLayerMetadata[K]] = typedExpressionEncoder[TileLayerMetadata[K]]
+ implicit lazy val tileContextEncoder: ExpressionEncoder[TileContext] = typedExpressionEncoder
+ implicit lazy val tileDataContextEncoder: ExpressionEncoder[TileDataContext] = typedExpressionEncoder
+ implicit lazy val cellContextEncoder: ExpressionEncoder[CellContext] = typedExpressionEncoder
+
+ implicit lazy val tileEncoder: ExpressionEncoder[Tile] = typedExpressionEncoder
+ implicit def rasterEncoder[T <: CellGrid[Int]: TypedEncoder]: ExpressionEncoder[Raster[T]] = typedExpressionEncoder[Raster[T]]
+ // Intentionally not implicit, defined as implicit in the ProjectedRasterTile companion object
+ lazy val projectedRasterTileEncoder: ExpressionEncoder[ProjectedRasterTile] = typedExpressionEncoder
+ // Intentionally not implicit, defined as implicit in the RFRasterSource companion object
+ lazy val rfRasterSourceEncoder: ExpressionEncoder[RFRasterSource] = typedExpressionEncoder
}
object StandardEncoders extends StandardEncoders
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardSerializers.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardSerializers.scala
deleted file mode 100644
index 1983f8bb9..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/StandardSerializers.scala
+++ /dev/null
@@ -1,306 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2019 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import com.github.blemale.scaffeine.Scaffeine
-import geotrellis.proj4.CRS
-import geotrellis.raster._
-import geotrellis.spark._
-import geotrellis.spark.tiling.LayoutDefinition
-import geotrellis.vector._
-import org.apache.spark.sql.types._
-import org.locationtech.jts.geom.Envelope
-import org.locationtech.rasterframes.TileType
-import org.locationtech.rasterframes.encoders.CatalystSerializer.{CatalystIO, _}
-import org.locationtech.rasterframes.model.LazyCRS
-
-/** Collection of CatalystSerializers for third-party types. */
-trait StandardSerializers {
-
- implicit val envelopeSerializer: CatalystSerializer[Envelope] = new CatalystSerializer[Envelope] {
- override val schema: StructType = StructType(Seq(
- StructField("minX", DoubleType, false),
- StructField("maxX", DoubleType, false),
- StructField("minY", DoubleType, false),
- StructField("maxY", DoubleType, false)
- ))
-
- override protected def to[R](t: Envelope, io: CatalystIO[R]): R = io.create(
- t.getMinX, t.getMaxX, t.getMinY, t.getMaxX
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): Envelope = new Envelope(
- io.getDouble(t, 0), io.getDouble(t, 1), io.getDouble(t, 2), io.getDouble(t, 3)
- )
- }
-
- implicit val extentSerializer: CatalystSerializer[Extent] = new CatalystSerializer[Extent] {
- override val schema: StructType = StructType(Seq(
- StructField("xmin", DoubleType, false),
- StructField("ymin", DoubleType, false),
- StructField("xmax", DoubleType, false),
- StructField("ymax", DoubleType, false)
- ))
- override def to[R](t: Extent, io: CatalystIO[R]): R = io.create(
- t.xmin, t.ymin, t.xmax, t.ymax
- )
- override def from[R](row: R, io: CatalystIO[R]): Extent = Extent(
- io.getDouble(row, 0),
- io.getDouble(row, 1),
- io.getDouble(row, 2),
- io.getDouble(row, 3)
- )
- }
-
- implicit val gridBoundsSerializer: CatalystSerializer[GridBounds] = new CatalystSerializer[GridBounds] {
- override val schema: StructType = StructType(Seq(
- StructField("colMin", IntegerType, false),
- StructField("rowMin", IntegerType, false),
- StructField("colMax", IntegerType, false),
- StructField("rowMax", IntegerType, false)
- ))
-
- override protected def to[R](t: GridBounds, io: CatalystIO[R]): R = io.create(
- t.colMin, t.rowMin, t.colMax, t.rowMax
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): GridBounds = GridBounds(
- io.getInt(t, 0),
- io.getInt(t, 1),
- io.getInt(t, 2),
- io.getInt(t, 3)
- )
- }
-
- implicit val crsSerializer: CatalystSerializer[CRS] = new CatalystSerializer[CRS] {
- override val schema: StructType = StructType(Seq(
- StructField("crsProj4", StringType, false)
- ))
- override def to[R](t: CRS, io: CatalystIO[R]): R = io.create(
- io.encode(
- // Don't do this... it's 1000x slower to decode.
- //t.epsgCode.map(c => "EPSG:" + c).getOrElse(t.toProj4String)
- t.toProj4String
- )
- )
- override def from[R](row: R, io: CatalystIO[R]): CRS =
- LazyCRS(io.getString(row, 0))
- }
-
- implicit val cellTypeSerializer: CatalystSerializer[CellType] = new CatalystSerializer[CellType] {
- import StandardSerializers._
- override val schema: StructType = StructType(Seq(
- StructField("cellTypeName", StringType, false)
- ))
- override def to[R](t: CellType, io: CatalystIO[R]): R = io.create(
- io.encode(ct2sCache.get(t))
- )
- override def from[R](row: R, io: CatalystIO[R]): CellType =
- s2ctCache.get(io.getString(row, 0))
- }
-
- implicit val projectedExtentSerializer: CatalystSerializer[ProjectedExtent] = new CatalystSerializer[ProjectedExtent] {
- override val schema: StructType = StructType(Seq(
- StructField("extent", schemaOf[Extent], false),
- StructField("crs", schemaOf[CRS], false)
- ))
-
- override protected def to[R](t: ProjectedExtent, io: CatalystSerializer.CatalystIO[R]): R = io.create(
- io.to(t.extent),
- io.to(t.crs)
- )
-
- override protected def from[R](t: R, io: CatalystSerializer.CatalystIO[R]): ProjectedExtent = ProjectedExtent(
- io.get[Extent](t, 0),
- io.get[CRS](t, 1)
- )
- }
-
- implicit val spatialKeySerializer: CatalystSerializer[SpatialKey] = new CatalystSerializer[SpatialKey] {
- override val schema: StructType = StructType(Seq(
- StructField("col", IntegerType, false),
- StructField("row", IntegerType, false)
- ))
-
- override protected def to[R](t: SpatialKey, io: CatalystIO[R]): R = io.create(
- t.col,
- t.row
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): SpatialKey = SpatialKey(
- io.getInt(t, 0),
- io.getInt(t, 1)
- )
- }
-
- implicit val spacetimeKeySerializer: CatalystSerializer[SpaceTimeKey] = new CatalystSerializer[SpaceTimeKey] {
- override val schema: StructType = StructType(Seq(
- StructField("col", IntegerType, false),
- StructField("row", IntegerType, false),
- StructField("instant", LongType, false)
- ))
-
- override protected def to[R](t: SpaceTimeKey, io: CatalystIO[R]): R = io.create(
- t.col,
- t.row,
- t.instant
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): SpaceTimeKey = SpaceTimeKey(
- io.getInt(t, 0),
- io.getInt(t, 1),
- io.getLong(t, 2)
- )
- }
-
- implicit val cellSizeSerializer: CatalystSerializer[CellSize] = new CatalystSerializer[CellSize] {
- override val schema: StructType = StructType(Seq(
- StructField("width", DoubleType, false),
- StructField("height", DoubleType, false)
- ))
-
- override protected def to[R](t: CellSize, io: CatalystIO[R]): R = io.create(
- t.width,
- t.height
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): CellSize = CellSize(
- io.getDouble(t, 0),
- io.getDouble(t, 1)
- )
- }
-
- implicit val tileLayoutSerializer: CatalystSerializer[TileLayout] = new CatalystSerializer[TileLayout] {
- override val schema: StructType = StructType(Seq(
- StructField("layoutCols", IntegerType, false),
- StructField("layoutRows", IntegerType, false),
- StructField("tileCols", IntegerType, false),
- StructField("tileRows", IntegerType, false)
- ))
-
- override protected def to[R](t: TileLayout, io: CatalystIO[R]): R = io.create(
- t.layoutCols,
- t.layoutRows,
- t.tileCols,
- t.tileRows
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): TileLayout = TileLayout(
- io.getInt(t, 0),
- io.getInt(t, 1),
- io.getInt(t, 2),
- io.getInt(t, 3)
- )
- }
-
- implicit val layoutDefinitionSerializer = new CatalystSerializer[LayoutDefinition] {
- override val schema: StructType = StructType(Seq(
- StructField("extent", schemaOf[Extent], true),
- StructField("tileLayout", schemaOf[TileLayout], true)
- ))
-
- override protected def to[R](t: LayoutDefinition, io: CatalystIO[R]): R = io.create(
- io.to(t.extent),
- io.to(t.tileLayout)
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): LayoutDefinition = LayoutDefinition(
- io.get[Extent](t, 0),
- io.get[TileLayout](t, 1)
- )
- }
-
- implicit def boundsSerializer[T >: Null: CatalystSerializer]: CatalystSerializer[KeyBounds[T]] = new CatalystSerializer[KeyBounds[T]] {
- override val schema: StructType = StructType(Seq(
- StructField("minKey", schemaOf[T], true),
- StructField("maxKey", schemaOf[T], true)
- ))
-
- override protected def to[R](t: KeyBounds[T], io: CatalystIO[R]): R = io.create(
- io.to(t.get.minKey),
- io.to(t.get.maxKey)
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): KeyBounds[T] = KeyBounds(
- io.get[T](t, 0),
- io.get[T](t, 1)
- )
- }
-
- def tileLayerMetadataSerializer[T >: Null: CatalystSerializer]: CatalystSerializer[TileLayerMetadata[T]] = new CatalystSerializer[TileLayerMetadata[T]] {
- override val schema: StructType = StructType(Seq(
- StructField("cellType", schemaOf[CellType], false),
- StructField("layout", schemaOf[LayoutDefinition], false),
- StructField("extent", schemaOf[Extent], false),
- StructField("crs", schemaOf[CRS], false),
- StructField("bounds", schemaOf[KeyBounds[T]], false)
- ))
-
- override protected def to[R](t: TileLayerMetadata[T], io: CatalystIO[R]): R = io.create(
- io.to(t.cellType),
- io.to(t.layout),
- io.to(t.extent),
- io.to(t.crs),
- io.to(t.bounds.head)
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): TileLayerMetadata[T] = TileLayerMetadata(
- io.get[CellType](t, 0),
- io.get[LayoutDefinition](t, 1),
- io.get[Extent](t, 2),
- io.get[CRS](t, 3),
- io.get[KeyBounds[T]](t, 4)
- )
- }
-
- implicit def rasterSerializer: CatalystSerializer[Raster[Tile]] = new CatalystSerializer[Raster[Tile]] {
- import org.apache.spark.sql.rf.TileUDT.tileSerializer
-
- override val schema: StructType = StructType(Seq(
- StructField("tile", TileType, false),
- StructField("extent", schemaOf[Extent], false)
- ))
-
- override protected def to[R](t: Raster[Tile], io: CatalystIO[R]): R = io.create(
- io.to(t.tile),
- io.to(t.extent)
- )
-
- override protected def from[R](t: R, io: CatalystIO[R]): Raster[Tile] = Raster(
- io.get[Tile](t, 0),
- io.get[Extent](t, 1)
- )
- }
-
- implicit val spatialKeyTLMSerializer = tileLayerMetadataSerializer[SpatialKey]
- implicit val spaceTimeKeyTLMSerializer = tileLayerMetadataSerializer[SpaceTimeKey]
-
-}
-
-object StandardSerializers {
- private val s2ctCache = Scaffeine().build[String, CellType](
- (s: String) => CellType.fromName(s)
- )
- private val ct2sCache = Scaffeine().build[CellType, String](
- (ct: CellType) => ct.toString()
- )
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/StringBackedEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/StringBackedEncoder.scala
deleted file mode 100644
index 2ec265ccc..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/StringBackedEncoder.scala
+++ /dev/null
@@ -1,70 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2018 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import org.apache.spark.sql.catalyst.ScalaReflection
-import org.apache.spark.sql.catalyst.analysis.GetColumnByOrdinal
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-import org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke
-import org.apache.spark.sql.catalyst.expressions.{BoundReference, Expression}
-import org.apache.spark.sql.types._
-import org.apache.spark.unsafe.types.UTF8String
-import org.apache.spark.sql.rf.VersionShims.InvokeSafely
-
-import scala.reflect.runtime.universe._
-
-/**
- * Generalized operations for creating an encoder when the type can be represented as a Catalyst string.
- *
- * @since 1/16/18
- */
-object StringBackedEncoder {
- def apply[T: TypeTag](
- fieldName: String,
- toStringMethod: String,
- fromStringStatic: (Class[_], String)): ExpressionEncoder[T] = {
-
- val sparkType = ScalaReflection.dataTypeFor[T]
- val schema = StructType(Seq(StructField(fieldName, StringType, false)))
- val inputObject = BoundReference(0, sparkType, nullable = false)
-
- val intermediateType = ObjectType(classOf[String])
- val serializer: Expression =
- StaticInvoke(
- classOf[UTF8String],
- StringType,
- "fromString",
- InvokeSafely(inputObject, toStringMethod, intermediateType) :: Nil
- )
-
- val inputRow = GetColumnByOrdinal(0, schema)
- val deserializer: Expression =
- StaticInvoke(
- fromStringStatic._1,
- sparkType,
- fromStringStatic._2,
- InvokeSafely(inputRow, "toString", intermediateType) :: Nil
- )
-
- ExpressionEncoder[T](schema, flat = false, Seq(serializer), deserializer, typeToClassTag[T])
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/TemporalProjectedExtentEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/TemporalProjectedExtentEncoder.scala
deleted file mode 100644
index f69f7f160..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/TemporalProjectedExtentEncoder.scala
+++ /dev/null
@@ -1,43 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2017 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import org.locationtech.rasterframes._
-import geotrellis.spark.TemporalProjectedExtent
-import org.apache.spark.sql.Encoders
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-
-/**
- * Custom encoder for `TemporalProjectedExtent`. Necessary because `geotrellis.proj4.CRS` within
- * `ProjectedExtent` isn't a case class, and `ZonedDateTime` doesn't have a natural encoder.
- *
- * @since 8/2/17
- */
-object TemporalProjectedExtentEncoder {
- def apply(): ExpressionEncoder[TemporalProjectedExtent] = {
- DelegatingSubfieldEncoder(
- "extent" -> extentEncoder,
- "crs" -> crsEncoder,
- "instant" -> Encoders.scalaLong.asInstanceOf[ExpressionEncoder[Long]]
- )
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/TileLayerMetadataEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/TileLayerMetadataEncoder.scala
deleted file mode 100644
index 2f59ea451..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/TileLayerMetadataEncoder.scala
+++ /dev/null
@@ -1,50 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2017 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import geotrellis.spark.{KeyBounds, TileLayerMetadata}
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-
-import scala.reflect.runtime.universe._
-
-/**
- * Specialized encoder for [[TileLayerMetadata]], necessary to be able to delegate to the
- * specialized cell type and crs encoders.
- *
- * @since 7/21/17
- */
-object TileLayerMetadataEncoder {
- import org.locationtech.rasterframes._
-
- private def fieldEncoders = Seq[(String, ExpressionEncoder[_])](
- "cellType" -> cellTypeEncoder,
- "layout" -> layoutDefinitionEncoder,
- "extent" -> extentEncoder,
- "crs" -> crsEncoder
- )
-
- def apply[K: TypeTag](): ExpressionEncoder[TileLayerMetadata[K]] = {
- val boundsEncoder = ExpressionEncoder[KeyBounds[K]]()
- val fEncoders = fieldEncoders :+ ("bounds" -> boundsEncoder)
- DelegatingSubfieldEncoder(fEncoders: _*)
- }
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/TypedEncoders.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/TypedEncoders.scala
new file mode 100644
index 000000000..524ca4c17
--- /dev/null
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/TypedEncoders.scala
@@ -0,0 +1,112 @@
+package org.locationtech.rasterframes.encoders
+
+import frameless._
+import geotrellis.layer.{KeyBounds, LayoutDefinition, TileLayerMetadata}
+import geotrellis.proj4.CRS
+import geotrellis.raster.mapalgebra.focal.{Kernel, Neighborhood, TargetCell}
+import geotrellis.raster.{CellGrid, CellType, Dimensions, GridBounds, Raster, Tile}
+import geotrellis.vector.Extent
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+import org.apache.spark.sql.catalyst.util.QuantileSummaries
+import org.apache.spark.sql.rf.{CrsUDT, RasterSourceUDT, TileUDT}
+import org.locationtech.jts.geom.Envelope
+import org.locationtech.rasterframes.ref.RFRasterSource
+import org.locationtech.rasterframes.tiles.ProjectedRasterTile
+import org.locationtech.rasterframes.util.{FocalNeighborhood, FocalTargetCell, KryoSupport}
+
+import java.net.URI
+import java.nio.ByteBuffer
+import scala.reflect.ClassTag
+
+trait TypedEncoders {
+ def typedExpressionEncoder[T: TypedEncoder]: ExpressionEncoder[T] = TypedExpressionEncoder[T].asInstanceOf[ExpressionEncoder[T]]
+
+ implicit val crsUDT = new CrsUDT
+ implicit val tileUDT = new TileUDT
+ implicit val rasterSourceUDT = new RasterSourceUDT
+
+ implicit val crsTypedEncoder: TypedEncoder[CRS] = TypedEncoder.usingUserDefinedType[CRS]
+
+ implicit val cellTypeInjection: Injection[CellType, String] = Injection(_.toString, CellType.fromName)
+ implicit val cellTypeTypedEncoder: TypedEncoder[CellType] = TypedEncoder.usingInjection[CellType, String]
+
+ implicit val quantileSummariesInjection: Injection[QuantileSummaries, Array[Byte]] =
+ Injection(KryoSupport.serialize(_).array(), array => KryoSupport.deserialize[QuantileSummaries](ByteBuffer.wrap(array)))
+
+ implicit val quantileSummariesTypedEncoder: TypedEncoder[QuantileSummaries] = TypedEncoder.usingInjection
+
+ implicit val uriInjection: Injection[URI, String] = Injection(_.toString, new URI(_))
+ implicit val uriTypedEncoder: TypedEncoder[URI] = TypedEncoder.usingInjection
+
+ implicit val neighborhoodInjection: Injection[Neighborhood, String] = Injection(FocalNeighborhood(_), FocalNeighborhood.fromString(_).get)
+ implicit val neighborhoodTypedEncoder: TypedEncoder[Neighborhood] = TypedEncoder.usingInjection
+
+ implicit val targetCellInjection: Injection[TargetCell, String] = Injection(FocalTargetCell(_), FocalTargetCell.fromString)
+ implicit val targetCellTypedEncoder: TypedEncoder[TargetCell] = TypedEncoder.usingInjection
+
+ implicit val envelopeTypedEncoder: TypedEncoder[Envelope] =
+ ManualTypedEncoder.newInstance[Envelope](
+ fields = List(
+ RecordEncoderField(0, "minX", TypedEncoder[Double]),
+ RecordEncoderField(1, "maxX", TypedEncoder[Double]),
+ RecordEncoderField(2, "minY", TypedEncoder[Double]),
+ RecordEncoderField(3, "maxY", TypedEncoder[Double])
+ ),
+ fieldNameModify = { fieldName => s"get${fieldName.capitalize}" }
+ )
+
+ implicit def dimensionsTypedEncoder[N: Integral: TypedEncoder]: TypedEncoder[Dimensions[N]] =
+ ManualTypedEncoder.staticInvoke[Dimensions[N]](
+ fields = List(
+ RecordEncoderField(0, "cols", TypedEncoder[N]),
+ RecordEncoderField(1, "rows", TypedEncoder[N])
+ )
+ )
+
+ /**
+ * @note
+ * Frameless cannot derive encoder for GridBounds because it lacks constructor from (int, int, int int)
+ * Defining Injection is not suitable because Injection is used in derivation of encoder fields but is not an encoder.
+ * Additionally Injection to Tuple4[Int, Int, Int, Int] would not have correct fields.
+ */
+ implicit def gridBoundsTypedEncoder[N: Integral: TypedEncoder]: TypedEncoder[GridBounds[N]] =
+ ManualTypedEncoder.staticInvoke[GridBounds[N]](
+ fields = List(
+ RecordEncoderField(0, "colMin", TypedEncoder[N]),
+ RecordEncoderField(1, "rowMin", TypedEncoder[N]),
+ RecordEncoderField(2, "colMax", TypedEncoder[N]),
+ RecordEncoderField(3, "rowMax", TypedEncoder[N])
+ )
+ )
+
+ implicit def tileLayerMetadataTypedEncoder[K: TypedEncoder: ClassTag]: TypedEncoder[TileLayerMetadata[K]] =
+ ManualTypedEncoder.staticInvoke[TileLayerMetadata[K]](
+ fields = List(
+ RecordEncoderField(0, "cellType", TypedEncoder[CellType]),
+ RecordEncoderField(1, "layout", TypedEncoder[LayoutDefinition]),
+ RecordEncoderField(2, "extent", TypedEncoder[Extent]),
+ RecordEncoderField(3, "crs", TypedEncoder[CRS]),
+ RecordEncoderField(4, "bounds", TypedEncoder[KeyBounds[K]])
+ )
+ )
+
+ implicit val tileTypedEncoder: TypedEncoder[Tile] = TypedEncoder.usingUserDefinedType[Tile]
+ implicit def rasterTileTypedEncoder[T <: CellGrid[Int]: TypedEncoder]: TypedEncoder[Raster[T]] = TypedEncoder.usingDerivation
+
+ // Derivation is done through frameless to trigger RasterSourceUDT load
+ implicit val rfRasterSourceTypedEncoder: TypedEncoder[RFRasterSource] = TypedEncoder.usingUserDefinedType[RFRasterSource]
+
+ implicit val kernelTypedEncoder: TypedEncoder[Kernel] = TypedEncoder.usingDerivation
+
+ // Derivation is done through frameless to trigger the TileUDT and CrsUDT load
+ implicit val projectedRasterTileTypedEncoder: TypedEncoder[ProjectedRasterTile] =
+ ManualTypedEncoder.newInstance[ProjectedRasterTile](
+ fields = List(
+ RecordEncoderField(0, "tile", TypedEncoder[Tile]),
+ RecordEncoderField(1, "extent", TypedEncoder[Extent]),
+ RecordEncoderField(2, "crs", TypedEncoder[CRS])
+ )
+ )
+}
+
+object TypedEncoders extends TypedEncoders
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/URIEncoder.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/URIEncoder.scala
deleted file mode 100644
index bbbcf25ea..000000000
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/URIEncoder.scala
+++ /dev/null
@@ -1,38 +0,0 @@
-/*
- * This software is licensed under the Apache 2 license, quoted below.
- *
- * Copyright 2018 Astraea, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License"); you may not
- * use this file except in compliance with the License. You may obtain a copy of
- * the License at
- *
- * [http://www.apache.org/licenses/LICENSE-2.0]
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations under
- * the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- *
- */
-
-package org.locationtech.rasterframes.encoders
-
-import java.net.URI
-
-import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
-
-/**
- * Custom Encoder for allowing friction-free use of URIs in DataFrames.
- *
- * @since 1/16/18
- */
-object URIEncoder {
- def apply(): ExpressionEncoder[URI] =
- StringBackedEncoder[URI]("uri", "toASCIIString", (URIEncoder.getClass, "fromString"))
- // Not sure why this delegate is necessary, but doGenCode fails without it.
- def fromString(str: String): URI = URI.create(str)
-}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/package.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/package.scala
index 8cb5a6f85..6851a56f6 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/encoders/package.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/package.scala
@@ -21,12 +21,17 @@
package org.locationtech.rasterframes
-import org.apache.spark.sql.rf._
+import org.locationtech.rasterframes.encoders.syntax._
+
import org.apache.spark.sql.Column
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.sql.catalyst.expressions.Literal
import scala.reflect.ClassTag
-import scala.reflect.runtime.universe.{Literal => _, _}
+import scala.reflect.runtime.universe._
+import frameless.TypedEncoder
+import org.apache.spark.sql.types.{DataType, StructType}
+import org.apache.spark.sql.rf.WithTypeConformity
/**
* Module utilities
@@ -34,6 +39,17 @@ import scala.reflect.runtime.universe.{Literal => _, _}
* @since 9/25/17
*/
package object encoders {
+ /** High priority specific product encoder derivation. Without it, the default spark would be used. */
+ implicit def productTypedToExpressionEncoder[T <: Product: TypedEncoder]: ExpressionEncoder[T] = TypedEncoders.typedExpressionEncoder
+
+ implicit class WithTypeConformityToEncoder(val left: DataType) extends AnyVal {
+ def conformsToSchema(schema: StructType): Boolean =
+ WithTypeConformity(left).conformsTo(schema)
+
+ def conformsToDataType(dataType: DataType): Boolean =
+ WithTypeConformity(left).conformsTo(dataType)
+ }
+
private[rasterframes] def runtimeClass[T: TypeTag]: Class[T] =
typeTag[T].mirror.runtimeClass(typeTag[T].tpe).asInstanceOf[Class[T]]
@@ -41,18 +57,26 @@ package object encoders {
ClassTag[T](typeTag[T].mirror.runtimeClass(typeTag[T].tpe))
}
- /** Constructs a catalyst literal expression from anything with a serializer. */
- def SerializedLiteral[T >: Null: CatalystSerializer](t: T): Literal = {
- val ser = CatalystSerializer[T]
- val schema = ser.schema match {
- case s if s.conformsTo(TileType.sqlType) => TileType
- case s if s.conformsTo(RasterSourceType.sqlType) => RasterSourceType
+ /** Constructs a catalyst literal expression from anything with a serializer.
+ * Using this serializer avoids using lit() function wich will defer to ScalaReflection to derive encoder.
+ * Therefore, this should be used when literal value can not be handled by Spark ScalaReflection.
+ */
+ def SerializedLiteral[T >: Null](t: T)(implicit tag: TypeTag[T], enc: ExpressionEncoder[T]): Literal = {
+ val schema = enc.schema match {
+ case s if s.conformsTo(tileUDT.sqlType) => tileUDT
+ case s if s.conformsTo(rasterSourceUDT.sqlType) => rasterSourceUDT
case s => s
}
- Literal.create(ser.toInternalRow(t), schema)
+ // we need to convert to Literal right here because otherwise ScalaReflection takes over
+ val ir = t.toInternalRow
+ Literal.create(ir, schema)
}
- /** Constructs a Dataframe literal column from anything with a serializer. */
- def serialized_literal[T >: Null: CatalystSerializer](t: T): Column =
+ /**
+ * Constructs a Dataframe literal column from anything with a serializer.
+ * TODO: review its usage.
+ */
+ def serialized_literal[T >: Null: ExpressionEncoder: TypeTag](t: T): Column =
new Column(SerializedLiteral(t))
+
}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/encoders/syntax/package.scala b/core/src/main/scala/org/locationtech/rasterframes/encoders/syntax/package.scala
new file mode 100644
index 000000000..eb4ea931c
--- /dev/null
+++ b/core/src/main/scala/org/locationtech/rasterframes/encoders/syntax/package.scala
@@ -0,0 +1,35 @@
+package org.locationtech.rasterframes.encoders
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+
+import scala.reflect.runtime.universe.TypeTag
+
+package object syntax {
+ implicit class CachedExpressionOps[T](val self: T) extends AnyVal {
+ def toInternalRow(implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): InternalRow = {
+ val toRow = SerializersCache.serializer[T]
+ toRow(self)
+ }
+
+ def toRow(implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): Row = {
+ val toRow = SerializersCache.rowSerialize[T]
+ toRow(self)
+ }
+ }
+
+ implicit class CachedExpressionRowOps(val self: Row) extends AnyVal {
+ def as[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): T = {
+ val fromRow = SerializersCache.rowDeserialize[T]
+ fromRow(self)
+ }
+ }
+
+ implicit class CachedInternalRowOps(val self: InternalRow) extends AnyVal {
+ def as[T](implicit tag: TypeTag[T], encoder: ExpressionEncoder[T]): T = {
+ val fromRow = SerializersCache.deserializer[T]
+ fromRow(self)
+ }
+ }
+}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryLocalRasterOp.scala b/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterFunction.scala
similarity index 89%
rename from core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryLocalRasterOp.scala
rename to core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterFunction.scala
index 9994fdef1..425e6c4e7 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryLocalRasterOp.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterFunction.scala
@@ -26,18 +26,15 @@ import geotrellis.raster.Tile
import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess}
import org.apache.spark.sql.catalyst.expressions.BinaryExpression
-import org.apache.spark.sql.rf.TileUDT
import org.apache.spark.sql.types.DataType
-import org.locationtech.rasterframes.encoders.CatalystSerializer._
import org.locationtech.rasterframes.expressions.DynamicExtractors._
import org.slf4j.LoggerFactory
/** Operation combining two tiles or a tile and a scalar into a new tile. */
-trait BinaryLocalRasterOp extends BinaryExpression {
+trait BinaryRasterFunction extends BinaryExpression with RasterResult {
@transient protected lazy val logger = Logger(LoggerFactory.getLogger(getClass.getName))
-
override def dataType: DataType = left.dataType
override def checkInputDataTypes(): TypeCheckResult = {
@@ -51,7 +48,6 @@ trait BinaryLocalRasterOp extends BinaryExpression {
}
override protected def nullSafeEval(input1: Any, input2: Any): Any = {
- implicit val tileSer = TileUDT.tileSerializer
val (leftTile, leftCtx) = tileExtractor(left.dataType)(row(input1))
val result = tileOrNumberExtractor(right.dataType)(input2) match {
case TileArg(rightTile, rightCtx) =>
@@ -63,15 +59,12 @@ trait BinaryLocalRasterOp extends BinaryExpression {
if(leftCtx.isDefined && rightCtx.isDefined && leftCtx != rightCtx)
logger.warn(s"Both '${left}' and '${right}' provided an extent and CRS, but they are different. Left-hand side will be used.")
+ // TODO: extract BufferTile here to preserve the buffer
op(leftTile, rightTile)
case DoubleArg(d) => op(fpTile(leftTile), d)
case IntegerArg(i) => op(leftTile, i)
}
-
- leftCtx match {
- case Some(ctx) => ctx.toProjectRasterTile(result).toInternalRow
- case None => result.toInternalRow
- }
+ toInternalRow(result, leftCtx)
}
@@ -79,4 +72,3 @@ trait BinaryLocalRasterOp extends BinaryExpression {
protected def op(left: Tile, right: Double): Tile
protected def op(left: Tile, right: Int): Tile
}
-
diff --git a/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterOp.scala b/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterOp.scala
index 2c33eae12..26e5138aa 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterOp.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/expressions/BinaryRasterOp.scala
@@ -26,17 +26,15 @@ import geotrellis.raster.Tile
import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess}
import org.apache.spark.sql.catalyst.expressions.BinaryExpression
-import org.apache.spark.sql.rf.TileUDT
import org.apache.spark.sql.types.DataType
-import org.locationtech.rasterframes.encoders.CatalystSerializer._
import org.locationtech.rasterframes.expressions.DynamicExtractors.tileExtractor
import org.slf4j.LoggerFactory
/** Operation combining two tiles into a new tile. */
-trait BinaryRasterOp extends BinaryExpression {
+trait BinaryRasterOp extends BinaryExpression with RasterResult {
@transient protected lazy val logger = Logger(LoggerFactory.getLogger(getClass.getName))
- override def dataType: DataType = left.dataType
+ def dataType: DataType = left.dataType
override def checkInputDataTypes(): TypeCheckResult = {
if (!tileExtractor.isDefinedAt(left.dataType)) {
@@ -51,7 +49,6 @@ trait BinaryRasterOp extends BinaryExpression {
protected def op(left: Tile, right: Tile): Tile
override protected def nullSafeEval(input1: Any, input2: Any): Any = {
- implicit val tileSer = TileUDT.tileSerializer
val (leftTile, leftCtx) = tileExtractor(left.dataType)(row(input1))
val (rightTile, rightCtx) = tileExtractor(right.dataType)(row(input2))
@@ -65,9 +62,6 @@ trait BinaryRasterOp extends BinaryExpression {
val result = op(leftTile, rightTile)
- leftCtx match {
- case Some(ctx) => ctx.toProjectRasterTile(result).toInternalRow
- case None => result.toInternalRow
- }
+ toInternalRow(result, leftCtx)
}
}
diff --git a/core/src/main/scala/org/locationtech/rasterframes/expressions/DynamicExtractors.scala b/core/src/main/scala/org/locationtech/rasterframes/expressions/DynamicExtractors.scala
index 834c3aac1..efc71a01c 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/expressions/DynamicExtractors.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/expressions/DynamicExtractors.scala
@@ -22,76 +22,191 @@
package org.locationtech.rasterframes.expressions
import geotrellis.proj4.CRS
-import geotrellis.raster.{CellGrid, Tile}
+import geotrellis.raster.{CellGrid, Neighborhood, Raster, TargetCell, Tile}
+import geotrellis.vector.Extent
import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.util.ArrayData
+import org.apache.spark.sql.jts.JTSTypes
import org.apache.spark.sql.rf.{RasterSourceUDT, TileUDT}
import org.apache.spark.sql.types._
import org.apache.spark.unsafe.types.UTF8String
-import org.locationtech.rasterframes.encoders.CatalystSerializer._
-import org.locationtech.rasterframes.model.{LazyCRS, TileContext}
-import org.locationtech.rasterframes.ref.{ProjectedRasterLike, RasterRef, RasterSource}
+import org.locationtech.jts.geom.{Envelope, Point}
+import org.locationtech.rasterframes._
+import org.locationtech.rasterframes.encoders._
+import org.locationtech.rasterframes.encoders.syntax._
+import org.locationtech.rasterframes.model.{LazyCRS, LongExtent, TileContext}
+import org.locationtech.rasterframes.ref.{ProjectedRasterLike, RasterRef}
import org.locationtech.rasterframes.tiles.ProjectedRasterTile
+import org.apache.spark.sql.rf.CrsUDT
+import org.locationtech.rasterframes.util.{FocalNeighborhood, FocalTargetCell}
private[rasterframes]
object DynamicExtractors {
/** Partial function for pulling a tile and its context from an input row. */
lazy val tileExtractor: PartialFunction[DataType, InternalRow => (Tile, Option[TileContext])] = {
case _: TileUDT =>
- (row: InternalRow) =>
- (row.to[Tile](TileUDT.tileSerializer), None)
- case t if t.conformsTo[ProjectedRasterTile] =>
+ (row: InternalRow) => (tileUDT.deserialize(row), None)
+ case t if t.conformsToSchema(ProjectedRasterTile.projectedRasterTileEncoder.schema) =>
(row: InternalRow) => {
- val prt = row.to[ProjectedRasterTile]
+ val prt = row.as[ProjectedRasterTile]
(prt, Some(TileContext(prt)))
}
}
lazy val rasterRefExtractor: PartialFunction[DataType, InternalRow => RasterRef] = {
- case t if t.conformsTo[RasterRef] =>
- (row: InternalRow) => row.to[RasterRef]
+ case t if t.conformsToSchema(RasterRef.rasterRefEncoder.schema) =>
+ (row: InternalRow) => row.as[RasterRef]
}
lazy val tileableExtractor: PartialFunction[DataType, InternalRow => Tile] =
tileExtractor.andThen(_.andThen(_._1)).orElse(rasterRefExtractor.andThen(_.andThen(_.tile)))
+ lazy val internalRowTileExtractor: PartialFunction[DataType, InternalRow => (Tile, Option[TileContext])] = {
+ case _: TileUDT => (row: Any) => (new TileUDT().deserialize(row), None)
+ case t if t.conformsToSchema(rasterEncoder[Tile].schema) =>
+ (row: InternalRow) =>(row.as[Raster[Tile]].tile, None)
+ case t if t.conformsToSchema(ProjectedRasterTile.projectedRasterTileEncoder.schema) =>
+ (row: InternalRow) => {
+ val prt = row.as[ProjectedRasterTile]
+ (prt, Some(TileContext(prt)))
+ }
+ }
+
lazy val rowTileExtractor: PartialFunction[DataType, Row => (Tile, Option[TileContext])] = {
- case _: TileUDT =>
- (row: Row) => (row.to[Tile](TileUDT.tileSerializer), None)
- case t if t.conformsTo[ProjectedRasterTile] =>
+ case _: TileUDT => (row: Row) => (row.as[Tile], None)
+ case t if t.conformsToSchema(rasterEncoder[Tile].schema) => (row: Row) => (row.as[Raster[Tile]].tile, None)
+ case t if t.conformsToSchema(ProjectedRasterTile.projectedRasterTileEncoder.schema) =>
(row: Row) => {
- val prt = row.to[ProjectedRasterTile]
+ val prt = row.as[ProjectedRasterTile]
(prt, Some(TileContext(prt)))
}
}
/** Partial function for pulling a ProjectedRasterLike an input row. */
- lazy val projectedRasterLikeExtractor: PartialFunction[DataType, InternalRow ⇒ ProjectedRasterLike] = {
- case _: RasterSourceUDT ⇒
- (row: InternalRow) => row.to[RasterSource](RasterSourceUDT.rasterSourceSerializer)
- case t if t.conformsTo[ProjectedRasterTile] =>
- (row: InternalRow) => row.to[ProjectedRasterTile]
- case t if t.conformsTo[RasterRef] =>
- (row: InternalRow) => row.to[RasterRef]
+ lazy val projectedRasterLikeExtractor: PartialFunction[DataType, Any => ProjectedRasterLike] = {
+ case _: RasterSourceUDT =>
+ (input: Any) =>
+ val row = input.asInstanceOf[InternalRow]
+ rasterSourceUDT.deserialize(row)
+ case t if t.conformsToSchema(ProjectedRasterTile.projectedRasterTileEncoder.schema) =>
+ (input: Any) => input.asInstanceOf[InternalRow].as[ProjectedRasterTile]
+ case t if t.conformsToSchema(RasterRef.rasterRefEncoder.schema) =>
+ (row: Any) => row.asInstanceOf[InternalRow].as[RasterRef]
}
/** Partial function for pulling a CellGrid from an input row. */
- lazy val gridExtractor: PartialFunction[DataType, InternalRow ⇒ CellGrid] = {
+ lazy val gridExtractor: PartialFunction[DataType, InternalRow => CellGrid[Int]] = {
case _: TileUDT =>
- (row: InternalRow) => row.to[Tile](TileUDT.tileSerializer)
- case _: RasterSourceUDT =>
- (row: InternalRow) => row.to[RasterSource](RasterSourceUDT.rasterSourceSerializer)
- case t if t.conformsTo[RasterRef] ⇒
- (row: InternalRow) => row.to[RasterRef]
- case t if t.conformsTo[ProjectedRasterTile] =>
- (row: InternalRow) => row.to[ProjectedRasterTile]
+ // TODO EAC: is there way to extract grid from TileUDT without reading the cells with an expression?
+ (row: InternalRow) => tileUDT.deserialize(row)
+ case _: RasterSourceUDT => (row: InternalRow) => rasterSourceUDT.deserialize(row)
+ case t if t.conformsToSchema(RasterRef.rasterRefEncoder.schema) =>
+ (row: InternalRow) => row.as[RasterRef]
+ case t if t.conformsToSchema(ProjectedRasterTile.projectedRasterTileEncoder.schema) =>
+ (row: InternalRow) => row.as[ProjectedRasterTile]
+ }
+
+ lazy val intArrayExtractor: PartialFunction[DataType, ArrayData => Array[Int]] = {
+ case ArrayType(t, true) =>
+ throw new IllegalArgumentException(s"Can't turn array of $t to arraySparkSession - in-memory
\n", - " \n", - "SparkContext
\n", - "\n", - " \n", - "\n", - "v2.3.4local[*]pyspark-shell| \n", - " | eod_collection_display_name | \n", - "eod_collection_family | \n", - "eod_collection_family_display_name | \n", - "eod_grid_id | \n", - "created | \n", - "datetime | \n", - "eo_cloud_cover | \n", - "eo_constellation | \n", - "eo_epsg | \n", - "eo_gsd | \n", - "... | \n", - "B2 | \n", - "BQA | \n", - "B4 | \n", - "B1 | \n", - "B8 | \n", - "B11 | \n", - "collection | \n", - "geometry | \n", - "id | \n", - "target | \n", - "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", - "Landsat 8 | \n", - "landsat8 | \n", - "Landsat 8 | \n", - "WRS2-030027 | \n", - "2019-08-19T20:54:33.413548Z | \n", - "2018-07-17T17:15:57.1536740Z | \n", - "1.49 | \n", - "landsat-8 | \n", - "32614 | \n", - "30.0 | \n", - "... | \n", - "https://landsat-pds.s3.us-west-2.amazonaws.com... | \n", - "https://landsat-pds.s3.us-west-2.amazonaws.com... | \n", - "https://landsat-pds.s3.us-west-2.amazonaws.com... | \n", - "https://landsat-pds.s3.us-west-2.amazonaws.com... | \n", - "https://landsat-pds.s3.us-west-2.amazonaws.com... | \n", - "https://landsat-pds.s3.us-west-2.amazonaws.com... | \n", - "landsat8_l1tp | \n", - "(POLYGON ((-98.62404379679178 46.4012557977134... | \n", - "LC08_L1TP_030027_20180717_20180730_01_T1_L1TP | \n", - "file:///tmp/scene_30_27_target_utm.tif | \n", - "
1 rows × 33 columns
\n", - "| rf_crs(proj_raster) | rf_extent(proj_raster) | rf_aspect(proj_raster) | rf_slope(proj_raster, 1) | rf_hillshade(proj_raster, 315, 45, 1) |
|---|---|---|---|---|
| utm-CS | {240929.2154, 4398599.0319, 256289.2154, 4401599.0319} | |||
| utm-CS | {210209.2154, 4432319.0319, 225569.2154, 4447679.0319} | |||
| utm-CS | {256289.2154, 4416959.0319, 271649.2154, 4432319.0319} | |||
| utm-CS | {271649.2154, 4509119.0319, 287009.2154, 4524479.0319} | |||
| utm-CS | {333089.2154, 4398599.0319, 341969.2154, 4401599.0319} |
| rf_local_add(proj_raster, 3) |
|---|
| proj_raster_path | footprint |
|---|---|
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF | POLYGON ((-70.85954815687087 8.933333332533772, -71.07986282542622 9.999999999104968, -69.99674110618135 9.999999999104968, -69.77978361352781 8.933333332533772, -70.85954815687087 8.933333332533772)) |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF | POLYGON ((-69.77978361352781 8.933333332533772, -69.99674110618135 9.999999999104968, -68.91361938693649 9.999999999104968, -68.70001907018472 8.933333332533772, -69.77978361352781 8.933333332533772)) |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF | POLYGON ((-68.70001907018474 8.933333332533772, -68.9136193869365 9.999999999104968, -67.83049766769162 9.999999999104968, -67.62025452684165 8.933333332533772, -68.70001907018474 8.933333332533772)) |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF | POLYGON ((-67.62025452684165 8.933333332533772, -67.83049766769162 9.999999999104968, -66.74737594844675 9.999999999104968, -66.54048998349857 8.933333332533772, -67.62025452684165 8.933333332533772)) |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF | POLYGON ((-66.54048998349859 8.933333332533772, -66.74737594844676 9.999999999104968, -65.66425422920187 9.999999999104968, -65.4607254401555 8.933333332533772, -66.54048998349859 8.933333332533772)) |
| id | collection | geometry |
|---|---|---|
| LC08_L1TP_232093_20210716_20210717_01_T1 | landsat-8-l1-c1 | POLYGON ((-74.64766964714028 -46.3435154... |
| LC08_L1TP_232092_20210716_20210717_01_T1 | landsat-8-l1-c1 | POLYGON ((-74.07682865409966 -44.9166888... |
| LC08_L1TP_232091_20210716_20210717_01_T1 | landsat-8-l1-c1 | POLYGON ((-73.54155930424828 -43.4885910... |
| LC08_L1TP_232090_20210716_20210717_01_T1 | landsat-8-l1-c1 | POLYGON ((-73.02667875381594 -42.0589406... |
| LC08_L1TP_232089_20210716_20210717_01_T1 | landsat-8-l1-c1 | POLYGON ((-72.67424121162182 -40.6804236... |
| rf_crs(band) | rf_extent(band) | rf_aspect(band, all) | rf_slope(band, 1, all) | rf_hillshade(band, 315, 45, 1, all) |
|---|---|---|---|---|
| utm-CS | {488445.0, -5335365.0, 503805.0, -5320005.0} | |||
| utm-CS | {657405.0, -5335365.0, 672765.0, -5320005.0} | |||
| utm-CS | {688125.0, -5335365.0, 703485.0, -5320005.0} | |||
| utm-CS | {642045.0, -5197125.0, 657405.0, -5181765.0} | |||
| utm-CS | {549885.0, -5366085.0, 565245.0, -5350725.0} |
| rf_crs(band) | rf_extent(band) | rf_aspect(band, data) | rf_slope(band, 1, data) | rf_hillshade(band, 315, 45, 1, data) |
|---|---|---|---|---|
| utm-CS | {488445.0, -5335365.0, 503805.0, -5320005.0} | |||
| utm-CS | {657405.0, -5335365.0, 672765.0, -5320005.0} | |||
| utm-CS | {688125.0, -5335365.0, 703485.0, -5320005.0} | |||
| utm-CS | {642045.0, -5197125.0, 657405.0, -5181765.0} | |||
| utm-CS | {549885.0, -5366085.0, 565245.0, -5350725.0} |
| proj_raster_path | tile | crs | ext |
|---|---|---|---|
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | [int16ud32767, (256,256), [1225,1244,1247,1222,1189,1216,1206,1185,1132,1040,...,1575,1489,1281,1189,1202,1145,1171,1189,1297,1382]] | [+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ] | [1.4455356755667E7, -2342509.0947640934, 1.4573964811098093E7, -2223901.039333] |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | [int16ud32767, (256,256), [1140,1227,1147,1106,1026,994,1047,1020,1174,1348,...,1793,1743,1685,1688,1706,1727,1766,1689,1561,1515]] | [+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ] | [1.4573964811098093E7, -2342509.0947640934, 1.4692572866529187E7, -2223901.039333] |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | [int16ud32767, (256,256), [1546,1445,1329,1539,1653,1576,1533,1603,1610,1584,...,1399,1434,1330,1429,1470,1451,1422,1407,1369,1310]] | [+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ] | [1.4692572866529185E7, -2342509.0947640934, 1.4811180921960281E7, -2223901.039333] |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | [int16ud32767, (256,256), [1765,1675,1704,1674,1665,1685,1551,1556,1576,1626,...,1814,1768,1771,1812,1825,1773,1737,1728,1734,1684]] | [+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ] | [1.481118092196028E7, -2342509.0947640934, 1.4929788977391373E7, -2223901.039333] |
| https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | [int16ud32767, (256,256), [1171,1272,1306,1294,1202,1065,998,971,976,1188,...,1455,1481,1458,1469,1449,1392,1227,1085,1102,1091]] | [+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ] | [1.4929788977391373E7, -2342509.0947640934, 1.5048397032822467E7, -2223901.039333] |
| \n", - " | proj_raster_path | \n", - "tile | \n", - "crs | \n", - "ext | \n", - "
|---|---|---|---|---|
| 0 | \n", - "https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | \n", - "(+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ,) | \n", - "(14455356.755667, -2342509.0947640934, 14573964.811098093, -2223901.039333) | \n", - "|
| 1 | \n", - "https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | \n", - "(+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ,) | \n", - "(14573964.811098093, -2342509.0947640934, 14692572.866529187, -2223901.039333) | \n", - "|
| 2 | \n", - "https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | \n", - "(+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ,) | \n", - "(14692572.866529185, -2342509.0947640934, 14811180.921960281, -2223901.039333) | \n", - "|
| 3 | \n", - "https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF | \n", - "(+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ,) | \n", - "(14811180.92196028, -2342509.0947640934, 14929788.977391373, -2223901.039333) | \n", - "
| \n", - " | iata | \n", - "airport | \n", - "city | \n", - "state | \n", - "country | \n", - "lat | \n", - "long | \n", - "cnt | \n", - "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", - "ORD | \n", - "Chicago O'Hare International | \n", - "Chicago | \n", - "IL | \n", - "USA | \n", - "41.979595 | \n", - "-87.904464 | \n", - "25129 | \n", - "
| 1 | \n", - "ATL | \n", - "William B Hartsfield-Atlanta Intl | \n", - "Atlanta | \n", - "GA | \n", - "USA | \n", - "33.640444 | \n", - "-84.426944 | \n", - "21925 | \n", - "
| 2 | \n", - "DFW | \n", - "Dallas-Fort Worth International | \n", - "Dallas-Fort Worth | \n", - "TX | \n", - "USA | \n", - "32.895951 | \n", - "-97.037200 | \n", - "20662 | \n", - "
| 3 | \n", - "PHX | \n", - "Phoenix Sky Harbor International | \n", - "Phoenix | \n", - "AZ | \n", - "USA | \n", - "33.434167 | \n", - "-112.008056 | \n", - "17290 | \n", - "
| 4 | \n", - "DEN | \n", - "Denver Intl | \n", - "Denver | \n", - "CO | \n", - "USA | \n", - "39.858408 | \n", - "-104.667002 | \n", - "13781 | \n", - "
| 5 | \n", - "IAH | \n", - "George Bush Intercontinental | \n", - "Houston | \n", - "TX | \n", - "USA | \n", - "29.980472 | \n", - "-95.339722 | \n", - "13223 | \n", - "
| 6 | \n", - "SFO | \n", - "San Francisco International | \n", - "San Francisco | \n", - "CA | \n", - "USA | \n", - "37.619002 | \n", - "-122.374843 | \n", - "12016 | \n", - "
| 7 | \n", - "LAX | \n", - "Los Angeles International | \n", - "Los Angeles | \n", - "CA | \n", - "USA | \n", - "33.942536 | \n", - "-118.408074 | \n", - "11797 | \n", - "
| 8 | \n", - "MCO | \n", - "Orlando International | \n", - "Orlando | \n", - "FL | \n", - "USA | \n", - "28.428889 | \n", - "-81.316028 | \n", - "10536 | \n", - "
| 9 | \n", - "CLT | \n", - "Charlotte/Douglas International | \n", - "Charlotte | \n", - "NC | \n", - "USA | \n", - "35.214011 | \n", - "-80.943126 | \n", - "10490 | \n", - "