Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ This will also run the Scala unit tests.
To run the Python unit tests, run the `run-tests.sh` script from the `python/` directory.
You will need to set `SPARK_HOME` to your local Spark installation directory.

## Release new version
Please see guide `dev/release_guide.md`.

## Spark version compatibility

This project is compatible with Spark 2.4+. However, significant speed improvements have been
Expand Down
6 changes: 0 additions & 6 deletions dev/build-docs-in-docker.sh

This file was deleted.

23 changes: 0 additions & 23 deletions dev/build-docs.sh

This file was deleted.

141 changes: 0 additions & 141 deletions dev/release.py

This file was deleted.

54 changes: 54 additions & 0 deletions dev/release_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Guild for releasing a new Graphframe version

## How to build GraphFrame package ?

To build a GraphFrame package for releasing, you only need to run the following command:

```
cd graphframe_repo

# build graphframe against scala 2.12.12 version
build/sbt ++2.12.12 clean spDist

# build graphframe against scala 2.13.8 version
build/sbt ++2.13.8 clean spDist
```

The above command execution generates zip file with the following path
```
target/graphframes-{graphframe-version}-spark{spark-version}-s_{scala_version}.zip
```
The zip file is the Graphframe package we need to publish, the zip file contains JAR file and POM file.
Note that python module files are included in the JAR file.

## How to publish the GraphFrame package ?

To publish the GraphFrame package, you need to have "admin" role of https://github.com/graphframes/graphframes project.

then you need to log in https://spark-packages.org/package/graphframes/graphframes website,
then upload the zip file generated by instructions in "How to build GraphFrame package" section.

## How to publish the GraphFrame doc ?

GraphFrame doc is hosted in 'https://graphframes.github.io/graphframes/', to publish doc,
you just need to build doc content, then push the doc content to gh-pages branch of https://github.com/graphframes/graphframes project.

Before building doc, you need to install jekyll, please refer to 'docs/README.md' for details.

The following command is for building and publishing doc:
```
cd graphframe_repo

cd ./docs
SKIP_SCALADOC=0 PRODUCTION=1 jekyll build

git fetch upstream gh-pages:gh-pages
git checkout gh-pages

# The doc content is under docs/_site directory
git add -f docs/_site

git commit -m "doc update for version xx"
git push upstream gh-pages
```

9 changes: 8 additions & 1 deletion src/main/scala/org/graphframes/GraphFrame.scala
Original file line number Diff line number Diff line change
Expand Up @@ -636,7 +636,14 @@ object GraphFrame extends Serializable with Logging {
}
}

/** Column name for vertex IDs in [[GraphFrame.vertices]] */
/**
* Column name for vertex IDs in [[GraphFrame.vertices]]
* Note that GraphFrame assigns a unique long ID to each vertex,
* If the vertex ID type is one of byte / int / long / short type,
* GraphFrame casts the original IDs to long as the unique long ID,
* otherwise GraphFrame generates the unique long ID by Spark function
* ``monotonically_increasing_id`` which is less performant.
*/
val ID: String = "id"

/**
Expand Down