graphframes · WeichenXu123 · Jul 17, 2024 · Jul 17, 2024 · Jul 17, 2024 · Jul 17, 2024
diff --git a/README.md b/README.md
@@ -20,6 +20,9 @@ This will also run the Scala unit tests.
 To run the Python unit tests, run the `run-tests.sh` script from the `python/` directory.
 You will need to set `SPARK_HOME` to your local Spark installation directory.
 
+## Release new version
+Please see guide `dev/release_guide.md`.
+
 ## Spark version compatibility
 
 This project is compatible with Spark 2.4+.  However, significant speed improvements have been

diff --git a/dev/build-docs-in-docker.sh b/dev/build-docs-in-docker.sh
diff --git a/dev/build-docs.sh b/dev/build-docs.sh
diff --git a/dev/release.py b/dev/release.py
diff --git a/dev/release_guide.md b/dev/release_guide.md
@@ -0,0 +1,54 @@
+# Guild for releasing a new Graphframe version
+
+## How to build GraphFrame package ?
+
+To build a GraphFrame package for releasing, you only need to run the following command:
+
+```
+cd graphframe_repo
+
+# build graphframe against scala 2.12.12 version
+build/sbt ++2.12.12 clean spDist
+
+# build graphframe against scala 2.13.8 version
+build/sbt ++2.13.8 clean spDist
+```
+
+The above command execution generates zip file with the following path
+```
+target/graphframes-{graphframe-version}-spark{spark-version}-s_{scala_version}.zip
+```
+The zip file is the Graphframe package we need to publish, the zip file contains JAR file and POM file.
+Note that python module files are included in the JAR file.
+
+## How to publish the GraphFrame package ?
+
+To publish the GraphFrame package, you need to have "admin" role of https://github.com/graphframes/graphframes project.
+
+then you need to log in https://spark-packages.org/package/graphframes/graphframes website,
+then upload the zip file generated by instructions in "How to build GraphFrame package" section.
+
+## How to publish the GraphFrame doc ?
+
+GraphFrame doc is hosted in 'https://graphframes.github.io/graphframes/', to publish doc,
+you just need to build doc content, then push the doc content to gh-pages branch of https://github.com/graphframes/graphframes project.
+
+Before building doc, you need to install jekyll, please refer to 'docs/README.md' for details.
+
+The following command is for building and publishing doc:
+```
+cd graphframe_repo
+
+cd ./docs
+SKIP_SCALADOC=0 PRODUCTION=1 jekyll build
+
+git fetch upstream gh-pages:gh-pages
+git checkout gh-pages
+
+# The doc content is under docs/_site directory
+git add -f docs/_site
+
+git commit -m "doc update for version xx"
+git push upstream gh-pages
+```
+
diff --git a/src/main/scala/org/graphframes/GraphFrame.scala b/src/main/scala/org/graphframes/GraphFrame.scala
@@ -636,7 +636,14 @@ object GraphFrame extends Serializable with Logging {
     }
   }
 
-  /** Column name for vertex IDs in [[GraphFrame.vertices]] */
+  /**
+   * Column name for vertex IDs in [[GraphFrame.vertices]]
+   * Note that GraphFrame assigns a unique long ID to each vertex,
+   * If the vertex ID type is one of byte / int / long / short type,
+   * GraphFrame casts the original IDs to long as the unique long ID,
+   * otherwise GraphFrame generates the unique long ID by Spark function
+   * ``monotonically_increasing_id`` which is less performant.
+   */
   val ID: String = "id"
 
   /**