Trying to use the new Python package requires all Connect related dependencies are installed even if you are not using Spark Connect.
% pip install pyspark graphframes-py
% pyspark
Python 3.11.13 (main, Jun 3 2025, 18:38:25) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
WARNING: Using incubator modules: jdk.incubator.vector
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/07/17 13:57:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 4.0.0
/_/
Using Python version 3.11.13 (main, Jun 3 2025 18:38:25)
Spark context available as 'sc' (master = local[*], app id = local-1752775048865).
SparkSession available as 'spark'.
>>> from graphframes import GraphFrame
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../test-venv/lib/python3.11/site-packages/graphframes/__init__.py", line 1, in <module>
from .graphframe import GraphFrame
File ".../test-venv/lib/python3.11/site-packages/graphframes/graphframe.py", line 39, in <module>
from graphframes.connect.graphframe_client import GraphFrameConnect
File ".../test-venv/lib/python3.11/site-packages/graphframes/connect/graphframe_client.py", line 4, in <module>
from pyspark.sql.connect import proto
File ".../test-venv/lib/python3.11/site-packages/pyspark/sql/connect/proto/__init__.py", line 18, in <module>
from pyspark.sql.connect.proto.base_pb2_grpc import *
File ".../test-venv/lib/python3.11/site-packages/pyspark/sql/connect/proto/base_pb2_grpc.py", line 19, in <module>
import grpc
ModuleNotFoundError: No module named 'grpc'
Describe the bug
Trying to use the new Python package requires all Connect related dependencies are installed even if you are not using Spark Connect.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
System [please complete the following information]:
Component
Additional context
Are you planning on creating a PR?