PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number

Can anyone help me with what I am doing incorrectly? This app runs fine on a local computer but having a hard time deploying it.
Error ```
PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before


sending its port number.


[19:50:20] πŸ–₯ Provisioning machine...

[19:50:29] πŸŽ› Preparing system...

[19:49:49] πŸš€ Starting up repository: 'searchengine', branch: 'main', main module: 'data/pyapp.py'

[19:49:49] πŸ™ Cloning repository...

[19:49:52] πŸ™ Cloning into '/mount/src/searchengine'...
Warning: Permanently added the ED25519 host key for IP address '140.82.116.3' to the list of known hosts.

[19:49:52] πŸ™ Cloned repository!

[19:49:52] πŸ™ Pulling code changes from Github...

[19:49:53] πŸ“¦ Processing dependencies...


──────────────────────────────────────── uv ───────────────────────────────────────────


Using uv pip install.

Resolved 51 packages in 3.84s

Downloaded 51 packages in 18.45s

Installed 51 packages in 185ms

 + altair==5.3.0

 + attrs==23.2.0

 + blinker==1.7.0

 + cachetools==5.3.3

 + certifi==2024.2.2

 + charset-normalizer==3.3.2

 + click==8.1.7

 + gitdb==4.0.11

 + gitpython==3.1.43

 + idna==3.7

 + importlib-metadata==6.11.0

 + jinja2==3.1.3

 + joblib==1.4.0

 + jsonschema==4.21.1

 + jsonschema-specifications==2023.12.1

 + markdown-it-py==3.0.0

 + markupsafe==2.1.5

 + mdurl==0.1.2

 + nltk==3.8.1

 + numpy==1.26.4

 + packaging==23.2

 + pandas==2.2.2

 + pillow==10.3.0

 + protobuf==4.25.3

 + py4j==0.10.9.7

 + pyarrow==16.0.0

 + pydeck==0.8.0

 + pygments==2.17.2

 + pyspark==3.5.1

 + python-dateutil==2.9.0.post0

 + pytz==2024.1

 + referencing==0.34.0

 + regex==2024.4.16

 + requests==2.31.0

 + rich==13.7.1

 + rpds-py==0.18.0

 + six==1.16.0

 + smmap==5.0.1

 + streamlit==1.29.0

 + tenacity==8.2.3

 + toml==0.10.2

 + toolz==0.12.1

 + tornado==6.4

 + tqdm==4.66.2

 + typing-extensions==4.11.0

 + tzdata==2024.1

 + tzlocal==5.2

 + urllib3==2.2.1

 + validators==0.28.1

 + watchdog==4.0.0

 + zipp==3.18.1

Checking if Streamlit is installed

Found Streamlit version 1.29.0 in the environment


────────────────────────────────────────────────────────────────────────────────────────


[19:50:21] 🐍 Python dependencies were installed from /mount/src/searchengine/data/requirements.txt using uv.

[19:50:21] πŸ“¦ WARN: More than one requirements file detected in the repository. Available options: uv /mount/src/searchengine/data/requirements.txt, uv /mount/src/searchengine/requirements.txt. Used: uv with /mount/src/searchengine/data/requirements.txt

Check if streamlit is installed

Streamlit is already installed

[19:50:22] πŸ“¦ Processed dependencies!




[19:50:33] β›“ Spinning up manager process...

[nltk_data] Downloading package punkt to /home/appuser/nltk_data...

[nltk_data]   Unzipping tokenizers/punkt.zip.

[nltk_data] Downloading package stopwords to

[nltk_data]     /home/appuser/nltk_data...

[nltk_data]   Unzipping corpora/stopwords.zip.

/home/adminuser/venv/lib/python3.11/site-packages/pyspark/bin/load-spark-env.sh: line 68: ps: command not found

JAVA_HOME is not set

────────────────────── Traceback (most recent call last) ───────────────────────

  /home/adminuser/venv/lib/python3.11/site-packages/streamlit/runtime/scriptru  

  nner/script_runner.py:534 in _run_script                                      

                                                                                

  /mount/src/searchengine/data/pyapp.py:15 in <module>                          

                                                                                

     12 nltk.download('stopwords')                                              

     13                                                                         

     14 # Create Spark session                                                  

  ❱  15 spark = SparkSession.builder.appName("LegalSearch").getOrCreate()       

     16                                                                         

     17 # Assuming preprocessed data are stored in relative paths in the data   

     18 path_to_flat_words = "./data/flat_words.parquet"                        

                                                                                

  /home/adminuser/venv/lib/python3.11/site-packages/pyspark/sql/session.py:497  

  in getOrCreate                                                                

                                                                                

     494 β”‚   β”‚   β”‚   β”‚   β”‚   for key, value in self._options.items():           

     495 β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   sparkConf.set(key, value)                      

     496 β”‚   β”‚   β”‚   β”‚   β”‚   # This SparkContext may be an existing one.        

  ❱  497 β”‚   β”‚   β”‚   β”‚   β”‚   sc = SparkContext.getOrCreate(sparkConf)           

     498 β”‚   β”‚   β”‚   β”‚   β”‚   # Do not update `SparkConf` for existing `SparkCo  

     499 β”‚   β”‚   β”‚   β”‚   β”‚   # by all sessions.                                 

     500 β”‚   β”‚   β”‚   β”‚   β”‚   session = SparkSession(sc, options=self._options)  

                                                                                

  /home/adminuser/venv/lib/python3.11/site-packages/pyspark/context.py:515 in   

  getOrCreate                                                                   

                                                                                

     512 β”‚   β”‚   """                                                            

     513 β”‚   β”‚   with SparkContext._lock:                                       

     514 β”‚   β”‚   β”‚   if SparkContext._active_spark_context is None:             

  ❱  515 β”‚   β”‚   β”‚   β”‚   SparkContext(conf=conf or SparkConf())                 

     516 β”‚   β”‚   β”‚   assert SparkContext._active_spark_context is not None      

     517 β”‚   β”‚   β”‚   return SparkContext._active_spark_context                  

     518                                                                        

                                                                                

  /home/adminuser/venv/lib/python3.11/site-packages/pyspark/context.py:201 in   

  __init__                                                                      

                                                                                

     198 β”‚   β”‚   β”‚   β”‚   " is not allowed as it is a security risk."            

     199 β”‚   β”‚   β”‚   )                                                          

     200 β”‚   β”‚                                                                  

  ❱  201 β”‚   β”‚   SparkContext._ensure_initialized(self, gateway=gateway, conf=  

     202 β”‚   β”‚   try:                                                           

     203 β”‚   β”‚   β”‚   self._do_init(                                             

     204 β”‚   β”‚   β”‚   β”‚   master,                                                

                                                                                

  /home/adminuser/venv/lib/python3.11/site-packages/pyspark/context.py:436 in   

  _ensure_initialized                                                           

                                                                                

     433 β”‚   β”‚   """                                                            

     434 β”‚   β”‚   with SparkContext._lock:                                       

     435 β”‚   β”‚   β”‚   if not SparkContext._gateway:                              

  ❱  436 β”‚   β”‚   β”‚   β”‚   SparkContext._gateway = gateway or launch_gateway(con  

     437 β”‚   β”‚   β”‚   β”‚   SparkContext._jvm = SparkContext._gateway.jvm          

     438 β”‚   β”‚   β”‚                                                              

     439 β”‚   β”‚   β”‚   if instance:                                               

                                                                                

  /home/adminuser/venv/lib/python3.11/site-packages/pyspark/java_gateway.py:10  

  7 in launch_gateway                                                           

                                                                                

    104 β”‚   β”‚   β”‚   β”‚   time.sleep(0.1)                                         

    105 β”‚   β”‚   β”‚                                                               

    106 β”‚   β”‚   β”‚   if not os.path.isfile(conn_info_file):                      

  ❱ 107 β”‚   β”‚   β”‚   β”‚   raise PySparkRuntimeError(                              

    108 β”‚   β”‚   β”‚   β”‚   β”‚   error_class="JAVA_GATEWAY_EXITED",                  

    109 β”‚   β”‚   β”‚   β”‚   β”‚   message_parameters={},                              

    110 β”‚   β”‚   β”‚   β”‚   )                                                       

────────────────────────────────────────────────────────────────────────────────

PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before 

sending its port number. Github link to project:https://github.com/abh2050/searchengine/tree/main/data