Java requirement

Hello,Preformatted text
I am building a streamlit app to extract/download Tables from PDF using tabula library.
Here is the code :

import tempfile
import streamlit as st
import tabula as tb 
import pandas as pd

uploaded_file2 = st.sidebar.file_uploader("Upload PDF", type=["pdf"], label_visibility="visible")

if uploaded_file2 :

    tables = tb.read_pdf(uploaded_file2, pages='all')

    colpi1_t, colpi2_t, colpi3_t = st.columns([40,20,40])

    with colpi2_t:
        tabe_idx = st.number_input(label=f"{len(tables)} Tables", value=1, min_value=1,
            max_value=len(tables), step=1, label_visibility="visible")

    temp = tempfile.NamedTemporaryFile(delete=True)
    temp_filename = temp.name + '.xlsx'

    with pd.ExcelWriter(temp_filename) as writer:
        for i in range (len(tables)):
            tables[i].to_excel(writer, sheet_name=f"table{i}")

    with open(writer, "rb") as f:
        binary_data = f.read()

    excel_namefile = f"Tables_{uploaded_file2.name[:-4]}.xlsx"

    st.download_button(
        label="Download",
        data=binary_data,
        file_name=excel_namefile)

   st.dataframe(tables[tabe_idx-1].style.format(na_rep='No Data', precision=1), use_container_width=True)

this app run correctly in local, but when hosted, i got this eror : tabula.errors.JavaNotFoundError: java command is not found from this Python process.Please ensure Java is installed and PATH is set for java

Can anyone help me fix this.

Thank you all.

Then you need to install java in your cloud environment. You use packages.txt to install dependencies (other than python packages). I think the package you need is default-jre-headless.

1 Like

Hello, Thank you for your answer.
Can you give me an example ?, i am new in IT development and i donโ€™t know many things about cloud environment.

Create the file packages.txt, write default-jre-headless in it, thatโ€™s all. There is an example in the link I posted, just with two other packages that you donโ€™t need.

2 Likes

Thank you so much, it works !