Getting an error when referencing an uploaded file more than once

When I store the uploaded file as a variable and call the uploaded file once to store inside a variable, there seems to be no issues with it. But when I reference the uploaded file a second time to store it in another variable I get the following error:

Here is my code:

                  @st.cache()
                    def read_pdf(file, coordinates, page):
                        df = tabula.read_pdf(file, area = coordinates,pages = page)
                        return df

                    pg2_df = read_pdf(uploaded_file,(11.858,174.267,151.088,565.947),'2')[0]
                    df2 = read_pdf(uploaded_file,(204.638,176.333,756.968,556.538),'1')[0]

As you can see, I am referencing the uploaded file in two variables which causes this error to occur.

1 Like

@Marisa_Smith

Hi @Mohd_Saad,

It’s very hard to determine what this is from without the full code or knowing if that file location you passed is correct. The error seems to indicate that that file is either empty or doesn’t exist.

I have actually never used tabula so I’m not sure about its syntax in your code example. Also, you reference uploaded_file but haven’t included this in your Minimum Working Example (MWE).

Can you please provide a more detailed code snippet, or ideally a link to a GitHub repo where you’re developing this code to better understand your issue?

Use the community guidelines on making posts that helps others understand your question to help you get an answer:

Happy Streamlit-ing!
Marisa

The first time you read the file, it is consumed. So, the second time you try to read it, there is no data in the file object.

How you fix this is difficult to advise on, given the amount of code you have included.

But, just for fun, try adding the line

uploaded_file.seek(0,0)

in between your two calls to the read_pdf function.
(ie add a new line before df2 = ... and paste in the line I’ve written).

It will be interesting to hear if that helps, or even works!

3 Likes