I’m running into two problems. I working on a file uploader and during the build I get the following error “‘NoneType’ object has no attribute ‘seek’”, but my code works and I get the output I am looking for.
The second error I get when I deploy the app File “/home/appuser/venv/lib/python3.7/sitepackages/streamlit/script_runner.py”, line 354, in _run_script exec(code, module.dict)
and
File “/app/293pending_cases/pending_cases.py”, line 4, in
import docx2txt
however, I have removed the docx2txt import and still getting this error. When it does clear its followed by import pyPDF not found…I am not sure where I am going wrong. Any advice is greatly appreciated.
I can tell you that I have successfully deployed apps with both docx2txt and pyPDF, so it is not a fundamental problem with the libraries. Most likely a missing from X import Y statement. If you post the import code and the full error message we can probably help more.
This in the build …
AttributeError: ‘NoneType’ object has no attribute ‘seek’
Traceback:
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/streamlit/script_runner.py", line 354, in _run_script
exec(code, module.__dict__)File "/Users/hector/codeup-data-science/293rd_pending_cases/pending_cases.py", line 29, in <module>
pdf_raw_text = read_pdf(pdf_file)File "/Users/hector/codeup-data-science/293rd_pending_cases/pending_cases.py", line 8, in read_pdf
pdfReader = PdfFileReader(file) #reads pdfFile "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/PyPDF2/pdf.py", line 1084, in __init__
self.read(stream)File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/PyPDF2/pdf.py", line 1689, in read
stream.seek(-1, 2)
and this is when I deploy…
well I am now getting a “Mismatched workspace name and repository owner” when I go to deploy…this thing is frustrating
I hope this helps…
import streamlit as st
import pandas as pd
from PyPDF2 import PdfFileReader
import re
import gspread
def read_pdf(file):
pdfReader = PdfFileReader(file) #reads pdf
count = pdfReader.numPages #counts the number of pages
content = " "#space holder for pdf content
for i in range(count): #for loop to extract text from all pages
page = pdfReader.getPage(i) #gets page numbers
content += page.extractText() #extracts text from iterated pages
return content
Here is the PyPDF error I was talking about…
File "/app/293rd_pending_cases/pending_cases.py", line 3, in <module>
from PyPDF2 import PdfFileReader
ModuleNotFoundError: No module named 'PyPDF2'
You need to do import PyPDF2 before the from statement.
The mismatched workspace/repository issue has to do with your Github configuration, sounds like an issue with who owns the repository.
Unfortunately, adding the import ahead of the from statement didn’t work. I was in the wrong workspace when trying to deploy…
Hi @Hector_Rodriguez_Jr, welcome to the Streamlit community!
ModuleNotFoundError: No module named ‘PyPDF2’
The issue is that your repository does not contain a requirements file with your Python dependencies. As such, Streamlit Cloud has not installed packages like PyPDF2
, docx2txt
, gspread
, etc, that your app uses.
Read our documentation on App dependencies and a knowledge base article on the ModuleNotFoundError
.
You have the option of manually creating a requirements.txt
file and including a Python package on each line. Take care to use the package name as it appears on PyPI. E.g. scikit-learn
, not sklearn
.
Alternatively, you can automate the creation of a requirements.txt
using pipreqs. Run:
pipreqs /path/to/293rd_pendening_cases/
It will create a requirements.txt
file for you that you can upload to GitHub.
When I run the above command upon cloning your repo, it creates a requirements file with the following entries:
gspread==5.1.1
streamlit==1.4.0
docx2txt==0.8
df2gspread==1.0.4
pandas==1.2.5
PyPDF2==1.26.0
Hope this helps!
Best,
Snehan
Resources
I had read something to this regard, but wasn’t about the requirement.txt I am going to try this. Thank you!
Dude you are the MAN!!!
Thank you!
I may have to do that…thanks
I have another question. How do I get rid of this error?
File "/home/appuser/venv/lib/python3.7/site-packages/streamlit/script_runner.py", line 354, in _run_script
exec(code, module.__dict__)File "/app/293rd_pending_cases/pending_cases.py", line 33, in <module>
self.read(stream)File "/home/appuser/venv/lib/python3.7/site-packages/PyPDF2/pdf.py", line 1689, in read
stream.seek(-1, 2)
I think it means PyPDF is trying to access a non-existent file object.
It was. I found at it…I have looked at this code over and over and it was sitting right in front of my face!
next problem: I am getting a FileNotFoundError: [Errno 2] No such file or directory: to my json file
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.