I am trying to extract text for a PDF by converting into image where my PDF is the input file.
In streamlit when I am trying to import the PDF and call the function
uploaded_file = st.file_uploader('Import PDF from local', type='pdf')
if uploaded_file is not None:
text = pic_to_text(uploaded_file)
st.success(text)
I am getting an error expected str, bytes or os.PathLike object, not UploadedFile
I am not wanting to extract the pages or text from PDF but directly pass PDF as an input to my function.
Can you post a link to your code? Also, youâre passing your uploaded file into a function that I cannot see, I imagine that somewhere in that function your passing the uploaded file directly into something that was expected âstrâ, âbytesâ or a path.
Check out this discussion where someone has a similar issue (docs typo already fixed):
"""Detects text in an image file
ARGS
infile: path to image file
RETURNS
String of text detected in image
"""
# Instantiates a client
client = vision.ImageAnnotatorClient()
# Opens the input image file
content = infile.tobytes()
image = vision.Image(content=content)
# For dense text, use document_text_detection
# For less dense text, use text_detection
response = client.text_detection(image=image, image_context={"language_hints": ["en"]})
text = response.text_annotations[0].description
# print("Detected text: {}".format(text))
return text
ââ"Translates text to a given language using a glossary
ARGS
text: String of text to translate
source_language_code: language of input text
target_language_code: language of output text
project_id: GCP project id
glossary_name: name you gave your project's glossary
resource when you created it
RETURNS
String of translated text
"""
# Instantiates a client
client = translate.TranslationServiceClient()
# Designates the data center location that you want to use
location = "us-central1"
project_id = "testprojectincloud"
parent = f"projects/{project_id}/locations/{location}"
result = client.translate_text(request={"parent": parent,
"contents": [text],
"mime_type": "text/plain", # mime types: text/plain, text/html
"source_language_code": source_language_code,
"target_language_code": target_language_code
}
)
# Extract translated text from API response
return result.translations
In the code above I am passing the pdf file,converting the pdf to image and getting the memory value, converting it into bytes and passing it to google cloud api to extract the text from image.
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking âAccept allâ, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.