Hi all!
I’m building a Streamlit app that allows users to upload a PDF and interactively extract specific fields (like name, email, etc.). I use PyMuPDF (fitz) for reading and locating text, and PIL to render a zoomed image of the page with rectangles highlighting fields.
Here’s what works so far:
-
I extract text fields using regex.
-
I find the bounding box of labels using page.search_for(“Field Label”).
-
I render the PDF page as an image and draw rectangles around the label positions.
Is there a way in Streamlit to allow users to click directly on the canvas or image to extract the text within the triangle?
- Is this possible using st.canvas, streamlit-drawable-canvas, or another method?
If anyone has done something similar or has ideas, I’d really appreciate the input. Happy to share more code if needed!
Thanks in advance