Streamlit Highlight Text in PDF


I have a Streamlit App where a PDF is rendered. The function to display the PDF on a specific page looks like this:

def displayPDF(file, page):
    # Opening file from file path
    with open(file, "rb") as f:
        base64_pdf = base64.b64encode('utf-8')

    # Embedding PDF in HTML
    pdf_display = F'<iframe src="data:application/pdf;base64,{base64_pdf}#page={page}" width="100%" height="300" type="application/pdf"></iframe>'

    # Displaying File
    st.markdown(pdf_display, unsafe_allow_html=True)

Now I want to also highlight or mark some text in the rendered PDF. How can I do this?

I already tried changing the iframe Code to this:
pdf_display = F'<iframe src="data:application/pdf;base64,{base64_pdf}#page={page}&#search=%22Einleitung%22" width="100%" height="300" type="application/pdf"></iframe>'

So I added โ€œ&#search=โ€ but this did not work. I donโ€™t want to only highlight one word, I would like to highlight a whole chunk on the specific doc.

Any ideas how to make it working?

Here is an idea using pymupdf to render the page as an image and to highlight some text match with rectangles.


import streamlit as st
import fitz

with st.sidebar:
    original_doc = st.file_uploader(
        "Upload PDF", accept_multiple_files=False, type="pdf"
    text_lookup = st.text_input("Look for", max_chars=50)

if original_doc:
    with as doc:
        page_number = st.sidebar.number_input(
            "Page number", min_value=1, max_value=doc.page_count, value=1, step=1
        page = doc.load_page(page_number - 1)

        if text_lookup:
            areas = page.search_for(text_lookup)

            for area in areas:

            pix = page.get_pixmap(dpi=120).tobytes()
            st.image(pix, use_column_width=True)


Nice solution thanks!
However would be nice to be able to scroll through the document, so having it in PDF form and not as a picture.

Thank you for this! I was trying to achieve the same with โ€˜streamlit_pdf_viewerโ€™ but settled on your solution in the end.

