Is there a way to run an initialization function?

Hi,

I have a piece of code that let’s a user select a file to upload, and then it carries on processing.

Is there a way to revoke this, and I upload the file myself when the application begins, through some sort of initialization function?

This is for a Streamlit page. When a webpage is refreshed by the user, i’m assuming the app.py file gets re-run? If so, the ‘initialization function’ gets re-run too correct, unless there’s a way to stop this. This is the code i run currently:

def main():
 if source == "PDF":
        pdffile = st.file_uploader("Upload here:", type="pdf", accept_multiple_files=True)
        if st.button("Spinner"):
            with st.spinner("Processing.."):
                raw_text = get_pdf_text(pdffile)

I need this to run once, or just eliminate the idea that the user has any selection. I’d like to just load one or multiple files upfront to be worked with in the rest of the code

Use cache_resource.

Could you give an example of how i’d use this in my case please?

Just decorate your initialization function, then it will only run when it is called for the first time.

@st.cache_resource
def initialization_function():
    # do your stuff here
    return something

Hello @chai86,

Here’s how you might structure your app to automatically process files from a specific directory at startup:

import streamlit as st
import os

def get_pdf_text(files):
    return "Processed text from PDF"

def main():
    files_directory = 'predefined_files'
    
    if 'initialized' not in st.session_state or not st.session_state.initialized:
        file_paths = [os.path.join(files_directory, f) for f in os.listdir(files_directory) if f.endswith('.pdf')]
        pdffiles = []  # This would be a list of file paths or file objects
        
        for file_path in file_paths:
            pdffiles.append(file_path)  # Adjust this part to open the file if needed for processing
        
        raw_text = get_pdf_text(pdffiles)
        st.session_state.initialized = True  # Mark initialization as done to avoid re-running
        
        st.write(raw_text)
    else:
        st.write("Initialization already done.")

if __name__ == "__main__":
    main()

Hope this helps!

Kind Regards,
Sahir Maharaj
Data Scientist | AI Engineer

P.S. Lets connect on LinkedIn!

➤ Want me to build your solution? Lets chat about how I can assist!
➤ Join my Medium community of 30k readers! Sharing my knowledge about data science and AI
➤ Website: https://sahirmaharaj.com
➤ Email: sahir@sahirmaharaj.com
➤ 100+ FREE Power BI Themes: Download Now

3 Likes

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.