Using HuggingFace with Streamlit

Hello, I’ve noticed that most of the Streamlit tutorials involving LLMs have been using OpenAI’s APIs. I was wondering if there’s any tutorials on how to use Streamlit with HuggingFace models?

I am currently basing my code based on the snippet below, except I am using meta-llama/Meta-Llama-3-8B-Instruct:

import streamlit as st
from transformers import pipeline

def main():
    st.title("Hugging Face Model Demo")

    # Create an input text box
    input_text = st.text_input("Enter your text", "")

    model = pipeline("sentiment-analysis")

    # Create a button to trigger model inference
    if st.button("Analyze"):
        # Perform inference using the loaded model
        result = model(input_text)
        st.write("Prediction:", result[0]['label'], "| Score:", result[0]['score'])

if __name__ == "__main__":
    main()

I was wondering if it’s normal for the main() function to be called for every user input? This is causing the model to be re-downloaded for every user input.

In addition, is there a way to enable streaming for HuggingFace models?