Hello, I’ve noticed that most of the Streamlit tutorials involving LLMs have been using OpenAI’s APIs. I was wondering if there’s any tutorials on how to use Streamlit with HuggingFace models?
I am currently basing my code based on the snippet below, except I am using meta-llama/Meta-Llama-3-8B-Instruct:
import streamlit as st
from transformers import pipeline
def main():
st.title("Hugging Face Model Demo")
# Create an input text box
input_text = st.text_input("Enter your text", "")
model = pipeline("sentiment-analysis")
# Create a button to trigger model inference
if st.button("Analyze"):
# Perform inference using the loaded model
result = model(input_text)
st.write("Prediction:", result[0]['label'], "| Score:", result[0]['score'])
if __name__ == "__main__":
main()
I was wondering if it’s normal for the main() function to be called for every user input? This is causing the model to be re-downloaded for every user input.
In addition, is there a way to enable streaming for HuggingFace models?