The returned value of file_uploader() is a file-like object that resides in memory which is in contrast to a physical file on the disk.
To simulate, given a physical file we will convert it to the returned value of ‘file_uploader()’.
For binary file such as pdf file.
import io
# uploaded_file = st.file_uploader("Choose a file")
file_path = 'C:/streamlit/todo_app/assets/todo_guide.pdf'
with open(file_path, 'rb') as f: # pdf file is binary, use rb
bytes_data = f.read()
uploaded_file = io.BytesIO(bytes_data) # this one
To get the bytes data back, you can just use.
bytes_data = uploaded_file.getvalue()
For text file such as markdown file.
import io
# uploaded_file = st.file_uploader("Choose a file")
file_path = 'C:/streamlit/todo_app/assets/todo_guide.md'
with open(file_path, 'r') as f: # md is text, use r
text_data = f.read()
uploaded_file = io.StringIO(text_data) # this one
To get the filename given the file path
from pathlib import Path
file_path = 'C:/streamlit/todo_app/assets/todo_guide.md'
pathlib_path = Path(file_path)
filename = pathlib_path.name
To view the pdf file in streamlit
from streamlit_pdf_viewer import pdf_viewer
# https://pypi.org/project/streamlit-pdf-viewer/
# pdf, binary file
pdf_path = 'F:/Downloads/mm-bradley-terry-1079120141.pdf'
with open(pdf_path, 'rb') as pdf_ref:
bytes_data = pdf_ref.read()
pdf_viewer(input=bytes_data, width=700)
To view an ms docx, google, and libreoffice docs file in streamlit
You may use the library mammoth.
import streamlit as st
import mammoth
# ms docx, binary file
docx_path = 'F:/Downloads/xgboost.docx'
with open(docx_path, 'rb') as docx_ref:
result = mammoth.convert_to_html(docx_ref)
html = result.value
st.markdown(html, unsafe_allow_html=True)
To view a markdown file in streamlit
import streamlit as st
# markdown, text file
md_path = 'F:/Downloads/stocks.md'
with open(md_path, 'r') as md_ref:
text_data = md_ref.read()
st.markdown(text_data)
Reference
File uploader