Summary
I would like to read a txt file with a dask bag (db) using the st.file_uploader method, but I am not able to do that cause I get error with the delimiter.
TypeError: ‘linedelimiter’ is an invalid keyword argument for StringIO()
Steps to reproduce
Code snippet:
import dask.bag as db
import io
import streamlit as st
uploaded_file = st.file_uploader("Upload file")
file = db.read_text(io.StringIO(uploaded_file.getvalue().decode("windows-1252"),linedelimiter='\n'))
If applicable, please provide the steps we should take to reproduce the error or specified behavior.
Expected behavior:
Read the uploaded file in a dask bag.
I am using dask because my file size is 2 GB.
Actual behavior:
EXCEPTION: Traceback (most recent call last):
File “C:\Users\andres\data_fuente_streamlit.py”, line 172, in main
file = db.read_text(io.StringIO(uploaded_file.getvalue().decode(“windows-1252”),linedelimiter=‘\n’))
TypeError: ‘linedelimiter’ is an invalid keyword argument for StringIO()
Debug info
- Streamlit version: version 1.16.0
- Python version: 3.9.12
- Using PipEnv
- OS version: Windows 10
- Browser version: Microsoft Edge Version 108.0.1462.54
Requirements file
altair==4.2.0
altgraph==0.17.3
asttokens==2.2.1
attrs==22.2.0
backcall==0.2.0
blinker==1.5
cachetools==5.2.0
certifi==2022.12.7
charset-normalizer==2.1.1
click==8.1.3
cloudpickle==2.2.0
colorama==0.4.6
comm==0.1.2
commonmark==0.9.1
cx-Oracle==8.3.0
dask==2022.12.0
debugpy==1.6.4
decorator==5.1.1
entrypoints==0.4
et-xmlfile==1.1.0
executing==1.2.0
fsspec==2022.11.0
future==0.18.2
gitdb==4.0.10
GitPython==3.1.30
idna==3.4
importlib-metadata==5.2.0
ipykernel==6.20.0
ipython==8.7.0
jedi==0.18.2
Jinja2==3.1.2
jsonschema==4.17.3
jupyter_client==7.4.8
jupyter_core==5.1.1
locket==1.0.0
MarkupSafe==2.1.1
matplotlib-inline==0.1.6
nest-asyncio==1.5.6
numpy==1.23.5
openpyxl==3.0.10
packaging==22.0
pandas==1.5.2
parso==0.8.3
partd==1.3.0
pefile==2022.5.30
pickleshare==0.7.5
Pillow==9.3.0
platformdirs==2.6.0
prompt-toolkit==3.0.36
protobuf==3.20.3
psutil==5.9.4
pure-eval==0.2.2
pyarrow==10.0.1
pydeck==0.8.0
Pygments==2.13.0
pyinstaller==5.7.0
pyinstaller-hooks-contrib==2022.14
Pympler==1.0.1
PyQt5==5.15.7
PyQt5-Qt5==5.15.2
PyQt5-sip==12.11.0
pyrsistent==0.19.3
python-dateutil==2.8.2
pytz==2022.6
pytz-deprecation-shim==0.1.0.post0
pywin32==305
pywin32-ctypes==0.2.0
PyYAML==6.0
pyzmq==24.0.1
requests==2.28.1
rich==12.6.0
semver==2.13.0
six==1.16.0
smmap==5.0.0
stack-data==0.6.2
streamlit==1.16.0
toml==0.10.2
toolz==0.12.0
tornado==6.2
traitlets==5.8.0
typing_extensions==4.4.0
tzdata==2022.7
tzlocal==4.2
urllib3==1.26.13
validators==0.20.0
watchdog==2.2.0
wcwidth==0.2.5
zipp==3.11.0
Additional information
This is just part of my code, the code compares sql tables with the content of the txt file. Some of the packages in the requirements file are not using in the streamlit app. But I decided to include them cause they are in my virtual enviroment.