Trying to understand how i can load my dataset with up to 1mil of rows.
Dataset size is relative small … i have 2 samples for test purpose - 260k of rows and 100k of rows.
100k of rows - loads fine… 260k - wont - half way trough it shows still loading but nothing happens - page a bit unresponsive.
Getting these errors in my terminal:
Traceback (most recent call last):
File " /home/evo/koala/lib/python3.11/site-packages/tornado/websocet.py, line 1089, in wrapper raise WebSocketClosedError()
tornado.websocet.WebSocetClosedError.
Two of my columns are mostly empty (99.9% - empty ) but i need them.
Im really new to python and streamlit and pandas stuff.
4 columns are spit into more columns by panda and in total i have 42 columns including Index column.( all rows needed, no needed rows are in my file and nothing i cant make less columns due to way data is analized ( the way i need ) only more columns will be added … so database will be up to 60 column, but actual csv file is 27 columns.
260k row csv file takes around 45MB only. 100k - <18MB.
Or there is a way to load only few columns but all of them are available when i need them ?
My way of sorting data is once file is uploaded , it splits specific columns into multiple so i would be able to take deeper look into repetitive patterns .
Im using latest Streamlit version and Python 3.11 with venv and Arch Linux ( if this makes sense or is helpfull )
Thank You and have a great day.