How to read *.csv file with different separators with st.file_uploader?

Summary

Hi all,
I want ro read *.csv files with st.file_uploader that have different separators like β€œ,” or β€œ;”. Not mixed in one file, e.g. read a file today that has β€œ,” as separator, then tomorrow one that has β€œ;” as separator.

Steps to reproduce

Code snippet:

Normally I would use Sniffer

import csv
import pandas

def get_delimiter(file_path, bytes = 4096):
    sniffer = csv.Sniffer()
    data = open(file_path, "r").read(bytes)
    delimiter = sniffer.sniff(data).delimiter
    return delimiter

delimiter = get_delimiter(file_path)
pd.read_csv(file_path, sep=delimiter)

The sniffer is not working with st.file_uploader, because st.file_uploader reads the bytes to the RAM, there is no filepath that can be accessed.

Expected behavior:

I would like that it is automatically recognized which seperator the *.csv file has and that this information is passed to read_csv.

sniffer.snif() needs bytes and you can read bytes from the UploadedFile object. There is nothing here that requires the involvement of a file path.

OTOH, pandas.read_csv() can sniff he delimiter for you, take a look at the docs.

2 Likes

Thanks! I was not aware that pandas.red_csv(sep=None) activates the pandas csv sniffer, which detects at least common separators like β€œ,” or β€œ;”.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.