'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I have a CSV with UTF-16 encoding I am trying to upload to streamlit.

I am getting an error for line 18 where I defined UTF-16 encoding so there’s not much more I can do. Here is the line:
with open(file1, newline=‘’, encoding=‘utf-16’) as csvfile:

Any ideas? I can open the same file in notebook so I suppose this is streamlit related problem.

Are you saying you have some file1 = st.file_uploader("Choose a file") and are using

with open(file1, newline=‘’, encoding=‘utf-16’) as csvfile:?

If so, note that the file uploader widget explictly returns the file in ByteIO. The with open(...) arugment is looking for a file path/name instead of a file object.

Making an adjustment to the example shown in the documentation, try:

import streamlit as st
from io import StringIO
import pandas as pd

file = st.file_uploader("Choose a file")

if file != None:
    bytes_data = file.getvalue()
    string_data = StringIO(bytes_data.decode("utf-16"))
    data = pd.read_csv(string_data)

    data

Note that something like pd.read_csv accepts a “str, path object or file-like object” so you have some flexibility. I’m not sure what you were going to ultimately do in your with open(..) so feel free to clarify further if needed.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.