'utf-8' codec decode error when working with dictionary containing Chinese characters

Summary

I am making a language learning app. I have a dictionary that I’m trying to work with in Streamlit with Chinese characters in it. Unfortunately the following error occurs when I do streamlit run [app name]: 'utf-8' codec can't decode byte 0xd5 in position 344: invalid continuation byte

Steps to reproduce

Code snippet:

import streamlit as st

st.title('KYLanguageApp')

st.write('This is a language app!')

data = [
    {
        "chinese": "这家便利店引入了自动结账系统,提供更快捷的购物体验。",
        "english": "This convenience store has introduced an automatic checkout system, providing a faster shopping experience.",
        "pronunciation": "Zhè jiā biàn lì diàn yǐn rù le zì dòng jié zhàng xì tǒng, tí gōng gèng kuài jié de gòu wù tǐ yàn."
    },
    {
        "chinese": "便利店采用了智能库存管理系统,实现了库存的精确控制。",
        "english": "The convenience store has adopted an intelligent inventory management system, achieving precise control of inventory.",
        "pronunciation": "Biàn lì diàn cǎi yòng le zhì néng kù cún guǎn lǐ xì tǒng, shí xiàn le kù cún de jīng què kòng zhì."
    },
    {
        "chinese": "这家便利店还设有咖啡吧台,供顾客品尝各种咖啡饮品。",
        "english": "This convenience store also has a coffee counter for customers to taste various coffee beverages.",
        "pronunciation": "Zhè jiā biàn lì diàn hái shè yǒu kā fēi bā tái, gòng gù kè pǐn cháng gè zhǒng kā fēi yǐn pǐn."
    }
]

Expected behavior:

I expected it to work. There is no error when I put this dictionary in a Jupyter notebook and run it.

Actual behavior:

Unfortunately the following error occurs when I do streamlit run [app name]: 'utf-8' codec can't decode byte 0xd5 in position 344: invalid continuation byte

Any help is greatly appreciated!

I can run that code without issues. Make sure the file is utf-8-encoded.

1 Like

Sorry, to clarify, what file do you mean? The Python file? I’m not reading the dictionary from any file. It’s just within the Python file. Could you teach me how to encode the Python file in UTF-8? I tried looking it up but didn’t find anything. Thanks so much!


EDIT: Never mind, I solved it, thanks so much! Did File → Save As in Visual Studio → Save with Encoding and changed the encoding :slight_smile:

2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.