I have no issues with Polars, DuckDB and Pandas to read the CSV file content into dataframe.
For DataFusion, I have challenges parsing the file, my codes are as follows:
uploaded_file = st.file_uploader("Choose a file")
Method 1:
from datafusion import SessionContext
ctx = SessionContext()
df_pl = ctx.read_csv(uploaded_file).to_polars()
Method 2:
from datafusion import SessionContext
ctx = SessionContext()
df_pl = ctx.read_csv(uploaded_file.getvalue()).to_polars()
Method 3:
from datafusion import SessionContext
ctx = SessionContext()
df_pl = ctx.read_csv(BytesIO(uploaded_file.getvalue())).to_polars()
Method 4:
from datafusion import SessionContext
ctx = SessionContext()
ctx.register_view('tbl', uploaded_file)
df_pl = ctx.table('tbl').to_polars()
Method 5:
from datafusion import SessionContext
ctx = SessionContext()
ctx.register_view('tbl', uploaded_file.getvalue())
df_pl = ctx.table('tbl').to_polars()
Method 6:
from datafusion import SessionContext
ctx = SessionContext()
df = ctx.read_csv(uploaded_file.getvalue())
ctx.register_view('tbl', df)
df_pl = ctx.sql('SELECT * FROM tbl').to_polars()
None of these parses my CSV file successfully. Any workaround?