🗞️ Weekly Roundup: Arrow DataFrames serialization, chef transformers, data quality detectors and more!

Welcome to the Weekly Roundup! :wave:

07/19/21 - 07/25/21
Issue - V62
Current Release: 0.85.1

Every week we’ll share community-driven apps and articles that were shared with us during the previous week as well as any major announcements from Streamlit. If there is anything we missed feel free to add it in a comment or @ us on Twitter!

  • :newspaper: = articles
  • :tv: = videos
  • :balloon: = apps

:new: Streamlit Updates

  • Check out the Streamlit sharing badge for your GitHub repos:
    Open in Streamlit
  • :star_struck: Check out our newly re-designed blog site
  • :eyes: In case you missed it - see @kmcgrady on the dataroots’ latest Tour de Tools, chat and answer questions about Streamlit :balloon:
  • :bow_and_arrow::bow_and_arrow: With the latest release, Streamlit now uses Apache Arrow for data serialization :bow_and_arrow::bow_and_arrow: This improves performance AND let us delete over 1k lines of code :partying_face: Read more here

Featured Content

Tutorials/Introduction

Science and Tech

NLP and Language

Data Visualization

CV and Images

Sports

Finance and Business

Geography and Society

General

:sunglasses: Cool happenings and other things

:footprints: Follow us

3 Likes

Cool stuff.

I read the blog and my understanding is that streamlit will handle dataframes into arrow dataframes internally. Am I right?

I even tried using pyarrow inside a streamlit app, but it throughs me out.

Nice!

Hey @oltipreka !

Yep, you’re correct the conversion is done internally. :balloon:

Regarding pyarrow, would you mind making a post under Using Streamlit - Streamlit with a your code so we can take a closer look?

Thanks!

1 Like

Hi,

Is there an option of changing the data serialization method to ‘legacy’ from arrow, other than changing it in the config.py file?

All config options can also be passed as command-line arguments or environment variables.

As environment variable:

export STREAMLIT_GLOBAL_DATAFRAMESERIALIZATION=legacy

As CLI argument:

streamlit run myapp.py --global.dataFrameSerialization=legacy

Also, it would be great hear what kinds of issues you’re experiencing with Arrow, because we’d love to fix them so you don’t have to use legacy at all!

Hi,
Thank you for the prompt reply.

I was facing the below mentioned error -
ArrowInvalid: (“Could not convert ‘4+’ with type str: tried to convert to int”, ‘Conversion failed for column Current Risk Rating with type object’)