Data engineering best practices

robmarkcole · May 26, 2020, 2:23pm

Starting this thread to discuss Data engineering best practices incorporating Streamlit. @andfanilo has experience using Prefect, and @randyzwitch has some ideas on best practice

randyzwitch · May 27, 2020, 1:02pm

It’s definitely an interesting question I’ll be interested to see evolve, meaning, what are some good ways to split the responsibility between Streamlit as a presentation layer vs heavier lifting in data engineering tools

theimposingdwarf · August 3, 2020, 6:40pm

Just thinking out loud here…

I wonder if you could save cached computations away onto disk so the client can use the same already completed computations/model as the researcher (like training a model)

Otherwise, the researcher needs a streamlit app to perform modeling computations Jupyter notebook style and then needs to take their model and create a client application to demonstrate the model on less powerful machines that other stakeholders may be using.

Maybe application states could be used to allow the researcher to work in the app in a development capacity where the app locally runs heavy computation while writing models to the disk/git directory so that the client mode of the app can be run on a less powerful computer or on a low power web server. After all, getting a result from a trained model is 1000x+ less computationally intense than training the model.

You could always rebuild the model from the development tab/screen, but otherwise, the model is already prepared, reducing latency for those who want to toy with a model.

randyzwitch · August 3, 2020, 6:51pm

I’ve seen some people using pickle to have models available to users.

Topic		Replies	Views
The Streamlit Roadmap: Big Plans for 2020! Show the Community!	1	1240	January 12, 2022
How to master Streamlit for data science Show the Community!	2	1007	April 10, 2023
Streamlit - Best Practices! Using Streamlit bestpractice	3	4287	June 23, 2024
Public-facing, enterprise-grade deployment of Streamlit Show the Community!	16	3607	March 24, 2025
Introducing Streamlit Sharing Show the Community!	4	1365	January 12, 2022

Data engineering best practices

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies