Sensitive User Data, Datastorage and GDPR

Hi!

Iā€™m building a Streamlit app that can explore your chat data from multiple sources online (e.g. WhatsApp and Facebook). But now Iā€™m wondering whether this might conflict with the European GDPR regulation, as individuals would be handling data that belongs to them, but also all the people theyā€™ve written with.

My questions are:
Is there a way for me to use Streamlit, without ever storing my userā€™s data anywhere online? Will my userā€™s data be safe, they use tools that are running on streamlit on their local data?
Iā€™ve asked Streamlit if they were GDPR reliant, but Iā€™m not sure if that entails everything? Do the Streamlit GDPR privacy policies cover, themselves, me or my users rights?

Note: The data my users will upload is in the form of a CSV :blush:

I hope someone call help me out!

Hey @Sebastian_S_Engen, welcome to the Streamlit community!

This is one of those hard questions to answerā€¦since youā€™ve mentioned contacting us already, itā€™s our belief that we are GDPR compliant. Of course, lawyers get paid to argue everything :laughing:

Is there a way for me to use Streamlit, without ever storing my userā€™s data anywhere online?

Streamlit itself never takes possession of the data. In itā€™s lowest level, Streamlit Cloud (currently) runs on a public cloud, so we never take physical possession of that data in the sense that we donā€™t own that hardware. From a Streamlit open-source library perspective, when you use st.file_uploader, the data is stored in Python via a BytesIO object, which is stored in RAM.

So to the extent that your code doesnā€™t save the CSV file anywhere, it will only persist in RAM until itā€™s overwritten by another session or the container is shutdown.

We believe that satisfies GDPR, as the (very large, global) cloud service we use should be abiding by GDPR, and we donā€™t save containers in any manner (i.e every time you change your code, the container is rebuilt and the repo is pulled from GitHub).

Do the Streamlit GDPR privacy policies cover, themselves, me or my users rights?

This is where, unfortunately, youā€™ll need your own legal representation. Iā€™m not Streamlitā€™s lawyer, but I can generally say that what we believe as a company (via our legal representatives) doesnā€™t mean that you couldnā€™t be liable. Itā€™s just a matter of how your legal jurisdiction decides to interpret the written law and the specific case should it arise.

Best,
Randy

Ahhh, I see!
Thank you so much for the swift repluy, Randy!
Just two follow-up questions:

  1. When you say that the data is stored in RAM - Is that then RAM locally on the userā€™s computer or RAM on a cloud server?
  2. What GDPR reliant cloud service does Streamlit use? And how would I find out if they comply to my specific needs? :blush:

Once again thank you for the SUPER swift reply! And I hope to hear more from you soon!
Best,
Sebastian

RAM inside the container, running on (for now) Google Kubernetes Engine. The only reason why I was intentionally opaque about where things run is that we donā€™t consider our cloud provider a ā€œpublic featureā€ per se and it can always change in the future.

Best,
Randy

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.