Update github csv with new input entries

Summary

Below code is working completely fine and updating my excel file available in local system. But, when I deployed this on streamlit using github and uploaded code and ā€œBOOKINGS.csvā€ file, below code is not updating file ā€œBOOKINGS.csvā€ stored in github. Do I need to change df.to_write(file.csv) to something else? Please help.

Steps to reproduce

Code snippet:

import streamlit as st
import pandas as pd

df = pd.read_csv("BOOKINGS.csv")

    with st.form(key = "Contact Form",clear_on_submit=True):
            
        fullName = st.text_input(label = 'Full Name',placeholder="Please enter your full name")
    
        email = st.text_input(label = 'Email Address',placeholder="Please enter your email address")

        submit_res = st.form_submit_button(label = "Submit")
        
        if submit_res == True:
                new_data = {"fullName" : fullName,"email" : email}
                df = df.append(new_data,ignore_index=True)
                df.to_csv("BOOKINGS.csv",index=False)

Yes, of course not.
Thatā€™s not how it works, the Streamlit Cloud App doesnā€™t run ā€œdirectlyā€ on the GitHub code, but on a file system created by git clone. You would have to push the file back to GitHub with git commit and git push e.g. with the GitPython library. However, I wouldnā€™t even try this, because it will lead to conflicts if more than one user is using the streamlit app. This is the wrong approach.

Your example would be a typical use-case for an (external) database.

So, should I create. a sql data base instead of excel file?

Yes, although it does not necessarily have to be an SQL database. But be aware, that a database doesnā€™t run neither on GitHub nor on Streamlit Cloud directly, you have to use another external service (DBaaS) for that. Another lightweight option would be to use Google Sheets, which also has an API.
Other possibilities that come to mind off the top of my head:

  • Firebase
  • AWS DynamoDB or other DBaaS from AWS (RDS etc.)
  • DBaaS from various cloud providers (Google Cloud, Azure, MongoDB and many more)
  • The list is certainly not exhaustiveā€¦

Some of them may have a free tier.
Here are a few examples:

Databases (DBaaS)

And of course, you can also install and run a database server on your own serverā€¦

Thanks for you help. I am public google sheet to update values. I tried to update the cell values but unable to do it.

Please help

code

To access public spreadsheet

@st.cache_data
def load_data(sheets_url):
csv_url = sheets_url.replace(ā€œ/edit#gid=ā€, ā€œ/export?format=csv&gid=ā€)
return pd.read_csv(csv_url)

df = load_data(st.secrets[ā€œpublic_gsheets_urlā€])
df = pd.DataFrame(df)
print(df)

Results

    fullName  phone_number              email

0 Puneet Gupta 4.370000e+12 xxxxxxx@gmail.com
1 Swapnil Nagar 4.370000e+11 yy97@gmail.com

Code

new_data = {ā€œfullNameā€ : abc,ā€œphone_numberā€ : 1234567890,ā€œemailā€ : zzz@gmail.com}

df.update_cell(len(df)+1,1,new_data[ā€˜fullNameā€™][0])
df.update_cell(len(df)+1,2,new_data[ā€˜phone_numberā€™][1])
df.update_cell(len(df)+1,3,new_data[ā€˜emailā€™][2])

Error

AttributeError: ā€˜DataFrameā€™ object has no attribute ā€˜update_cellā€™

Thanks

As itā€™s not possible to directly use the csv file to commit to github repo i went with the google sheets api appending from my streamlit app.

First set your secrets.toml by following the procedure in this link Connect Streamlit to a private Google Sheet - Streamlit Docs

# .streamlit/secrets.toml

private_gsheets_url = "https://docs.google.com/spreadsheets/d/12345/edit?usp=sharing"

[gcp_service_account]
type = "service_account"
project_id = "xxx"
private_key_id = "xxx"
private_key = "xxx"
client_email = "xxx"
client_id = "xxx"
auth_uri = "https://accounts.google.com/o/oauth2/auth"
token_uri = "https://oauth2.googleapis.com/token"
auth_provider_x509_cert_url = "https://www.googleapis.com/oauth2/v1/certs"
client_x509_cert_url = "xxx"

Below code can be used to insert into to google sheets using itā€™s api in your streamlit app.py


import streamlit as st
from google.oauth2 import service_account
import gspread
import pandas as pd

# Create a connection object.
credentials = service_account.Credentials.from_service_account_info(
    st.secrets["gcp_service_account"],
    scopes=[
        "https://www.googleapis.com/auth/spreadsheets",
    ],
)
gc = gspread.authorize(credentials)

# Get the Google Sheet by URL.
sheet_url = st.secrets["private_gsheets_url"]
sheet = gc.open_by_url(sheet_url)

# Function to find the last filled row in the worksheet.
def find_last_filled_row(worksheet):
    return len(worksheet.get_all_values()) + 1

# Function to insert data into the Google Sheet after the last filled row.
def insert_data_into_sheet(dataframe):
    worksheet = sheet.get_worksheet(0)  # Replace 0 with the index of your desired worksheet
    values = dataframe.values.tolist()

    # Find the last filled row
    last_filled_row = find_last_filled_row(worksheet)

    # Insert the data after the last filled row
    worksheet.insert_rows(values, last_filled_row)

# Your DataFrame with data to be inserted
df = pd.DataFrame(results, columns=['query', 'batch_index', 'index_of_audio_output_tensor', 'audio_file_name', 'similarity_score_by_model', 'user_relevance_score'])

# Call the function to insert data into the Google Sheet
insert_data_into_sheet(df)
2 Likes