Examples of creating custom st.column_config for st.dataframe?

Are there examples of how to create custom st.column_config for st.dataframe?

I have a Pandas DataFrame with a column of SMILES strings for molecules (for example, “CCCNC”), and I’d like to display that column not as strings, but as molecular images instead. There are libraries that convert SMILES strings to images using JavaScript: SmilesDrawer and RDKit.js.

For SmilesDrawer, it looks like it might be enough to do the following:

  • replace SMILES strings (“CCCNC” for example) with HTML <canvas data-smiles="CCCNC"></canvas>,
  • import SmilesDrawer with <script src="https://unpkg.com/smiles-drawer@1.0.10/dist/smiles-drawer.min.js"></script> ,
  • and call function SmilesDrawer.apply();

But I’m not sure how to put this together in Streamlit, or whether it is possible to do so at all. Any advice, or pointers where to start?

Hi @alex1

  1. If you want to integrate js into a dataframe, you may probably want to look at the streamlit-aggrid library. (I haven’t tried this though for your kind of problem)
  2. Another way to do this is:
    a. have a dataframe (tdf) of 2 columns (1: your smiles strings…say SString… & 2: image file name… say SmileImg…) which you will pull into st.data_editor
    b. the images (which would be pre-saved to disk’s application folder under the smiles string name Eg. CCNC.png
    c. the code for the columns can be:
cfg = {}
cfg['SString'] = st.column_config.TextColumn(label='SString', width="medium")
cfg['SmileImg'] = st.column_config.ImageColumn(label='SmileImg')
derows = st.data_editor(tdf, height=248, hide_index=True, column_config=cfg, key="myde")
  1. some people have used external libraries: ( python - How to generate a graph from a SMILES molecule representation? - Stack Overflow)

Hope this helps in some way…

Cheers

1 Like

Hi @Shawn_Pereira ,

Thanks for the suggestions! I think I can even skip saving files, and encode images as data URI strings ( <img src="data:image/png;base64...), and try to display those using st.column_config.ImageColumn, as you suggested.

I wanted to explore JavaScript options to offload image generation from the server to the browser, and to reduce network traffic of the app. But it sounds like it might be quite a big undertaking, especially if it requires switching to streamlit-aggrid.

Hi @alex1

You can also use st.cache_data to cache the images so that subsequent runs would run faster.

More info here in the Docs:

Thanks @dataprofessor and @Shawn_Pereira, I have now a working version with server-based image generation:

import base64
import io

import pandas as pd
import rdkit
import rdkit.Chem
import rdkit.Chem.Draw
import streamlit as st


@st.cache_data
def smi_to_png(smi: str) -> str:
    """Returns molecular image as data URI."""
    mol = rdkit.Chem.MolFromSmiles(smi)
    pil_image = rdkit.Chem.Draw.MolToImage(mol)

    with io.BytesIO() as buffer:
        pil_image.save(buffer, "png")
        data = base64.encodebytes(buffer.getvalue()).decode("utf-8")

    return f"data:image/png;base64,{data}"


df = pd.DataFrame({"smiles": ["CCCNC", "CCC", "CCCCC"]})

df["img"] = df["smiles"].apply(smi_to_png)

st.dataframe(df, column_config={"img": st.column_config.ImageColumn()})

Screenshot 2023-12-11 at 12.45.17

2 Likes

That’s super awesome! Thanks for sharing the implementation.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.