Inconsistent visualization after filtering, bug?

Melody_Duplaix · November 27, 2023, 1:27pm

I’m reaching out to seek assistance regarding a specific issue in my Streamlit application.
My application is local for the time being.

Problem Description: Upon selecting a specific type (“revendeur”) and then reverting to “Tous,” I’ve observed a discrepancy in the displayed data. Some products appear in the chart without any evolution data, although they were not present before selecting the specific type.

Detailed Problem Statement:

I’ve carefully examined the data filtering logic and ensured correct implementation.
The issue seems to arise specifically when transitioning from a filtered type back to displaying all data (“Tous”).
Products without evolution data are unexpectedly included in the chart after this transition.
If I insert a st.write() with the dataframe filtered right before the graph creation, the unexpected products not appear in the dataframe

Request for Insights: I’m seeking insights from the community to understand whether this might be indicative of a Streamlit bug or if there are aspects of my implementation that I may have overlooked.

Code Highlights: For your reference, I’ve included relevant portions of my Python code below:

my function:


def graph_articles_plus_grande_evolution_trimestre(selected_type_vente):
    """construit le graphique en barre des articles ayant eu le plus d'évolutions par trimestre

    Args:
        selected_type_vente (string): type de vente sélectionné

    Returns:
        plotly objects: graphique en barre
        str: nom du dernier trimestre du dataframe
    """
    df_trimestre = pd.read_csv("data/vente_par_articles.csv")
    # filtre sur le type de vente sélectionné
    if selected_type_vente != "Tous":
        df_trimestre = df_trimestre[df_trimestre["Type de vente"] == selected_type_vente]
    # Group by pour éviter les valeurs dupliquées
    df_filtre_trimestre = df_trimestre.groupby(['Nom du Produit', 'trimestre'])['Montant total HT'].sum().reset_index()
    # Tri par client et trimestre pour garantir l'ordre correct
    df_filtre_trimestre = df_filtre_trimestre.sort_values(by=['Nom du Produit', 'trimestre'])
    # Calculer le pourcentage d'évolution
    df_filtre_trimestre['Pourcentage Evolution'] = df_filtre_trimestre.groupby('Nom du Produit')['Montant total HT'].pct_change() * 100
    # pivote les lignes en colonnes
    df_pivot_trimestre = df_filtre_trimestre.pivot(index='Nom du Produit', columns='trimestre', values='Pourcentage Evolution').reset_index()
    # Ajouter une colonne pour le graphique
    df_pivot_trimestre['Evolution'] = df_pivot_trimestre.apply(lambda row: [0 if pd.isnull(value) else value for value in row.values[1:-1]], axis=1)
    # remplace les valeurs nan par des 0
    df_pivot_trimestre.fillna(0, inplace=True)
    # Trouver le nom de la colonne du dernier trimestre
    dernier_trimestre = df_pivot_trimestre.columns[-2]
    # Exclure les valeurs à zéro
    df_trie_trimestre = df_pivot_trimestre[(df_pivot_trimestre[dernier_trimestre] != 0) & (df_pivot_trimestre[dernier_trimestre].notna())]
    # Trier le DataFrame par la colonne du dernier trimestre
    df_trie_trimestre = df_trie_trimestre.sort_values(by=dernier_trimestre, ascending=False).head(15)
    # Créer le graphique en barres avec Plotly Express
    fig = px.bar(df_trie_trimestre, y='Nom du Produit', x=dernier_trimestre,
                template='plotly_white')
    # Inverser l'ordre des barres pour que les plus grandes soient en haut
    fig.update_layout(hovermode=False,dragmode=False, barmode='stack', yaxis={'categoryorder': 'total ascending'})
    # Changer la couleur des barres en fonction de leur valeur
    fig.update_traces(marker_color=['rgb(69, 197, 127)' if val >= 0 else 'red' for val in df_trie_trimestre[dernier_trimestre]])
    # Ajouter des étiquettes de texte pour chaque valeurs différentes de 0
    for index, row in df_trie_trimestre.iterrows():
        value = row[dernier_trimestre]
        fig.add_annotation(
            x=value,
            y=row['Nom du Produit'],
            text=f"{value:.2f}%", 
            showarrow=False,
            font=dict(size=12, color='rgb(69, 197, 127)' if value >= 0 else 'red'),  
            xshift=30 if value > 0 else -30  )
    return fig, dernier_trimestre

the page:

import streamlit as st
import pandas as pd
from bibliotheque.lib import *

config_pages_menu("wide")
formatage_de_la_page("style.css")
df = pd.read_csv("data/vente_par_articles.csv")

# Composants situés dans la barre des filtres
st.sidebar.header("Menu")
type_vente_liste = ["Tous"] + sorted(df['Type de vente'].unique().tolist())
selected_type_vente = st.sidebar.selectbox("Choisir un type de vente", type_vente_liste)

# TITRE
st.title("Evolution des montant de vente par clients par trimestres")

col1, col2 = st.columns(2)
graph_trimestre, trimestre = graph_articles_plus_grande_evolution_trimestre(selected_type_vente)
col1.write(f"<h4>Pourcentage d'évolution du montant de vente par articles en {trimestre}</h4>", unsafe_allow_html=True)
col1.plotly_chart(graph_trimestre, use_container_width=True)
graph_annee, annee = graph_articles_plus_grande_evolution_annee(selected_type_vente)
col2.write(f"<h4>Pourcentage d'évolution du montant de vente par articles en {annee}</h4>", unsafe_allow_html=True)
col2.plotly_chart(graph_annee, use_container_width=True)


df_style, nombre_lignes = calcul_dataframe_evolution_vente_par_articles(selected_type_vente)
column_config = {
        "Evolution": st.column_config.LineChartColumn(
            "Evolution par trimestre",
            width="medium",
            help="L'évolution du chiffre d'affaires par trimestre",
            y_min=df["Montant total HT"].min(),
            y_max=df["Montant total HT"].max(),
        ),
    }
st.dataframe(df_style.format(precision=2), height=36 * nombre_lignes, column_config=column_config, hide_index=True)

To illustrate my problem, here’s what it looks like with a diagram:v

Additional Context: This is not the first time I’ve encountered this problem, but in previous instances, I attributed it to issues in my code.

Seeking Community Wisdom: If anyone has encountered a similar behavior or has suggestions on how to troubleshoot this discrepancy effectively, I would greatly appreciate your guidance. Your expertise is invaluable in resolving this issue.

Thank you for your time and consideration

dataprofessor · November 28, 2023, 6:36am

Hi @Melody_Duplaix

I’d suggest to not reuse the same variable name after performing a data transformation. It seems that several variables were reused and reassigned to overwrite a previously defined one. Please see if ensuring that all variable names are unique would help to resolve the issue.

Hope this helps!

Melody_Duplaix · November 28, 2023, 7:35am

Thank you, but even after being careful not to reuse or reassign to an already existing variable, I still have the same problem.

def graph_articles_plus_grande_evolution_trimestre(selected_type_vente):
    """construit le graphique en barre des articles ayant eu le plus d'évolutions par trimestre

    Args:
        selected_type_vente (string): type de vente sélectionné

    Returns:
        plotly objects: graphique en barre
        str: nom du dernier trimestre du dataframe
    """
    df_trimestre = pd.read_csv("data/vente_par_articles.csv")
    # filtre sur le type de vente sélectionné
    if selected_type_vente != "Tous":
        df_filtre_type_vente = df_trimestre[df_trimestre["Type de vente"] == selected_type_vente]
    else:
        df_filtre_type_vente = df_trimestre.copy()
    # Group by pour éviter les valeurs dupliquées
    df_filtre_trimestre = df_filtre_type_vente.groupby(['Nom du Produit', 'trimestre'])['Montant total HT'].sum().reset_index()
    # Tri par client et trimestre pour garantir l'ordre correct
    df_filtre_trimestre_trie = df_filtre_trimestre.sort_values(by=['Nom du Produit', 'trimestre'])
    # Calculer le pourcentage d'évolution
    df_filtre_trimestre_trie['Pourcentage Evolution'] = df_filtre_trimestre_trie.groupby('Nom du Produit')['Montant total HT'].pct_change() * 100
    # pivote les lignes en colonnes
    df_pivot_trimestre = df_filtre_trimestre_trie.pivot(index='Nom du Produit', columns='trimestre', values='Pourcentage Evolution').reset_index()
    # Ajouter une colonne pour le graphique
    df_pivot_trimestre['Evolution'] = df_pivot_trimestre.apply(lambda row: [0 if pd.isnull(value) else value for value in row.values[1:-1]], axis=1)
    # remplace les valeurs nan par des 0
    df_pivot_trimestre.fillna(0, inplace=True)
    # Trouver le nom de la colonne du dernier trimestre
    dernier_trimestre = df_pivot_trimestre.columns[-2]
    # Exclure les valeurs à zéro
    df_trie_trimestre = df_pivot_trimestre[(df_pivot_trimestre[dernier_trimestre] != 0) & (df_pivot_trimestre[dernier_trimestre].notna())]
    # Trier le DataFrame par la colonne du dernier trimestre
    df_trie_trimestre_sort = df_trie_trimestre.sort_values(by=dernier_trimestre, ascending=False).head(15)
    # Créer le graphique en barres avec Plotly Express
    fig = px.bar(df_trie_trimestre_sort, y='Nom du Produit', x=dernier_trimestre,
                template='plotly_white')
    # Inverser l'ordre des barres pour que les plus grandes soient en haut
    fig.update_layout(hovermode=False,dragmode=False, barmode='stack', yaxis={'categoryorder': 'total ascending'})
    # Changer la couleur des barres en fonction de leur valeur
    fig.update_traces(marker_color=['rgb(69, 197, 127)' if val >= 0 else 'red' for val in df_trie_trimestre_sort[dernier_trimestre]])
    # Ajouter des étiquettes de texte pour chaque valeurs différentes de 0
    for index, row in df_trie_trimestre_sort.iterrows():
        value = row[dernier_trimestre]
        fig.add_annotation(
            x=value,
            y=row['Nom du Produit'],
            text=f"{value:.2f}%", 
            showarrow=False,
            font=dict(size=12, color='rgb(69, 197, 127)' if value >= 0 else 'red'),  
            xshift=30 if value > 0 else -30  )
    return fig, dernier_trimestre

system · May 26, 2024, 7:35am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Streamlit plotly graph seems incorrect Using Streamlit plotly , debugging	3	50	May 13, 2025
(Python 3.11.3) Filtering two datasets and plotting using same multiselect filters but just first plot being shown Using Streamlit	3	860	October 20, 2023
Please help a confused guy with Streamlit/DataFrames/Filter - Updated Using Streamlit	5	640	June 26, 2024
I am using plotly_chart to plot a graphic after selecting a value from selectbox, first run, it works as expected, but when i select a differnt value, the old graphic is still there, in gray mode Using Streamlit cache , debugging	4	50	March 2, 2025
Multiple Selectboxes reload issue Using Streamlit selectbox , pandas , streamlit-cloud	4	668	January 13, 2024

Inconsistent visualization after filtering, bug?

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies