Streamlit and pandas Profiling

ninjayoga · June 6, 2022, 10:18am

Hi,
I am trying to create pandas profiling app using streamlit.
The input file size(.csv) is 6.5 MB and has 80 columns.
When is run the App with few selected columns it generates the report whereas if i choose to run the App with all variables as input I get error as below:

ValueError: Maximum allowed size exceeded

Traceback:
File “C:\Users\username\anaconda3\lib\site-packages\streamlit\scriptrunner\script_runner.py”, line 475, in run_script
exec(code, module.dict)
File “C:\Users\username\applied AI files\REUSEABLES\app_excel_option.py”, line 106, in
st_profile_report(profile)
File "C:\Users\username\AppData\Roaming\Python\Python39\site-packages\streamlit_pandas_profiling_init.py", line 54, in st_profile_report
_render_component(html=report.to_html(), height=height, key=key, default=None)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\profile_report.py”, line 368, in to_html
return self.html
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\profile_report.py”, line 185, in html
self._html = self._render_html()
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\profile_report.py”, line 287, in _render_html
report = self.report
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\profile_report.py”, line 179, in report
self.report = get_report_structure(self.config, self.description_set)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\profile_report.py”, line 161, in description_set
self.description_set = describe_df(
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\describe.py”, line 71, in describe
series_description = get_series_descriptions(
File "C:\Users\username\anaconda3\lib\site-packages\multimethod_init.py", line 312, in call
return func(*args, **kwargs)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\pandas\summary_pandas.py”, line 92, in pandas_get_series_descriptions
for i, (column, description) in enumerate(
File “C:\Users\username\anaconda3\lib\multiprocessing\pool.py”, line 870, in next
raise value
File “C:\Users\username\anaconda3\lib\multiprocessing\pool.py”, line 125, in worker
result = (True, func(*args, **kwds))
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\pandas\summary_pandas.py”, line 72, in multiprocess_1d
return column, describe_1d(config, series, summarizer, typeset)
File "C:\Users\username\anaconda3\lib\site-packages\multimethod_init.py", line 312, in call
return func(*args, **kwargs)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\pandas\summary_pandas.py”, line 50, in pandas_describe_1d
return summarizer.summarize(config, series, dtype=vtype)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\summarizer.py”, line 37, in summarize
_, , summary = self.handle(str(dtype), config, series, {“type”: str(dtype)})
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\handler.py”, line 62, in handle
return op(*args)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\handler.py”, line 21, in func2
return f(*res)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\handler.py”, line 21, in func2
return f(*res)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\handler.py”, line 21, in func2
return f(*res)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\handler.py”, line 17, in func2
res = g(*x)
File "C:\Users\username\anaconda3\lib\site-packages\multimethod_init.py", line 312, in call
return func(*args, **kwargs)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\summary_algorithms.py”, line 65, in inner
return fn(config, series, summary)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\summary_algorithms.py”, line 82, in inner
return fn(config, series, summary)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\pandas\describe_numeric_pandas.py”, line 114, in pandas_describe_numeric_1d
stats[“chi_squared”] = chi_square(finite_values)
File “C:\Users\username\anaconda3\lib\site-packages\pandas_profiling\model\summary_algorithms.py”, line 52, in chi_square
histogram, _ = np.histogram(values, bins=“auto”)
File “<array_function internals>”, line 5, in histogram
File “C:\Users\username\anaconda3\lib\site-packages\numpy\lib\histograms.py”, line 792, in histogram
bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
File “C:\Users\username\anaconda3\lib\site-packages\numpy\lib\histograms.py”, line 446, in _get_bin_edges
bin_edges = np.linspace(
File “<array_function internals>”, line 5, in linspace
File “C:\Users\username\anaconda3\lib\site-packages\numpy\core\function_base.py”, line 135, in linspace
y = _nx.arange(0, num, dtype=dt).reshape((-1,) + (1,) * ndim(delta))

Is there a work around so that the App can handle 80+ columns as input.

Thanks,
Nidhi

system · June 6, 2023, 10:18am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Streamlit data limit Using Streamlit	8	7545	May 13, 2022
Pandas Profiling Custom Components	51	10021	April 15, 2022
Dataframe Size Using Streamlit layout , pandas	5	26542	November 19, 2021
Problem on resources limit Community Cloud	9	841	May 6, 2021
50MB dataset limitation when using Plotly.py Using Streamlit plotly	12	5226	January 25, 2022

Streamlit and pandas Profiling

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies