Summary
When using pandas merge function within a callback function, the dataframe is not updated correctly. However, the pandas drop function works as expected
Note that although i have turned on st.cache. The same behavior is noted when removing the cache function as well.
Steps to reproduce
Code snippet:
import streamlit as st
import pandas as pd
@st.cache(allow_output_mutation=True)
def read_df():
df = pd.DataFrame({
'col1':[1,2],
'col2':['A','B']
})
return df
df = read_df()
def do_something():
global df
df_new = pd.DataFrame({
'col1':[1,2],
'col3':["X","Y"]
})
df.drop(['col2'], axis = 1, inplace = True)
df = df.merge(df_new, on="col1")
st.button("Do Something", on_click=do_something, args =())
download_csv = df.to_csv().encode('utf-8')
st.download_button('Download', data = download_csv, file_name = 'download_csv.csv', mime='text/csv')
Steps to reproduce behavior
- click on "Do Something" button
- click on "Download" button
Expected behavior:
I would expect the downloaded csv to be displayed
col1 col3
0 1 X
1 2 Y
Actual behavior:
However, i get the following output instead
col1
0 1
1 2
Debug info
- Streamlit version: 1.16.0
- Python version: 3.8.15
- Using Conda: Yes
- OS version: Windows 11
- Browser version: Edge v108.0.1462.54
CodePudding user response:
The way that I would do it would be to store and retrieve the dataframe from a session_state variable. This way you know that you are getting and working with the most up-to-date values.
st.session_state['df'] = df
- will set the 'df' session state variable as the currentdf
st.session_state['df'] = df1
- will update the session state variable with the merged df
Here is an example:
import streamlit as st
import pandas as pd
@st.experimental_memo
def read_df():
df = pd.DataFrame({
'col1':[1,2],
'col2':['A','B']
})
st.session_state['df'] = df
return df
df = read_df()
def do_something():
df1 = st.session_state['df']
df_new = pd.DataFrame({
'col1':[1,2],
'col3':["X","Y"]
})
df1.drop(['col2'], axis = 1, inplace = True)
df1 = df1.merge(df_new, on="col1")
st.session_state['df'] = df1
st.button("Do Something", on_click=do_something, args =())
df = st.session_state['df']
download_csv = df.to_csv().encode('utf-8')
st.download_button('Download', data = download_csv, file_name = 'download_csv.csv', mime='text/csv')
Output file:
file_name = 'download_csv.csv'
col1 col3
0 1 X
1 2 Y
Note:
@st.experimental_memo
- ensures that the df is only loaded once.