This is my code so far in Pycharm for my Streamlit Data app:
import pandas as pd
import plotly.express as px
import streamlit as st
st.set_page_config(page_title='Matching Application Number',
layout='wide')
df = pd.read_csv('Analysis_1.csv')
st.sidebar.header("Filter Data:")
MeetingFileType = st.sidebar.multiselect(
"Select File Type:",
options=df['MEETING_FILE_TYPE'].unique(),
default=df['MEETING_FILE_TYPE'].unique()
)
df_selection = df.query(
'MEETING_FILE_TYPE == @MeetingFileType'
)
st.dataframe(df_selection)
The result of my code on streamlit is this below:
Application_ID MEETING_FILE_TYPE
BBC#:1010 1
NBA#:1111 2
BRC#:1212 1
SAC#:1412 4
QRD#:1912 2
BBA#:1092 4
But, I would like to filter the data and only return Application_ID results for just MEETING_FILE_TYPE 1&2.
I am looking for this result below:
Filter Data: Application_ID MEETING_FILE_TYPE
select type: BBC#:1010 1
1 2 NBA#:1111 2
BRC#:1212 1
QRD#:1912 2
CodePudding user response:
The .isin() function is useful for creating a vector of Bools on which to filter the rows of your DataFrame. For filtering categorical columns, as in your example, it's the simplest way to go. Documentation here.
select_type = [1,2]
df = df[df['MEETING_FILE_TYPE'].isin(select_type)]
It's not necessary in this example, but pairing the ~, which gives the reverse Bool value, along with .isin() often comes in handy. Someone covered this here.