Home > Back-end >  Filter Data with Pandas in Pycharm
Filter Data with Pandas in Pycharm

Time:04-18

This is my code so far in Pycharm for my Streamlit Data app:

import pandas as pd
import plotly.express as px
import streamlit as st

st.set_page_config(page_title='Matching Application Number', 
                   layout='wide')
df = pd.read_csv('Analysis_1.csv')

st.sidebar.header("Filter Data:")
MeetingFileType = st.sidebar.multiselect(
    "Select File Type:",
    options=df['MEETING_FILE_TYPE'].unique(),
    default=df['MEETING_FILE_TYPE'].unique()
)

df_selection = df.query(
    'MEETING_FILE_TYPE == @MeetingFileType'
)

st.dataframe(df_selection)

The result of my code on streamlit is this below:

Application_ID    MEETING_FILE_TYPE
BBC#:1010         1     
NBA#:1111         2
BRC#:1212         1
SAC#:1412         4
QRD#:1912         2
BBA#:1092         4

But, I would like to filter the data and only return Application_ID results for just MEETING_FILE_TYPE 1&2.

I am looking for this result below:

Filter Data:               Application_ID     MEETING_FILE_TYPE
select type:               BBC#:1010          1 
1 2                        NBA#:1111          2
                           BRC#:1212          1
                           QRD#:1912          2

CodePudding user response:

The .isin() function is useful for creating a vector of Bools on which to filter the rows of your DataFrame. For filtering categorical columns, as in your example, it's the simplest way to go. Documentation here.

select_type = [1,2]
df = df[df['MEETING_FILE_TYPE'].isin(select_type)]

It's not necessary in this example, but pairing the ~, which gives the reverse Bool value, along with .isin() often comes in handy. Someone covered this here.

  • Related