Python Dash: Return subset of a data frame using drop-down menu-CodePudding

I have a data frame containing some commodities trading deals. My objective is to display a subset of this data frame using Dash drop down menu. Below is the code I used to display the whole data frame:

from dash import Dash, dash_table
import pandas as pd

parquet_file = r'/home/maanan/sevencommodities/random_deals.parq'

df = pd.read_parquet(parquet_file, engine='auto')

app = Dash(__name__)

app.layout = dash_table.DataTable(df.to_dict('records'), [{"name": i, "id": i} for i in df.columns])

if __name__ == '__main__':
    app.run_server(debug=False, port=5950)

which returns the following table:

However, I do not want to show everything, for example; the variable book contains the following levels:

bk = pd.Categorical(df['book'])
print(bk.categories)

Index(['Book_1', 'Book_2', 'Book_3', 'Book_4', 'Book_5', 'Book_6', 'Book_7'], dtype='object')

and I only want to show the data frame for one level, the level which the user will choose.

I am really new to Python Dash, and I would really appreciate it if someone could give me a hint as to how to solve this problem. Thank you in advance.

EDIT

Below is the output from df.info():

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 15 columns):
 #   Column               Non-Null Count   Dtype                        
---  ------               --------------   -----                        
 0   deal_id              100000 non-null  int64                        
 1   book                 100000 non-null  object                       
 2   counterparty         100000 non-null  object                       
 3   commodity_name       100000 non-null  object                       
 4   commodity_code       100000 non-null  object                       
 5   executed_date        100000 non-null  datetime64[ns, Europe/Prague]
 6   first_delivery_date  100000 non-null  datetime64[ns, Europe/Prague]
 7   last_delivery_date   100000 non-null  datetime64[ns, Europe/Prague]
 8   last_trading_date    100000 non-null  datetime64[ns, Europe/Prague]
 9   volume               100000 non-null  int64                        
 10  buy_sell             100000 non-null  object                       
 11  trading_unit         100000 non-null  object                       
 12  tenor                100000 non-null  object                       
 13  delivery_window      100000 non-null  object                       
 14  strategy             3847 non-null    object                       
 dtypes: datetime64[ns, Europe/Prague](4), int64(2), object(9)
 memory usage: 11.4  MB

CodePudding user response：

I will recommend isin , in case user want multiple input

user_chosen_book = ['Book_2','Book_1']
out = df[df['book'].isin(user_chosen_book)]

CodePudding user response：

You could have the user specify a user_chosen_book and then filter the DataFrame to only contain records with that book.

df = pd.read_parquet(parquet_file, engine='auto')
user_chosen_book = 'Book_2'
df = df[df['book'] == user_chosen_book]

now you have a subset of your original df ready to load into your application