I have a data frame containing some commodities trading deals. My objective is to display a subset of this data frame using Dash drop down menu. Below is the code I used to display the whole data frame:
from dash import Dash, dash_table
import pandas as pd
parquet_file = r'/home/maanan/sevencommodities/random_deals.parq'
df = pd.read_parquet(parquet_file, engine='auto')
app = Dash(__name__)
app.layout = dash_table.DataTable(df.to_dict('records'), [{"name": i, "id": i} for i in df.columns])
if __name__ == '__main__':
app.run_server(debug=False, port=5950)
which returns the following table:
However, I do not want to show everything, for example; the variable book
contains the following levels:
bk = pd.Categorical(df['book'])
print(bk.categories)
Index(['Book_1', 'Book_2', 'Book_3', 'Book_4', 'Book_5', 'Book_6', 'Book_7'], dtype='object')
and I only want to show the data frame for one level, the level which the user will choose.
I am really new to Python Dash, and I would really appreciate it if someone could give me a hint as to how to solve this problem. Thank you in advance.
EDIT
Below is the output from df.info()
:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 deal_id 100000 non-null int64
1 book 100000 non-null object
2 counterparty 100000 non-null object
3 commodity_name 100000 non-null object
4 commodity_code 100000 non-null object
5 executed_date 100000 non-null datetime64[ns, Europe/Prague]
6 first_delivery_date 100000 non-null datetime64[ns, Europe/Prague]
7 last_delivery_date 100000 non-null datetime64[ns, Europe/Prague]
8 last_trading_date 100000 non-null datetime64[ns, Europe/Prague]
9 volume 100000 non-null int64
10 buy_sell 100000 non-null object
11 trading_unit 100000 non-null object
12 tenor 100000 non-null object
13 delivery_window 100000 non-null object
14 strategy 3847 non-null object
dtypes: datetime64[ns, Europe/Prague](4), int64(2), object(9)
memory usage: 11.4 MB
CodePudding user response:
I will recommend isin
, in case user want multiple input
user_chosen_book = ['Book_2','Book_1']
out = df[df['book'].isin(user_chosen_book)]
CodePudding user response:
You could have the user specify a user_chosen_book
and then filter the DataFrame to only contain records with that book.
df = pd.read_parquet(parquet_file, engine='auto')
user_chosen_book = 'Book_2'
df = df[df['book'] == user_chosen_book]
now you have a subset of your original df ready to load into your application