I have a DataFrame that looks like this:
df:
date price region currency
2022-01-01 1 NorthAmerica UnitedStatesDollar
2022-01-01 1.28 NorthAmerica CanadianDollar
2022-01-01 1.42 Oceania AustralianDollar
2022-01-02 1 NorthAmerica UnitedStatesDollar
2022-01-02 1.29 NorthAmerica CanadianDollar
2022-01-02 1.41 Oceania AustralianDollar
I have a dictionary as follows:
currency_dict = [
{
'symbol': 'USD',
'region': 'NorthAmerica',
'currency': 'UnitedStatesDollar',
},
{
'symbol': 'CAD',
'region': 'NorthAmerica',
'currency': 'CanadianDollar',
},
{
'symbol': 'AUD',
'region': 'Oceania',
'currency': 'AustralianDollar',
},
];
I want to use the dictionary to get the following DataFrame:
df:
date price symbol
2022-01-01 1 USD
2022-01-01 1.28 CAD
2022-01-01 1.42 AUD
2022-01-02 1 USD
2022-01-02 1.29 CAD
2022-01-02 1.41 AUD
I have tried df.replace(currency_dict, inplace=True)
but I am not sure how that would work with replacing two columns.
Thank you in advance.
EDIT: This is a list of dictionaries! Thanks for all the comments
CodePudding user response:
Another example:
df_currency = pd.DataFrame(currency_dict) # currency_dict actually a list of dict!!
result = pd.merge(df, df_currency)[['date', 'price', 'symbol']]
References:
https://realpython.com/pandas-merge-join-and-concat/
CodePudding user response:
Use a simple .merge()
(pandas.DataFrame.merge()): works like an SQL-join operation and even on multiple columns as keys
your example (I am way too lazy to reproduce it entirely)
import pandas as pd
df = pd.DataFrame(
[{'price': 1, 'region': 'NorthAmerica', 'currency': 'UnitedStatesDollar'},
{'price': 1.28, 'region': 'NorthAmerica', 'currency': 'CanadianDollar'},
{'price': 1.42, 'region': 'Oceania', 'currency': 'AustralianDollar'},
{'price': 1, 'region': 'NorthAmerica', 'currency': 'UnitedStatesDollar'},
])
currency_dict = [
{
'symbol': 'USD',
'region': 'NorthAmerica',
'currency': 'UnitedStatesDollar',
},
{
'symbol': 'CAD',
'region': 'NorthAmerica',
'currency': 'CanadianDollar',
},
{
'symbol': 'AUD',
'region': 'Oceania',
'currency': 'AustralianDollar',
},
]
merge
df.merge(pd.DataFrame(currency_dict), on=['currency', 'region'])
output
price | region | currency | symbol | |
---|---|---|---|---|
0 | 1.00 | NorthAmerica | UnitedStatesDollar | USD |
1 | 1.00 | NorthAmerica | UnitedStatesDollar | USD |
2 | 1.28 | NorthAmerica | CanadianDollar | CAD |
3 | 1.42 | Oceania | AustralianDollar | AUD |
CodePudding user response:
Try:
df2 = pd.DataFrame(currency_dict)
df["symbol"] = df["currency"].map(df2.set_index("currency")["symbol"])
print(df[["date", "price", "symbol"]])
Prints:
date price symbol
0 2022-01-01 1.00 USD
1 2022-01-01 1.28 CAD
2 2022-01-01 1.42 AUD
3 2022-01-02 1.00 USD
4 2022-01-02 1.29 CAD
5 2022-01-02 1.41 AUD
CodePudding user response:
Here is how to do it with replace. As @topsail mentioned in the comment your dictionary is actually a list of dictionaries so the first step is to construct a dictionary that will do the job. The we do replace on the currency column. As a last step we rename the column.
conv_dict={}
#consturcting dict to use with replace
for item in currency_dict:
conv_dict[item["currency"]]=item["symbol"]
df["currency"]=df["currency"].replace(conv_dict)
df=df.rename({"currency":"symbol"})
df=df.loc[:,["date","price","currency"]]
CodePudding user response:
pandas.apply is always a good idea when you try to perform a function on a whole column
def symbol(currency):
for i in range(3):
if currency_dict[i]['currency'] == currency:
return currency_dict[i]['symbol']
df['symbol'] = df.currency.apply(symbol)