Home > Software design >  Python: replace columns using dictionary
Python: replace columns using dictionary

Time:06-13

I have a DataFrame that looks like this:

df:
date        price region        currency
2022-01-01  1     NorthAmerica  UnitedStatesDollar
2022-01-01  1.28  NorthAmerica  CanadianDollar
2022-01-01  1.42  Oceania       AustralianDollar
2022-01-02  1     NorthAmerica  UnitedStatesDollar
2022-01-02  1.29  NorthAmerica  CanadianDollar
2022-01-02  1.41  Oceania       AustralianDollar

I have a dictionary as follows:

currency_dict = [
  {
    'symbol': 'USD',
    'region': 'NorthAmerica',
    'currency': 'UnitedStatesDollar',
  },
  {
    'symbol': 'CAD',
    'region': 'NorthAmerica',
    'currency': 'CanadianDollar',
  },
  {
    'symbol': 'AUD',
    'region': 'Oceania',
    'currency': 'AustralianDollar',
  },
];

I want to use the dictionary to get the following DataFrame:

df:
date        price  symbol
2022-01-01  1      USD
2022-01-01  1.28   CAD
2022-01-01  1.42   AUD
2022-01-02  1      USD
2022-01-02  1.29   CAD
2022-01-02  1.41   AUD

I have tried df.replace(currency_dict, inplace=True) but I am not sure how that would work with replacing two columns.

Thank you in advance.

EDIT: This is a list of dictionaries! Thanks for all the comments

CodePudding user response:

Another example:

df_currency = pd.DataFrame(currency_dict) # currency_dict actually a list of dict!!
result = pd.merge(df, df_currency)[['date', 'price', 'symbol']]

References:

https://realpython.com/pandas-merge-join-and-concat/

https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#database-style-dataframe-or-named-series-joining-merging

CodePudding user response:

Use a simple .merge() (pandas.DataFrame.merge()): works like an SQL-join operation and even on multiple columns as keys

your example (I am way too lazy to reproduce it entirely)

import pandas as pd

df = pd.DataFrame(
[{'price': 1, 'region': 'NorthAmerica', 'currency': 'UnitedStatesDollar'},
{'price': 1.28, 'region': 'NorthAmerica', 'currency': 'CanadianDollar'},
{'price': 1.42, 'region': 'Oceania', 'currency': 'AustralianDollar'},
{'price': 1, 'region': 'NorthAmerica', 'currency': 'UnitedStatesDollar'},
])

currency_dict = [
  {
    'symbol': 'USD',
    'region': 'NorthAmerica',
    'currency': 'UnitedStatesDollar',
  },
  {
    'symbol': 'CAD',
    'region': 'NorthAmerica',
    'currency': 'CanadianDollar',
  },
  {
    'symbol': 'AUD',
    'region': 'Oceania',
    'currency': 'AustralianDollar',
  },
]

merge

df.merge(pd.DataFrame(currency_dict), on=['currency', 'region'])

output

price region currency symbol
0 1.00 NorthAmerica UnitedStatesDollar USD
1 1.00 NorthAmerica UnitedStatesDollar USD
2 1.28 NorthAmerica CanadianDollar CAD
3 1.42 Oceania AustralianDollar AUD

CodePudding user response:

Try:

df2 = pd.DataFrame(currency_dict)
df["symbol"] = df["currency"].map(df2.set_index("currency")["symbol"])

print(df[["date", "price", "symbol"]])

Prints:

         date  price symbol
0  2022-01-01   1.00    USD
1  2022-01-01   1.28    CAD
2  2022-01-01   1.42    AUD
3  2022-01-02   1.00    USD
4  2022-01-02   1.29    CAD
5  2022-01-02   1.41    AUD

CodePudding user response:

Here is how to do it with replace. As @topsail mentioned in the comment your dictionary is actually a list of dictionaries so the first step is to construct a dictionary that will do the job. The we do replace on the currency column. As a last step we rename the column.

conv_dict={}
#consturcting dict to use with replace
for item in currency_dict:
    conv_dict[item["currency"]]=item["symbol"]
    
    
df["currency"]=df["currency"].replace(conv_dict)
df=df.rename({"currency":"symbol"})

df=df.loc[:,["date","price","currency"]]

CodePudding user response:

pandas.apply is always a good idea when you try to perform a function on a whole column

def symbol(currency):
    for i in range(3):
        if currency_dict[i]['currency'] == currency:
            return currency_dict[i]['symbol']

df['symbol'] = df.currency.apply(symbol)
  • Related