Home > database >  Map each words inside a dataframe to a dictionary
Map each words inside a dataframe to a dictionary

Time:08-21

I have a dictionary

a=mapping.set_index('String')
dict_y = a['Mapping'].to_dict()
dict_y

{'mp3': 'sound',
 'player': 'device',
 'horses': 'horse',
 'laptop': 'electronic device',
 'hard disk': 'storage'}

I want to replace each words in a dataframe row , please see the sample dataframes

Original Dataframe

Item Code Item Description
1 64 GB sound device
2 15 inch laptop

Required Dataframe

Item Code Item Description
1 64 GB mp3 player
2 15 inch electronic device

The code that I have developed so far is this, But I don't know how to move forward

def testing ():
    test_dic=dict_y
    text = text.split(" ")

    new_text = []
    for word in text:
        if word in test_dic:
            new_text.append(test_dic[word])
        else:
            new_text.append(word)
    return " ".join(new_text)

testing()

df_test['ITEM DESCRIPTION']=df_test['ITEM DESCRIPTION'].apply(testing())
df_test['ITEM DESCRIPTION']

CodePudding user response:

Note- You want to replace key or value in the dictonary with the word present in pandas rows.

You can check for keys in pandas rows and replace it with values or vice-versa.

But you want to match key & values present in pandas row and then replace with the opposite of key/value present in dict, which I think may not be possible.

You can keep words in dict as dict which you can to check in pandas rows and then replace it with it's values present in dict as below.

Use .replace with regex=True

Ex:

import pandas as pd

dic = {"quick brown fox": "fox", "lazy dog": "dog", "u": "you"}
#Update as per comment
dic = {r"\b{}\b".format(k): v for k, v in dic.items()}

df = pd.DataFrame({"Text": ["The quick brown fox jumps over the lazy dog"]})
df["Text"] = df["Text"].replace(dic, regex=True)
print(df)

Ref link- Pandas replace part of string with values from dictionary

  • Related