Creating a new column mapping a dictionary of string instead of number in pandas-CodePudding

I think there is already a question on that but I was not able to find, I read these questions but they are not useful for my case returning the wrong result.

I have the following df:

mod_unmod = {"mod":["yt","fb","ig"],
             "unmod":["tik tok"]}
df_dict = {"media_type":["yt","fb","ig","tik tok","yt","fb","ig","tik tok"],
              "budget": [1,2,3,4,5,6,1,2]}

df = pd.DataFrame(df_dict)


    media_type  budget
0   yt      1
1   fb      2
2   ig      3
3   tik tok 4
4   yt      5
5   fb      6
6   ig      1
7   tik tok 2

Expected Output

I want to get this based on the values inside the dict mod_unmod

    media_type  budget  model
0   yt      1   mod
1   fb      2   mod
2   ig      3   mod
3   tik tok 4   unmod
4   yt      5   mod
5   fb      6   mod
6   ig      1   mod
7   tik tok 2   unmod

I tried this:

df["model"] = df["media_type"].map(mod_unmodd)

but it returns this:

    media_type  budget  model
0   yt      1   NaN
1   fb      2   NaN
2   ig      3   NaN
3   tik tok 4   NaN
4   yt      5   NaN
5   fb      6   NaN
6   ig      1   NaN
7   tik tok 2   NaN

What is wrong? (I think because it works only with numbers)
How can I get my desired output?
Please let me know if this question is a duplicate :)
In this example the mapping is based on only two possible label "mod" "unmod", but what if I had a third one. For example in the future I want to create a column based on these three labels "to test", "testing", "untested"

CodePudding user response：

The problem is that your map is the wrong way around. So your mod_unmod dict should look something like this.

mod_unmod = {'yt':'mod','fb':'mod','ig':'mod','tik tok':'unmod'}

That should give the desired output.

You could use something like this to go from your version to the desired version of the dictionary.

new_dict = = {v: key for key, val in mod_unmod.items() for v in val}

CodePudding user response：

You are mapping using 'mod' to 'media_type' dictionary, you rather need 'media_type' to 'mod' dictionary and map. So you reverse the dictionary and map.

df['model'] = df['media_type'].map({v: k for k in mod_unmod for v in mod_unmod[k]})