Cross dataframe with dictionary-CodePudding

I have the following dictionaries inside variables:

sk_channel_types = {"facebooknotification": 2,
                    "facebookmessenger": 9,
                    "onsitenotification": 3,
                    "pushnotification": 6,
                    "pushnotificationmessage": 6,
                    "lightbox": 4,
                    "onsitemessage": 7,
                    "mailmessage": 1}

sk_story_types = {"welcome": 7,
                  "rescue": 13,
                  "frequency": 4,
                  "abandoncart": 6,
                  "pricedrop": 16,
                  "manual": 5,
                  "searchbykeyword": 30,
                  "sazonality": 31,
                  "bestdayforpurchase": 28,
                  "pricechange": 32,
                  "availability": 33,
                  "toptrending": 1,
                  "toptrendingbycluster": 2,
                  "toptrendingwithpricelimit": 3,
                  "frequencyview": 4,
                  "manualnotification": 5,
                  "trending": 9,
                  "toptrendingbykeyword": 9}

And this is my current spark dataframe:

ID	StoryType	Type	StoryId
abcdefghijklmnopqrst	AbandonCart	MailMessage	56465465456456456465
lçdkçlskdçlsdkçlskdç	ManualNotification	MailMessage	60983099380938390833
uahuahuahauhauahuaha	ManualNotification	MailMessage	49438093890484984949
sklçskçlskdkcnopeieo	ManualNotification	MailMessage	93084098409840984098
2d5fe941380938098948	ManualNotification	MailMessage	49809380398094894844
9883jkjd3eu0dj0j3930	ManualNotification	MailMessage	636f50c9380938093893

I need to replace the StoryType and Type columns with their respective numbers, as per the variables, like this:

ID	StoryType	Type	StoryId
abcdefghijklmnopqrst	6	1	56465465456456456465
lçdkçlskdçlsdkçlskdç	5	1	60983099380938390833
uahuahuahauhauahuaha	5	1	49438093890484984949
sklçskçlskdkcnopeieo	5	1	93084098409840984098
2d5fe941380938098948	5	1	49809380398094894844
9883jkjd3eu0dj0j3930	5	1	636f50c9380938093893

How can I do this? Can I use a case with low? I'm new to Pyspark.

CodePudding user response：

Since the dictionaries are small the efficient way is to make them broadcasted dataset and join them to the dataset.