Home > Net >  Replace text in url in pandas dataframe
Replace text in url in pandas dataframe

Time:01-05

In my dataframe i have a links with utm parameters:

utm_content=keys_{gbid}|cid|{campaign_id}|aid|{keyword}|{phrase_id}|src&utm_term={keyword}

Also in dataframe i have sevral columns with id - CampaignId, AdGroupId, Keyword, Keyword ID

And I need to replace the values in curly brackets in the link with the values from these columns

For exmaple i need to replace {campaign_id} with values from CampaignId colums. And do this for each value in the link

The result should be like this -

utm_content=keys_3745473327|cid|31757442|aid|CRM|38372916231|src&utm_term=CRM

CodePudding user response:

You can try this:

import pandas as pd
import numpy as np

# create some sample data
df = pd.DataFrame(columns=['CampaignId', 'AdGroupId', 'Keyword', 'Keyword ID'],
                  data=np.random.randint(low=0, high=100, size=(10, 4)))

df['url'] = 'utm_content=keys_{gbid}|cid|{campaign_id}|aid|{keyword}|{phrase_id}|src&utm_term={keyword}'
df

Output:

    CampaignId  AdGroupId   Keyword Keyword ID  url
0   21  13  26  41  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
1   28  9   19  3   utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
2   11  17  37  43  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
3   25  13  17  54  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
4   32  19  17  48  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
5   26  92  80  90  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
6   25  17  1   54  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
7   81  7   68  85  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
8   75  55  37  56  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...
9   14  53  34  84  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...

And then write a custom function to replace your variables with f string and apply it to the dataframe creating a new column (You can also replace with the url column if you want):

def fill_link(CampaignId, AdGroupId, Keyword, KeywordID, url):
    campaign_id = CampaignId
    keyword = Keyword
    gbid = AdGroupId
    phrase_id = KeywordID
    return eval("f'"   f"{url}"   "'")
    
df['url_filled'] = df.apply(lambda row: fill_link(row['CampaignId'], row['AdGroupId'], row['Keyword'], row['Keyword ID'], row['url']), axis=1)

df

    CampaignId  AdGroupId   Keyword Keyword ID  url url_filled
0   21  13  26  41  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_13|cid|21|aid|26|41|src&utm_t...
1   28  9   19  3   utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_9|cid|28|aid|19|3|src&utm_ter...
2   11  17  37  43  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_17|cid|11|aid|37|43|src&utm_t...
3   25  13  17  54  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_13|cid|25|aid|17|54|src&utm_t...
4   32  19  17  48  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_19|cid|32|aid|17|48|src&utm_t...
5   26  92  80  90  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_92|cid|26|aid|80|90|src&utm_t...
6   25  17  1   54  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_17|cid|25|aid|1|54|src&utm_te...
7   81  7   68  85  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_7|cid|81|aid|68|85|src&utm_te...
8   75  55  37  56  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_55|cid|75|aid|37|56|src&utm_t...
9   14  53  34  84  utm_content=keys_{gbid}|cid|{campaign_id}|aid|...   utm_content=keys_53|cid|14|aid|34|84|src&utm_t...

I am not sure if the name of your variables are correctly assigned as they are not exactly the same. But it shouldn't be a problem for you to replace them as you wish.

  • Related