Home > Software design >  Pandas: a Pythonic way to create a hyperlink from a value stored in another column of the dataframe
Pandas: a Pythonic way to create a hyperlink from a value stored in another column of the dataframe

Time:09-28

I have the following toy dataset df:

import pandas as pd

data = {
        'id' : [1, 2, 3],
        'name' : ['John Smith', 'Sally Jones', 'William Lee']
       }

df = pd.DataFrame(data)
df

    id  name
0   1   John Smith
1   2   Sally Jones
2   3   William Lee

My ultimate goal is to add a column that represents a Google search of the value in the name column.

I do this using:

def create_hyperlink(search_string):
    return f'https://www.google.com/search?q={search_string}'

df['google_search'] = df['name'].apply(create_hyperlink)
df

    id   name           google_search
0   1    John Smith     https://www.google.com/search?q=John Smith
1   2    Sally Jones    https://www.google.com/search?q=Sally Jones
2   3    William Lee    https://www.google.com/search?q=William Lee

Unfortunately, newly created google_search column is returning a malformed URL. The URL should have a " " between the first name and last name.

The google_search column should return the following:

https://www.google.com/search?q=John Smith

It's possible to do this using split() and join().

foo = df['name'].str.split()
foo

0     [John, Smith]
1    [Sally, Jones]
2    [William, Lee]
Name: name, dtype: object

Now, joining them:

df['bar'] = [' '.join(map(str, l)) for l in df['foo']]
df

    id  name    google_search   foo bar
0   1   John Smith  https://www.google.com/search?q=John Smith  [John, Smith]   John Smith
1   2   Sally Jones https://www.google.com/search?q=Sally Jones [Sally, Jones]  Sally Jones
2   3   William Lee https://www.google.com/search?q=William Lee [William, Lee]  William Lee

Lastly, creating the updated google_search column:

df['google_search'] = df['bar'].apply(create_hyperlink)
df

Is there a more elegant, streamlined, Pythonic way to do this?

Thanks!

CodePudding user response:

Rather than reinvent the wheel and modify your string manually, use a library that's guaranteed to give you the right result :

from urllib.parse import quote_plus


def create_hyperlink(search_string):
    return f"https://www.google.com/search?q={quote_plus(search_string)}"

CodePudding user response:

Use Series.str.replace:

df['google_search'] = 'https://www.google.com/search?q='   \
    df.name.str.replace(' ',' ')

print(df)

   id         name                                google_search
0   1   John Smith   https://www.google.com/search?q=John Smith
1   2  Sally Jones  https://www.google.com/search?q=Sally Jones
2   3  William Lee  https://www.google.com/search?q=William Lee
  • Related