Home > Enterprise >  Pandas df how to change specific columns to remove list from a single-element list
Pandas df how to change specific columns to remove list from a single-element list

Time:10-19

I have a pandas data frame with 5 columns:

    ID             Title         Sponsor name       Date       Comparator 

0   [NT0235]     [bla bla bla]  [sponsor name1]  2021-08-13    [comparator1]
1   [NT0236]     [bla bla bla]  [sponsor name2]  2021-08-12    [comparator2]
2   [NT0237]     [bla bla bla]  [sponsor name3]  2021-08-11    [comparator3]
3   [NT0238]     [bla bla bla]  [sponsor name4]  2021-08-10    [comparator4]
4   [NT0239]     [bla bla bla]  [sponsor name5]  2021-08-09    [comparator5]

I need to remove the lists from columns: ID, Title, Sponsor name and Comparator since they are all single element lists (in case they are not, the elements can be concatenated using a space as a delimeter). How do I do this? Thanks!

This is the df:

import pandas as pd
    
data = {'ID': [['NT0235'],['NT0236'],['NT0237'],['NT0238'],['NT0239']],'Title': [['bla bla bla'] , ['bla bla bla'] , ['bla bla bla'], ['bla bla bla'] , ['bla bla bla']] , 'Sponsor name': [['sponsor name1'], ['sponsor name2'], ['sponsor name3'], ['sponsor name4'], ['sponsor name5']], 'Date': ['2021-08-13','2021-08-12','2021-08-11','2021-08-10','2021-08-09'], 'Comparator': [['comparator1'], ['comparator2'], ['comparator3'], ['comparator4'], ['comparator5']]
}
    
df = pd.DataFrame(data)

What I need is this df:

    ID             Title         Sponsor name       Date       Comparator 

0   NT0235     bla bla bla     sponsor name1     2021-08-13    comparator1
1   NT0236     bla bla bla     sponsor name2     2021-08-12    comparator2
2   NT0237     bla bla bla     sponsor name3     2021-08-11    comparator3
3   NT0238     bla bla bla     sponsor name4     2021-08-10    comparator4
4   NT0239     bla bla bla     sponsor name5     2021-08-09    comparator5

CodePudding user response:

Use:

df[['ID', 'Title', 'Sponsor name', 'Comparator']] = df[['ID', 'Title', 'Sponsor name', 'Comparator']].apply(lambda x: x.explode())

OUTPUT:

       ID        Title   Sponsor name        Date   Comparator
0  NT0235  bla bla bla  sponsor name1  2021-08-13  comparator1
1  NT0236  bla bla bla  sponsor name2  2021-08-12  comparator2
2  NT0237  bla bla bla  sponsor name3  2021-08-11  comparator3
3  NT0238  bla bla bla  sponsor name4  2021-08-10  comparator4
4  NT0239  bla bla bla  sponsor name5  2021-08-09  comparator5

CodePudding user response:

For this simple case, you can use DataFrame.apply() to apply pd.Series.explode on each column, as follows:

df = df.apply(pd.Series.explode)

Result:

print(df)


       ID        Title   Sponsor name        date   comparator
0  NT0235  bla bla bla  sponsor name1  2021-08-13  comparator1
1  NT0236  bla bla bla  sponsor name2  2021-08-12  comparator2
2  NT0237  bla bla bla  sponsor name3  2021-08-11  comparator3
3  NT0238  bla bla bla  sponsor name4  2021-08-10  comparator4
4  NT0239  bla bla bla  sponsor name5  2021-08-09  comparator5

If you want to apply only on the selected columns, you can use:

df[['ID', 'Title', 'Sponsor name', 'Comparator']] = df[['ID', 'Title', 'Sponsor name', 'Comparator']].apply(pd.Series.explode)

CodePudding user response:

You can join values by spaces if there are lists (and strings inside) - it working for one element and also multiple elements lists:

df = df.applymap(lambda x: ' '.join(x) if isinstance(x, list) else x)

If possible some numeric values:

df = df.applymap(lambda x: ' '.join(map(str, x)) if isinstance(x, list) else x)
  • Related