I have a pandas data frame with 5 columns:
ID Title Sponsor name Date Comparator
0 [NT0235] [bla bla bla] [sponsor name1] 2021-08-13 [comparator1]
1 [NT0236] [bla bla bla] [sponsor name2] 2021-08-12 [comparator2]
2 [NT0237] [bla bla bla] [sponsor name3] 2021-08-11 [comparator3]
3 [NT0238] [bla bla bla] [sponsor name4] 2021-08-10 [comparator4]
4 [NT0239] [bla bla bla] [sponsor name5] 2021-08-09 [comparator5]
I need to remove the lists from columns: ID, Title, Sponsor name and Comparator since they are all single element lists (in case they are not, the elements can be concatenated using a space as a delimeter). How do I do this? Thanks!
This is the df:
import pandas as pd
data = {'ID': [['NT0235'],['NT0236'],['NT0237'],['NT0238'],['NT0239']],'Title': [['bla bla bla'] , ['bla bla bla'] , ['bla bla bla'], ['bla bla bla'] , ['bla bla bla']] , 'Sponsor name': [['sponsor name1'], ['sponsor name2'], ['sponsor name3'], ['sponsor name4'], ['sponsor name5']], 'Date': ['2021-08-13','2021-08-12','2021-08-11','2021-08-10','2021-08-09'], 'Comparator': [['comparator1'], ['comparator2'], ['comparator3'], ['comparator4'], ['comparator5']]
}
df = pd.DataFrame(data)
What I need is this df:
ID Title Sponsor name Date Comparator
0 NT0235 bla bla bla sponsor name1 2021-08-13 comparator1
1 NT0236 bla bla bla sponsor name2 2021-08-12 comparator2
2 NT0237 bla bla bla sponsor name3 2021-08-11 comparator3
3 NT0238 bla bla bla sponsor name4 2021-08-10 comparator4
4 NT0239 bla bla bla sponsor name5 2021-08-09 comparator5
CodePudding user response:
Use:
df[['ID', 'Title', 'Sponsor name', 'Comparator']] = df[['ID', 'Title', 'Sponsor name', 'Comparator']].apply(lambda x: x.explode())
OUTPUT:
ID Title Sponsor name Date Comparator
0 NT0235 bla bla bla sponsor name1 2021-08-13 comparator1
1 NT0236 bla bla bla sponsor name2 2021-08-12 comparator2
2 NT0237 bla bla bla sponsor name3 2021-08-11 comparator3
3 NT0238 bla bla bla sponsor name4 2021-08-10 comparator4
4 NT0239 bla bla bla sponsor name5 2021-08-09 comparator5
CodePudding user response:
For this simple case, you can use DataFrame.apply()
to apply pd.Series.explode
on each column, as follows:
df = df.apply(pd.Series.explode)
Result:
print(df)
ID Title Sponsor name date comparator
0 NT0235 bla bla bla sponsor name1 2021-08-13 comparator1
1 NT0236 bla bla bla sponsor name2 2021-08-12 comparator2
2 NT0237 bla bla bla sponsor name3 2021-08-11 comparator3
3 NT0238 bla bla bla sponsor name4 2021-08-10 comparator4
4 NT0239 bla bla bla sponsor name5 2021-08-09 comparator5
If you want to apply only on the selected columns, you can use:
df[['ID', 'Title', 'Sponsor name', 'Comparator']] = df[['ID', 'Title', 'Sponsor name', 'Comparator']].apply(pd.Series.explode)
CodePudding user response:
You can join values by spaces if there are list
s (and strings inside) - it working for one element and also multiple elements lists:
df = df.applymap(lambda x: ' '.join(x) if isinstance(x, list) else x)
If possible some numeric values:
df = df.applymap(lambda x: ' '.join(map(str, x)) if isinstance(x, list) else x)