I have a dataframe:
df = pd.DataFrame({
'day': ['11', '12'],
'City': ['[(Mumbai, 1),(Bangalore, 2)]', '[(Pune, 3),(Mumbai, 4),(Delh, 5)]']
})
day City
0 11 [(Mumbai, 1),(Bangalore, 2)]
1 12 [(Pune, 3),(Mumbai, 4),(Delh, 5)]
I want to make an explode. But when I do that, nothing changes.
df2 = df.explode('City')
What I want to get at the output
day City
0 11 (Mumbai, 1)
1 11 (Bangalore, 2)
2 12 (Pune, 3)
3 12 (Mumbai, 4)
4 12 (Delh, 5)
CodePudding user response:
You can explode
strings. You need to find a way to convert to lists.
Assuming you have city names with only letters (or spaces) you could use a regex to add the quotes and convert to list with ast.literal_eval
:
from ast import literal_eval
df['City'] = (df['City']
.str.replace(r'([a-zA-Z ] ),', r'"\1",', regex=True)
.apply(literal_eval)
)
df2 = df.explode('City', ignore_index=True)
output:
day City
0 11 (Mumbai, 1)
1 11 (Bangalore, 2)
2 12 (Pune, 3)
3 12 (Mumbai, 4)
4 12 (Delh, 5)
CodePudding user response:
df = pd.DataFrame({
'day': ['11', '12'],
'City': ['[(Mumbai, 1),(Bangalore, 2)]', '[(Pune, 3),(Mumbai, 4),(Delh, 5)]']
})
df['City'] = [re.sub("\),\(",")-(", x) for x in df['City']]
df['City'] = [re.sub("\[|\]|\(|\)","", x) for x in df['City']]
df['City'] = [x.split("-") for x in df['City']]
df['City']
df2 = df.explode('City').reset_index(drop=True)
you have to process the string and convert it to list before explode
day City
0 11 Mumbai, 1
1 11 Bangalore, 2
2 12 Pune, 3
3 12 Mumbai, 4
4 12 Delh, 5