I have multiple variable product in my csv. Assume I have an product which title "Car model145" and this "Car model145" have three different price and size. Now I want to expand price and color row with title. here is my data frame:
title price color image
0 Car model145 2,54.00,852.00,2532.00 black,white,blue car iamge url
#three different price
I also have problem in price column. how to remove first comma after 2? so I can split price row properly. I also don't want to expand image row. The result will be look like this:
title price color image
0 Car model145 254.00 black car iamge url
1 Car model145 852.00 white
2 Car model145 2532.00 blue
CodePudding user response:
Something confusing is the extra price (2,
). Do you have this for all prices? You first need to get rid of it.
Then you can simply apply
str.split
and explode
:
(df.assign(price=df['price'].str.replace(',', '', 1)) # remove first comma
.apply(lambda s: s.str.split(',').explode())
.assign(image=lambda d: d['image'].mask(d['image'].duplicated(), ''))
.reset_index(drop=True)
# .to_csv('filename.csv') # uncomment to save output as csv
)
output:
title price color image
0 Car model145 254.00 black car iamge url
1 Car model145 852.00 white
2 Car model145 2532.00 blue