Home > Software engineering >  Creating new pandas columns from substrings in a list
Creating new pandas columns from substrings in a list

Time:05-14

I have data in a csv called 'Features' which is of this form:

0      [Shops: Close by, Passing trade: Yes]
1      [Lift: Yes, No of Bedrooms: 1, Bedroom 1 Dims:...
2      [Lift: Yes, No of Bedrooms: 2, Bedroom 1 Dims:...
3      [No of Bedrooms: 4, Bedroom 1 Dims: 4.80 x 5.0...
4      [Finish: Excellent, Airconditioning: Yes, Shop...
...

and would like to create new pandas columns for the number of bedrooms.

0      [N/A]
1      [1]
2      [2]
3      [4]
4      [N/A]
...

I have tried something this like in python:

csvname['No of Bedrooms'] = [s for s in csvname['Features'] if 'No of Bedrooms' in s]

This did not work. Is there an easy way of doing this? Any help would be greatly appreciated.

CodePudding user response:

You can try .str.extract

csvname['No of Bedrooms'] = csvname['Features'].astype(str).str.extract('No of Bedrooms: (\d )')
print(csvname)

                                            Features No of Bedrooms
0              [Shops: Close by, Passing trade: Yes]            NaN
1  [Lift: Yes, No of Bedrooms: 1, Bedroom 1 Dims:...              1
2  [Lift: Yes, No of Bedrooms: 2, Bedroom 1 Dims:...              2
3  [No of Bedrooms: 4, Bedroom 1 Dims: 4.80 x 5.0...              4
4  [Finish: Excellent, Airconditioning: Yes, Shop...            NaN
  • Related