Can someone explain this please?-CodePudding

I have the following data:

0    Ground out of 2
1         1 out of 3
2         1 out of 3
Name: Floor, dtype: object

I want to modify this data so that I can create two columns named first floor and max floor.

Looking at the first item as an example:

0    Ground out of 2

the first floor would be 0 and max floor would be 2 etc...

This is the code I have written to extract the first floor items:

first_floor = []
lower_floors = ['Ground','Basement]

for data in df.Floor:
  for char in lower_floors:
     if char in data:
        floor_location.append('0')

 else:
    floor_location.append(data[:2])

When I do this, I get the following output:

['0', 'Gr', '1 ', '1 ']

I am expecting

['0', '1 ', '1 ']

Can someone explain where I am going wrong?

Thanks in advance.

CodePudding user response：

You loop is written in a wrong order.

But anyway, don't use a loop, rather use vectorial string extraction and fillna:

df['Floor'].str.extract('^(\d )', expand=False).fillna(0).astype(int)

Or for more flexibility (Ground -> 0 ; Basement -> -1…):

(df['Floor'].str.extract('^(\w )', expand=False)
            .replace({'Ground': 0, 'Basement': -1})
            .astype(int)
)

output:

0    0
1    1
2    1
Name: Floor, dtype: int64

As list:

df['Floor'].str.extract('^(\d )', expand=False).fillna(0).astype(int).tolist()

output : [0, 1, 1]

CodePudding user response：

First of all the indent of the else case is wrong. It should be:

first_floor = []
lower_floors = ['Ground','Basement']

for data in df.Floor:
  for char in lower_floors:
     if char in data:
        floor_location.append('0')

     else:
        floor_location.append(data[:2])

And second, as you are looping through the Floor column, data will be just a cell, not a row. So data[:2] will cut the cell to 2 characters. This is why you see Gr.