Loop though decimal that is object type-CodePudding

               Adress  Rooms   m2  Price   Floor
196      Skanstes 29a      5  325   2800   24/24
12          Ausekļa 4      5  195   2660     7/7
7       Antonijas 17A      3   86   2200     6/6
31        Blaumaņa 16      4  136   1800     4/6
186  Rūpniecības 21k2      5  160   1700     7/7
233        Vesetas 24      4  133   1700   10/10
187    Rūpniecības 34      5  157   1600     3/6
91     Elizabetes 31а      8  203   1600     1/5
35         Blaumaņa 9      3   90   1600     3/5
60             Cēsu 9      3  133   1550     6/7

I got the data set that I want to test the theory on if the higher the floor the more expensive the property rent price.

Adress    object
Rooms      int64
m2         int64
Price      int64
Floor     object
dtype: object

tbh I am stuck, not even sure how to start with this. Is there any way I can loop through the first number and compare it to the second? Like if 24=24 then it's in the new category 'Top Floor'?? And create 'mid-floor' and 'ground floor' categories as well.

GOT this far.

df_sorted= df.sort_values("Price",ascending=False)
print(df_sorted.head(10))
for e in df_sorted['Floor']:
    parts=e.split('/')
    print(parts)

but the second part is not working

if parts[0]==parts[-1]:
    return "Top Floor" if parts[0]=="1":
    return "Bottom Floor" else: "Mid Floor"

CodePudding user response：

If the floor is stored as a string you can use the following function:

def split_floors(floor):
    if floor.split('/')[0] == '1':
        return 'Bottom'
    if floor.split('/')[0] == floor.split('/')[1]:
        return 'Top Floor'
    else:
        return 'Mid Floor'

CodePudding user response：

First solution, using three categories as suggested in the question. Then applying a grouping by category to check the mean price as a simple comparison:

def floor_to_categories(floor_str):
    num1, num2 = floor_str.split("/")
    if num1 == num2: return "Top"
    elif num1 == "1": return "Bottom"
    return "Middle"
df["FloorCategories"] = df.Floor.apply(floor_to_categories)
df.groupby("FloorCategories").Price.mean()

Second solution, continuous intead of discrete, converting the floor into a float from 0 to 1, and then apply pearson correlation between the price and the new floor float:

def floor_to_float(floor_str):
    num1, num2 = [float(num) for num in floor_str.split("/")]
    return num1 / num2
df["FloorFloat"] = df.Floor.apply(floor_to_float)
df[["Price", "FloorFloat"]].corr()