Adress Rooms m2 Price Floor
196 Skanstes 29a 5 325 2800 24/24
12 Ausekļa 4 5 195 2660 7/7
7 Antonijas 17A 3 86 2200 6/6
31 Blaumaņa 16 4 136 1800 4/6
186 Rūpniecības 21k2 5 160 1700 7/7
233 Vesetas 24 4 133 1700 10/10
187 Rūpniecības 34 5 157 1600 3/6
91 Elizabetes 31а 8 203 1600 1/5
35 Blaumaņa 9 3 90 1600 3/5
60 Cēsu 9 3 133 1550 6/7
I got the data set that I want to test the theory on if the higher the floor the more expensive the property rent price.
Adress object
Rooms int64
m2 int64
Price int64
Floor object
dtype: object
tbh I am stuck, not even sure how to start with this. Is there any way I can loop through the first number and compare it to the second? Like if 24=24 then it's in the new category 'Top Floor'?? And create 'mid-floor' and 'ground floor' categories as well.
GOT this far.
df_sorted= df.sort_values("Price",ascending=False)
print(df_sorted.head(10))
for e in df_sorted['Floor']:
parts=e.split('/')
print(parts)
but the second part is not working
if parts[0]==parts[-1]:
return "Top Floor" if parts[0]=="1":
return "Bottom Floor" else: "Mid Floor"
CodePudding user response:
If the floor is stored as a string you can use the following function:
def split_floors(floor):
if floor.split('/')[0] == '1':
return 'Bottom'
if floor.split('/')[0] == floor.split('/')[1]:
return 'Top Floor'
else:
return 'Mid Floor'
CodePudding user response:
First solution, using three categories as suggested in the question. Then applying a grouping by category to check the mean price as a simple comparison:
def floor_to_categories(floor_str):
num1, num2 = floor_str.split("/")
if num1 == num2: return "Top"
elif num1 == "1": return "Bottom"
return "Middle"
df["FloorCategories"] = df.Floor.apply(floor_to_categories)
df.groupby("FloorCategories").Price.mean()
Second solution, continuous intead of discrete, converting the floor into a float from 0 to 1, and then apply pearson correlation between the price and the new floor float:
def floor_to_float(floor_str):
num1, num2 = [float(num) for num in floor_str.split("/")]
return num1 / num2
df["FloorFloat"] = df.Floor.apply(floor_to_float)
df[["Price", "FloorFloat"]].corr()