I want to generate combinations for the following 3 lists such that beds >= baths >= cars, and the combinations are then projected into 3 dataframe columns. Could anyone suggest a method to achieve this? Thanks!
beds = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
baths = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
cars = [0, 1, 2, 3, 4, 5, 6, 7, 8]
Example
nbed nbath ncar
0 1.0 1.0 0
1 1.0 1.0 1
2 2.0 1.0 0
3 2.0 2.0 0
4 2.0 1.0 1
5 2.0 2.0 1
6 2.0 1.0 2
7 2.0 2.0 2
8 3.0 1.0 0
9 3.0 2.0 0
10 3.0 3.0 0
11 3.0 1.0 1
12 3.0 2.0 1
13 3.0 3.0 1
14 3.0 1.0 2
15 3.0 2.0 2
16 3.0 3.0 2
17 3.0 1.0 3
18 3.0 2.0 3
19 3.0 3.0 3
CodePudding user response:
You can do a simple join and then query:
df1,df2,df3 = [pd.DataFrame({name:lst} ) for lst,name in zip([beds,baths,cars], ['bed', 'bath', 'beyond'])]
df_123 = df1.join(df2, how='cross').join(df3, how='cross')
df_123.query("bed >= bath >= beyond").reset_index(drop=True)
# bed bath beyond
#0 1.0 1.0 0
#1 1.0 1.0 1
#2 2.0 1.0 0
#3 2.0 1.0 1
#4 2.0 2.0 0
#.. ... ... ...
#151 8.0 8.0 4
#152 8.0 8.0 5
#153 8.0 8.0 6
#154 8.0 8.0 7
#155 8.0 8.0 8
CodePudding user response:
One option is with conditional_join from pyjanitor and it is usually more efficient than a cartesian join for such inequality (non-equi) joins:
# pip install pyjanitor
import janitor
import pandas as pd
# convert them to integer type,
# to make all data types uniform
beds = pd.DataFrame({'beds': beds}, dtype = int)
cars = pd.Series(cars, name = 'cars', dtype = int)
baths = pd.Series(baths, name = 'baths', dtype = int)
(beds
.conditional_join(
baths,
("beds", "baths", ">="))
.conditional_join(
cars,
("baths", "cars", ">="))
)
beds baths cars
0 1 1 0
1 1 1 1
2 2 1 0
3 2 1 1
4 2 2 0
.. ... ... ...
151 8 8 4
152 8 8 5
153 8 8 6
154 8 8 7
155 8 8 8
[156 rows x 3 columns]