Fill dataframe with multiple range from nested for loops-CodePudding

I'm doing some calculations for building boxes (yeah, boxes to put stuff in). What I do is take as input the box dimensions, wall thickness, lid thickness, and other parameters and do the math to get my materials breakdown.

I started out with a function, then took it a step further and replaced the function with dataframe math, so I could calculate several boxes at once. And then I got more and more ideas, and at the end here I am trying to calculate all possible boxes in a certain range of dimensions and material combinations.

The problem I have is I'm trying to fill my dataframe with all the necessary input values. For this I'm using nested for loops.

for width in range(50,150,5):
 for length in range(50,150,5):
  for height in range(50,150,5):
   (append to dataframe)

And as the ranges get bigger, the dataframe gets huge. And in the end I have to spend hours waiting for the for loop to complete and get my input csv, in order to do 30 seconds of df processing and get my results (there's a few more for loops nested than the ones shown).

The question, is a nested for loop the best way to fill data in a case like this, where you have to sweep a full range and generate combinations of several variables? Or is there a more efficient way to fill the dataframe that doesn't take so long?

CodePudding user response：

I think you may be able to get away with doing a cross join on a dataframe with your dimensions.

df = pd.DataFrame([x for x in range(50,150,5)],columns='dim') 
# this will create a dataframe with a single column called dim that is just the range
df.merge(df,how='cross')
# this gives the cartesian production - that is, all combinations, of the range in the 
# first column and the range in the second column. df has 20 rows and after the merge
# this has 400 rows
df.merge(df,how='cross').merge(df,how='cross')
# this gives the cartesian production and then the cartesian product of it again. this results
# in 8000 rows, which is 20x20x20

Result:

    dim_x   dim_y   dim
0   50        50    50
1   50        50    55
2   50        50    60
3   50        50    65
4   50        50    70
...

In your example, it looks like length, width, and height were all the same, but if they are different just make three different starting dataframes to merge.

CodePudding user response：

You can use itertool.product.

import pandas as pd
import numpy as np
import itertools

df = pd.DataFrame(itertools.product(np.arange(50,150,5), 
                                    np.arange(50,150,5),
                                    np.arange(50,150,5)),
                  columns = ['width', 'length', 'height']
                 )
print(df)

Output:

      width  length  height
0        50      50      50
1        50      50      55
2        50      50      60
3        50      50      65
4        50      50      70
...     ...     ...     ...
7995    145     145     125
7996    145     145     130
7997    145     145     135
7998    145     145     140
7999    145     145     145

[8000 rows x 3 columns]

Explanation:

>>> list(itertools.product(np.arange(1,3), np.arange(1,3),np.arange(1,3)))

[(1, 1, 1),
 (1, 1, 2),
 (1, 2, 1),
 (1, 2, 2),
 (2, 1, 1),
 (2, 1, 2),
 (2, 2, 1),
 (2, 2, 2)]