Home > Mobile >  Fill dataframe with multiple range from nested for loops
Fill dataframe with multiple range from nested for loops

Time:07-08

I'm doing some calculations for building boxes (yeah, boxes to put stuff in). What I do is take as input the box dimensions, wall thickness, lid thickness, and other parameters and do the math to get my materials breakdown.

I started out with a function, then took it a step further and replaced the function with dataframe math, so I could calculate several boxes at once. And then I got more and more ideas, and at the end here I am trying to calculate all possible boxes in a certain range of dimensions and material combinations.

The problem I have is I'm trying to fill my dataframe with all the necessary input values. For this I'm using nested for loops.

for width in range(50,150,5):
 for length in range(50,150,5):
  for height in range(50,150,5):
   (append to dataframe)

And as the ranges get bigger, the dataframe gets huge. And in the end I have to spend hours waiting for the for loop to complete and get my input csv, in order to do 30 seconds of df processing and get my results (there's a few more for loops nested than the ones shown).

The question, is a nested for loop the best way to fill data in a case like this, where you have to sweep a full range and generate combinations of several variables? Or is there a more efficient way to fill the dataframe that doesn't take so long?

CodePudding user response:

I think you may be able to get away with doing a cross join on a dataframe with your dimensions.

df = pd.DataFrame([x for x in range(50,150,5)],columns='dim') 
# this will create a dataframe with a single column called dim that is just the range
df.merge(df,how='cross')
# this gives the cartesian production - that is, all combinations, of the range in the 
# first column and the range in the second column. df has 20 rows and after the merge
# this has 400 rows
df.merge(df,how='cross').merge(df,how='cross')
# this gives the cartesian production and then the cartesian product of it again. this results
# in 8000 rows, which is 20x20x20

Result:

    dim_x   dim_y   dim
0   50        50    50
1   50        50    55
2   50        50    60
3   50        50    65
4   50        50    70
...

In your example, it looks like length, width, and height were all the same, but if they are different just make three different starting dataframes to merge.

CodePudding user response:

You can use itertool.product.

import pandas as pd
import numpy as np
import itertools

df = pd.DataFrame(itertools.product(np.arange(50,150,5), 
                                    np.arange(50,150,5),
                                    np.arange(50,150,5)),
                  columns = ['width', 'length', 'height']
                 )
print(df)

Output:

      width  length  height
0        50      50      50
1        50      50      55
2        50      50      60
3        50      50      65
4        50      50      70
...     ...     ...     ...
7995    145     145     125
7996    145     145     130
7997    145     145     135
7998    145     145     140
7999    145     145     145

[8000 rows x 3 columns]

Explanation:

>>> list(itertools.product(np.arange(1,3), np.arange(1,3),np.arange(1,3)))

[(1, 1, 1),
 (1, 1, 2),
 (1, 2, 1),
 (1, 2, 2),
 (2, 1, 1),
 (2, 1, 2),
 (2, 2, 1),
 (2, 2, 2)]
  • Related