Pandas : Generate a Dataframe with every iteration for 3 independent variables-CodePudding

Say I have 3 independent variables.

x = np.arange(30, 481, 30) 
y = np.arange(1.5, 3.1, 0.1) 
z = np.arange(0.2, 1.0, 0.05) 

assert len(x) == len(y) == len(z) # length is 16

Each array has a length of 16. Is it possible to generate a pandas dataframe that shows each (16 * 16 * 16 = 4096) possible iterations of these independent variables (ideally in constant time)? I.e. the output would start as follows:

And then so on and so on until we have 4096 rows for each value of x for each value of y for each value of z.

CodePudding user response：

try:

import pandas as pd
import numpy as np
from itertools import product

x = np.arange(30, 481, 30) 
y = np.arange(1.5, 3.1, 0.1) 
z = np.arange(0.2, 1.0, 0.05)

df = pd.DataFrame(product(x,y,z), columns=['x', 'y', 'z'])
print(df)

Result:

        x    y     z
0      30  1.5  0.20
1      30  1.5  0.25
2      30  1.5  0.30
3      30  1.5  0.35
4      30  1.5  0.40
...   ...  ...   ...
4091  480  3.0  0.75
4092  480  3.0  0.80
4093  480  3.0  0.85
4094  480  3.0  0.90
4095  480  3.0  0.95

[4096 rows x 3 columns]

CodePudding user response：

Just iterate over your lists and build a list of all rows. This can be directly converted into a dataframe

len(pd.DataFrame([[xi, yi, zi ] for xi in x for yi in y for zi in z], columns=['x', 'y', 'z']))

then gives you 4096

CodePudding user response：

This works for me. Not the most elegant however.

x = np.arange(30, 481, 30) 
y = np.arange(1.5, 3.1, 0.1) 
z = np.arange(0.2, 1.0, 0.05) 

assert len(x) == len(y) == len(z) # length is 16

df = pd.DataFrame(columns=['x','y','z'])

for i in x:
    for j in y:
        for k in z:
            df = df.append({'x':i, 'y':j, 'z':k}, ignore_index=True)

CodePudding user response：

data = pd.merge(pd.Series(x, name="x"), pd.Series(y, name="y"), how="cross")
pd.merge(data, pd.Series(z, name="z"), how="cross")