Say I have 3 independent variables.
x = np.arange(30, 481, 30)
y = np.arange(1.5, 3.1, 0.1)
z = np.arange(0.2, 1.0, 0.05)
assert len(x) == len(y) == len(z) # length is 16
Each array has a length of 16. Is it possible to generate a pandas dataframe that shows each (16 * 16 * 16 = 4096) possible iterations of these independent variables (ideally in constant time)? I.e. the output would start as follows:
And then so on and so on until we have 4096 rows for each value of x for each value of y for each value of z.
CodePudding user response:
try:
import pandas as pd
import numpy as np
from itertools import product
x = np.arange(30, 481, 30)
y = np.arange(1.5, 3.1, 0.1)
z = np.arange(0.2, 1.0, 0.05)
df = pd.DataFrame(product(x,y,z), columns=['x', 'y', 'z'])
print(df)
Result:
x y z
0 30 1.5 0.20
1 30 1.5 0.25
2 30 1.5 0.30
3 30 1.5 0.35
4 30 1.5 0.40
... ... ... ...
4091 480 3.0 0.75
4092 480 3.0 0.80
4093 480 3.0 0.85
4094 480 3.0 0.90
4095 480 3.0 0.95
[4096 rows x 3 columns]
CodePudding user response:
Just iterate over your lists and build a list of all rows. This can be directly converted into a dataframe
len(pd.DataFrame([[xi, yi, zi ] for xi in x for yi in y for zi in z], columns=['x', 'y', 'z']))
then gives you
4096
CodePudding user response:
This works for me. Not the most elegant however.
x = np.arange(30, 481, 30)
y = np.arange(1.5, 3.1, 0.1)
z = np.arange(0.2, 1.0, 0.05)
assert len(x) == len(y) == len(z) # length is 16
df = pd.DataFrame(columns=['x','y','z'])
for i in x:
for j in y:
for k in z:
df = df.append({'x':i, 'y':j, 'z':k}, ignore_index=True)
CodePudding user response:
data = pd.merge(pd.Series(x, name="x"), pd.Series(y, name="y"), how="cross")
pd.merge(data, pd.Series(z, name="z"), how="cross")