I am new to Python and trying to store results from a for-loop into dataframe columns (Python 3). Let's say I have the following data:
time=[1,2,3,4]
i=[1,2,3,4,5]
j=[10,20,30,40,50]
k=[100,200,300,400,500]
data_ijk=list(zip(i,j,k))
[(1, 10, 100), (2, 20, 200), (3, 30, 300), (4, 40, 400), (5, 50, 500)]
I want to loop through the data_ijk to solve the following equation:
eq=(i j k)*t
and put the results of equation into dataframe with column labels
col_labels=["A","B","C","D","E"]
for each set of ijk and rows considering time (t).
The expected output should look something like this:
t | A | B | C | D | E |
---|---|---|---|---|---|
1 | 111 | 222 | 333 | 444 | 555 |
2 | 222 | 444 | 666 | 888 | 1100 |
3 | 333 | 666 | 999 | 1332 | 1665 |
4 | 444 | 888 | 1332 | 1776 | 2220 |
I know i need to define dataframe
df = pd.DataFrame(columns=[col_labels])
and I assume I should create nested for loop with data_abc and time, but I have no idea how to define to put results in different columns
for i,j,k in data_abc:
for t in time:
...???
print((i j k)*t)
CodePudding user response:
Try:
df = pd.DataFrame(data=[[sum(x)*t for x in zip(i,j,k)] for t in time],
columns=list("ABCDE"),
index=time)
>>> df
A B C D E
1 111 222 333 444 555
2 222 444 666 888 1110
3 333 666 999 1332 1665
4 444 888 1332 1776 2220
CodePudding user response:
Try:
import pandas as pd
# In case, the column names are actually words than single letters
col_labels=["A","B","C","D","E"]
time=[1,2,3,4]
data = dict(i=[1,2,3,4,5],
j=[10,20,30,40,50],
k=[100,200,300,400,500])
df = pd.DataFrame(data)
cons_sum_ijk = df.apply(sum, axis=1).to_list()
df = pd.DataFrame(columns=col_labels, data=[cons_sum_ijk]* len(time) )
df['t'] = time
df[col_labels] = df.apply(lambda x: x[col_labels]*x['t'], axis=1)
df = df[['t'] col_labels]
>>> df
t A B C D E
0 1 111 222 333 444 555
1 2 222 444 666 888 1110
2 3 333 666 999 1332 1665
3 4 444 888 1332 1776 2220
CodePudding user response:
Using Numpy's einsum
import numpy as np
import pandas as pd
time = [1, 2, 3, 4]
i = [1, 2, 3, 4, 5]
j = [10, 20, 30, 40, 50]
k = [100, 200, 300, 400, 500]
df = pd.DataFrame(np.einsum('a,bc->ac', time, np.array([i, j, k])), columns=[*'ABCDE'])
df
A B C D E
0 111 222 333 444 555
1 222 444 666 888 1110
2 333 666 999 1332 1665
3 444 888 1332 1776 2220