I have a list of two numpy arrays that look like these:
a_b = [array([0.23078484, 0.23076418]),
array([0.3478484, 0.72076418]),
array([1.42590463, 1.42562456])]
c_d = [array([0.23276474, 0.23276488]),
array([0.3498484, 0.72086418]),
array([1.43590464, 1.44562477])]
and I want to generate a csv file that looks like the following
Source A B C
a_b 0.23078484 0.3478484 1.42590463
a_b 0.23076418 0.72076418 1.42562456
c_d 0.23276474 0.3498484 1.43590464
c_d 0.23276488 0.72086418 1.44562477
I have tried this so far
df = pd.DataFrame({'a_b': a_b , 'c_d': c_d}, columns = ['A', 'B','C'])
df.to_csv('test.csv', index=False)
But it gives this error
raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length
CodePudding user response:
You could use np.hstack
T
to work convert the list of arrays to a numpy array; convert it to a DataFrame; then save it as a csv file:
out = (pd.DataFrame(np.hstack((a_b, c_d)).T,
index=['a_b']*len(a_b[0]) ['c_d']*len(c_d[0]),
columns=[*'ABC'])
.reset_index()
.rename(columns={'index':'Source'}))
out.to_csv('file.csv')
Another way that is similar to yours is to use the from_dict
constructor with "orient" parameter. Then explode
-ing the columns will get the desired outcome:
out = (pd.DataFrame.from_dict({'a_b': a_b , 'c_d': c_d},
orient='index',
columns = ['A', 'B','C'])
.explode(['A','B','C'])
.reset_index()
.rename(columns={'index':'Source'}))
Output:
Source A B C
0 a_b 0.230785 0.347848 1.425905
1 a_b 0.230764 0.720764 1.425625
2 c_d 0.232765 0.349848 1.435905
3 c_d 0.232765 0.720864 1.445625