I am trying to analyze mechanical computation results using Python and export them to Excel automatically. My raw data consists of various types of computation (each stored in a different text file). For each type, I have various load cases (loading conditions). For each load case, I output the results are various datapoints. Finally, at each datapoint, I need to store one vector result (with 3 components). So basically, the general structure is:
computation --> load --> datapoint --> component = value
After searching for similar questions, for instance:
- python: combine multiple files into a matrix with 1 and 0
- Construct pandas DataFrame from items in nested dictionary
- Python: how to export a dictionary of nested lists to Excel
and many others, it looks like the pandas library of Python seems the most appropriate, especially the dataframe and its ability to generate pivot_table or crosstab.
So far, I created a list of dictionary similar to this:
dd=[]
files = ["computation1.out","computation2.out"]
for files in files:
filename=file.split(".")[0]
with open(file,"r") as f:
for row in f:
...
#process file and extract
# loadnumber
# datapointnb
# component number icomp
# value
# Creating a new Python dictionary
mydict = {'Computation': filename,
'Load Case': RefValues[iref],
'Datapoint': EntValues[ient],
'Compo': icomp,
'Value': float(row)}
# Append it to the list
dd.append(mydict)
print(json.dumps(dd, sort_keys=True, indent=4))
df = pd.DataFrame.from_dict(dd)
print(df)
df.to_excel("myoutout.xlsx", sheet_name='Data', float_format="%.4f", merge_cells=False)
This generates something like this:
[
{
"Component": 1,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 57,
"Value": 0.0004761905
},
{
"Component": 2,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 57,
"Value": -9.333414e-18
},
{
"Component": 3,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 57,
"Value": 0.0
},
{
"Component": 1,
"Computation": "computation1",
"LoadCase": 1.0,
"Datapoint": 84,
"Value": 0.0009523809
},
...
]
Computation LoadCase Datapoint Component Value
0 computation1 1.0 57 1 4.761905e-04
1 computation1 1.0 57 2 -9.333414e-18
2 computation1 1.0 57 3 0.000000e 00
3 computation1 1.0 84 1 9.523809e-04
...
The conversion from the list of dictionaries to a dataframe and the export to Excel went well but I can't seem to create automatically pivot table or restructure the data in lines and columns:
|computation1 |computation2...
|-----------------------------------------------|---
|loadcase1 |loadcase2 |...|loadcase1...
|compo1 compo2 compo3 |compo1 compo2 compo3 |...|compo1...
-----------------------------------------------------------|---
datapoint1 |val1 val2 val3 |val4 val5 val6 |...|...
... |... |...
datapoint3 |val7 ... |...
-----------------------------------------------------------|---
I have two questions:
- Would this data structure (list of dictionaries) be the most efficient once the amount of data gets bigger and if not, what would you recommend? I also investigated nested dictionaries but seemed more complicated to export to Excel.
- How can I restructure my dataframe to export to Excel using the appropriate format?
Thank you for you help, Christophe
CodePudding user response:
Finally, converting my list of dictionaries to pandas.DataFrame
and using the appropriate arguments to pandas.pivot_table
and exporting to Excel using pandas.pivot_table.to_excel
did the trick.
Full solution was those lines:
df = pandas.DataFrame.from_dict(dd)
mypivot=pandas.pivot_table(df, index=EntTypeLabel, columns=['Computation',RefTypeLabel,'Compo'], values='Value')
print_dbg(1,mypivot)
myExcelfile="samres.xlsx"
mypivot.to_excel(myExcelfile, sheet_name='Data', float_format="%.4f", merge_cells=False)
This finally generated the desired output:
Computation COMPUTATION1 COMPUTATION2 ...
Load Case Load Case-1.00 Load Case-1.00 ...
Compo Comp-1 Comp-2 Comp-1 Comp-2 ...
Node % ...
5559 1.13 -1.08 0.80 -0.63 ...
5561 0.77 -1.17 0.54 -0.68 ...
17789 1.00 -1.17 0.66 -0.62 ...
17791 0.69 -1.16 0.49 -0.60 ...
56690 0.81 0.03 0.57 0.44 ...
56812 1.01 -0.04 0.70 0.33 ...
Thanks to @MattDMo for the suggestion to use to_excel.