Trying to combine stats outputs of two datasets that are related with pandas, one is like this,
PoweredUp
min max mean median var std
magic -1.0 1.0 0.282669 0.8 0.659919 0.812354
magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000
power 0.0 0.0 0.000000 0.0 0.000000 0.000000
PoweredDown
min max mean median var std
magic -1.0 1.0 0.473780 1.0 0.586732 0.765984
magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000
power 1.0 2.0 1.152439 1.0 0.129994 0.360547
I want to create an output that has these variables in a single dataframe. Not 100% sure on the best way to approach it really, perhaps prefixing PoweredUp and PoweredDown to the columns for magic, magnitude and power and transposing and joining the dataframe?
CodePudding user response:
You can do pretty much what you want.
Use pandas.concat
:
pd.concat({'PoweredUp': PoweredUp, 'PoweredDown': PoweredUp})
output:
min max mean median var std
PoweredUp magic -1.0 1.0 0.282669 0.8 0.659919 0.812354
magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000
power 0.0 0.0 0.000000 0.0 0.000000 0.000000
PoweredDown magic -1.0 1.0 0.473780 1.0 0.586732 0.765984
magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000
power 1.0 2.0 1.152439 1.0 0.129994 0.360547
or:
pd.concat({'PoweredUp': PoweredUp, 'PoweredDown': PoweredDown}, axis=1)
output
PoweredUp PoweredDown
min max mean median var std min max mean median var std
magic -1.0 1.0 0.282669 0.8 0.659919 0.812354 -1.0 1.0 0.473780 1.0 0.586732 0.765984
magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000 1.0 1.0 1.000000 1.0 0.000000 0.000000
power 0.0 0.0 0.000000 0.0 0.000000 0.000000 1.0 2.0 1.152439 1.0 0.129994 0.360547
Or, with prefixes/suffixes:
pd.concat([PoweredUp.add_suffix('_up'), PoweredDown.add_suffix('_down')], axis=1)
output:
min_up max_up mean_up median_up var_up std_up min_down max_down mean_down median_down var_down std_down
magic -1.0 1.0 0.282669 0.8 0.659919 0.812354 -1.0 1.0 0.473780 1.0 0.586732 0.765984
magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000 1.0 1.0 1.000000 1.0 0.000000 0.000000
power 0.0 0.0 0.000000 0.0 0.000000 0.000000 1.0 2.0 1.152439 1.0 0.129994 0.360547
CodePudding user response:
A way to do it (among several, depending on preferred details of output) is to create a new column named 'dataset' and concatenate the datasets:
PoweredUp = [
[-1.0, 1.0, 0.282669, 0.8, 0.659919, 0.812354],
[1.0, 1.0, 1.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
]
PoweredDown = [
[-1.0, 1.0, 0.473780, 1.0, 0.586732, 0.76598],
[1.0, 1.0, 1.0, 1.0, 0.0, 0.0],
[1.0, 2.0, 1.152439, 1.0, 0.129994, 0.360547]
]
outputNames = ['magic', 'magnitude', 'power']
colNames = ['dataset', 'output', 'min', 'max', 'mean', 'median', 'var', 'std']
PoweredUp = pd.DataFrame(PoweredUp, index=outputNames, columns=colNames)
PoweredUp.insert(0, 'dataset', 'PoweredUp')
PoweredDown = pd.DataFrame(PoweredDown, index=outputNames, columns=colNames)
PoweredDown.insert(0, 'dataset', 'PoweredDown')
df = pd.concat([PoweredUp, PoweredDown])
print(df)
Output:
dataset min max mean median var std
magic PoweredUp -1.0 1.0 0.282669 0.8 0.659919 0.812354
magnitude PoweredUp 1.0 1.0 1.000000 1.0 0.000000 0.000000
power PoweredUp 0.0 0.0 0.000000 0.0 0.000000 0.000000
magic PoweredDown -1.0 1.0 0.473780 1.0 0.586732 0.765980
magnitude PoweredDown 1.0 1.0 1.000000 1.0 0.000000 0.000000
power PoweredDown 1.0 2.0 1.152439 1.0 0.129994 0.360547
A second way to do it is to create both the 'dataset' column as well as an 'output' column and have a numerical index:
PoweredUp = pd.DataFrame([['PoweredUp', output] row for output, row in zip(outputNames , PoweredUp)], columns=colNames)
PoweredDown = pd.DataFrame([['PoweredDown', output] row for output, row in zip(outputNames , PoweredDown)], columns=colNames)
df = pd.concat([PoweredUp, PoweredDown], ignore_index=True)
print(df)
Output:
dataset output min max mean median var std
0 PoweredUp magic -1.0 1.0 0.282669 0.8 0.659919 0.812354
1 PoweredUp magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000
2 PoweredUp power 0.0 0.0 0.000000 0.0 0.000000 0.000000
3 PoweredDown magic -1.0 1.0 0.473780 1.0 0.586732 0.765980
4 PoweredDown magnitude 1.0 1.0 1.000000 1.0 0.000000 0.000000
5 PoweredDown power 1.0 2.0 1.152439 1.0 0.129994 0.360547