Home > other >  How to combine mean and standard deviation columns of pandas into a single column
How to combine mean and standard deviation columns of pandas into a single column

Time:04-20

I have a pandas data frame one column of the data frame is Mean Values and the second column of the data frame is the standard deviation. Every row of the data frame represents one sample_Case for which we have a mean and standard deviation.

I want to create a new column where I can save mean and standard deviation together in the following format:

mean( - StD)

And then I want to export this as a csv file.

So the file will be like

Sample_1, Mean( - StD)

Sample_2, Mean( - StD)

and so on. I do not know how to combine the pandas data frame to produce something like this. I was wondering if someone can point me in the correct direction as to how I can do this?

CodePudding user response:

If both columns are strings, you can concatenate them directly:

df["mergedCol"] = df["Mean"]   "("   df["StD"]   ")"

If one of the column are not strings convert it first using astype

CodePudding user response:

As @WindCheck stated, you can combine the columns. Below assumes the columns are values. I reindexed the df to start at 1 and created a name for each sample. Then only send the sample name and combined column to the CSV file.

import pandas as pd
import numpy as np

cols = ['mean', 'std']
data= [[ 1.01, 0.11],
       [ 1.02, 0.12],
       [ 1.03, 0.13]]
df = pd.DataFrame(data, columns = cols)
# reindex df starting at 1
df.index = np.arange(1, len(df)   1)

# create sample names
df['sample_name'] = 'Sample_'   df.index.astype('str')
# combine mean and std columns.
# Use .astype('str') as shown below if columns are values
# if they are strings, remove the .astype('str')
df['mean_std'] = df['mean'].astype('str')   ' ( - '   df['std'].astype('str')   ')'

print(df)

# select columns you want written to CSV
cols_for_csv = ['sample_name', 'mean_std']
# don't include indexs and headers so it matches CSV output you are looking for.
df.to_csv('mean_std.csv', encoding='utf-8', index=False, header=None, columns=cols_for_csv)

df looks like:

   mean   std sample_name        mean_std
1  1.01  0.11    Sample_1  1.01 ( - 0.11)
2  1.02  0.12    Sample_2  1.02 ( - 0.12)
3  1.03  0.13    Sample_3  1.03 ( - 0.13)

CSV file looks like:

Sample_1,1.01 ( - 0.11)
Sample_2,1.02 ( - 0.12)
Sample_3,1.03 ( - 0.13)
  • Related