Home > Enterprise >  How to access the nominal values and uncertainties in a Pandas DataFrame?
How to access the nominal values and uncertainties in a Pandas DataFrame?

Time:10-23

I am using the uncertainties module along with Pandas. At present, I am able to output the dataframe with the uncertainties together to a spreadsheet. My main objective is to write the dataframe with the uncertainties in an adjacent column. But how to access the nominal values or uncertainties within dataframes. A MWE is given below.

Present output

A B
63.2 /-0.9 75.4 /-0.9
41.94 /-0.05 53.12 /-0.21
4.1 /-0.4 89.51 /-0.32
28.2 /-0.5 10.6 /-0.6
25.8 /-0.9 39.03 /-0.08
27.26 /-0.09 44.61 /-0.35
25.04 /-0.13 37.7 /-0.6
2.4 /-0.5 50.0 /-0.8
0.92 /-0.21 3.1 /-0.5
57.69 /-0.34 21.8 /-0.8

Desired output

A /- B /-
63.2 0.9 75.4 0.9
41.94 0.05 53.12 0.21
4.1 0.4 89.51 0.32
28.2 0.5 10.6 0.6
25.8 0.9 39.03 0.08
27.26 0.09 44.61 0.35
25.04 0.13 37.7 0.6
2.4 0.5 50 0.8
0.92 0.21 3.1 0.5
57.69 0.34 21.8 0.8

MWE

from uncertainties import unumpy
import pandas as pd
import numpy as np


A_n = 100 * np.random.rand(10)
A_s = np.random.rand(10)

B_n = 100 * np.random.rand(10)
B_s = np.random.rand(10)

AB = pd.DataFrame({'A':unumpy.uarray(A_n, A_s), 'B': unumpy.uarray(B_n, B_s)})


AB_writer = pd.ExcelWriter('A.xlsx', engine = 'xlsxwriter', options={'strings_to_numbers': True})
AB.to_excel(AB_writer, sheet_name = 'Data', index=False, na_rep='nan')
AB_writer.close()

Update

I forgot to mention that AB is not created as shown in MWE, but is a result of previous calculations not given in the MWE. For the sake of MWE, I created the AB. So in short, I won't have access to the A and B nominal and uncertainty values.

CodePudding user response:

You can map the column(s) to get the outcome you're looking for. The following code maps the A column (make sure to not assign two columns to the same column key ' /-')

AB[['A', ' /-']] = AB.A.apply(lambda x: str(x).split(' /-')).to_list()

CodePudding user response:

Just split them into different columns:

Au = unumpy.uarray(A_n, A_s)
Bu = unumpy.uarray(B_n, B_s)
AB = pd.DataFrame({'A': unumpy.nominal_values(Au), 'A /-': unumpy.std_devs(Au), 'B': unumpy.nominal_values(Bu), 'B /-': unumpy.std_devs(Bu)})

CodePudding user response:

You can use str.split() to split each column into one column of main value and one column of the uncertainties, as follows:

for col in AB:     # or `for col in AB[['A', 'B']]` if you only want to process from columns `A` and `B`
    AB[[col, f'{col} /-']] = AB[col].str.split(r'\ /-', expand=True)

# sort the columns to put the related columns together
AB = AB.sort_index(axis=1)    

It is not recommended to have 2 columns of the same column labels in the same dataframe. Here, we name the /- columns with together with their respective source column name in order to distinguish them.

Here, we also use .sort_index() to sort the column names to put related columns adjacent to each other.

Result:

print(AB)

       A  A /-      B  B /-
0   63.2   0.9   75.4   0.9
1  41.94  0.05  53.12  0.21
2    4.1   0.4  89.51  0.32
3   28.2   0.5   10.6   0.6
4   25.8   0.9  39.03  0.08
5  27.26  0.09  44.61  0.35
6  25.04  0.13   37.7   0.6
7    2.4   0.5   50.0   0.8
8   0.92  0.21    3.1   0.5
9  57.69  0.34   21.8   0.8
  • Related