Home > Enterprise >  Filtering on dataframe of calculated values
Filtering on dataframe of calculated values

Time:09-27

I wrote a script to do some simple modeling calculations based on a set of input variables. I want to combine these calculated values into a dataframe that I can then filter on. The problem is, when I combine the calculated variables into a dataframe, the variables seem not be recognized as I can't filter on them (e.g., filtering on F_sil_remaining in the code below). I included some of the code below to reproduce a snippet of the script. I suppose it may be something to do with how the calculated data are stored and how I combined them into a dataframe, but I am not too sure (I am still new to python). Any help is greatly appreciated.

# Setup

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
import seaborn as sns

%matplotlib inline 
%config InlineBackend.figure_format = 'svg'

# Input paramaters

Ba_sil_i = 0.66

P_Cpx = 1
P_Pl = 0
P_Opx = 0
P_Ol = 0
P_Mt = 0
P_Ilm = 0
P_Ap = 0
P_Chr = 0
P_Maj_gn = 0
P_Bt = 0

D_Ba_Cpx = 0.0085   # Bedard et al. (2009)
D_Ba_Pl = 0.149     # Bedard et al. (2009)
D_Ba_Opx = 0.0268   # Bedard et al. (2009)
D_Ba_Ol = 0.0033    # Bedard et al. (2009)
D_Ba_Mt = 0.001     # Bedard et al. (2009)
D_Ba_Ilm = 0.0242   # Bedard et al. (2009)
D_Ba_Ap = 0.06      # Bedard et al. (2009)
D_Ba_Chr = float()
D_Ba_Maj_gn = 0.00184285714285714
D_Ba_Bt = 10        # Villemant et al. (1981) --> alkaline basalt

# Calculations

D_Ba_bulk = (P_Cpx * D_Ba_Cpx)   (P_Pl * D_Ba_Pl)   (P_Opx * D_Ba_Opx)   (P_Ol * D_Ba_Ol)   (P_Mt * D_Ba_Mt)   (P_Ilm * D_Ba_Ilm)   (P_Ap * D_Ba_Ap)   (P_Chr * D_Ba_Chr)  (P_Maj_gn * D_Ba_Maj_gn)   (P_Bt * D_Ba_Bt)

F_sil_remaining = np.arange(0.1, 1.0 0.01, 0.01)
Ba_sil_f = Ba_sil_i * (F_sil_remaining ** (D_Ba_bulk - 1))

# Data combined in a dataframe

combined_data = pd.DataFrame()
combined_data["F_sil_remaining"] = F_sil_remaining
combined_data["Ba_sil_f"] = Ba_sil_f

# Filter on column F_sil_remaining

combined_data_subset = combined_data[combined_data.F_sil_remaining.isin([0.2])]

CodePudding user response:

There seems to be some sort of rounding error which you can see if you use.

print(combined_data['F_sil_remaining'].to_list())

Output

[0.1, 0.11, 0.12, 0.13, 0.13999999999999999, 0.14999999999999997, 0.15999999999999998, 0.16999999999999998, 0.17999999999999997, 0.18999999999999995, 0.19999999999999996, 0.20999999999999996,...,0.9099999999999996, 0.9199999999999996, 0.9299999999999996, 0.9399999999999996, 0.9499999999999995, 0.9599999999999995, 0.9699999999999995, 0.9799999999999995, 0.9899999999999995, 0.9999999999999996]

You could fix that using np.round but that might not be the ideal solution.

combined_data["F_sil_remaining"] = np.round(F_sil_remaining,2)
  • Related