Home > Software design >  Find length of decimal places Including trailing zeros from pandas column
Find length of decimal places Including trailing zeros from pandas column

Time:11-30

I have a Pandas dataframe which look like this.

Customer ID  Customer Name   Price_Val    
5015            AXN           17.12            
5015            AXN           2.08             
5015            AXN           3.453             
7315            BXN           2.22             
7315            BXN           8.46567             
3283            CXN           88               
3283            CXN           0.4600             
3283            CXN           6.46   
3283            CXN                          

I want to create column named dec_value. I want dec_value column to have length of decimal places from the corresponding Price_Val column.

For example I want my dec_value column should look like this.

Customer ID  Customer Name   Price_Val  dec_value   
5015            AXN           17.12         2   
5015            AXN           2.08          2   
5015            AXN           3.453         3    
7315            BXN           2.22          2   
7315            BXN           8.4656        4     
3283            CXN           88            0   
3283            CXN           0.4600        4     
3283            CXN           6.46          2
3283            CXN                         0 

I am using below code to do the above work.

i = 0
for value in df1['Price_Val']:
    if value == '':
        df1.loc[i, "dec_value "] = 0
    else:
        colval = value
        k = str(colval)[::-1].find('.')
        if k == -1:
            df1.loc[i,"dec_value"] = 0
        else:
            df1.loc[i,"dec_value"] = str(colval)[::-1].find('.')
    i=i 1

What's the most efficient way to do this?

CodePudding user response:

Convert your column to string, split on dot, rstrip zeros, and count the characters:

df['Price_Val'].fillna('').apply(lambda x: len(str(x).split('.')[-1].rstrip('0')))

or

df['dec_value'] = (df['Price_Val'].fillna('').astype(str)
                                  .str.split('.').str[-1]
                                  .str.rstrip('0').str.len()
                   )

output:

   Customer ID Customer Name  Price_Val  dec_value
0         5015           AXN   17.12000          2
1         5015           AXN    2.08000          2
2         5015           AXN    3.45300          3
3         7315           BXN    2.22000          2
4         7315           BXN    8.46567          5
5         3283           CXN   88.00000          0
6         3283           CXN    0.46000          2
7         3283           CXN    6.46000          2
8         3283           CXN        NaN          0

Alternatively, using a regex:

df['dec_value'] = (df['Price_Val'].fillna('').astype(str)
                                  .str.extract('\.(\d*[1-9])', expand=False)
                                  .str.len().fillna(0, downcast='infer')
                  )
timing of alternatives (90k rows)
# apply
50.5 ms ± 913 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# regex
83.9 ms ± 323 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# str pipeline
115 ms ± 2.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
  • Related