Home > Software engineering >  Convert Dataframe column values to common decimal place
Convert Dataframe column values to common decimal place

Time:11-12

I have a Pandas dataframe which look like this.

CustId CustName  Price_1  Priceqty1  Price_2  Priceqty2  Price_3  Priceqty3  Price_4   Price_5
5015      Axn     315.12      1        374       1        126.32       3     167.8765     
5015      Axn     75.36       3      190.19      7        33.16        1     190.19      88
5015      Axn     123         5                  4        18.07        2      0.073      12
7315      Bxy     12.0        4      22.345      3        77.89        1     345.0       3344
7315      Bxy     987.90      7      34.06       4        90.09        3     876.34      908.76
3283      Cxz     123.34      8      55.78       7        12           9     878.09      98.456
3283      Cxz     178.90      7       88         8        0.09         0                 987.56

I have 5 different price columns with different values with different decimal places. For better understanding to end user I need to convert all the price columns values to a common decimal place.

To convert I need to follow certain Rule: 1.conversion should be done for every customer id separately. 2.Common decimal place will be determined by a price column value with largest decimal place.

Let's Take CustId 5015 example

CustId CustName  Price_1  Priceqty1  Price_2  Priceqty2  Price_3  Priceqty3  Price_4   Price_5
5015      Axn     315.12      1        374       1        126.32       3     167.8765     
5015      Axn     75.36       3      190.19      7        33.16        1     190.19      88
5015      Axn     123         5                  4        18.07        2      0.073      12

If you see you can find 1st row of Price_4 column Has value of 167.8765 here the decimal place is of 4 . If you check all the price column values for Custid 5015 the decimal place of 4 is the largest one when compared with other price columns values. since 4 is largest one I need to convert every price column values of custid 5015 to 4 decimal places.

After conversion This how it should look like.

CustId CustName  Price_1  Priceqty1  Price_2   Priceqty2  Price_3  Priceqty3  Price_4   Price_5
5015      Axn     315.1200      1    374.0000      1      126.3200     3     167.8765     
5015      Axn     75.3600       3    190.1900      7      33.1600      1     190.1900    88
5015      Axn     123.0000      5                  4      18.0700      2     0.0730      12

Blank values should be Blank only.

I need to do this for every custid there are more than 800 different customer id. What's the most efficient way to do this?

CodePudding user response:

You could start by computing the integer and decimal part length of the numbers and take the max:

lengths = (df.filter(like='Price_')
             .stack().astype(str).dropna()
             .str.split('.', expand=True).astype(str)
             .apply(lambda c: c.str.len())
             .max()
            )

output:

0    4  # this is the integer part
1    4  # this is the decimal part

Then, if you do not want to change the data but only print the dataframe:

integer, decimal = lengths.values
total = integer decimal 1
custom_format = '{:%s.%sf}' % (total,decimal)
pd.options.display.float_format = custom_format.format
print(df.fillna(''))

output:

   CustId CustName   Price_1  Priceqty1   Price_2  Priceqty2   Price_3  Priceqty3   Price_4   Price_5
0    5015      Axn  315.1200          1  374.0000     1.0000  126.3200     3.0000  167.8765          
1    5015      Axn   75.3600          3  190.1900     7.0000   33.1600     1.0000  190.1900   88.0000
2    5015      Axn  123.0000          5    4.0000    18.0700    2.0000     0.0730   12.0000          
3    7315      Bxy   12.0000          4   22.3450     3.0000   77.8900     1.0000  345.0000 3344.0000
4    7315      Bxy  987.9000          7   34.0600     4.0000   90.0900     3.0000  876.3400  908.7600
5    3283      Cxz  123.3400          8   55.7800     7.0000   12.0000     9.0000  878.0900   98.4560
6    3283      Cxz  178.9000          7   88.0000     8.0000    0.0900     0.0000  987.5600          

CodePudding user response:

df = df.fillna('')
m = df.filter(regex='Price_').astype(str).applymap(lambda x: len(x.split('.')[1]) if x else 0).max().max()
pd.options.display.float_format = ('{:,.'   str(m)   'f}').format # based on https://stackoverflow.com/a/20937592/15035314
print(df)

Prints:

   CustId CustName  Price_1  Priceqty1  ...  Price_3  Priceqty3  Price_4    Price_5
0    5015      Axn 315.1200          1  ... 126.3200          3 167.8764           
1    5015      Axn  75.3600          3  ...  33.1600          1 190.1900    88.0000
2    5015      Axn 123.0000          5  ...  18.0700          2   0.0730    12.0000
3    7315      Bxy  12.0000          4  ...  77.8900          1 345.0000 3,344.0000
4    7315      Bxy 987.9000          7  ...  90.0900          3 876.3400   908.7600
5    3283      Cxz 123.3400          8  ...  12.0000          9 878.0900    98.4560
6    3283      Cxz 178.9000          7  ...   0.0900          0            987.5600

[7 rows x 10 columns]
  • Related