Convert Dataframe column values to common decimal place-CodePudding

I have a Pandas dataframe which look like this.

CustId CustName  Price_1  Priceqty1  Price_2  Priceqty2  Price_3  Priceqty3  Price_4   Price_5
5015      Axn     315.12      1        374       1        126.32       3     167.8765     
5015      Axn     75.36       3      190.19      7        33.16        1     190.19      88
5015      Axn     123         5                  4        18.07        2      0.073      12
7315      Bxy     12.0        4      22.345      3        77.89        1     345.0       3344
7315      Bxy     987.90      7      34.06       4        90.09        3     876.34      908.76
3283      Cxz     123.34      8      55.78       7        12           9     878.09      98.456
3283      Cxz     178.90      7       88         8        0.09         0                 987.56

I have 5 different price columns with different values with different decimal places. For better understanding to end user I need to convert all the price columns values to a common decimal place.

To convert I need to follow certain Rule: 1.conversion should be done for every customer id separately. 2.Common decimal place will be determined by a price column value with largest decimal place.

Let's Take CustId 5015 example

CustId CustName  Price_1  Priceqty1  Price_2  Priceqty2  Price_3  Priceqty3  Price_4   Price_5
5015      Axn     315.12      1        374       1        126.32       3     167.8765     
5015      Axn     75.36       3      190.19      7        33.16        1     190.19      88
5015      Axn     123         5                  4        18.07        2      0.073      12

If you see you can find 1st row of Price_4 column Has value of 167.8765 here the decimal place is of 4 . If you check all the price column values for Custid 5015 the decimal place of 4 is the largest one when compared with other price columns values. since 4 is largest one I need to convert every price column values of custid 5015 to 4 decimal places.

After conversion This how it should look like.

CustId CustName  Price_1  Priceqty1  Price_2   Priceqty2  Price_3  Priceqty3  Price_4   Price_5
5015      Axn     315.1200      1    374.0000      1      126.3200     3     167.8765     
5015      Axn     75.3600       3    190.1900      7      33.1600      1     190.1900    88
5015      Axn     123.0000      5                  4      18.0700      2     0.0730      12

Blank values should be Blank only.

I need to do this for every custid there are more than 800 different customer id. What's the most efficient way to do this?

CodePudding user response：

You could start by computing the integer and decimal part length of the numbers and take the max:

lengths = (df.filter(like='Price_')
             .stack().astype(str).dropna()
             .str.split('.', expand=True).astype(str)
             .apply(lambda c: c.str.len())
             .max()
            )

output:

0    4  # this is the integer part
1    4  # this is the decimal part

Then, if you do not want to change the data but only print the dataframe:

integer, decimal = lengths.values
total = integer decimal 1
custom_format = '{:%s.%sf}' % (total,decimal)
pd.options.display.float_format = custom_format.format
print(df.fillna(''))

output:

   CustId CustName   Price_1  Priceqty1   Price_2  Priceqty2   Price_3  Priceqty3   Price_4   Price_5
0    5015      Axn  315.1200          1  374.0000     1.0000  126.3200     3.0000  167.8765          
1    5015      Axn   75.3600          3  190.1900     7.0000   33.1600     1.0000  190.1900   88.0000
2    5015      Axn  123.0000          5    4.0000    18.0700    2.0000     0.0730   12.0000          
3    7315      Bxy   12.0000          4   22.3450     3.0000   77.8900     1.0000  345.0000 3344.0000
4    7315      Bxy  987.9000          7   34.0600     4.0000   90.0900     3.0000  876.3400  908.7600
5    3283      Cxz  123.3400          8   55.7800     7.0000   12.0000     9.0000  878.0900   98.4560
6    3283      Cxz  178.9000          7   88.0000     8.0000    0.0900     0.0000  987.5600

CodePudding user response：

df = df.fillna('')
m = df.filter(regex='Price_').astype(str).applymap(lambda x: len(x.split('.')[1]) if x else 0).max().max()
pd.options.display.float_format = ('{:,.'   str(m)   'f}').format # based on https://stackoverflow.com/a/20937592/15035314
print(df)

Prints:

   CustId CustName  Price_1  Priceqty1  ...  Price_3  Priceqty3  Price_4    Price_5
0    5015      Axn 315.1200          1  ... 126.3200          3 167.8764           
1    5015      Axn  75.3600          3  ...  33.1600          1 190.1900    88.0000
2    5015      Axn 123.0000          5  ...  18.0700          2   0.0730    12.0000
3    7315      Bxy  12.0000          4  ...  77.8900          1 345.0000 3,344.0000
4    7315      Bxy 987.9000          7  ...  90.0900          3 876.3400   908.7600
5    3283      Cxz 123.3400          8  ...  12.0000          9 878.0900    98.4560
6    3283      Cxz 178.9000          7  ...   0.0900          0            987.5600

[7 rows x 10 columns]