Home > OS >  Pandas universal data structure conversion
Pandas universal data structure conversion

Time:07-30

Hello friends Faced a problem. As an input, I get a price column from csv or xlsx. The data contains str, int and float. And there are commas instead of periods. At the output, I need a float. When I try to remove commas, my data is overwritten with zeros. How to be?

df = pd.DataFrame({'price': ['21', 22.0, '23', 24.0, 25, 
                             '26,0', 27, '28', 29, 30, 31, 32, 33, 
                              34, 35.0, 36.0, 37],})

def data_structure_change(df):    
    """1.Change commas to points
    2.All empty values are assigned zero.
    3.Change data type to float
    4.Change data type to integer"""
    
    try:
        df['price'] = df['price'].str.replace(',', '.',)
    except AttributeError:
        pass
    finally:
        df.price = df.price.fillna(0)
        df.price = df.price.astype('float')
    return df

CodePudding user response:

I have added the apply method to a function (called it conv_type()) that identifies and edits the types correctly. This is fast as it is vectorised.

This would work:

import pandas as pd
df = pd.DataFrame({'price': ['21', 22.0, '23', 24.0, 25, 
                             '26,0', 27, '28', 29, 30, 31, 32, 33, 
                              34, 35.0, 36.0, 37],})

def data_structure_change(df: pd.DataFrame):    

    # apply the type conversion
    df['price'] = df['price'].apply(conv_type)
    return df


def conv_type(x):
    """1.Change commas to points
    2.All empty values are assigned zero.
    3.Change data type to float
    4.Change data type to integer"""
    if type(x) == str:
        x = x.replace(',','.')
        x = float(x)
        return x

    if type(x) == int:
        x = float(x)
        return x
    
    if type(x) == float:
        return x

x = data_structure_change(df)
print(x)

result:

    price
0    21.0
1    22.0
2    23.0
3    24.0
4    25.0
5    26.0
6    27.0
7    28.0
8    29.0
9    30.0
10   31.0
11   32.0
12   33.0
13   34.0
14   35.0
15   36.0
16   37.0
  • Related