Home > other >  How to parse pandas column in the loop if it's a dictionary
How to parse pandas column in the loop if it's a dictionary

Time:06-06

I have the DataFrame, one of the column contains dictionary in rows in format:

                                                                                                rates
0         {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '0X', 'rate': 73571.98764837519}
1       {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '18C', 'rate': 11607635.869234081}
2     {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '1EARTH', 'rate': 4162937.862828232}
3       {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '1ECO', 'rate': 7337.697146675354}
4      {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '1INCH', 'rate': 34645.31024605586}
...                                                                                               ...
4917     {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'SEK', 'rate': 291937.0479618742}
4918    {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'SGD', 'rate': 41146.239317767526}
4919    {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'THB', 'rate': 1026548.5675256335}
4920    {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'TRY', 'rate': 495117.01581884566}
4921    {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'ZAR', 'rate': 463779.32985313266}

How I can parse it in the loop in order to delete letters T and Z from this and change format to yyyy-mm-dd hh:mm:ss?

I tried like:

def change_colomns_format(self, colomn_in_df1):
        df = self.getting_response_and_df()
        for i in colomn_in_df1: 
            for x in i:
                x = [x[:19].replace("T", " ") for x in i]
                print (i)
        return df

But of course it's wrong.

CodePudding user response:

use Regex to remove T and Z here is an example

df['time'].str.replace(r'[T|Z]',' ', regex=True)

results in

time              2022-06-05 19:25:57.3000000 

or

df['time'].astype('datetime64')

results in

time             2022-06-05 19:25:57.300

or

df['time'].astype('datetime64').dt.strftime('%Y-%m-%d %H:%M:%S')

results in

time              2022-06-05 19:25:57

CodePudding user response:

You can use pd.json_normalize to create a dataframe from rates column then fix format and finally use to_dict to recreate dictionaries:

df['rates'] = (pd.json_normalize(df['rates'])
                 .assign(time=lambda x: x['time'].str[:-1].str.replace('T', ' '))
                 .to_dict('records'))
print(df)

# Output
                                                                                               rates
0         {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '0X', 'rate': 73571.98764837519}
1       {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '18C', 'rate': 11607635.869234081}
2     {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '1EARTH', 'rate': 4162937.862828232}
3       {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '1ECO', 'rate': 7337.697146675354}
4      {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '1INCH', 'rate': 34645.31024605586}
4917     {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'SEK', 'rate': 291937.0479618742}
4918    {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'SGD', 'rate': 41146.239317767526}
4919    {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'THB', 'rate': 1026548.5675256335}
4920    {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'TRY', 'rate': 495117.01581884566}
4921    {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'ZAR', 'rate': 463779.32985313266}
  • Related