I have the DataFrame, one of the column contains dictionary in rows in format:
rates
0 {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '0X', 'rate': 73571.98764837519}
1 {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '18C', 'rate': 11607635.869234081}
2 {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '1EARTH', 'rate': 4162937.862828232}
3 {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '1ECO', 'rate': 7337.697146675354}
4 {'time': '2022-06-05T19:25:57.3000000Z', 'asset_id_quote': '1INCH', 'rate': 34645.31024605586}
... ...
4917 {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'SEK', 'rate': 291937.0479618742}
4918 {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'SGD', 'rate': 41146.239317767526}
4919 {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'THB', 'rate': 1026548.5675256335}
4920 {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'TRY', 'rate': 495117.01581884566}
4921 {'time': '2022-06-05T19:25:07.1000000Z', 'asset_id_quote': 'ZAR', 'rate': 463779.32985313266}
How I can parse it in the loop in order to delete letters T
and Z
from this and change format to yyyy-mm-dd hh:mm:ss
?
I tried like:
def change_colomns_format(self, colomn_in_df1):
df = self.getting_response_and_df()
for i in colomn_in_df1:
for x in i:
x = [x[:19].replace("T", " ") for x in i]
print (i)
return df
But of course it's wrong.
CodePudding user response:
use Regex to remove T and Z here is an example
df['time'].str.replace(r'[T|Z]',' ', regex=True)
results in
time 2022-06-05 19:25:57.3000000
or
df['time'].astype('datetime64')
results in
time 2022-06-05 19:25:57.300
or
df['time'].astype('datetime64').dt.strftime('%Y-%m-%d %H:%M:%S')
results in
time 2022-06-05 19:25:57
CodePudding user response:
You can use pd.json_normalize
to create a dataframe from rates
column then fix format and finally use to_dict
to recreate dictionaries:
df['rates'] = (pd.json_normalize(df['rates'])
.assign(time=lambda x: x['time'].str[:-1].str.replace('T', ' '))
.to_dict('records'))
print(df)
# Output
rates
0 {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '0X', 'rate': 73571.98764837519}
1 {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '18C', 'rate': 11607635.869234081}
2 {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '1EARTH', 'rate': 4162937.862828232}
3 {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '1ECO', 'rate': 7337.697146675354}
4 {'time': '2022-06-05 19:25:57.3000000', 'asset_id_quote': '1INCH', 'rate': 34645.31024605586}
4917 {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'SEK', 'rate': 291937.0479618742}
4918 {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'SGD', 'rate': 41146.239317767526}
4919 {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'THB', 'rate': 1026548.5675256335}
4920 {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'TRY', 'rate': 495117.01581884566}
4921 {'time': '2022-06-05 19:25:07.1000000', 'asset_id_quote': 'ZAR', 'rate': 463779.32985313266}