Home > Mobile >  split array from csv into columns
split array from csv into columns

Time:12-15

i have this CSV Data

number,event_date,event_timestamp,event_name,event_params
0,20220315,1668314165054758,eventTracking,"[{'key': 'test0', 'value': {'string_value': None, 'int_value': 1665374225, 'float_value': None, 'double_value': None}}
 {'key': 'test1', 'value': {'string_value': None, 'int_value': 0, 'float_value': None, 'double_value': None}}
 {'key': 'test2', 'value': {'string_value': 'http:\test.com', 'int_value': None, 'float_value': None, 'double_value': None}}
 {'key': 'test3', 'value': {'string_value': '[email protected]', 'int_value': None, 'float_value': None, 'double_value': None}}
 {'key': 'test4', 'value': {'string_value': None, 'int_value': 5, 'float_value': None, 'double_value': None}}]"

i want to get this:

number,event_date,event_timestamp,event_name1,event_params1
0,20220315,1668314165054758,test0,None
0,20220315,1668314165054758,test1,None
0,20220315,1668314165054758,test2,http:\test.com
0,20220315,1668314165054758,test3,[email protected]
0,20220315,1668314165054758,test4,None

can you please help? Thank you

CodePudding user response:

import pandas as pd
import re
# >>> re.sub(r'([A-Z])(?!$)', r'\1,', 'ABC')
df = pd.read_csv("check.csv")
print(df)
# It will take the list/dict from the relevant column
dc = df.event_params.iloc[0]
print(dc)

# The list is actually a string so we need to you eval
# Before the eval we need to fix the dict by adding comma after }}
r = re.sub("}}", r"}},", dc)
ev = eval(r)

# To get the number and event_date I just taking the column value
number = df.number.iloc[0]
event_date = df.event_date.iloc[0]

# Now lets loop over the dict and get the needed values
event_timestamp = []
event_name1 = []
event_params1 = []

for i in ev:
    print(i['value'])
    print(i.keys())
    event_timestamp.append(i['value']['int_value'])
    event_name1.append(i['key'])
    event_params1.append(i['value']['string_value'])


# Now creating final dataFrame and inserting all the values
df_final = pd.DataFrame()

df_final['event_timestamp'] = event_timestamp
df_final['event_name1'] = event_name1
df_final['event_params1'] = event_params1
df_final['number'] = 0
df_final['event_date'] = 20220315
df_final = df_final[['number', 'event_date', 'event_timestamp', 'event_name1', 
'event_params1']]

print(df_final)

# Save as csv
df_final.to_csv("This_is_what_you_nees.csv")

And the result is:

Result

  • Related