input file.csv
['NE,PORT,EVENT,TIME,VALUE',
'NODE,13,MAX,2021-08-30 09:15:00 01:00 DST,-10.9',
'NODE,13,MIN,2021-08-30 09:15:00 01:00 DST,-11.0',
'NODE,13,CUR,2021-08-30 09:15:00 01:00 DST,-10.9',
'NODE,13,MAX,2021-08-30 10:30:00 01:00 DST,-12.9',
'NODE,13,MIN,2021-08-30 10:30:00 01:00 DST,-10.0',
'NODE,13,CUR,2021-08-30 10:30:00 01:00 DST,-12.9']
python code:
intext=open('file.csv', 'r')
check=intext.readlines()
for lista in check:
lista_split=lista.split(",")
lista_split.extend(['MAX','MIN','CUR'])
lista_index=[0,1,2,3,4]
lista_index.extend([5,6,7])
contents=list(lista_split[i] for i in lista_index)
if contents[2]==('MAX'):
contents[5] = contents[4])
elif contents[2]==('MIN'):
contents[6] = contents[4])
elif contents[2]==('CUR'):
contents[7] = contents[4])
contents.remove(contents[2])
contents.remove(contents[4])
print(contents)
step1 move EVENT as columns and the corresponding value, step2 clean columns (remove EVENT and VALUE), done!
['NE', 'PORT', 'TIME', 'MAX', 'MIN', 'CUR']
['NODE', '13', '2021-08-30 09:15:00 01:00 DST', '-10.9', 'MIN', 'CUR']
['NODE', '13', '2021-08-30 09:15:00 01:00 DST', 'MAX', '-11.0', 'CUR']
['NODE', '13', '2021-08-30 09:15:00 01:00 DST', 'MAX', 'MIN', '-10.9']
['NODE', '13', '2021-08-30 10:30:00 01:00 DST', '-12.9', 'MIN', 'CUR']
['NODE', '13', '2021-08-30 10:30:00 01:00 DST', 'MAX', '-10.0', 'CUR']
['NODE', '13', '2021-08-30 10:30:00 01:00 DST', 'MAX', 'MIN', '-12.9']
target:
['NE', 'PORT', 'TIME', 'MAX', 'MIN', 'CUR']
['NODE', '13', '2021-08-30 09:15:00 01:00 DST', '-10.9', '-11.0', '-10.9']
['NODE', '13', '2021-08-30 10:30:00 01:00 DST', '-12.9', '-10.0', '-12.9']
CodePudding user response:
One way to do this is by creating a pivot table.
csv = ['NE,PORT,EVENT,TIME,VALUE',
'NODE,13,MAX,2021-08-30 09:15:00 01:00 DST,-10.9',
'NODE,13,MIN,2021-08-30 09:15:00 01:00 DST,-11.0',
'NODE,13,CUR,2021-08-30 09:15:00 01:00 DST,-10.9',
'NODE,13,MAX,2021-08-30 10:30:00 01:00 DST,-12.9',
'NODE,13,MIN,2021-08-30 10:30:00 01:00 DST,-10.0',
'NODE,13,CUR,2021-08-30 10:30:00 01:00 DST,-12.9']
csv = [x.split(",") for x in csv]
df = pd.DataFrame(csv[1:], columns=csv[0])
df["VALUE"] = df["VALUE"].astype(float)
df = df.pivot_table("VALUE", ["NE", "PORT", "TIME"], "EVENT")
df = df.reset_index().rename_axis(None, axis=1)
CodePudding user response:
i'll correct just your first step. your logic was good, but there was a lot of confusion due to too many lists
for lista in check:
lista=lista.split(",")
lista.extend(['MAX','MIN','CUR'])
if lista[2]==('MAX'):
lista[5] = lista[4]
elif lista[2]==('MIN'):
lista[6] = lista[4]
elif lista[2]==('CUR'):
lista[7] = lista[4]
lista.remove(lista[2])
lista.remove(lista[3]) #you removed an index before so it's not 4 anymore
print(lista)