My code seems to be outputting the list I want, however, when I try printing the list to CSV I do not get the same result on the .csv file for some reason. I am sure there's something not right at the end of my code. Could anyone please shed some light? Thanks in advance.
import pandas as pd
df = pd.read_csv('microRuleSet-row.csv')
deduplicated_list = list()
for index, row in df.iterrows():
for item in row:
if item not in deduplicated_list:
deduplicated_list.append(item)
print(deduplicated_list)
df.to_csv('microRuleSet-row-noDupes.csv', index=False)
CodePudding user response:
I have not used pandas before. But it looks like you are outputting to csv the original microRuleSet-row.csv that you loaded. You have to export the deduplicated_list to csv. OK so each row must have no duplicated items. This code will do that. The first (header) row is now numbered 0 to 5. This can be changed to to the original heading, and adding placeholders for the extra empty csv cells.
import pandas as pd
df = pd.read_csv('microRuleSet-row.csv')
no_duplicates_list = []
for index, row in df.iterrows():
new_row = []
for item in row:
if item not in new_row:
new_row.append(item)
no_duplicates_list.append(new_row)
print(no_duplicates_list)
df2 = pd.DataFrame(no_duplicates_list)
df2.to_csv('microRuleSet-row-noDupes.csv', index=False)