Home > Software engineering >  Correction of the recorded data in the CSV file after aggregation
Correction of the recorded data in the CSV file after aggregation

Time:02-19

To aggregate the data, I use the code:

import pandas
df = pandas.read_csv("./input_file.csv", delimiter=";", low_memory=False)
df.head()
count_severity = df.groupby("B")["A"].unique()
has_multiple_elements = count_severity.apply(lambda x: len(x)>1)
result = count_severity[has_multiple_elements]
result.to_csv("./output_file.csv", sep=";", line_terminator=None)

In the output file I get the data in the following form:

enter image description here

Could you tell me please, how can square brackets [] (highlighted in red) and quotation marks "" (highlighted in red) be removed, and spaces (highlighted in yellow) replaced, for example, with a comma? I assume that the quotes appear as a result of the line break \n, i.e. in the second line between the digits 14 and 25 is \n. I tried using the parameter line_terminator=None but he did not come to success.

CodePudding user response:

If you can use another format except csv, as lists are containers, pickle would be convenient in this case:

result.to_pickle("./output_file.csv")
df2 = pd.read_pickle("./output_file.csv")

CodePudding user response:

So far, nothing comes to mind, except for this solution:

output_file = open("./output_file.csv", "r")
output_file = ''.join([i for i in output_file]).replace("[]", "")
output_file_new = open("./output_file_new.csv", "w")
output_file_new.writelines(output_file)
output_file_new.close()

Well, so a few replaces ...

I am waiting for the evaluation of experts on this solution, please give your comment. Thanks

  • Related