I am having some trouble reading/importing a csv file into a pandas dataframe. The import is not skipping the comma that is enclosed in quotes.
I have tried different options for quotechar but none made any difference
import csv
import pandas
df = pandas.read_csv( 'test_quote.csv', header=None,sep=',', quotechar='\"', quoting=csv.QUOTE_MINIMAL, encoding='ascii', engine='python')
print(df)
code output
$ python3 test_quote.py
0 1 2 3 4 5 6
0 201571 2080 "December 2 2022" "November 1 - November 30 2022" 487.29
1 345741 5377 "December 3 2022" "November 1 - November 30 2022" 729.35
2 995349 3672 "December 2 2022" "November 1 - November 30 2022" 937.33
3 475601 3672 "December 2 2022" "November 1 - November 30 2022" 790.17
4 228548 3672 "December 7 2022" "November 1 - November 30 2022" 682.38
expected output
$ python3 test_quote.py
0 1 2 3 4
0 201571 2080 "December 2, 2022" "November 1 - November 30, 2022" 487.29
1 345741 5377 "December 3, 2022" "November 1 - November 30, 2022" 729.35
2 995349 3672 "December 2 , 2022" "November 1 - November 30 , 2022" 937.33
3 475601 3672 "December 2 , 2022" "November 1 - November 30 , 2022" 790.17
4 228548 3672 "December 7, 2022" "November 1 - November 30, 2022" 682.38
input file = test_quote.csv
201571, 2080, "December 2, 2022", "November 1 - November 30, 2022", 487.29
345741, 5377, "December 3, 2022", "November 1 - November 30, 2022", 729.35
995349, 3672, "December 2 , 2022", "November 1 - November 30 , 2022", 937.33
475601, 3672, "December 2 , 2022", "November 1 - November 30 , 2022", 790.17
228548, 3672, "December 7, 2022", "November 1 - November 30, 2022", 682.38
CodePudding user response:
The extra spaces after the commas are causing the issue. Use the following, but note most of your parameters are already the defaults.
import csv
import pandas
df = pandas.read_csv( 'test_quote.csv', header=None, skipinitialspace=True)
print(df)
Output:
0 1 2 3 4
0 201571 2080 December 2, 2022 November 1 - November 30, 2022 487.29
1 345741 5377 December 3, 2022 November 1 - November 30, 2022 729.35
2 995349 3672 December 2 , 2022 November 1 - November 30 , 2022 937.33
3 475601 3672 December 2 , 2022 November 1 - November 30 , 2022 790.17
4 228548 3672 December 7, 2022 November 1 - November 30, 2022 682.38