I'm using Twint to create a .csv file with ten results. But whenever I try to load it into a pandas dataframe, I get an error. Can someone help me to understand what is going on?
Traceback (most recent call last):
File "k:\Documents\Visual Studio Code\Twitter Project\exploratory stage.py", line 4, in <module>
scrapedData = pd.read_csv('demo.csv')
File "K:\Programs\Python\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\readers.py", line 586, in
read_csv
return _read(filepath_or_buffer, kwds)
File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\readers.py", line 488, in
_read
return parser.read(nrows)
File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\readers.py", line 1047, in read
index, columns, col_dict = self._engine.read(nrows)
File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 223, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas\_libs\parsers.pyx", line 801, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas\_libs\parsers.pyx", line 857, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 1925, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 3
CodePudding user response:
Whenever you ask a pandas
question, you should always, if possible, provide a few lines of your data s.t. people can help you more efficiently.
The error states that your third line contains 3 fields where it expects only 1.
This can happen if your CSV is formatted incorrectly. The solution, in your case, is to fix the format or try setting error_bad_lines=False
.
This example throws the same error:
from io import StringIO
import pandas as pd
data = """name
brad
susi,tina,ellen
peter
"""
pd.read_csv(StringIO(data))
Output:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 3
Solution
Fix the CSV file or try setting error_bad_lines=False
will skip faulty lines
df = pd.read_csv(StringIO(data), error_bad_lines=False)
print(df)
Output:
Note the missing row
susi,tina,ellen
name
0 brad
1 peter
exec(code_obj, self.user_global_ns, self.user_ns)
b'Skipping line 3: expected 1 fields, saw 3\n'
CodePudding user response:
When the csv is occupied by another programm or application it can happen that the OS will "lock" the file up untill the operation is finished.
Asure that wen you create a .csv that you tell the os tho clone / end the operation
this is a example for opening / closing a file:
f.open(test.csv, "w")
f.write("test")
f.close()
withoput the f.close the file is "locked" up by the OS and can't be accesed by another programm / process