I have 100 csv files in one folder. I want to concatanate those csv files into a single dataframe.
I used the following code:
import os
import pandas as pd
data_suntracker = [f for f in os.listdir(".") if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, data_suntracker))
The output:
runfile('C:/Users/vasil/.spyder-py3/autosave/dokimastiko_sun4.py', wdir='C:/Users/vasil/.spyder-py3/autosave')
Traceback (most recent call last):
File "C:\Program Files\Spyder\pkgs\spyder_kernels\py3compat.py", line 356, in compat_exec
exec(code, globals, locals)
File "c:\users\vasil\.spyder-py3\autosave\dokimastiko_sun4.py", line 5, in <module>
df = pd.concat(map(pd.read_csv, data_suntracker))
File "C:\Program Files\Spyder\pkgs\pandas\util\_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "C:\Program Files\Spyder\pkgs\pandas\core\reshape\concat.py", line 368, in concat
op = _Concatenator(
File "C:\Program Files\Spyder\pkgs\pandas\core\reshape\concat.py", line 422, in __init__
objs = list(objs)
File "C:\Program Files\Spyder\pkgs\pandas\util\_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
File "C:\Program Files\Spyder\pkgs\pandas\util\_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "C:\Program Files\Spyder\pkgs\pandas\io\parsers\readers.py", line 950, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Program Files\Spyder\pkgs\pandas\io\parsers\readers.py", line 611, in _read
return parser.read(nrows)
File "C:\Program Files\Spyder\pkgs\pandas\io\parsers\readers.py", line 1778, in read
) = self._engine.read( # type: ignore[attr-defined]
File "C:\Program Files\Spyder\pkgs\pandas\io\parsers\c_parser_wrapper.py", line 230, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas\_libs\parsers.pyx", line 808, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas\_libs\parsers.pyx", line 866, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 852, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 1973, in pandas._libs.parsers.raise_parser_error
ParserError: Error tokenizing data. C error: Expected 1 fields in line 5, saw 3
and because I have the spyder application ...in the matrix in the up-right place the output is the following I have a list of the 100 csv files titles (that are strings) not databases. How can I fix my code in order to create the database that has all the data of these data files? All the files have the same columns.
CodePudding user response:
Try this code. It should work for you
import pandas as pd
import glob
import os
path = 'your path to files'
all_files = glob.glob(os.path.join(path , "/*.csv"))
temp_list = []
for filename in all_files:
temp_df = pd.read_csv(filename, index_col=None, header=0)
temp_list.append(temp_df)
single_df = pd.concat(temp_list, axis=0, ignore_index=True)
CodePudding user response:
Using pathlib
import pandas as pd
from pathlib import Path
path = "path/to/files/"
df = pd.concat((pd.read_csv(x) for x in Path(path).glob("*.csv")), ignore_index=True)