I am reading CSV file in dask but while reading, i want to "usecols" as we use in panads.
currently using for DASK, df = dd.read_csv('myfiles.csv') #in dask
I want to use like this as we can do in pandas, df = pd.read_csv('myfiles.csv',usecols=["date", "loc", "x"]
CodePudding user response:
Have you tried:
df = dd.read_csv('myfiles.csv',names=["date", "loc", "x"])
Here is a definition from pandas.read_csv
names array-like, optional
List of column names to use. If the file contains a header row, then you should explicitly pass header=0
to override the column names. Duplicates in this list are not allowed.
You can use Extra keyword arguments to forward to pandas.read_csv()
.
dask.dataframe.read_csv so even
df = dd.read_csv('myfiles.csv',usecols=["date", "loc", "x"])
will work for you.