ddf = dd.read_csv("data/csvs/*.part", dtype=better_dtypes)
Is there an easy equivalent way to convert all columns in a dask df(converted from a pandas df) using a dictionary. I have a dictionary as follows:
better_dtypes = {
"id1": "string[pyarrow]",
"id2": "string[pyarrow]",
"id3": "string[pyarrow]",
"id4": "int64",
"id5": "int64",
"id6": "int64",
"v1": "int64",
"v2": "int64",
"v3": "float64",
}
and would like to convert the pandas|dask df dtypes all at once to the suggested dtypes in the dictionary.
ddf = ddf.astype(better_dtypes).dtypes
CodePudding user response:
Not sure if I understand the question correctly, but the conversion of dtypes can be done using .astype
(as you wrote), except you would want to remove .dtype
from the assignment:
# this will store the converted ddf
ddf = ddf.astype(better_dtypes)