Home > Software design >  numpy to spark error: TypeError: Can not infer schema for type: <class 'numpy.float64'&
numpy to spark error: TypeError: Can not infer schema for type: <class 'numpy.float64'&

Time:12-21

While trying to convert a numpy array into a Spark DataFrame, I receive Can not infer schema for type: <class 'numpy.float64'> error. The same thing happens with numpy.int64 arrays.

Example:

df = spark.createDataFrame(numpy.arange(10.))

TypeError: Can not infer schema for type: <class 'numpy.float64'>

CodePudding user response:

A quick conversion to a pandas DataFrame works nicely:

import pandas
import numpy
df = spark.createDataFrame(pandas.DataFrame(numpy.arange(10.)))

CodePudding user response:

Or without using pandas:

df = spark.createDataFrame([(float(i),) for i in numpy.arange(10.)])
  • Related