The usual code
print((sparkdf.count(), len(sparkdf.columns)))
Since I using HDFS system that fully on HDFS, no pandas allowed, The output I need
|-------|-------|
|row |columns|
|-------|-------|
|1500 | 22 |
|-------|-------|
CodePudding user response:
Just use spark.createDataFrame
and pass the values as a list of tuple:
spark.createDataFrame([(sparkdf.count(), len(sparkdf.columns))], schema=['rows', 'columns'])