I would like to know how to do that using PySpark Pandas API.
This is Pandas version:
indexNames = dfObj[ (dfObj['Age'] >= 30) & (dfObj['Age'] <= 40) ].index
dfObj.drop(indexNames , inplace=True)
But I would like to do that using PySpark Pandas API.
Could you please help me?
Thanks a lot
CodePudding user response:
You should follow this guide initially:
example will look like this:
import pyspark.pandas as ps
psdf = ps.range(10)
pdf = psdf.to_pandas()
pdf.values
And you can work how you like from this...
CodePudding user response:
Thanks dude. I found the solution:
array = indexNames.to_numpy()
dfObj = dfObj.drop(index = array)