Drop rows with conditions in PySpark Pandas API-CodePudding

I would like to know how to do that using PySpark Pandas API.

This is Pandas version:

indexNames = dfObj[ (dfObj['Age'] >= 30) & (dfObj['Age'] <= 40) ].index
dfObj.drop(indexNames , inplace=True)

But I would like to do that using PySpark Pandas API.

Could you please help me?

Thanks a lot

CodePudding user response：

You should follow this guide initially:

example will look like this:


import pyspark.pandas as ps

psdf = ps.range(10)
pdf = psdf.to_pandas()
pdf.values

And you can work how you like from this...

CodePudding user response：

Thanks dude. I found the solution:

array = indexNames.to_numpy()   
dfObj = dfObj.drop(index = array)