Home > Enterprise >  Groupby dataframe and filter in pyspark
Groupby dataframe and filter in pyspark

Time:09-24

Below is my input spark dataframe, Can someone help me with the desired dataframe or atleast the approach .

unique-id status
1 OAOS-STP
1 OAOS-nonSTP
1 manual
2 OAOS-nonSTP
2 manual
3 OAOS-STP
3 OAOS-nonSTP
4 OAOS-STP
4 manual

The output Dataframe I am expecting:

unique-id status
1 OAOS-STP
2 OAOS-nonSTP
3 OAOS-STP
4 OAOS-STP

OAOS-STP > OAOS-nonSTP > manual in order of precedence. Thanks in advance.

CodePudding user response:

You can link status of each row to an integer representing status order using a dictionary and an output

  • Related