We currently handle the following data frames
import pandas as pd
import io
csv_data = '''
ID,age,Year
100,75,2020
100,76,2021
200,64,2020
200,65,2021
200,66,2022
300,69,2020
300,70,2021
300,71,2022
300,72,2023
'''
df = pd.read_csv(io.StringIO(csv_data))
df = df.set_index(['ID', 'age'])
df
Year
ID age
100 75 2020
76 2021
200 64 2020
65 2021
66 2022
300 69 2020
70 2021
71 2022
72 2023
The ID in this data frame represents the same person, and Year means the year of the visit.
This will be multi-indexed based on ID and age. At this time, is it possible to display from the ID with more age elements?
The ideal display is shown below.
Year
ID age
300 69 2020
70 2021
71 2022
72 2023
200 64 2020
65 2021
66 2022
100 75 2020
76 2021
CodePudding user response:
You can sort based on two columns and set which column ascending and which not ascending like below:
>>> print(df.sort_values(['ID', 'age'], ascending=[False, True]))
Year
ID age
300 69 2020
70 2021
71 2022
72 2023
200 64 2020
65 2021
66 2022
100 75 2020
76 2021
CodePudding user response:
You could sort these values like this:
df = df.sort_values(['ID', 'age'],
ascending = [False, True])
CodePudding user response:
You can first compute the length of each ID
group using groupby
and size
, and then sort the ID
values based on the length:
s = df.groupby('ID').size()
df.sort_values('ID', key=lambda i:s[i], ascending=False)
Output:
Year
ID age
300 69 2020
70 2021
71 2022
72 2023
200 64 2020
65 2021
66 2022
100 75 2020
76 2021
CodePudding user response:
pandas sort_index()
is useful for sorting dataframe by index values.
df.sort_index(ascending=[False, True])