Home > database >  Rearrange rows while sorting one column and leaving another column unchanged in python dataframe
Rearrange rows while sorting one column and leaving another column unchanged in python dataframe

Time:07-14

I have the following dataset:

Name Year  Date Value
x    year1 date1 v1
x    year1 date2 v2
x    year1 date3 v3
x    year2 date1 v4
x    year2 date2 v5
x    year2 date3 v6
z    year1 date1 v7
z    year1 date2 v8
z    year1 date3 v9
z    year2 date1 v10
z    year2 date2 v11
z    year2 date3 v12
y    year1 date1 v13
y    year1 date2 v14
y    year1 date3 v15
y    year2 date1 v16
y    year2 date2 v17
y    year2 date3 v18

I would like the following dataset output:

Name Year  Date Value
x    year1 date1 v1
x    year2 date1 v4
x    year1 date2 v2
x    year2 date2 v5
x    year1 date3 v3
x    year2 date3 v6
z    year1 date1 v7
z    year2 date1 v10
z    year1 date2 v8
z    year2 date2 v11
z    year1 date3 v9
z    year2 date3 v12
y    year1 date1 v13
y    year2 date1 v16
y    year1 date2 v14
y    year2 date2 v17
y    year1 date3 v15
y    year2 date3 v18

I tried the following code but my 'Name' column is being sorted to 'x,y,z' as well. I want the 'Name' column order to stay as 'x,z,y': df.sort_values(['Name', 'Date'])

CodePudding user response:

IIUC, let's create a new categorical dtype with set order:

namedtype = pd.CategoricalDtype([*'xzy'], ordered=True)
df['Name'] = df['Name'].astype(namedtype)
df.sort_values(['Name', 'Date', 'Year'])

Output:

   Name   Year   Date Value
0     x  year1  date1    v1
3     x  year2  date1    v4
1     x  year1  date2    v2
4     x  year2  date2    v5
2     x  year1  date3    v3
5     x  year2  date3    v6
6     z  year1  date1    v7
9     z  year2  date1   v10
7     z  year1  date2    v8
10    z  year2  date2   v11
8     z  year1  date3    v9
11    z  year2  date3   v12
12    y  year1  date1   v13
15    y  year2  date1   v16
13    y  year1  date2   v14
16    y  year2  date2   v17
14    y  year1  date3   v15
17    y  year2  date3   v18

CodePudding user response:

datar is a reimagination of pandas APIs.

It's very easy to implement using datar:

>>> from datar.all import f, tribble, arrange, match
>>> df = tribble(
... f.Name, f.Year,  f.Date, f.Value,
... "x",    "year1", "date1", "v1",
... "x",    "year1", "date2", "v2",
... "x",    "year1", "date3", "v3",
... "x",    "year2", "date1", "v4",
... "x",    "year2", "date2", "v5",
... "x",    "year2", "date3", "v6",
... "z",    "year1", "date1", "v7",
... "z",    "year1", "date2", "v8",
... "z",    "year1", "date3", "v9",
... "z",    "year2", "date1", "v10",
... "z",    "year2", "date2", "v11",
... "z",    "year2", "date3", "v12",
... "y",    "year1", "date1", "v13",
... "y",    "year1", "date2", "v14",
... "y",    "year1", "date3", "v15",
... "y",    "year2", "date1", "v16",
... "y",    "year2", "date2", "v17",
... "y",    "year2", "date3", "v18",
... )
>>> df >> arrange(match(f.Name, f.Name), f.Date, f.Year)
       Name     Year     Date    Value
   <object> <object> <object> <object>
0         x    year1    date1       v1
3         x    year2    date1       v4
1         x    year1    date2       v2
4         x    year2    date2       v5
2         x    year1    date3       v3
5         x    year2    date3       v6
6         z    year1    date1       v7
9         z    year2    date1      v10
7         z    year1    date2       v8
10        z    year2    date2      v11
8         z    year1    date3       v9
11        z    year2    date3      v12
12        y    year1    date1      v13
15        y    year2    date1      v16
13        y    year1    date2      v14
16        y    year2    date2      v17
14        y    year1    date3      v15
17        y    year2    date3      v18
  • Related