I have a dataset and I want to rearrange it in order to have more harmony and calculate mean and frequency in an easier way.
Let's take the following example, I have a dataset
cointaining the last shopping expenditures of different models:
Observation | Model | Date | Clothing | Price in $ | Store |
---|---|---|---|---|---|
# 1 | Amy | 14 / 01 | Top | 60 | X |
# 2 | Amy | 17 / 03 | SKIRT | 35 | X |
# 3 | Amy | 05 / 05 | Skirt | 40 | X |
# 4 | Amy | 05 / 05 | Blouse | 70 | P |
# 5 | Claudia | 17 / 02 | BLOUSE | 40 | B |
# 6 | Claudia | 17 / 02 | Jeans | 90 | L |
# 7 | Claudia | 21 / 04 | Jacket | 120 | L |
# 8 | Claudia | 22 / 04 | TOP | 30 | X |
# 9 | Estella | 05 / 05 | NA | 95 | L |
# 10 | Estella | 07 / 06 | Skirt | 40 | X |
# 11 | Estella | 08 / 07 | Dress | 150 | H |
# 12 | Estella | 04 / 08 | Hat | 15 | X |
As you can see some clothing pieces are the same but are written differently (it's on purpose). I want to rearrange this dataset
in order to keep the models in the exact same order but organize the clothing so that it will always start in alphabetic order and missing values at the end (blouse, dress, hat, jacket, jeans, skirt, NA), regardless of how the word is written. In other words, I want to re-order Clothing
within each Model
I don't have many ideas about what to use as code for this, so I cannot provide a code...
CodePudding user response:
You can sort
only on the Clothing
column, and put it back to your df$Clothing
.
df$Clothing <- sort(df$Clothing, na.last = T)
Observation Model Date Clothing Price in $ Store
1 # 1 Amy 14 / 01 Blouse 60 X
2 # 2 Amy 17 / 03 BLOUSE 35 X
3 # 3 Amy 05 / 05 Dress 40 X
4 # 4 Amy 05 / 05 Hat 70 P
5 # 5 Claudia 17 / 02 Jacket 40 B
6 # 6 Claudia 17 / 02 Jeans 90 L
7 # 7 Claudia 21 / 04 Skirt 120 L
8 # 8 Claudia 22 / 04 Skirt 30 X
9 # 9 Estella 05 / 05 SKIRT 95 L
10 # 10 Estella 07 / 06 Top 40 X
11 # 11 Estella 08 / 07 TOP 150 H
12 # 12 Estella 04 / 08 <NA> 15 X
UPDATE: Seems like OP wants to arrange Clothing
within each Model
, here's the code for this:
library(dplyr)
df %>% group_by(Model) %>% arrange(Clothing, .by_group = T)
# A tibble: 12 × 6
# Groups: Model [3]
Observation Model Date Clothing `Price in $` Store
<chr> <chr> <chr> <chr> <int> <chr>
1 # 4 Amy 05 / 05 Blouse 70 P
2 # 3 Amy 05 / 05 Skirt 40 X
3 # 2 Amy 17 / 03 SKIRT 35 X
4 # 1 Amy 14 / 01 Top 60 X
5 # 5 Claudia 17 / 02 BLOUSE 40 B
6 # 7 Claudia 21 / 04 Jacket 120 L
7 # 6 Claudia 17 / 02 Jeans 90 L
8 # 8 Claudia 22 / 04 TOP 30 X
9 # 11 Estella 08 / 07 Dress 150 H
10 # 12 Estella 04 / 08 Hat 15 X
11 # 10 Estella 07 / 06 Skirt 40 X
12 # 9 Estella 05 / 05 NA 95 L