I'm using seqIplot function (TraMineR) to create a sorted sequence index plot (sortv = "from.end"). How can I get the dataframe sorted in the same order ? Thank you !
CodePudding user response:
It's a little bit unclear which data frame you are alluding to, but I assume you are talking about your sequence data.
The sorting is done in a two-step procedure:
- Create a sort index using
order
- sort data using this index
Below you find an example drawing on {TraMineR}
's example data actcal
.
> library(TraMineR)
>
> # actcal data set
> data(actcal)
>
> # We use only a sample of 10 cases
> set.seed(1)
> actcal <- actcal[sample(nrow(actcal),10),]
> actcal.seq <- seqdef(actcal,13:24)
[>] 4 distinct states appear in the data:
1 = A
2 = B
3 = C
4 = D
[>] state coding:
[alphabet] [label] [long label]
1 A A A
2 B B B
3 C C C
4 D D D
[>] 10 sequences in the data set
[>] min/max sequence length: 12/12
>
> # here the sorting happens:
> # 1) we create a sorting index with order
> # 2) we sort the data according to the index
> x <- as.data.frame(actcal.seq)
> sortvar <- do.call(order, x[,ncol(x):1])
> sorted.data <- actcal.seq[sortvar,]
>
> # bonus: change rownumber to index position
> # to ease comparison with seqplot output
> rownames(sorted.data) <- 1:nrow(sorted.data)
>
> #Inspect data
> #unsorted data
> actcal.seq
Sequence
3649 A-A-A-A-A-A-A-A-A-A-A-A
6274 D-D-D-D-D-D-D-D-D-D-D-D
4130 C-C-C-C-C-C-C-C-C-C-C-C
3236 B-B-B-B-B-B-B-B-B-B-B-B
2302 B-B-B-B-B-B-B-B-B-B-B-B
2172 A-A-A-A-A-A-A-A-A-A-A-A
5671 D-D-D-D-D-C-C-C-C-D-D-D
4039 B-B-B-B-B-B-B-B-B-B-B-B
153 D-D-D-D-D-D-D-D-D-D-D-D
5712 B-B-B-B-B-B-B-B-B-B-B-B
> #sorted data
> sorted.data
Sequence
1 A-A-A-A-A-A-A-A-A-A-A-A
2 A-A-A-A-A-A-A-A-A-A-A-A
3 B-B-B-B-B-B-B-B-B-B-B-B
4 B-B-B-B-B-B-B-B-B-B-B-B
5 B-B-B-B-B-B-B-B-B-B-B-B
6 B-B-B-B-B-B-B-B-B-B-B-B
7 C-C-C-C-C-C-C-C-C-C-C-C
8 D-D-D-D-D-C-C-C-C-D-D-D
9 D-D-D-D-D-D-D-D-D-D-D-D
10 D-D-D-D-D-D-D-D-D-D-D-D
>
> #Compare to plot output
> seqiplot(actcal.seq, sortv = "from.end")
CodePudding user response:
The TraMineRextras package has functions sorti
and sortv
that respectively return the sort index and the sort variable. To sorting the data, you need the sorti
variable.
Here is an example using the actcal
data.
library(TraMineRextras)
# actcal data set
data(actcal)
# using only sequence 11 to 20
actcal.seq <- seqdef(actcal[11:20,13:24])
sort.index <- sorti(actcal.seq, start="beg")
actcal.seq[sort.index,]
# 2103 A-A-A-A-A-A-A-A-A-A-A-A
# 528 A-A-A-A-A-A-A-A-A-A-A-A
# 4866 B-B-B-B-B-B-B-B-B-B-B-B
# 5108 B-B-B-B-B-B-B-B-B-B-B-B
# 5386 B-B-B-B-B-B-B-B-B-B-B-B
# 3876 B-B-B-B-B-B-B-B-B-B-B-B
# 5238 B-B-B-B-B-B-B-B-B-B-B-C
# 3972 C-C-C-C-C-C-C-C-C-B-B-B
# 4977 C-C-C-C-C-C-C-C-C-C-C-C
# 6175 D-D-D-D-D-D-D-D-D-D-D-D
With start="beg"
, you get the order corresponding to the sortv="from.start"
argument of the plot function, and with start="end"
the order corresponding to "from.end"
.
You can similarly use the sort.index
with any table where rows match with sequences as actcal[11:20,]
in the example above.