I have a data frame with a date column like the following:
Decade |
---|
1770-1779 |
1780-1789 |
1770-1779 |
1820-1829 |
1770-1779 |
1790-1799 |
1800-1809 |
1810-1819 |
etc...
The desired output is to add a continous variable "Time" like this:
Decade | Time |
---|---|
1770-1779 | 1 |
1780-1789 | 2 |
1770-1779 | 1 |
1820-1829 | 6 |
1770-1779 | 1 |
1790-1799 | 3 |
1800-1809 | 4 |
1810-1819 | 5 |
etc...
Thank you very much.
CodePudding user response:
We can use base R
with factor
df$Time <- as.integer(factor(df$Decade, levels = sort(unique(df$Decade))))
-output
> df
Decade Time
1 1770-1779 1
2 1780-1789 2
3 1770-1779 1
4 1820-1829 6
5 1770-1779 1
6 1790-1799 3
7 1800-1809 4
8 1810-1819 5
Or another option is match
with(df, match(Decade, sort(unique(Decade))))
[1] 1 2 1 6 1 3 4 5
data
df <- structure(list(Decade = c("1770-1779", "1780-1789", "1770-1779",
"1820-1829", "1770-1779", "1790-1799", "1800-1809", "1810-1819"
)), class = "data.frame", row.names = c(NA, -8L))
CodePudding user response:
The function you need is cur_group_id()
from the package dplyr
.
library(dplyr)
df %>% group_by(Decade) %>% mutate(Time = cur_group_id())
# A tibble: 8 × 2
# Groups: Decade [6]
Decade Time
<chr> <int>
1 1770-1779 1
2 1780-1789 2
3 1770-1779 1
4 1820-1829 6
5 1770-1779 1
6 1790-1799 3
7 1800-1809 4
8 1810-1819 5
Data
df <- read.table(header = T, text = "
Decade
1770-1779
1780-1789
1770-1779
1820-1829
1770-1779
1790-1799
1800-1809
1810-1819")