I want to modify the longitudinal data.
How can I create the column using the number in the column name(e.g. gdpPercap_1952, gdpPercap_1957, etc.)?
I try to divide by the number(year) and the letter(gdp) in those column(e.g. gdpPercap_1952, gdpPercap_1957, etc.). Then I try to make the new column "year".
Would you tell me how I can solve that? Or is there any other suitable way?
continent country gdpPercap_1952
1 Africa Algeria 2449.0082
2 Africa Angola 3520.6103
3 Africa Benin 1062.7522
4 Africa Botswana 851.2411
5 Africa Burkina Faso 543.2552
6 Africa Burundi 339.2965
gdpPercap_1957 gdpPercap_1962
1 3013.9760 2550.8169
2 3827.9405 4269.2767
3 959.6011 949.4991
4 918.2325 983.6540
5 617.1835 722.5120
6 379.5646 355.2032
gdpPercap_1967 gdpPercap_1972
1 3246.9918 4182.6638
2 5522.7764 5473.2880
3 1035.8314 1085.7969
CodePudding user response:
This can be done with pivot_longer
library(tidyr)
pivot_longer(df1, cols = starts_with('gdp'),
names_to = c(".value", "year"), names_sep = "_")
CodePudding user response:
Or instead of names_sep
, we could use names_pattern
:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(
-c(continent, country),
names_to = c(".value", "year"),
names_pattern = "(.*)_(\\d )"
) %>%
data.frame()
continent country year gdpPercap
1 Africa Algeria 1952 2449.0082
2 Africa Algeria 1957 3013.9760
3 Africa Algeria 1962 2550.8169
4 Africa Angola 1952 3520.6103
5 Africa Angola 1957 3827.9405
6 Africa Angola 1962 4269.2767
7 Africa Benin 1952 1062.7522
8 Africa Benin 1957 959.6011
9 Africa Benin 1962 949.4991
10 Africa Botswana 1952 851.2411
11 Africa Botswana 1957 918.2325
12 Africa Botswana 1962 983.6540
13 Africa Burkina Faso 1952 543.2552
14 Africa Burkina Faso 1957 617.1835
15 Africa Burkina Faso 1962 722.5120
16 Africa Burundi 1952 339.2965
17 Africa Burundi 1957 379.5646
18 Africa Burundi 1962 355.2032
data:
structure(list(continent = c("Africa", "Africa", "Africa", "Africa",
"Africa", "Africa"), country = c("Algeria", "Angola", "Benin",
"Botswana", "Burkina Faso", "Burundi"), gdpPercap_1952 = c(2449.0082,
3520.6103, 1062.7522, 851.2411, 543.2552, 339.2965), gdpPercap_1957 = c(3013.976,
3827.9405, 959.6011, 918.2325, 617.1835, 379.5646), gdpPercap_1962 = c(2550.8169,
4269.2767, 949.4991, 983.654, 722.512, 355.2032)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))