The dataset (named Politics) that I am working on looks as follows:
Current Dataset of every country in the world
However, my original data set contains the years 1997, 1999 and 2001 as well. As one can see in the picture, every country has no data for 1997, 1999 and 2001.
I would like to insert rows of the year 1997, 1999 and 2001 for every country in the current dataset such that we would have something like:
Country Year Politics
Afghanistan 1996 Value 1
Afghanistan 1997 empty value
Afghanistan 1998 Value 2
....
....
....
Albania 1996 Value 3
Albania 1997 empty value
etc
etc
Is there maybe another way because my original dataset looks as follows: Original dataset
The conclusion is that I want to make the current dataset fitted to the original dataset and currrently this is not possible as the original dataset has the years 1997, 1999 and 2001 whereas the current dataset did not include these years.
I hope that I have given a clear explanation of what I would like to see.
CodePudding user response:
if I understood correctly, you can use full_join() from dplyr something like this :
dplyr::full_join(current_dataset,Original dataset, by=c("Country Name"="Country", "Time"="Time")
After that you can use select(column1,column2...) to select or unselect columns.
CodePudding user response:
You can use complete
from tidyr
. You can specify explicitly the years you want for each country, or refer to your second data.frame's years. Below includes example data based on your post (recommend using dput
to share data instead of image).
set.seed(123)
df <- data.frame(
Country = c(rep("Afghanistan", 4), rep("Albania", 4)),
Time = c(1996, 1998, 2000, 2002, 1996, 1998, 2000, 2002),
Politics = rnorm(n = 8)
)
library(tidyverse)
df %>%
complete(Time = 1996:2002, nesting(Country)) %>%
arrange(Country, Time)
Output
Time Country Politics
<dbl> <chr> <dbl>
1 1996 Afghanistan -0.230
2 1997 Afghanistan NA
3 1998 Afghanistan 1.56
4 1999 Afghanistan NA
5 2000 Afghanistan 0.0705
6 2001 Afghanistan NA
7 2002 Afghanistan 0.129
8 1996 Albania 1.72
9 1997 Albania NA
10 1998 Albania 0.461
11 1999 Albania NA
12 2000 Albania -1.27
13 2001 Albania NA
14 2002 Albania -0.687