Creating a Variable Initial Values from a base variable in Panel Data Structure in R-CodePudding

I'm trying to create a new variable in R containing the initial values of another variable (crime) based on groups (countries) considering the initial period of time observable per group (on panel data framework), my current data looks like this:

country	year	Crime
Albania	2016	2.7369478
Albania	2017	2.0109779
Argentina	2002	9.474084
Argentina	2003	7.7898825
Argentina	2004	6.0739941

And I want it to look like this:

country	year	Crime	Initial_Crime
Albania	2016	2.7369478	2.7369478
Albania	2017	2.0109779	2.7369478
Argentina	2002	9.474084	9.474084
Argentina	2003	7.7898825	9.474084
Argentina	2004	6.0739941	9.474084

I saw that ddply could make it work this way, but the problem is that it is not longer supported by the latest R updates.

Thank you in advance.

CodePudding user response：

Maybe arrange by year, then after grouping by country set Initial_Crime to be the first Crime in the group.

library(tidyverse)

df %>%
  arrange(year) %>%
  group_by(country) %>%
  mutate(Initial_Crime = first(Crime))

Output

  country    year Crime Initial_Crime
  <chr>     <int> <dbl>         <dbl>
1 Argentina  2002  9.47          9.47
2 Argentina  2003  7.79          9.47
3 Argentina  2004  6.07          9.47
4 Albania    2016  2.74          2.74
5 Albania    2017  2.01          2.74

CodePudding user response：

library(data.table)

setDT(data)[, Initial_Crime:=.SD[1,Crime], by=country]

     country year    Crime Initial_Crime
1:   Albania 2016 2.736948      2.736948
2:   Albania 2017 2.010978      2.736948
3: Argentina 2002 9.474084      9.474084
4: Argentina 2003 7.789883      9.474084
5: Argentina 2004 6.073994      9.474084

CodePudding user response：

A data.table solution

setDT(df)

df[, x := 1:.N, country
   ][x==1, initial_crime := crime
     ][, initial_crime := nafill(initial_crime, type = "locf")
       ][, x := NULL
         ]