Home > database >  Add row based on condition in R
Add row based on condition in R

Time:03-11

I have a dataset that has information on power plants that are operating (currently functioning) or planned (will be opened in the future). However, for a few cases, there are power plants with the status "operational & planned". These power plants already exist but will expand at a given date. I need to correct this, by doubling the entries of this outlier, i.e., when a power plant is "operational & planned" I need to keep the current row (which I will rename to operational) and create a new identical row in which the status is planned. <br />

Simply put, what I need to do is, when there is an "operational & planned" power plant, I need to split it into operational and planned by creating a new row with the same data points. I am not being able to do this with ifelse. Could someone suggest an approach, please?



Example<br />

Firm   Status   Production<br />
A     Operating      1000<br />
B     Planned        1030<br />
C     Operating      1000<br />
D     Operating & Planned   1000<br />

Output<br />
Firm   Status   Production<br />
A     Operating      1000<br />
B     Planned        1030<br />
C     Operating      1000<br />
D     Operating   1000<br />
D     Planned   1000<br />

CodePudding user response:

You can use the package splitstackshape. You can use this code:

First your data:

df <- data.frame(Firm = c("A", "B", "C", "D"),
                 Status = c("Operating", "Planned", "Operating", " Operating & Planned"),
                 Production = c(1000, 1030, 1000, 1000))

Output:

  Firm               Status Production
1    A            Operating       1000
2    B              Planned       1030
3    C            Operating       1000
4    D  Operating & Planned       1000

Next use this code to split the row in two rows with same value:

library(splitstackshape)
df2 <- cSplit(df, "Status", sep = "&", direction = "long")
df2

Output:

   Firm    Status Production
1:    A Operating       1000
2:    B   Planned       1030
3:    C Operating       1000
4:    D Operating       1000
5:    D   Planned       1000

CodePudding user response:

You're looking for tidyr::separate_rows:

library(tidyr)
separate_rows(df, Status, sep = " & ")

Using @Quinten's dataset:

df <- data.frame(Firm = c("A", "B", "C", "D"),
                 Status = c("Operating", "Planned", "Operating", "Operating & Planned"),
                 Production = c(1000, 1030, 1000, 1000))
df %>% 
  separate_rows(Status, sep = " & ")

# A tibble: 5 x 3
  Firm  Status    Production
  <chr> <chr>          <dbl>
1 A     Operating       1000
2 B     Planned         1030
3 C     Operating       1000
4 D     Operating       1000
5 D     Planned         1000
  •  Tags:  
  • r
  • Related