I have a dataframe which consists of 2342 observations with 11 variables. The observation (column known as DATE) starts from January 01, 2016 and goes till May 31, 2022. Now I want to add more than 1000 rows so that I can add date till 2025 in the DATE column. I used the code -
delhi_ra[nrow(delhi_ra) no. of rows,]
But this just adds 1 row to the dataframe and gives me an error - can't assign to columns beyond the end with non-consecutive locations. Input has size Subscript __ contains non-consecutive location. so the question is how to add more than 1000 rows in the existing dataframe. I am attaching the code which I have written so far for the reference.
df_delhi <- rcrea::measurements(country = 'IN', poll = rcrea::PM25, date_from = '2016-01-01', date_to = '2022-05-30', city = 'Delhi')
delhi_ra <- rcrea::utils.running_average(df_delhi,365)
delhi_ra$date <- as.Date(delhi_ra$date)
avg_2017 <- delhi_ra$value[733]
percent_decrease <- 30
NCAP_target <- avg_2017*(100-percent_decrease)/100
delhi_ra$avg_2017 <- NA
delhi_ra$avg_2017[733:2342] <- 78.49
delhi_ra[nrow(delhi_ra) 946,]
#plot the data
colors <- c("PM2.5 level" = "black", "NCAP Target by 2024" = "red")
ggplot(data = delhi_ra, aes(x=date))
geom_line(aes(y=value, color = 'PM2.5 level'))
geom_line(aes(y=avg_2017, color = 'NCAP Target by 2024'), linetype = 'twodash') rcrea::theme_crea()
labs(title = 'PM2.5 pollution levels in Delhi', x='Year', y='PM2.5', color = 'Legend')
scale_color_manual(values = colors)
scale_x_date(limits = as.Date(c('2017-01-01','2025-01-01')), breaks = '1 year', date_labels = "%b %Y")
scale_y_continuous(expand = c(0, 0), limits = c(0, NA), breaks = breaks_extended(5))
CodePudding user response:
If I understand correctly, you're trying to add many new rows so that the column DATE
can be extended.
One possible way of doing this is creating a new dataframe with the dates you want to add and then joining it to your dataframe with full_join
from dplyr
. I use lubridate
because it has a lot of handy functions when working with dates.
In this simple example below I'm only adding 4 rows, but you can change this to 1000 and more by changing the end date to the line creating dat2
.
library(lubridate)
library(dplyr)
# I first create some simulated data as an example
dat <- data.frame(DATE = as_date(ymd("2016-01-01"):ymd("2016-01-03")), value = c(7, 2, 8))
dat
#> DATE value
#> 1 2016-01-01 7
#> 2 2016-01-02 2
#> 3 2016-01-03 8
# Create the empty data with the dates you want to add
dat2 <- data.frame(DATE = as_date(ymd("2016-01-04"):ymd("2016-01-07")))
# Join the two datasets
full_join(dat, dat2)
#> DATE value
#> 1 2016-01-01 7
#> 2 2016-01-02 2
#> 3 2016-01-03 8
#> 4 2016-01-04 NA
#> 5 2016-01-05 NA
#> 6 2016-01-06 NA
#> 7 2016-01-07 NA
Created on 2022-06-25 by the reprex package (v2.0.1)