Home > Enterprise >  Filling NA row using data from early date using dpylr in R
Filling NA row using data from early date using dpylr in R

Time:10-10

I have data:

set.seed(2021)
sales <- round(runif(672, 10, 50), 0)

I want to add sales data into a dataframe as a new column. my dataframe is like this:

library(lubridate)
library(tidyr)
gen_month <- function(first_datex){
  first_datex <- as.Date(first_datex)
  last_datex <- ceiling_date(first_datex, 'month') - 1
  expand_grid(datex = seq(first_datex, last_datex, by = 'day'), hourx = 0:23)
}
mydata <- gen_month("2021-03-01")

As sample, i use month March as mydata. So we combine mydata and sales.

set.seed(2021)
sales <- c(sales, rep(NA,72))
df <- data.frame(mydata, sales)

#tail(df)
#         datex hourx sales
#739 2021-03-31    18    NA
#740 2021-03-31    19    NA
#741 2021-03-31    20    NA
#742 2021-03-31    21    NA
#743 2021-03-31    22    NA
#744 2021-03-31    23    NA

But, because the length of sales data is less than mydata, We fill NA data on March using early data of df. The output that i hope is:

df <- data.frame(mydata, sales2 = c(sales, sales[1:72]))
# head(df,72) & tail(df,72) should be same.

My question are

  1. How do we automate this process?.
  2. If the length of sales data is more than mydata, we need to cut sales data so it can fit into mydata. Can it become only 1 solution with my first question?

Many Thank You.

CodePudding user response:

You may subset the sales data depending on the number of rows in mydata.

mydata$sales <- sales[1:nrow(mydata)]

CodePudding user response:

We may also do

mydata$sales <- sales[seq_len(nrow(mydata))]
  • Related