Home > Enterprise >  Create "Ascending ID" in a Data Frame
Create "Ascending ID" in a Data Frame

Time:04-06

I have this data in R:

column1 = c("A", "B", "C")

column2 = c("AA", "BB", "CC", "DD")

column3 = c("AAA")


na.pad <- function(x,len){
    x[1:len]
}

makePaddedDataFrame <- function(l,...){
    maxlen <- max(sapply(l,length))
    data.frame(lapply(l,na.pad,len=maxlen),...)
}


d = makePaddedDataFrame(list(x=column1,y=column2,z=column3))

     x  y    z
1    A AA  AAA
2    B BB <NA>
3    C CC <NA>
4 <NA> DD <NA>
  • I would like to give "Ascending ID's" to each element in this table such that NA's are not assigned an ID - for example:

enter image description here

In this above example:

  • The first row of Column X is assigned as id = 1, the second row of Column X is assigned as id = 2, the third row of Column X is assigned as id = 3, and the fourth row of Column X is skipped because there is an NA

  • Since there are no NA's in Column Y, the first row of Column Y is assigned as id = 4 (picks off from the previous row), the second row of Column Y is assigned as id = 5, the third row of Column Y is assigned as id = 6, and the fourth row of Column Y is assigned as id = 7

  • Since all rows in Column Z are NA except the first row, only the first row of Column Z is assigned as id = 8 and all other rows are skipped.

Thank you!

CodePudding user response:

Here is one option with replace - create a logical matrix of 'd' where there are non-NA elements (!is.na(d)), replace those elements, with sequence (sum(!is.na(d)) - returns the total count of non-NA, seq_len, gives the sequence for that count) and assign (<-) it to new columns by pasteing the '_id' on the existing column names

d[paste0(names(d), "_id")] <- replace(d, !is.na(d), seq_len(sum(!is.na(d))))

-output

> d
     x  y    z x_id y_id z_id
1    A AA  AAA    1    4    8
2    B BB <NA>    2    5 <NA>
3    C CC <NA>    3    6 <NA>
4 <NA> DD <NA> <NA>    7 <NA>

CodePudding user response:

You can try:

d[paste0(names(d), "_id")] <- cumsum(c(!is.na(d))) * match(!is.na(d), TRUE)

Which gives:

     x  y    z x_id y_id z_id
1    A AA  AAA    1    4    8
2    B BB <NA>    2    5   NA
3    C CC <NA>    3    6   NA
4 <NA> DD <NA>   NA    7   NA
  • Related