I have this data frame
df <- data.frame(c(1:5), c(6:10), rep(1,length(5)), c(11:15), rep(4,length(5)))
I want to find all columns which have numbers that are all equal, eg. (all 1 or all 4).
After finding these columns, I would like to replace these columns with others that has 1 in the first row and 0 in all the others rows like this:
I used this to find the columns
i1 <- sapply(df, function(x) length(unique(x)) >1)
I do not know how to replace those columns with the new ones.
Thank you
CodePudding user response:
Welcome to SO!
You can try this in R base, probably there are more elegant ways:
# get the columns which have the same value (all values in columns == to their mean)
vec <- sapply(df, function(x) all(x == mean(x)))
# split df in two, one with the columns to replaces, and the other columns
df_no <- df[,!vec]
df_yes <- df[,vec]
# replace with 1,0,0 ...
df_yes <- sapply(df_yes, function (x) x = c(1, rep(0,length(x)-1)) )
# put together
df <- cbind(df_no, df_yes)
# order alphabetically the output if needed
df[order(names(df))]
a b c d e
1 1 6 1 11 1
2 2 7 0 12 0
3 3 8 0 13 0
4 4 9 0 14 0
5 5 10 0 15 0
with data:
df <- data.frame(a=c(1:5), b=c(6:10), c=rep(1,length(5)),d= c(11:15), e=rep(4,length(5)))
CodePudding user response:
Here's an apply
solution:
df <- setNames(
data.frame(c(1:5), c(6:10), rep(1,length(5)), c(11:15), rep(4,length(5))),
letters[1:5])
df_new <- apply(df, 2, function(col)
if(length(unique(col)) == 1) c(1, rep(0, length(col)-1)) else col)
df_new
a b c d e
[1,] 1 6 1 11 1
[2,] 2 7 0 12 0
[3,] 3 8 0 13 0
[4,] 4 9 0 14 0
[5,] 5 10 0 15 0
Note that the output is now a matrix, which may actually be a better fit if your data really is all the same type across all columns; however, if you want a data.frame, as.data.frame(df_new)
will do that for you.