Home > database >  Remove duplicated values from a data frame
Remove duplicated values from a data frame

Time:09-12

I am working with a large dataset. My data looks like this :

df = data.frame (Frame0 = c(22,22,22,53,76,76,76,76,89,89,1000,1000,1000,1000,1000),Frame1 =c(18,18,21,21,21,46,67,67,67,67,67,103,103,1200,1200))

df

 Frame0 Frame1
1      22     18
2      22     18
3      22     21
4      53     21
5      76     21
6      76     46
7      76     67
8      76     67
9      89     67
10     89     67
11   1000     67
12   1000    103
13   1000    103
14   1000   1200
15   1000   1200

My question is how to delete the duplicated value in each row to get this:

   Frame0 Frame1
1      22     18
2      53     21
3      76     46 
4      89     67
5    1000    103
6           1200

I've tried the unique and duplicate functions without success

Thanks

CodePudding user response:

It seems that Frame0 and Frame1 are unnecessary to be paired. If so, the data structure you should use is a list instead of data.frame.

uni <- lapply(df, unique)
uni

# $Frame0
# [1]   22   53   76   89 1000
# 
# $Frame1
# [1]   18   21   46   67  103 1200

If you insist on a data.frame, you could use

as.data.frame(lapply(uni, `[`, 1:max(lengths(uni))))

#   Frame0 Frame1
# 1     22     18
# 2     53     21
# 3     76     46
# 4     89     67
# 5   1000    103
# 6     NA   1200
  • Related