Home > Enterprise >  Extract column names from another dataframe using a vector of column numbers using R
Extract column names from another dataframe using a vector of column numbers using R

Time:07-16

I have an R dataframe df like so:

Index   Apple       Banana      Cherry      Grapes      Kiwi        Mango    Pineapple
1   0.66847562  0.495134685 0.820451181 0.070702057 0.683837693 0.388543228 0.571200597
2   0.184473453 0.508429653 0.396672132 0.743786979 0.661554403 0.436394826 0.377355932
3   0.301713951 0.179224237 0.50246674  0.170402365 0.479336322 0.124812993 0.371646433
4   0.745509908 0.337535336 0.800883347 0.717630725 0.985711555 0.234468872 0.16068873
5   0.476586474 0.100029462 0.613557843 0.567949925 0.619796691 0.553820422 0.705426475
6   0.856433091 0.495988882 0.314542875 0.9065113   0.118285352 0.95569566  0.665548558
7   0.110767972 0.171679955 0.545800594 0.919484825 0.415596164 0.628647556 0.483384835
8   0.094686487 0.546430332 0.153235316 0.980495149 0.144969949 0.114925394 0.255054177
9   0.485556303 0.487981036 0.503833458 0.842182793 0.333361918 0.305030779 0.161368591
10  0.109480492 0.102061878 0.294458866 0.485434308 0.379377818 0.559150491 0.675970203
11  0.439851776 0.836668881 0.358897587 0.24019111  0.481115144 0.500810573 0.695537232
12  0.309172232 0.008729781 0.197713284 0.080192772 0.374389407 0.332776965 0.70539989
13  0.043975567 0.552578684 0.598413321 0.177443546 0.433807567 0.324200317 0.244426492
14  0.362977832 0.965445082 0.831462419 0.237671771 0.36718815  0.218914879 0.5592952

Code to create table:

df<-data.frame (Index = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14),
                Apple = c(runif(14, min=0, max=1)),
                Banana = c(runif(14, min=0, max=1)),
                Cherry = c(runif(14, min=0, max=1)),
                Grapes = c(runif(14, min=0, max=1)),
                Kiwi = c(runif(14, min=0, max=1)),
                Mango = c(runif(14, min=0, max=1)),
                Pineapple = c(runif(14, min=0, max=1))
                )

I have done some parsing on this data set and have extracted the rows and columns of interest resulting in a dataframe of interest doi like so:

row  colum
2     3
2     5
7     4
8     7
11    2
12    2
13    1

Code to create table:

doi<- data.frame(row=c(2,2,7,8,11,12,13),
                  colum=c(3,5,4,7,2,2,1))

I want to correlate the rows and columns from doi to df. So I use

data.frame(Index=c(df[doi$row,1]), Fruits=c(colnames(df[doi$colum])))

However I get a .1 appended to the columns that occurred twice. This is the result I am getting:

 Index  Fruits
1     2  Banana
2     2  Grapes
3     7  Cherry
4     8   Mango
5    11   Apple
6    12 Apple.1
7    13   Index

For index 12 why does it show Apple.1 instead of just Apple?

CodePudding user response:

data.frame(Index=c(df[doi$row,1]), Fruits=c(colnames(df)[doi$colum]))

is the right statement.

  Index Fruits
1     2 Banana
2     2 Grapes
3     7 Cherry
4     8  Mango
5    11  Apple
6    12  Apple
7    13  Index

colnames(df)[doi$colum] will give you the column names at these indexes, while colnames(df[doi$colum]) returns all the column names of a new data.frame, which enforces unique column names, hence the .1.

  •  Tags:  
  • r
  • Related