Home > database >  Sorting dataframe with 1 column
Sorting dataframe with 1 column

Time:12-15

I have a data frame of names which has 1 column. I have tried multiple iterations of order() and have also converted it to a list and tried sort() in a few different ways, with no luck.

Below is dput() for reference:

> dput(names.ordered)
structure(list(Directors = c("Darabont, Frank", "Nolan, Christopher", 
"Lumet, Sidney", "Spielberg, Steven", "Jackson, Peter", "Tarantino, Quentin", 
"Leone, Sergio", "Fincher, David", "Zemeckis, Robert", "Kershner, Irvin", 
"Wachowski, Lana", "Scorsese, Martin", "Forman, Milos", "Kurosawa, Akira", 
"Demme, Jonathan", "Meirelles, Fernando", "Benigni, Roberto", 
"Capra, Frank", "Lucas, George", "Miyazaki, Hayao", "Besson, Luc", 
"Kobayashi, Masaki", "Polanski, Roman", "Cameron, James", "Singer, Bryan", 
"Hitchcock, Alfred", "Allers, Roger", "Chaplin, Charles", "Kaye, Tony", 
"Takahata, Isao", "Chazelle, Damien", "Scott, Ridley", "Nakache, Olivier", 
"Curtiz, Michael", "Tornatore, Giuseppe", "Kubrick, Stanley", 
"Wilder, Billy", "Stanton, Andrew", "Russo, Anthony", "Persichetti, Bob", 
"Chan-Wook, Park", "Phillips, Todd", "Shinkai, Makoto", "Unkrich, Lee", 
"Labaki, Nadine", "Petersen, Wolfgang", "Hirani, Rajkumar", "Lasseter, John", 
"Mendes, Sam", "Gibson, Mel", "Kail, Thomas", "Marquand, Richard", 
"Klimov, Elem", "Lang, Fritz", "Khan, Aamir", "Welles, Orson", 
"Vinterberg, Thomas", "Aronofsky, Darren", "Donen, Stanley", 
"Gondry, Michel", "Lean, David", "Tiwari, Nitesh", "Villeneuve, Denis", 
"Zeller, Florian", "Farhadi, Asghar", "Ray, Satyajit", "Ritchie, Guy", 
"Jeunet, Jean-Pierre", "Mulligan, Robert", "Docter, Pete", "Mann, Michael", 
"Hanson, Curtis", "McTiernan, John", "Gnanavel, T.J.", "Farrelly, Peter", 
"Hirschbiegel, Oliver", "Gilliam, Terry", "Eastwood, Clint", 
"Majidi, Majid", "Kramer, Stanley", "Sturges, John", "Huston, John", 
"Howard, Ron", "Coen, Ethan", "Carpenter, John", "Bergman, Ingmar", 
"McDonagh, Martin", "Pablos, Sergio", "Lynch, David", "Weir, Peter", 
"Reed, Carol", "McTeigue, James", "Boyle, Danny", "Coen, Joel", 
"O'Connor, Gavin", "Fleming, Victor", "Ozu, Yasujirô", "Kazan, Elia", 
"Irmak, Cagan", "Szifron, Damián", "Tarkovsky, Andrei", "Cimino, Michael", 
"Costa-Gavras, Costa-Gavras,", "Anderson, Wes", "Keaton, Buster", 
"Bruckman, Clyde", "Linklater, Richard", "Elliot, Adam", "Sheridan, Jim", 
"Abrahamson, Lenny", "Raghavan, Sriram", "Mangold, James", "McQueen, Steve", 
"Lubitsch, Ernst", "DeBlois, Dean", "Miller, George", "Wyler, William", 
"Yates, David", "Clouzot, Henri-Georges", "Reiner, Rob", "Kashyap, Anurag", 
"Rosenberg, Stuart", "Hallström, Lasse", "Kassovitz, Mathieu", 
"Truffaut, François", "Yamada, Naoko", "Stone, Oliver", "McCarthy, Tom", 
"Jones, Terry", "George, Terry", "Turgul, Yavuz", "Wong, Kar-Wai", 
"Penn, Sean", "Anno, Hideaki", "Pontecorvo, Gillo", "Fellini, Federico", 
"Wenders, Wim", "Kieslowski, Krzysztof", "Kumar, Ram", "Coppola, Francis Ford", 
"Joon Ho, Bong", "von Donnersmarck, Florian Henckel", "Van Sant, Gus", 
"De Sica, Vittorio", "Hill, George Roy", "De Palma, Brian", "Mankiewicz, Joseph L.", 
"Anderson, Paul Thomas", "del Toro, Guillermo", "Campanella, Juan José", 
"Shyamalan, M. Night", "Dreyer, Carl Theodor", "Avildsen, John G.", 
"Iñárritu, Alejandro G.")), row.names = c(NA, -154L), class = "data.frame")

A couple things I've already tried which returned errors or no results:

> names.ordered <- names.ordered[order(names.ordered$Directors)]
Error in `[.data.frame`(names.ordered, order(names.ordered$Directors)) : 
  undefined columns selected

> names.ordered <- names.ordered[order(1)] 

#after converting to list
> names.ordered <- sort(names.ordered)
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

CodePudding user response:

You need to specify which column is to be ordered/sorted even if the data frame contains only one column.

If you want to preserve the original order of names.ordered use order to create an index:

idx <- order(names.ordered$Director)
head(names.ordered)
           Directors
1    Darabont, Frank
2 Nolan, Christopher
3      Lumet, Sidney
4  Spielberg, Steven
5     Jackson, Peter
6 Tarantino, Quentin
head(names.ordered[idx, ])
# [1] "Abrahamson, Lenny"     "Allers, Roger"         "Anderson, Paul Thomas" "Anderson, Wes"         "Anno, Hideaki"         "Aronofsky, Darren" 

If you want to re-arrange the order of names.ordered use sort():

names.ordered$Directors <- sort(names.ordered$Directors)
head(names.ordered$Directors)
# [1] "Abrahamson, Lenny"     "Allers, Roger"         "Anderson, Paul Thomas" "Anderson, Wes"         "Anno, Hideaki"         "Aronofsky, Darren"    
tail(names.ordered$Directors)
# [1] "Wong, Kar-Wai"    "Wyler, William"   "Yamada, Naoko"    "Yates, David"     "Zeller, Florian"  "Zemeckis, Robert"

CodePudding user response:

I think your main problem is that you attempted to order the columns. The syntax to extract elements from a data frame is either x[i, j, ... , drop=TRUE] or X[j], where i denotes the rows and j the columns. Notice the comma which is always needed when referring to rows. Since you did not use a comma, R thinks you did X[j] and you want to order the columns. So use order() before the comma to sort by rows.

In the " order() " call, simply enter the vector from which you want to obtain the order by which you want to rearrange the data frame.

A minor complication is that you just have one column, which would coerce the result to the lowest possible dimension (i.e. a vector in this case). To avoid this, there is the argument drop=FALSE.

names_ordered <- names[order(names$Directors), , drop=FALSE]

head(names_ordered)
#            Directors
# 1    Darabont, Frank
# 2 Nolan, Christopher
# 3      Lumet, Sidney
# 4  Spielberg, Steven
# 5     Jackson, Peter
# 6 Tarantino, Quentin
  • Related