Home > Net >  Remove multiple columns from a df
Remove multiple columns from a df

Time:10-22

I have a dataframe and I wish to remove 2 columns from it using the names, following another post on here it suggests the following code should work (or be close to working), would anyone be able to point me in the right direction here.

My df column dames

head(economics_df)

  `Series Name`  `Series Code` Country `Country Code` `1997` `1998` `1999` `2000` `2001` `2002` `2003` `2004` `2005` `2006` `2007` `2008` `2009` `2010` `2011` `2012`
  1 GDP (current … NY.GDP.MKTP.… Spain   ESP            5.900… 6.192… 6.349… 5.983… 6.278… 7.087… 9.074… 1.069… 1.153… 1.260…

code to remove unwanted columns

economics_df = economics_df %>% select(-c(`Series Code`, `Country Code`))

other ways tried

economics_df = economics_df[-c("`Series Code`", "`Country Code`")]

CodePudding user response:

Would this work? It's hard to know without the structure of the column names.

library(dplyr)

economics_df = economics_df %>% 
  dplyr::select(-c("`Series Code`", "`Country Code`"))

Or using base R:

economics_df = df[, !names(df) %in% c("`Series Code`", "`Country Code`")]

Output

  `Series Name` `Country` `1997`
1             1         1      1

Data

economics_df <- structure(list(``Series Name`` = 1, ``Series Code`` = 1, ``Country`` = 1, 
    ``Country Code`` = 1, ``1997`` = 1), class = "data.frame", row.names = c(NA, 
-1L))

#  `Series Name` `Series Code` `Country` `Country Code` `1997`
#1             1             1         1              1      1

OR (if you're column names are structured slightly different), then remove the outer quotation marks:

economics_df = economics_df %>% 
  dplyr::select(-c(`Series Code`, `Country Code`))

economics_df = economics_df[, !names(economics_df) %in% c(`Series Code`, `Country Code`)]

Output

economics_df <- structure(list(`Series Name` = "1", `Series Code` = 1, Country = "1", 
    `Country Code` = 1, `1997` = 1), row.names = c(NA, -1L), class = "data.frame")

CodePudding user response:

I assume you are looking for something like below. First, I loaded the tidyverse for tibbles (name repair) and data selection. Then I tried to recreate one row of your data (I didn't bother making all of the columns because that is time consuming). I then merged those into a tibble, which automatically retains the odd naming. Finally, I selected by name with contains.

#### Load Library ####
library(tidyverse)

#### Create Variables ####
`Series Name` <- "Name"
`Series Code` <- 000
Country <- "Afghanistan" 
`Country Code` <- 311
`1997` <- 100 
`1998` <- 200 
`1999` <- 300

#### Turn into Tibble ####
df <- tibble(`Series Name`,`Series Code`,Country,
                 `Country Code`,`1997`,`1998`,`1999`)

#### Deselect Based on Name ####    
df %>% 
  select(-contains("Country Code"),
         -contains("Series Code"))

Which gives you this tibble back:

# A tibble: 1 × 5
  `Series Name` Country     `1997` `1998` `1999`
  <chr>         <chr>        <dbl>  <dbl>  <dbl>
1 Name          Afghanistan    100    200    300
  • Related