Home > Net >  Change Character Value Into Negative Number Using R
Change Character Value Into Negative Number Using R

Time:12-27

I have the following dataframe. All the columns are character columns, but except the 1st column, rest of the columns contain numeric values:

df <- data.frame(col1=c('', 'assets', 'loss', 'liability'),
                 col2=c("", "5,000", "(1,400)", "300"),
                 col3=c("", "4,500", "(1,100)", "500"))

df

col1       col2      col3
<chr>      <chr>    <chr>
        
assets     5,000    4,500
loss      (1,400)  (1,100)
liability   300      500 
                                                                  ​               ​        ​

The task is -- if a value is within () [for example, loss value], I need to make that value as a negative number.

I am trying the following script:

clean_columns <- function(dataframe){

  dataframe_new <- copy(dataframe)

  for (idx in 2:length(names(dataframe))) {
      
      # Check if text contains -- ( )
      if (grepl(is.na(dataframe_new[, idx]), "(", fixed = TRUE)) {
        
      # Remove characters -- () , -- and multiply -1
      dataframe_new[, idx] <- gsub("[(,)]", "", dataframe_new[, idx]) %>%
        as.numeric(dataframe_new[, idx])*(-1)
      
      } else 
      # Remove characters -- ,
      dataframe_new[, idx] <- gsub("[,]", "", dataframe_new[, idx]) %>%
        as.numeric(dataframe_new[, idx])
} 
  return(dataframe_new)
}

The function removes the characters correctly, but make the loss value NA.

clean_columns(df)

col1       col2     col3
<chr>      <dbl>    <dbl>
            NA       NA
assets     5000     4500
loss        NA       NA
liability  300      500

Desired Output

col1       col2     col3
<chr>      <dbl>    <dbl>
            NA       NA
assets     5000     4500
loss      -1400    -1100
liability   300      500

Any suggestions would be appreciated. Thanks!

CodePudding user response:

We loop across the columns having at least some digits, remove the ), replace the ( with - and parse it with parse_number

library(dplyr)
library(stringr)
 df %>% 
  mutate(across(-1, 
   ~ readr::parse_number(str_replace(str_remove(.x, "\\)"), fixed("("), "-") )))

-output

       col1  col2  col3
1              NA    NA
2    assets  5000  4500
3      loss -1400 -1100
4 liability   300   500

Or instead of removing the (, we can detect the ( to multiply by -1

df %>%
   mutate(across(-1, ~ c(1, -1)[1   str_detect(.x, fixed("("))] * 
         readr::parse_number(.x) ))

CodePudding user response:

Here is one without my favorite function parse_number():

library(dplyr)
library(stringr)

df %>% 
  mutate(across(-col1, ~str_replace_all(., "[[:punct:]]", ""))) %>% 
  type.convert(as.is = TRUE) %>% 
  mutate(across(-col1, ~ifelse(col1 == "loss", .*-1, .))) %>% 
  as_tibble()
  col1         col2  col3
  <chr>       <dbl> <dbl>
1 ""             NA    NA
2 "assets"     5000  4500
3 "loss"      -1400 -1100
4 "liability"   300   500
  • Related