I have the following dataframe in R. Is there a quick way to replace all of the NA values in the dataframe with a blank string in dplyr?
data <- structure(list(Bank = c("Primary Residential Mortgage Loans",
"FNB", "BWHK", "NED", "STD", "Average", "Non-Primary Residential Mortgage Loans",
"FNB", "BWHK", "NED", "STD", "Average"), `Q3 2021` = c(NA, 88.3,
66.5, 87.07, 97.97, 84.96, NA, 11.7, 33.5, 12.93, 2.03, 15.04
), `Q4 2021` = c(NA, 88.43, 66.71, 87.57, 98, 85.18, NA, 11.57,
33.29, 12.43, 2, 14.82), `Q1 2022` = c(NA, 88.67, 66.6, 89.25,
97.9, 85.61, NA, 11.33, 33.4, 10.75, 2.1, 14.39), `Q2 2022` = c(NA,
88.84, 66.67, 89.18, 97.91, 85.65, NA, 11.16, 33.33, 10.82, 2.09,
14.35), `Q3 2022` = c(NA, 89.09, 66.77, 88.96, 97.98, 85.7, NA,
10.91, 33.23, 11.04, 2.02, 14.3), `Q-o-Q % Point Change` = c(NA,
0.25, 0.1, -0.22, 0.07, 0.05, NA, -0.25, -0.1, 0.22, -0.07, -0.05
), `Y-o-Y % Point Change` = c(NA, 0.79, 0.27, 1.89, 0.01, 0.74,
NA, -0.79, -0.27, -1.89, -0.01, -0.74)), row.names = c(NA, -12L
), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
You can use tidyr::replace_na()
within dplyr::mutate(across())
. You’ll also have to coerce each column to character or replace_na()
will throw an error.
library(dplyr)
library(tidyr)
data %>%
mutate(across(.fns = ~ replace_na(as.character(.x), "")))
# A tibble: 12 × 8
Bank Q3 20…¹ Q4 20…² Q1 20…³ Q2 20…⁴ Q3 20…⁵ Q-o-Q…⁶ Y-o-Y…⁷
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Primary Residential … "" "" "" "" "" "" ""
2 FNB "88.3" "88.43" "88.67" "88.84" "89.09" "0.25" "0.79"
3 BWHK "66.5" "66.71" "66.6" "66.67" "66.77" "0.1" "0.27"
4 NED "87.07" "87.57" "89.25" "89.18" "88.96" "-0.22" "1.89"
5 STD "97.97" "98" "97.9" "97.91" "97.98" "0.07" "0.01"
6 Average "84.96" "85.18" "85.61" "85.65" "85.7" "0.05" "0.74"
7 Non-Primary Resident… "" "" "" "" "" "" ""
8 FNB "11.7" "11.57" "11.33" "11.16" "10.91" "-0.25" "-0.79"
9 BWHK "33.5" "33.29" "33.4" "33.33" "33.23" "-0.1" "-0.27"
10 NED "12.93" "12.43" "10.75" "10.82" "11.04" "0.22" "-1.89"
11 STD "2.03" "2" "2.1" "2.09" "2.02" "-0.07" "-0.01"
12 Average "15.04" "14.82" "14.39" "14.35" "14.3" "-0.05" "-0.74"
# … with abbreviated variable names ¹`Q3 2021`, ²`Q4 2021`, ³`Q1 2022`,
# ⁴`Q2 2022`, ⁵`Q3 2022`, ⁶`Q-o-Q % Point Change`, ⁷`Y-o-Y % Point Change`
CodePudding user response:
Here is a base R
function:
First we have to transfor all columns to character type (we do it with lapply
). Then assign ""
to data:
na_blank <- function(df) {
df[] <- lapply(df[], as.character)
df[is.na(df)] <- ""
return(df)
}
na_blank(data)
Bank `Q3 2021` `Q4 2021` `Q1 2022` `Q2 2022` `Q3 2022`
<chr> <chr> <chr> <chr> <chr> <chr>
1 Primary Resident… "" "" "" "" ""
2 FNB "88.3" "88.43" "88.67" "88.84" "89.09"
3 BWHK "66.5" "66.71" "66.6" "66.67" "66.77"
4 NED "87.07" "87.57" "89.25" "89.18" "88.96"
5 STD "97.97" "98" "97.9" "97.91" "97.98"
6 Average "84.96" "85.18" "85.61" "85.65" "85.7"
7 Non-Primary Resi… "" "" "" "" ""
8 FNB "11.7" "11.57" "11.33" "11.16" "10.91"
9 BWHK "33.5" "33.29" "33.4" "33.33" "33.23"
10 NED "12.93" "12.43" "10.75" "10.82" "11.04"
11 STD "2.03" "2" "2.1" "2.09" "2.02"
12 Average "15.04" "14.82" "14.39" "14.35" "14.3"
# … with 2 more variables: `Q-o-Q % Point Change` <chr>,
# `Y-o-Y % Point Change` <chr>