how to separate the currency symbols from the number and make a new column with the symbol replaced by EUR, USD, YEN etc. in a dataframe like this in R:
> a
prices
1 $100.00
2 200.00€
3 75.40¥
4 £51.98
5 154.00 EUR
6 59.00 USD
The desired output should look like this:
> a
prices currency
1 100.00 USD
2 200.00 EUR
3 75.40 JPY
4 51.98 GBP
5 154.00 EUR
6 59.00 USD
Thanks in advance!
CodePudding user response:
Using tidyverse
functions. There are no built-in converter so I'm afraid you'll have to do the currency symbol/name conversion manually (you can do it with case_when
).
library(tidyverse)
dat %>%
mutate(value = parse_number(prices),
text = gsub("[0-9.] ", "", prices),
text = case_when(text == "$" ~ "USD",
text == "€" ~ "EUR",
text == "¥" ~ "YEN",
text == "£" ~ "GBP",
T ~ text))
output
prices value text
1 $100.00 100.00 USD
2 200.00€ 200.00 EUR
3 75.40¥ 75.40 YEN
4 £51.98 51.98 GBP
5 154.00 EUR 154.00 EUR
6 59.00 USD 59.00 USD
data
dat <- read.table(header = T, text = " prices
1 $100.00
2 200.00€
3 75.40¥
4 £51.98
5 '154.00 EUR'
6 '59.00 USD'")
CodePudding user response:
Adding to Maël's answer, but with a stringr
solution for the regex, but I also threw in a case for dealing with lower case.
dat %>% mutate(numbers = str_extract(prices, "[0-9.] "),
currency = str_replace(prices, numbers, ""),
currency = case_when(currency == "$" ~ "USD",
currency == "€" ~ "EUR",
currency == "¥" ~ "YEN",
currency == "£" ~ "GBP",
TRUE ~ toupper(currency)))