I am doing this practice problem that wants me to write a function that can calculate the digits that occur the most number of times in the array.
The example is input:
x = c(25, 2, 3, 57, 38, 41)
and the return value is 2, 3, 5, since these numbers 2, 3 and 5 all occurs 2 times which is the most.
CodePudding user response:
One approach would look something like this, although I am sure there are more efficient approaches:
my_vector <- c(25, 2, 3, 57, 38, 41)
# function to evaluate the number of times a certain digit occurrs
digit_occurrence <- function(vector) {
# collape vector to a single string without commas
x <- paste(vector, sep = '', collapse = '')
# create empty vector
digit <- c()
# loop over each unique digit and store its occurrence
for(i in paste(as.character(0:9))) {
digit[i] <- lengths(regmatches(x, gregexpr(i, x)))
}
digit
}
> digit_occurrence(my_vector)
0 1 2 3 4 5 6 7 8 9
0 1 2 2 1 2 0 1 1 0
CodePudding user response:
A solution using the table()
function to get a dataframe with the frequency of each digit (instead of counting with a for loop), and then arranging that dataframe by frequency and extracting the top three digits directly:
input_vector <- c(25, 2, 3, 57, 38, 41)
top_digits <- function(my_array, n=3) {
# `as.character` converts the digits to strings,
# `strsplit` splits each one into individual characters (e.g. "23" into "2" and "3")
# and `unlist` "flattens" the result to a unique string vector
my_array_splitted <- unlist(strsplit(as.character(input_vector), ""))
# `table` creates a vector of frequencies
# `as.data.frame` converts the vector into a DF with 2 columns: digits and frequencies
df_digits <- as.data.frame(table(my_array_splitted))
# Sorting the DF by frequency
df_digits <- df_digits[order(df_digits$Freq, decreasing = TRUE),]
# Extracting the first `n` elements of the digits column (which is now sorted) and converting back to integer
# (we need the intermediate step as character because the column is originally factor, and converting directly to integer is unsafe
as.integer(as.character(df_digits$my_array_splitted[1:n]))
}
CodePudding user response:
This could be another option for you:
fn <- function(x) {
# First We separate every single digit in each element but we need to turn
# the each element into character string beforehand. We then use do.call
# function to apply c function on every element of the resulting list to
# flatten the list to a vector
digits <- do.call(c, sapply(x, function(y) strsplit(as.character(y), "")))
# In the end we calculate the frequencies and sort the in decreasing order
most_freq <- sort(table(digits), decreasing = TRUE)
most_freq
}
fn(x)
digits_num
2 3 5 1 4 7 8
2 2 2 1 1 1 1
CodePudding user response:
This approach is similar and uses table
count = function(x) {
# make a table of counts of all the digits
tab = table(strsplit(paste(x, collapse=""), ""))
# access the names of the last digits
names(tab[max(tab)])
}
And a fun benchmark because it's Christmas:
x = sample(1:1000, 100000, replace=T)
Unit: milliseconds
expr min lq mean median uq max
me(x) 46.63262 52.34020 57.33796 53.87266 58.91561 123.5481
anou(x) 319.14199 351.43877 381.35371 374.78037 408.67354 490.3464
digit_occurrence(x) 149.83663 151.61908 160.47220 156.88108 161.57646 245.5067
top_digits(x) 42.40598 49.92426 55.87991 51.90813 56.61563 109.5608