I am referring the document here
https://developer.apple.com/documentation/swift/array/2298686-reduce
So I got code snippet from one article its about reduce
which gives unique value set from the array
func countUniques<T: Comparable>(_ array: Array<T>) -> Int {
let sorted = array.sorted()
let initial: (T?,Int) = (.none, 0)
let reduced = sorted.reduce(initial) {
return ($1, $0.0 == $1 ? $0.1 : $0.1 1)// Doubt here
}
return reduced.1
}
print(countUniques([1,1,1,1,1,1,2,2,2,2,3,3,3,4]))// prints 4
Reduce is generally used to sum up elements but here it's little tricky & I am not able to understand return ($1, $0.0 == $1 ? $0.1 : $0.1 1)
what this is actually doing ?
CodePudding user response:
Recall that reduce
works by repeated applying a function (that is the closure you pass in) to an accumulator (that is the initial
), for each element of the sequence:
var acc = initial
for element in array {
acc = f(acc, element) // f is the closure
}
Now knowing that, this reduction will make a lot more sense if you give meaningful names to $0.0
, $0.1
and $1
. Here I have renamed $0
to (prev, total)
, and $1
to curr
.
func countUniques<T: Comparable>(_ array: Array<T>) -> Int {
let sorted = array.sorted()
let initial: (T?,Int) = (.none, 0)
let (_, total) = sorted.reduce(initial) { acc, curr in
let (prev, total) = acc
return (curr, prev == curr ? total : total 1)
}
return total
}
prev
is used to keep track of the previous element that we have seen. Notice how we always return curr
as the first element of the pair. This is to say that the current element that we are seeing, becomes the "previous element" in the next iteration. Also note that it starts off .none
(nil
), as we have not seen anything yet.
If prev
is not curr
, we add 1 to total
, otherwise we use the same total
. In this way, we can count how many times we see a different element from the one just saw.
For a sorted array, "how many times we see a different element from the one just saw" is the same number as the number of unique elements, because the duplicates will all be grouped next to each other.
CodePudding user response:
initial
not having labels makes that code too impenetrable.
Fixing that and using the into
overload should make things more clear. But with either overload, $0
is the accumulating value, and $1
is the current iteration of the sequence being reduced.
extension Array where Element: Comparable {
var countOfUniques: Int {
sorted()
.reduce(into: (previous: Element?.none, count: 0)) {
if $0.previous != $1 {
$0.count = 1
}
$0.previous = $1
}
.count
}
}
If you write this with a more reusable case of reduce
, you can disregard $1
, but only if you name $0
.
import Algorithms
[1,1,1,1,1,1,2,2,2,2,3,3,3,4].uniqued().count
public extension Sequence {
/// - Complexity: O(n)
var count: Int {
reduce(0) { count, _ in count 1 }
}
}
CodePudding user response:
Debugging is a useful skill. If you don't understand, you can add breakpoints and see the values, conditional breakpoints with prints, etc, or directly print()
.
Since the reduce()
is "complex", let's explicit it:
Let's name the variables. It's a reduce(_:_)
, so we know what mean the first and second parameter. The first one is "initial" at start, and is modified during the course of the iteration, so let's call it partialResult
, and it will be the finalResult
on last iteration, giving its value to the return function, here let reduced =
. The second one is the element of the iteration:
let reduced = sorted.reduce(initial) { partialResult, currentIterationItem in
return (currentIterationItem, partialResult.0 == currentIterationItem ? partialResult.1 : partialResult.1 1)// Doubt here
}
Let's remove the ternary if now, with an explicit if/else:
let reduced = sorted.reduce(initial) { partialResult, currentIterationItem in
if partialResult.0 == currentIterationItem {
return (currentIterationItem, partialResult.1)
} else {
return (currentIterationItem, partialResult.1 1)
}
}
Let's use a temp variable which will holds the value returned:
let reduced = sorted.reduce(initial) { partialResult, currentIterationItem in
var returnValue: (T?,Int)
if partialResult.0 == currentIterationItem {
returnValue = (currentIterationItem, partialResult.1)
} else {
returnValue = (currentIterationItem, partialResult.1 1)
}
return returnValue
}
Now, let's add logs:
let reduced = sorted.reduce(initial) { partialResult, currentIterationItem in
print("Current Item to analyze: \(currentIterationItem)")
print("Partial Result: \(partialResult) before")
var returnValue: (T?,Int)
if partialResult.0 == currentIterationItem {
print("partialResult.0 == currentIterationItem")
returnValue = (currentIterationItem, partialResult.1)
} else {
print("partialResult.0 != currentIterationItem")
returnValue = (currentIterationItem, partialResult.1 1)
}
print("Return value: \(returnValue) which is partialResult at next iteration")
return returnValue
}
You can try it now, but your example is not the best one. Why? Because it's using an already sorted array, and "4" is also the value of the last item, it could bring confusion...
Instead, let's use an array of String
, single letters for better understanding (1 -> A, 2 -> B, 3 -> C, 4 -> D):
print(countUniques5(["A","A","A","A","A","A","B","B","B","B","C","C","C","D"]))// prints 4
Okay, now, what are the logs:
Current Item to analyze: A
Partial Result: (nil, 0) before
partialResult.0 != currentIterationItem
Return value: (Optional("A"), 1) which is partialResult at next iteration
Current Item to analyze: A
Partial Result: (Optional("A"), 1) before
partialResult.0 == currentIterationItem
Return value: (Optional("A"), 1) which is partialResult at next iteration
Current Item to analyze: A
Partial Result: (Optional("A"), 1) before
partialResult.0 == currentIterationItem
Return value: (Optional("A"), 1) which is partialResult at next iteration
Current Item to analyze: A
Partial Result: (Optional("A"), 1) before
partialResult.0 == currentIterationItem
Return value: (Optional("A"), 1) which is partialResult at next iteration
Current Item to analyze: A
Partial Result: (Optional("A"), 1) before
partialResult.0 == currentIterationItem
Return value: (Optional("A"), 1) which is partialResult at next iteration
Current Item to analyze: A
Partial Result: (Optional("A"), 1) before
partialResult.0 == currentIterationItem
Return value: (Optional("A"), 1) which is partialResult at next iteration
Current Item to analyze: B
Partial Result: (Optional("A"), 1) before
partialResult.0 != currentIterationItem
Return value: (Optional("B"), 2) which is partialResult at next iteration
Current Item to analyze: B
Partial Result: (Optional("B"), 2) before
partialResult.0 == currentIterationItem
Return value: (Optional("B"), 2) which is partialResult at next iteration
Current Item to analyze: B
Partial Result: (Optional("B"), 2) before
partialResult.0 == currentIterationItem
Return value: (Optional("B"), 2) which is partialResult at next iteration
Current Item to analyze: B
Partial Result: (Optional("B"), 2) before
partialResult.0 == currentIterationItem
Return value: (Optional("B"), 2) which is partialResult at next iteration
Current Item to analyze: C
Partial Result: (Optional("B"), 2) before
partialResult.0 != currentIterationItem
Return value: (Optional("C"), 3) which is partialResult at next iteration
Current Item to analyze: C
Partial Result: (Optional("C"), 3) before
partialResult.0 == currentIterationItem
Return value: (Optional("C"), 3) which is partialResult at next iteration
Current Item to analyze: C
Partial Result: (Optional("C"), 3) before
partialResult.0 == currentIterationItem
Return value: (Optional("C"), 3) which is partialResult at next iteration
Current Item to analyze: D
Partial Result: (Optional("C"), 3) before
partialResult.0 != currentIterationItem
Return value: (Optional("D"), 4) which is partialResult at next iteration
We know the goal of the method, and we know how reduce(_:_:)
works.
So what's happening?
We have a start (nil, 0)
. We then, test the current element against nil
, it's different, we'll have now (A, 1). Then, since it's sorted, we do the same for the remaining "A", it's the same, the partialResult
is still (A,1). When entering next letter "B", it goes into the else, so we keep the letter "B" into partialResult, and we increment the second tuple value. We know have (B, 2), etc.
So, the tuple meaning is: first value: Current item, second value: counter of different items seen.
It works, because array if sorted. If it wasn't, the output would be totally wrong.
In my opinion, it's not good (as they is no explanation) code. It's a little hard to read, understand what's happening. A Set
if possible would have been better, but it needs to be Hashable
compliant.
Usually, when you need to create a var
before reduce(_:_:)
, prefer reduce(into:_:)
.
In a nutshell: See intermediary values & decomplex code by breaking into multiple small steps...
CodePudding user response:
I would suggest a multi-step process to reverse-engineer this code:
Let us replace the closure‘s “shorthand argument names” (the
$0
and the$1
) with meaningful names as deduced from thereduce
documentation. Thereduce
function’s closure’s parameters are (a) the “result” from the previous iteration; and (b) the current element from the array for the current iteration of thereduce
closure:func countUniques<T: Comparable>(_ array: [T]) -> Int { let sorted = array.sorted() let initial: (T?, Int) = (.none, 0) let reduced = sorted.reduce(initial) { previousResult, item in return (item, previousResult.0 == element ? previousResult.1 : previousResult.1 1) // Doubt here } return reduced.1 }
Already, it is starting to be a little more clear. But let us continue our analysis.
If the ternary syntax (the
a ? b : c
) makes the meaning opaque, it is often useful to break it down into anif
-else
statement:func countUniques2<T: Comparable>(_ array: [T]) -> Int { let sorted = array.sorted() let initial: (T?, Int) = (.none, 0) let reduced = sorted.reduce(initial) { previousResult, item in if previousResult.0 == item { return (item, previousResult.1) } else { return (item, previousResult.1 1) } } return reduced.1 }
OK, so what is
(T?, Int)
and the subsequent.0
and.1
syntax? That is a tuple. It is a pair of values which is being used in three places, namely:- the
initial
value; - the first parameter of the closure,
previousResult
; and - the return value of the closure.
You are apparently familiar with
reduce
that is passing a single value from one iteration to another (e.g., a simple sum). But when you see a tuple in areduce
function closure, that simply means thatreduce
is passing two values from one iteration to the next. So what are these two tuple elements?In the code, we see
previousResult.0
(the first element of the tuple) andpreviousResult.1
(the second element of the tuple). So, what does that first element of the tuple represent?By looking at what the closure returns, we can infer that
previousResult.0
is simply the previous value in the array (notice that thereturn
returns the current item in the array as the first parameter).How about the second element of that tuple, the
previousResult.1
? Well, we can see if the previous value is the same and the current value in the array, it simply returnspreviousResult.1
, but if the values are different, it returnspreviousResult.1
plus1
. Clearly this is just the count of unique values encountered thus far. I.e., if the current array item is different than the previous item in the array, it increments the count.func countUniques3<T: Comparable>(_ array: [T]) -> Int { let sorted = array.sorted() let initial: (T?, Int) = (.none, 0) // first element of tuple is previous value in sorted array; second element is count of unique values encountered thus far let reduced = sorted.reduce(initial) { previousResult, item in let (previousValue, previousUniquesCount) = previousResult // decompose the tuple's two elements into meaningful names if previousValue == item { return (item, previousUniquesCount) } else { return (item, previousUniquesCount 1) } } return reduced.1 // the final result is the second element of the tuple returned by `reduce` }
- the
Now that we can start to see what the routine is doing, we could translate this into imperative routine, replacing the tuple with two separate variables to represent the tuple’s two elements:
func countUniques4<T: Comparable>(_ array: [T]) -> Int { let sorted = array.sorted() var previousValue: T? var count = 0 for item in sorted { if item != previousValue { count = 1 previousValue = item } } return count }
Or, even more concisely:
func countUniques5<T: Comparable>(_ array: [T]) -> Int { let sorted = array.sorted() var previousValue: T? var count = 0 for value in sorted where value != previousValue { count = 1 previousValue = value } return count }
Now, one can debate the merits of
reduce
with a tuple. It is a clever technique, but arguably makes the code harder to understand at a glance. Admittedly, once you have done it a few times, it becomes easier to interpret. I guess we are lucky that the author did not elect to make it even more “concise”:func countUniques6<T: Comparable>(_ array: [T]) -> Int { array.sorted().reduce((T?.none, 0)) { ($1, $0.0 == $1 ? $0.1 : $0.1 1) }.1 }
or
func countUniques7<T: Comparable>(_ array: [T]) -> Int { array.sorted().reduce(into: (T?.none, 0)) { if $0.0 != $1 { $0 = ($1, $0.1 1) } }.1 }
As you can see, you can write very compact expressions with functional code, shorthand argument names, tuples, etc. There can even be a perverse pleasure in doing so. But generally we strive for code that is easily understood and maintained (except for those rare cases where we have to sacrifice these features for the sake of performance, which is not the case here).
Speaking of performance, this whole routine is somewhat inefficient. Notably, the algorithm is predicated upon first sorting the array. That is slow. It is also an inefficient use of space.
For example, if the array element values were hashable, using a Set
would be more efficient and avoids the complexity of the above:
func countUniques8<T: Hashable>(_ array: [T]) -> Int {
return Set(array).count
}
There is no need to build a separate sorted array, sorting each and every element of the array, simply to determine how many unique values there are. You only need to know whether the value has been encountered before. A Set
accomplishes this efficiently.