I'm trying to understand this code block, I saw this in class and I still don't really get it.
I understand what and HOW a map works. It's a key-pair value object. In this case I just don't understand what is happening. I see that we have a char and int, but I don't understand how they interact in this instance.
class Solution {
public:
bool isIsomorphic(string s, string t) {
map<char, int> mapS;
map<char, int> mapT;
for(int i = 0; i < s.length(); i )
{
if(mapS[s[i]] != mapT[t[i]]) return false;
mapS[s[i]] = i 1;
mapT[t[i]] = i 1;
}
return true;
}
};
I tried printing out the results after each for, and I got 0 and 1 (not in the binary sense). I also know that we are taking a character at 'i' 1 and placing it at the maps index. What am I missing?
Thanks!
Sorry I'm still not used to posting good questions here.
I am trying to understand the solution first and foremost. So instead, like someone suggested, I'll go through it with my reasoning.
First, we initiate two maps (mapS and mapT).
Second, iterate over the length of string s. (I guess we can assume string t is the same length? That part isn't clear.)
We check if the character at s[i] is equal to t[i], and that also has to exist in the map. (this is where it falls apart for me)
I'm treating the line after that as looking ahead one and adding it to the map.
We then return if we don't have an issue.
Now, excuse me if im wrong, but according to the documentation shouldn't we comparing the keys here? Or am missing something entirely.
Also, someone mentioned if I understand isomorphic. I don't entirely. I have a general idea.
CodePudding user response:
First I'll go over the explanation you provided.
First, we initiate two maps (mapS and mapT).
Pedantically, we initialize two maps.
Second, iterate over the length of string s. (I guess we can assume string t is the same length? That part isn't clear.)
We can assume for the sake of this problem that t
is at least as long as s
.
We check if the character at s[i] is equal to t[i],
No. That would be s[i] == t[i]
.
and that also has to exist in the map. (this is where it falls apart for me)
No. mapS[s[i]]
is 0 if we never stored a value for s[i]
in mapS
.
Here's what the if
condition actually does:
It looks up
s[i]
inmapS
, getting back someint
—either theint
we most recently stored for keys[i]
, or 0 if we never stored anint
for keys[i]
.It looks up
t[i]
inmapT
, getting back someint
—either theint
we most recently stored for keyt[i]
, or 0 if we never stored anint
for keyt[i]
.If the two
int
s are different, it makes the function return early with the valuefalse
.
I'm treating the line after that as looking ahead one and adding it to the map.
It doesn't look ahead. Looking ahead would be something like s[i 1]
or t[i 1]
. It does store i 1
in both maps, but it doesn't “look ahead”.
We then return if we don't have an issue.
We return true
if we exit the loop body because i
reached the length of s
without ever returning early.
Now, excuse me if im wrong, but according to the documentation shouldn't we comparing the keys here?
What documentation?
Also, someone mentioned if I understand isomorphic. I don't entirely. I have a general idea.
It is not obvious what “isomorphic” means here, but we'll figure it out.
Now I'll walk through my analysis of this function.
One way to understand what a loop does is to start with simple input and walk through the loop “manually”, as many times as needed. (You start with a simple input so you don't have to walk through it too many times!)
Let's focus on the mapS
variable and see what the loop does to it. Since mapS
is indexed using elements from s
, we need to pick some string for s
. Let's use "abcab"
.
The program only updates mapS
in one statement:
mapS[s[i]] = i 1;
Now let's walk through the loop and see how mapS
evolves. We'll assume for now that the if
condition is always false.
i |
s[i] |
mapS before loop body |
mapS after loop body |
---|---|---|---|
0 | 'a' |
(empty) | ['a']=1 |
1 | 'b' |
['a']=1 |
['a']=1 ['b']=2 |
2 | 'c' |
['a']=1 ['b']=2 |
['a']=1 ['b']=2 ['c']=3 |
3 | 'a' |
['a']=1 ['b']=2 ['c']=3 |
['a']=4 ['b']=2 ['c']=3 |
4 | 'a' |
['a']=4 ['b']=2 ['c']=3 |
['a']=4 ['b']=5 ['c']=3 |
So at the start of the loop body, we can say: for every character c
in s.substr(0, i)
(the first i
characters of s
), mapS[c]
is one larger than the last index of c
in s.substr(0, i)
.
Note that s.substr(0, i)
is the empty string when i == 0
.
As I mentioned earlier, mapS
returns 0 for any key that isn't actually stored in the map. So let's think of 0 as meaning “s.substr(0, i)
doesn't contain c
”.
With that meaning for 0, we can say that, at the start of the loop body, for every c
, mapS[c]
encodes the last index of c
in s.substr(0, i)
as follows:
mapS[c] == 0
encodes the meaning “c
is not ins.substr(0, i)
”.mapS[c] == x
encodes the meaning “x - 1 < i
ands[x - 1] == c
ands.substr(0, i)
does not containc
at any larger index thanx - 1
”.
Since mapT
is updated exactly the same way as mapS
, except using characters from t
, we can say that it encodes the same meaning as mapS
, but relative to t
instead of s
.
Now we're ready to analyze the if
statement's condition: mapS[s[i]] != mapT[t[i]]
.
Remember that s.substr(0, i)
does not reach s[i]
; it stops just before s[i]
. That is, "abcab"[2]
is 'c'
, and "abcab".substr(0, 2)
is "ab"
; the substring does not end with 'c'
.
So, in the if
condition:
mapS[s[i]]
is the encoded last index ofs[i]
ins.substr(0, i)
. (Remember that the “encoded last index” may encode the meaning “no such index”.)mapT[t[i]]
is the encoded last index oft[i]
int.substr(0, i)
.
The if
statement thus makes the function return early, with value false
, if these encoded last indexes different.
Now suppose the loop goes all the way to i == s.length()
, with the if
condition never being true. Then the loop exits normally, and the program has proven that every time it saw some character c
in s
, and some corresponding character d
at the same index in t
, it had previously seen c
and d
at identical earlier (encoded) indexes in s
and t
.
It's a little bit of a leap to get from that to a more holistic understanding of isIsomorphic
, but here's what isIsomorphic(s, t)
means:
Is there a way to uniformly map each character c
that appears in s
to a unique character d
that appears in t
? And vice versa? If so, isIsomorphic(s, t)
returns true. Otherwise, it returns false.
Consider this example: isIsomorphic("abcab", "zyxzy")
. It returns true, because there is a uniform, bidirectional mapping between the characters that occur in s
and the characters that occur in t
:
s | t |
---|---|
a | z |
b | y |
c | x |
But isIsomorphic("abcab", "zyzzy")
is false. Both 'a'
and 'c'
in s
would have to map to 'z'
in t
—but 'z'
in t
can only map back to one character, not to both 'a'
and 'c'
.
This sort of uniform, unique, bidirectional mapping is called a bijection.
An isomorphism is always a bijection, but generally has additional requirements, depending on the kind of structures it's defined on. For example, in graph theory, you could define a bijection between the nodes of two graphs, but an isomorphism would also have to define a bijection between the edges of those graphs.
A better name for this function might be bijectionExists
, or samePattern
.
CodePudding user response:
Getting started with understanding the code first requires understanding the problem it is attempting to solve.
Two strings are isomorphic is every character string A can be mapped to every character in string B.
For example, ACAB
and XCXY
are isomorphic.
A -> X
C -> C
B -> Y
And another, ACAB
and XCZY
are not isomorphic.
A -> X, Z # 'A' cannot map to multiple characters
C -> C
B -> Y
So, how does the code go about figuring this out? We'll look at the parts of the function body.
std::map<char, int> mapS;
std::map<char, int> mapT;
Declare two maps, one for each string. They are both empty.
for(int i = 0; i < s.length(); i )
This loop only cares about the length of s
, and we've already hit a road block. If we wanted to check for a one-way mapping where s -> t
but t !-> s
, this function doesn't do that correctly. I'd go further and say that if the lengths of s
and t
are different, the function should immediately return false
. Because we are guaranteed to read out of bounds is t.length() < s.length()
.
Moving on:
{
if(mapS[s[i]] != mapT[t[i]]) return false;
mapS[s[i]] = i 1;
mapT[t[i]] = i 1;
}
We have to take these together. The if
check simply wants to know if the corresponding map elements have the same value. If they don't, we have a mismatch in our mapping and can immediately return false. If the values do match, we update both maps together. i 1
is used because of how operator[]
works with maps. If a key doesn't exist, attempting to index with that key will create an entry in the map, and initialize the int
value to 0
. Adding 1
is a cheap way to get around false positives.
Finally, if you make it through the loop without returning false
, it is assumed that the strings are isomorphic and you return true.
But this function is bogus. It can almost be said that it checks for "one-way" isomorphism, when t
is longer than s
, but if s
is longer, it can still return true and invoke undefined behavior while doing so.
The better solution, to me, is to store the actual character mapping. You also need to keep track of the unique characters in the second string. And the first check made should be for equal length.
#include <iostream>
#include <set>
#include <string>
#include <unordered_map>
class Solution {
public:
bool isIsomorphic(const std::string& s, const std::string& t) {
if (s.length() != t.length()) return false;
// The map stores the character mapping from s to t. Characters in s
// are the key, and the value is the corresponding character in t.
std::unordered_map<char, char> map;
// The set stores the **unique** characters of t. This allows us to know
// if we already have a mapping to a character in t, if it exists in
// the set.
std::set<char> trackedChars;
for (std::size_t i = 0; i < s.length(); i) {
char sChar = s[i];
char tChar = t[i];
// Have I mapped this character yet?
if (map.find(sChar) != map.end()) { // Yes
// Is the mapping still correct?
if (map[sChar] != tChar) { // Map mismatch found
return false;
}
} else { // No
// Has the corresponding character been mapped already?
if (trackedChars.find(tChar) != trackedChars.end()) {
// tChar is already mapped
return false;
} else { // tChar is also not yet mapped
map[sChar] = tChar;
trackedChars.insert(tChar);
}
}
}
// Making it out of the loop implies every character mapped correctly.
return true;
}
};
int main() {
std::string one{"ACAB"};
std::string two{"XCXY"};
Solution s;
std::cout << std::boolalpha << s.isIsomorphic(one, two) << '\n';
}
Output:
true
I don't want to pile on to what's been said, but for a class, this code structure is garbage. It should just be a free function, or maybe inserted into a namespace. But that's beyond the scope of this question.