Home > Software engineering >  How do I manually find a substring in a string? (Java)
How do I manually find a substring in a string? (Java)

Time:11-28

        public int lookFor(String s) {
    final int EXIST = 1;
    final int NOT_EXIST = -1; 
    int thisIndex = 0;
    int otherIndex = 0;
    char thisNext;
    char otherNext;
    

    if (s == null || s.length() == 0)
        return NOT_EXIST;
    
    for(; thisIndex < this.mainString.length() ; ) {
        
        thisNext = this.mainString.charAt(thisIndex);
        otherNext = s.charAt(otherIndex);
        
        if (thisNext == otherNext) {
            thisIndex  ;
            otherIndex  ;
        }
        
        else if (thisNext != otherNext)
            thisIndex  ;
        
        if (otherIndex == s.length()-1)
            return EXIST;       
    }
    return NOT_EXIST;
}

This is my attempt so far.

mainString = the main string I want to find the substring in. s = the substring.

So my idea was to get the first chars of both strings, see if they equal. if they don't, i'll get the second char of mainString, see if they equal (mainString second char to s first char). If they're not equal, i'll get the third char of mainString and so forth. Once they're equal, i'll get the next char of both strings and see if they both equal.

Basically the loops knows that mainString contains s when index of s equals to s length minus one (that means the loop looped all the way to the last char inc, of s, so s index == s length -1).

Is the logic I'm trying to work with incorrect? or I just executed it not good? i'll happy to get answers!

CodePudding user response:

Here's my naïve approach:

private final int EXIST = 1;
private final int NOT_EXIST = -1;

private int lookFor(String a, String b, int index) {
     for (int i = 0; i < b.length(); i  ) {
          if ((index   i) >= a.length()) return NOT_EXIST;
          if (a.charAt(index   i) != b.charAt(i)) return NOT_EXIST;
     }
     return EXIST;
}

public int lookFor(String a, String b) {
     char start = b.charAt(0);
     for (int i=0; i < a.length(); i  ) {
          if (a.charAt(i) == start) {
               if (lookFor(a, b, i) == EXIST) return EXIST;
          }
     }
     return NOT_EXIST;
}

Though, I'm not sure why you would do this when you could just do:

int ret = a.contains(b) ? EXIST : NOT_EXIST

However I wanted to actually answer your question.

Here's a slightly improved version that satisfies your "all in one method" requirement.

public static int lookFor(String a, String b) {
    // Fancy way of preventing errors when one of the strings is empty
    boolean az = a.length() == 0;
    boolean bz = b.length() == 0;
     if (az ^ bz) return NOT_EXIST;
     // Need this next line if you want to interpret two empty strings as containing eachother
     if (az && bz) return EXIST;
     char start = b.charAt(0);
     // This is known as a "label". Some say it's bad practice.
     outer:
     for (int i=0; i < a.length(); i  ) {
          if (a.charAt(i) == start) {
               // Instead of using two methods, we can condense it like so
               for (int q = 0; q < b.length(); q  ) {
                    if ((i   q) >= a.length()) continue outer;
                    if (a.charAt(i   q) != b.charAt(q)) continue outer;
               }
               return EXIST;
          }
     }
     return NOT_EXIST;
}

CodePudding user response:

To find a substring "by hand", you need a nested loop; i.e. a loop inside a loop.

  • The outer loop tries all of the possible start positions for the substring in the string.
  • For a given start position, the inner loop tests all of the characters of the string that you are looking for against the string you are searching in.

In the naive substring search algorithm, the outer loop steps starts at index zero, and increments the index by one until it gets to the end of the string being searched. This can be improved on:

  • Every non-null string "contains" the empty string. It may be worth treating this as a special case.

  • It is easy to see that the outer loop can usually stop before the final. If you are searching for a string of length (say) 3, then the outer loop can stop at 3 from the end. (Think about it ....)

  • There are some clever algorithms which allow the outer loop to skip over some indexes. If you are interested, start by Googling for "Boyer-Moore string search".

(Note: the looping could be replaced with / written using recursion, but it is still there.)


Your code doesn't have a nested loops. By my reading, it is only going to find a match if the string you are searching for is at the start of the string you are searching. That is not correct.

  • Related