Home > Enterprise >  Find matching topics strings with * and # wildcards in java
Find matching topics strings with * and # wildcards in java

Time:10-18

I have a topic based publish subscribe service and I'm working on topic matching. The current test tree I have is the following (each line represents a topic) :

t1.t2
t1.t3
t1.t4
t1.t3.t5
t1.t4.t6

Results should be (when using a wildcard) :

t1.# -> t1.t2, t1.t3, t1.t4, t1.t3.t5, t1.t4.t6
t1.* -> t1.t2, t1.t3, t1.t4

My current code :

public List<Topic> getMatchingTopics(Topic topic, List<Topic> currentTopics) {
    List<Topic> matchingTopics = new ArrayList<>();

    String topicString = topic.getName();

    for (Topic t : currentTopics) {
        String currentTopicString = t.getName();
        if (topicString.equals(currentTopicString)) {
            matchingTopics.add(t);
        } else {
            String[] topicStringSplit = topicString.split("\\.");

            String[] currentTopicStringSplit = currentTopicString.split("\\.");

            if (topicStringSplit.length == currentTopicStringSplit.length) {
                boolean match = true;

                for (int i = 0; i < topicStringSplit.length; i  ) {

                    if (topicStringSplit[i].equals("*")) {
                        continue;
                    }
                    else if (topicStringSplit[i].equals("#")) {
                        continue;
                    } else if (!topicStringSplit[i].equals(currentTopicStringSplit[i])) {
                        // not a match
                        match = false;
                        break;
                    }

                }
                if (match) {
                    matchingTopics.add(t);
                }
            }
        }
    }

    return matchingTopics;
}

The asterisk wildcard (*) is working, however, I'm having issues with the hashtag (#), hence left the logic the same for both for clarity.

Test method

   @Test
public void testTopicMatch() {
    // t1.t2
    // t1.t3
    // t1.t4
    // t1.t3.t5
    // t1.t4.t6

    // t1.# -> t1.t2, t1.t3, t1.t4, t1.t3.t5, t1.t4.t6
    // t1.* -> t1.t2, t1.t3, t1.t4

    List<Topic> topics = new ArrayList<>();

    Topic t1t2 = new Topic();
    t1t2.setName("t1.t2");
    topics.add(t1t2);

    Topic t1t3 = new Topic();
    t1t3.setName("t1.t3");
    topics.add(t1t3);

    Topic t1t4 = new Topic();
    t1t4.setName("t1.t4");
    topics.add(t1t4);

    Topic t1t3t5 = new Topic();
    t1t3t5.setName("t1.t3.t5");
    topics.add(t1t3t5);

    Topic t1t4t6 = new Topic();
    t1t4t6.setName("t1.t4.t6");
    topics.add(t1t4t6);

    Topic topicHashtag = new Topic();
    topicHashtag.setName("t1.#");
    List<Topic> res = getMatchingTopics(topicHashtag, topics);

    for (Topic topic : res) {
        System.out.println(topic.getName());
    }


}

CodePudding user response:

One way to look at this is using regular expressions. Your specification might be translated as follows:

t1.* -> t1.t2, t1.t3, t1.t4    
Regex: (t1\.)(t[0-9])
Explanation: Any string (topic name) starting with t1. followed by exactly t and number

t1.# -> t1.t2, t1.t3, t1.t4, t1.t3.t5, t1.t4.t6
Regex: (t1\.)(t[0-9])(\.(t[0-9]))?
Explanation: Any string (topic name) starting with t1., followed by exactly t and number, optionally followed by a dot, a t and a number

Of course, this is a very simple definition which could be generalised according other matching rules. For your matching requirement it should work as the following test might show:

import org.junit.jupiter.api.Test;

import java.util.List;
import java.util.regex.Pattern;

import static java.util.stream.Collectors.toList;
import static junit.framework.Assert.assertEquals;

public class RegexTest {

    final List<String> topicNames = List.of("t1.t2", "t1.t3", "t1.t4", "t1.t3.t5", "t1.t4.t6");

    @Test
    void regexForAsteriskSpecification(){
        final Pattern asteriskPattern = Pattern.compile("(t1\\.)(t[0-9])");
        java.util.List<String> collect = topicNames.stream().filter(n -> n.matches(asteriskPattern.pattern())).collect(toList());
        assertEquals(collect, List.of("t1.t2", "t1.t3", "t1.t4"));
    }

    @Test
    void regexForHashtagSpecification(){
        final Pattern asteriskPattern = Pattern.compile("(t1\\.)(t[0-9])(\\.(t[0-9]))?");
        java.util.List<String> collect = topicNames.stream().filter(n -> n.matches(asteriskPattern.pattern())).collect(toList());
        assertEquals(collect, topicNames);
    }
}

CodePudding user response:

First of all you should add method to compute matching level to be use in * match

class Topic {
    private String name;
    public String getName() {return name;}
    public void setName(String name) {this.name = name;}
    // add this method to compute level based on how many dot (.)
    public int level() {return (int) this.name.chars().filter(it -> it == '.').count();}
}
public List<Topic> getMatchingTopics(Topic topic, List<Topic> currentTopics) {
    List<Topic> matchingTopics = new ArrayList<>();

    String topicString = topic.getName();
    // if match topic is ends with `#` | eg. `topicString` = `"t1.#"`
    if (topicString.endsWith(".#")) {
        // drop `#` char | `topicPrefix` = `"t1."`
        String topicPrefix = topicString.substring(0, topicString.length() - 1);
        for (Topic currentTopic : currentTopics) {
            // check if topic name is start with `topicPrefix` (`"t1."`)
            // this will match any topic start with "t1."
            if (currentTopic.getName().startsWith(topicPrefix)) {
                matchingTopics.add(currentTopic);
            }
        }
    } else if (topicString.endsWith(".*")) {
        // drop `*` char | `topicPrefix` = `"t1."`
        String topicPrefix = topicString.substring(0, topicString.length() - 1);
        // level is based on dot as previous step `"t1.*"` is `1`
        int topicLevel = topic.level();
        for (Topic currentTopic : currentTopics) {
            // check if topic is start with `topicPrefix` (`"t1."`)
            if (currentTopic.getName().startsWith(topicPrefix) 
                // AND check if topic is at same level
                // `"t1.t2"`    have level 1 = match
                // `"t1.t2.t3"` have level 2 = not match
                && currentTopic.level() == topicLevel) {
                matchingTopics.add(currentTopic);
            }
        }
    } else {
        // your logic
        for (Topic currentTopic : currentTopics) {
            if (currentTopic.getName().equals(topicString)) {
                matchingTopics.add(currentTopic);
            }
        }
    }

    return matchingTopics;
}
  • Related