Home > Back-end >  How to make unique string out of substrings?
How to make unique string out of substrings?

Time:05-05

When I concatenate two strings (e.g. "q5q3q2q1" and "q5q4q3q2q1") I get string that have duplicate substrings q5,q3,q2,q1 which appears twice.

The resultant string would be "q5q3q2q1q5q4q3q2q1" and I need to have each substring (q[number]) appear once i.e."q5q4q3q2q1". Substring doesn't have to start with 'q', but I could set restriction that it doesn't start with number, also it could have multiple numbers like q11.

What could I use to get this string? If solution could be written in Java that would be good, otherwise only algorithm would be useful.

CodePudding user response:

You can split the concatenated string in groups and then use a set, if order of groups doesn't matter, or a dictionary if it is.

a = "q5q3q2q1"
b = "q5q4q3q2q1"

# Concatenate strings
c = a   b
print(c)

# Create the groups
d = ["q"   item for item in c.split("q") if item != ""]
print(d)

# If order does not matter
print("".join(set(d)))

# If order does matter
print("".join({key: 1 for key in d}.keys()))

CodePudding user response:

Another solution, this one is using regular expression. Concatenate the string and find all patterns ([^\d] \d ) (regex101). Then add found strings to set to remove duplicates and join them:

import re

s1 = "q5q3q2q1"
s2 = "q5q4q3q2q1"

out = "".join(set(re.findall(r"([^\d] \d )", s1   s2)))
print(out)

Prints:

q5q2q1q4q3

CodePudding user response:

Some quick way of doing this via java as you asked in question:

        String a = "q1q2q3";
        String b = "q1q2q3q4q5q11";
        List l1 = Arrays.asList(a.split("q"));
        List l2 = Arrays.asList(b.split("q"));
        List l3 = new ArrayList<String>();
        l3.addAll(l1);
        List l4 = new ArrayList<String>();
        l4.addAll(l2);
        l4.removeAll(l3);
        l3.addAll(l4);

        System.out.println(String.join("q", l3));

Output:

q1q2q3q4q5q11

CodePudding user response:

This is a variation of @DanConstantinescu's solution in JS:

  • Start with the concatenated string.
  • Split string at the beginning of a substring composed of text followed by a number. This is implemented as a regex lookahead, so split returns the string portions as an array.
  • Build a set from this array. The constructor performs deduplication.
  • Turn the set into an array again
  • Concat the elements with the empty string.

While this code is not Java it should be straightforward to port the idea to other (imperative or object-oriented) languages.

let s_concatenated = "q5q3q2q1"   "q5q4q3q2q1"   "q11a13b4q11"
  , s_dedup
  ;

s_dedup = 
    Array.from(
        new Set(s_concatenated
                   .split(/(?=[^\d] \d )/)  // Split into an array
            ) // Build a set, deduplicating
    ) // Turn the set into an array again
       .join('') // Concat the elements with the empty string.
    ;

console.log(`'${s_concatenated}' -> '${s_dedup}'.`);

  • Related