Home > database >  How to extract string using regex with space before or after
How to extract string using regex with space before or after

Time:02-02

in the following examples, I want to extract "Mywebsite.xx". How do I do it?

Search Mywebsite.de  ----> Mywebsite.de
Mywebsite.de durchsuchen ----> Mywebsite.de
Search Mywebsite.co.uk ----> Mywebsite.co.uk
Mywebsite.co.uk something ----> Mywebsite.co.uk

I tried this but it's not working:

String mydata2 = "Mywebsite.de durchsuchen";
Matcher matcher = Pattern.compile("Mywebsite(.*?)").matcher(mydata2);
if (matcher.find())
{
    System.out.println(matcher.group(1));
}

CodePudding user response:

You can use the Mywebsite\.([a-z] \.[a-z] )

public static void extractDomain(String domain){
  Pattern domainPattern = Pattern.compile("Mywebsite\.([a-z] \.[a-z] )");
  Matcher match = domainPattern.matcher(domain);
 System.out.println("Mywebsite"  match.group(1));

}

CodePudding user response:

You can try this pattern match for the input array of possible strings. The first four strings will match.

String patternStr = "(\\s|^)mywebsite([.][a-z][a-z]){1,2}(\\s|$)";
Pattern pattern = Pattern.compile(patternStr, Pattern.CASE_INSENSITIVE);
String [] stringsToMatch = {
    "Mywebsite.co.uk xyz",
    "abc Mywebsite.co.uk",
    "abc Mywebsite.co.uk xyz",
    "Mywebsite.co.uk",
    "Mywebsite.co.uk.us",
    "Mywebsite"
};

for (String str : stringsToMatch) {
    Matcher matcher = pattern.matcher(str);
    System.out.println(str);
    if (matcher.find()) {
        System.out.println("    "   str.substring(matcher.start(), matcher.end()));
    }
    else {
        System.out.println("    No match");
    }
}

CodePudding user response:

To find the domain name from a string you can use regex like

(?:http[s]?:\/\/)?(?:[a-zA-Z]|[0-9]|[$-_@.& ]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))

This program will capture list of domain from your string

public static List<String> extractDomainNames(String input) {
        List<String> domainNames = new ArrayList<>();
        String domainNamePattern = "(?:http[s]?://)?(?:[a-zA-Z]|[0-9]|[$-_@.& ]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F])) ";
        Pattern pattern = Pattern.compile(domainNamePattern);
        Matcher matcher = pattern.matcher(input);
        while (matcher.find()) {
            domainNames.add(matcher.group());
        }
        return domainNames;
    }

CodePudding user response:

You could try this regex: Mywebsite\.[^\s]

String input = "Mywebsite.de durchsuchen";
Pattern regexPattern = Pattern.compile("Mywebsite\.[^\s] ");
Matcher regexMatcher = regexPattern.matcher(input);
while (regexMatcher.find()) {
    System.out.println(regexMatcher.group());
}

See regex demo here

  • Related