Home > Back-end >  How to remove particular string value from long string using java?
How to remove particular string value from long string using java?

Time:09-21

String actualResource = driver.getPageSource();

actualResource output is

<html><head><meta http-equiv="Content-Type" content="text/html" charset="UTF-8"></meta><link rel="stylesheet" type="text/css" href="Html_4d1b82e4-c90b-48ce-8640-3ab33abc7850.css"></link><script language="javascript" src="script.js"></script><script language="javascript"></script><link rel="stylesheet" type="text/css" href="script.css"></link></head><body><p class="paragraph_class4 Title"><span class="paragraph_class4 Title text_class2"><span>Testing</span></span></p><p class="paragraph_class5"><span class="paragraph_class5 text_class2"><span>Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)</span></span></p><p class="paragraph_class6"><h3 class="paragraph_class6 text_class7 3"><span>Work Items</span></h3></p><p class="paragraph_class6"><span class="paragraph_class6 text_class120"><span>Fixed WIs/Total WIs: </span></span><span class="paragraph_class6 text_class121"><span>29</span></span></body></html>

I need to remove the value of "Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)" from this long string. The generated time value is fully dynamic with the current timestamp.

Please help to resolve this.

CodePudding user response:

Well I am no regex expert, a simple regex that filters out the

<span>Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)</span>

is:

public static void main(String[] args) {
        String pattern = "(<span>Generated).*\\(EDT\\)(</span>)";
        String longText = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html\" charset=\"UTF-8\"></meta><link rel=\"stylesheet\" type=\"text/css\" href=\"Html_4d1b82e4-c90b-48ce-8640-3ab33abc7850.css\"></link><script language=\"javascript\" src=\"script.js\"></script><script language=\"javascript\"></script><link rel=\"stylesheet\" type=\"text/css\" href=\"script.css\"></link></head><body><p class=\"paragraph_class4 Title\"><span class=\"paragraph_class4 Title text_class2\"><span>Testing</span></span></p><p class=\"paragraph_class5\"><span class=\"paragraph_class5 text_class2\"><span>Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)</span></span></p><p class=\"paragraph_class6\"><h3 class=\"paragraph_class6 text_class7 3\"><span>Work Items</span></h3></p><p class=\"paragraph_class6\"><span class=\"paragraph_class6 text_class120\"><span>Fixed WIs/Total WIs: </span></span><span class=\"paragraph_class6 text_class121\"><span>29</span></span></body></html>";

        final String s = longText.replaceAll(pattern, "");
        System.out.println(s);
    }

The above will also remove the (empty) <span></span> elements. If however you want them intact and just remove the text inside them you can change the regex to:

String pattern = "(Generated).*\\(EDT\\)";

CodePudding user response:

Try like this.

First you need to escape the double quotes with in the Text.

//This is for example.
String longText = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html\" charset=\"UTF-8\"></meta><link rel=\"stylesheet\" type=\"text/css\">"; 

Then use replace(CharSequence target, CharSequence replacement)

String newlongText = longText.replace("meta http-equiv", "");
// The result.
<html><head><="Content-Type" content="text/html" charset="UTF-8"></meta><link rel="stylesheet" type="text/css">
  • Related