String actualResource = driver.getPageSource();
actualResource output is
<html><head><meta http-equiv="Content-Type" content="text/html" charset="UTF-8"></meta><link rel="stylesheet" type="text/css" href="Html_4d1b82e4-c90b-48ce-8640-3ab33abc7850.css"></link><script language="javascript" src="script.js"></script><script language="javascript"></script><link rel="stylesheet" type="text/css" href="script.css"></link></head><body><p class="paragraph_class4 Title"><span class="paragraph_class4 Title text_class2"><span>Testing</span></span></p><p class="paragraph_class5"><span class="paragraph_class5 text_class2"><span>Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)</span></span></p><p class="paragraph_class6"><h3 class="paragraph_class6 text_class7 3"><span>Work Items</span></h3></p><p class="paragraph_class6"><span class="paragraph_class6 text_class120"><span>Fixed WIs/Total WIs: </span></span><span class="paragraph_class6 text_class121"><span>29</span></span></body></html>
I need to remove the value of "Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)" from this long string. The generated time value is fully dynamic with the current timestamp.
Please help to resolve this.
CodePudding user response:
Well I am no regex expert, a simple regex that filters out the
<span>Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)</span>
is:
public static void main(String[] args) {
String pattern = "(<span>Generated).*\\(EDT\\)(</span>)";
String longText = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html\" charset=\"UTF-8\"></meta><link rel=\"stylesheet\" type=\"text/css\" href=\"Html_4d1b82e4-c90b-48ce-8640-3ab33abc7850.css\"></link><script language=\"javascript\" src=\"script.js\"></script><script language=\"javascript\"></script><link rel=\"stylesheet\" type=\"text/css\" href=\"script.css\"></link></head><body><p class=\"paragraph_class4 Title\"><span class=\"paragraph_class4 Title text_class2\"><span>Testing</span></span></p><p class=\"paragraph_class5\"><span class=\"paragraph_class5 text_class2\"><span>Generated Tue Sep 21 2021 03:01:46 GMT-0400 (EDT)</span></span></p><p class=\"paragraph_class6\"><h3 class=\"paragraph_class6 text_class7 3\"><span>Work Items</span></h3></p><p class=\"paragraph_class6\"><span class=\"paragraph_class6 text_class120\"><span>Fixed WIs/Total WIs: </span></span><span class=\"paragraph_class6 text_class121\"><span>29</span></span></body></html>";
final String s = longText.replaceAll(pattern, "");
System.out.println(s);
}
The above will also remove the (empty) <span></span>
elements. If however you want them intact and just remove the text inside them you can change the regex to:
String pattern = "(Generated).*\\(EDT\\)";
CodePudding user response:
Try like this.
First you need to escape the double quotes with in the Text.
//This is for example.
String longText = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html\" charset=\"UTF-8\"></meta><link rel=\"stylesheet\" type=\"text/css\">";
Then use replace(CharSequence target, CharSequence replacement)
String newlongText = longText.replace("meta http-equiv", "");
// The result.
<html><head><="Content-Type" content="text/html" charset="UTF-8"></meta><link rel="stylesheet" type="text/css">