Purpose, just a POC (for now) to automatically and periodically find some CVE tags in the maven repository.
I can access maven just fine through browser and mvn, but am unable to do the same via Java, what am I missing? I've tried UrlConnection, HttpsURLConnection, with and without GET, Content-type, User-Agent, and Accept, it always returns a 403 for all addresses that I try, the same code works fine on other websites like "cve.mitre.org" or "nvd.nist.gov", but fails for "https://mvnrepository.com/artifact/log4j/apache-log4j-extras/1.2.17".
My URL is been built dynamically, with the start "**https://mvnrepository.com/artifact/**", then adding the group, name, and version are added, turning it into a valid address like "https://mvnrepository.com/artifact/log4j/apache-log4j-extras/1.2.17"
System.setProperty("https.proxyHost", "xxxx");
System.setProperty("https.proxyPort", "xxxx");
String content = null;
try {
URL obj = new URL(address);
HttpsURLConnection con = (HttpsURLConnection) obj.openConnection();
con.setRequestMethod("GET");
con.setRequestProperty("Content-Type", "application/json");
con.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36");
con.setRequestProperty("Accept", "*/*");
con.connect();
BufferedReader br;
if (con.getResponseCode() < 300) {
br = new BufferedReader(new InputStreamReader(con.getInputStream(), StandardCharsets.UTF_8));
} else {
br = new BufferedReader(new InputStreamReader(con.getErrorStream(), StandardCharsets.UTF_8));
}
final StringBuilder sb = new StringBuilder();
String line;
while ((line = br.readLine()) != null) {
sb.append(line);
}
br.close();
CodePudding user response:
This web use anti-bot security CloudFlare.
How to bypass CloudFlare bot protection?
It depends.... Sometimes it is very difficult task or impossible. That you need to do, is simulate a real user with the browser.
With htmlunit browser you can bypass it in this case only and with a good IP address. (i use my own ip address and did only one request)
You need maven dependency:
<dependency>
<groupId>net.sourceforge.htmlunit</groupId>
<artifactId>htmlunit</artifactId>
<version>2.57.0</version>
</dependency>
Here you have some java example:
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlAnchor;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import java.io.IOException;
import java.net.URL;
import java.util.List;
public class Maven {
public static void main(String[] args) throws IOException {
try (final WebClient webClient = new WebClient()) {
webClient.getOptions().setJavaScriptEnabled(false);
URL target = new URL("https://mvnrepository.com/artifact/log4j/apache-log4j-extras/1.2.17");
final HtmlPage page = webClient.getPage(target);
List<HtmlAnchor> elements = page.getByXPath("//a[contains(@class, 'vuln')]");
elements.forEach(element -> System.out.println(element.getTextContent()));
}
}
}
OUTPUT:
CVE-2022-23305
CVE-2022-23302
CVE-2021-4104
CVE-2019-17571
View 1 more ...
4 vulnerabilities
I hope I have been able to help you.