Home > OS >  jsoup does not find value of span id or class
jsoup does not find value of span id or class

Time:01-30

I literally don't know how to describe my problem other than the fact that jsoup actively skips over the one value I need. I'm attempting to grab the value of average engagement/likes/comments on Instagram posts from a selected user; but let's just stick with engagement.

So far in my testing, I've seen it skip both values in <span id=... and also <span class=...

I have two versions of my code, neither of which provide any sort of helpful result. *Just as reference, this is what I can see when I inspect element the page: <span >4,300</span> == $0 (https://analisa.io/profile/officialrickastley)

General:

import org.jsoup.*;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

Code Ver 1.

String accountUsername = "officialrickastley";
String url = "https://analisa.io/profile/"   accountUsername;
Document doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36").get();

Elements engagement = doc.getElementsByClass("js-summary-whole-engagement");
System.out.println(engagement);  

The above outputs: <span ><i ></i></span> The latter half I believe to be irrelevant and I think appears later on down the page. But after the first half where I would expect the numbered value, it just doesn't have anything?

Code Ver 2.

String accountUsername = "officialrickastley";
String url = "https://analisa.io/profile/"   accountUsername;
Document doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36").get();

Elements engagement = doc.getElementsByClass("js-summary-whole-engagement");
System.out.println(engagement.text()); 

The above outputs nothing, not even a space or anything.

I've also tried something called doc.select and quite a few other things like .value, but nothing actually addresses the issue I'm having. I have also seen people parse the html directly from within the class, but if that is a possible solution, I'm unsure how to make the connection to the website and then store it to be parsed, since I want the data to update everyday.

Any help or suggestions would be greatly appreciated, thanks!

CodePudding user response:

getElementsByClass returns an array of elements. Select the first one and print its text:

System.out.println(engagement[0].text());

Also, it's good practice to name lists or arrays in plural: Elements engagement -> Elements engagements

CodePudding user response:

You could try this (read comments):

try {
    String accountUsername = "officialrickastley";
    String url = "https://analisa.io/profile/"   accountUsername;
    Document doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36").get();
        
    // Get Name and Analisa Handle
    String keyWords = doc.select("meta[name=\"keywords\"]").first().attr("content"); 
    String[] contParts = keyWords.split(",\\s");
    String name = contParts[0];
    String handle = contParts[1];

    // Get desired stats:
    keyWords = doc.select("meta[property=\"og:description\"]").first().attr("content");  
    contParts = keyWords.split(",\\s");
        
    String engagmentRate = contParts[0].split("\\s ")[0];
    String avgLikes = contParts[1].split("\\s ")[0];
    String avgComments = contParts[2].split("\\s ")[0];
    String followers = contParts[3].split("\\s ")[0];
        
    System.out.println("Name:           "   name   " ("   handle   ")"); 
    System.out.println("Engagment Rate: "   engagmentRate);
    System.out.println("Avg Likes:      "   avgLikes);
    System.out.println("Avg Comments:   "   avgComments);
    System.out.println("Followers:      "   followers);
} catch (IOException ex) {
    // Handle exception whichever way you want, just don't leave it blank:
    System.err.println(ex);
}

The code above should output the following into the Console Window:

Name:           Rick Astley (@officialrickastley Analisa)
Engagment Rate: 2.44%
Avg Likes:      2.37
Avg Comments:   0.07
Followers:      176,125
  • Related