Home > Back-end >  Web scraping table shows no results
Web scraping table shows no results

Time:07-08

I want to scrape the main table from enter image description here

However, I don't get any results neither with .find_all() nor with .xpath()

import requests
from bs4 import BeautifulSoup

page = requests.get('https://paperswithcode.com/sota/3d-object-detection-on-kitti-cars-moderate')
soup = BeautifulSoup(page.text, 'html.parser')
soup.find_all('table')
# out: []
import requests
import lxml.html as lh

page = requests.get('https://paperswithcode.com/sota/3d-object-detection-on-kitti-cars-moderate')
doc = lh.fromstring(page.content)
doc.xpath('//*[@id="leaderboard"]/div[3]/table')
# out: []

What's the problem and how do I get the correct result?

CodePudding user response:

Because the webpage is not dynamic, you can find table in script block.

import requests
import json
from bs4 import BeautifulSoup


url = 'https://paperswithcode.com/sota/3d-object-detection-on-kitti-cars-moderate'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
table_json = json.loads(soup.find('script', {'id': 'evaluation-table-data'}).getText())
for row in table_json:
    print(row['rank'], row['method'], row['metrics']['AP'], row['paper']['title'], row['evaluation_date'][:4])

OUTPUT:

1 BtcDet 82.86% Behind the Curtain: Learning Occluded Shapes for 3D Object Detection 2021
2 SE-SSD 82.54% SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud 2020
3 SPG 82.13 % SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation 2021
4 VoTr-TSD (ours) 82.09% Voxel Transformer for 3D Object Detection 2021
5 Pyramid-PV 82.08% Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection 2021
6 PV-RCNN   81.88% PV-RCNN  : Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection 2021
7 M3DeTR 81.73 M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers 2021
8 Voxel R-CNN 81.62% Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection 2020
9 PV-RCNN 81.43% PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection 2019
10 SVGA-Net 80.82 % SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds 2020
11 CIA-SSD 80.28% CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud 2020
12 SA-SSD EBM 80.12% Accurate 3D Object Detection using Energy-Based Models 2020
13 PC-RGNN 79.9% PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection 2020
14 Joint 78.96% Joint 3D Instance Segmentation and Object Detection for Autonomous Driving 2020
15 STD 77.63% STD: Sparse-to-Dense 3D Object Detector for Point Cloud 2019
16 UberATG-MMF 76.75% Multi-Task Multi-Sensor Fusion for 3D Object Detection 2020
17 F-ConvNet 76.51% Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection 2019
18 PointRGCN 75.73% PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement 2019
19 PointRCNN 75.42% PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud 2018
20 PointPillars 74.99% PointPillars: Fast Encoders for Object Detection from Point Clouds 2018
21 PC-CNN-V2 73.80% A General Pipeline for 3D Detection of Vehicles 2018
22 RoarNet 73.04% RoarNet: A Robust 3D Object Detection based on RegiOn Approximation Refinement 2018
23 3D-FCT 72.79% 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation 2021
24 IPOD 72.57% IPOD: Intensive Point-based Object Detector for Point Cloud 2018
25 AVOD   Feature Pyramid 71.88% Joint 3D Proposal Generation and Object Detection from View Aggregation 2017
26 Frustum PointNets  70.39% Frustum PointNets for 3D Object Detection from RGB-D Data 2017
27 VoxelNet 65.11% VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection 2017
28 PGD 11.77% Probabilistic and Geometric Depth: Detecting Objects in Perspective 2021

CodePudding user response:

Because the webpage is dynamic. So you can apply bs4/pandas with selenium in this case.

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))#,options=options

driver.get('https://paperswithcode.com/sota/3d-object-detection-on-kitti-cars-moderate')
driver.maximize_window()
time.sleep(3)
html = driver.page_source
df = pd.read_html(html)[0]
print(df)

Output:

    Rank                   Model       AP  ... Result  Year Tags
0      1                  BtcDet   82.86%  ...  Enter  2021  NaN
1      2                  SE-SSD   82.54%  ...  Enter  2020  NaN
2      3                     SPG  82.13 %  ...  Enter  2021  NaN
3      4         VoTr-TSD (ours)   82.09%  ...  Enter  2021  NaN
4      5              Pyramid-PV   82.08%  ...  Enter  2021  NaN
5      6               PV-RCNN     81.88%  ...  Enter  2021  NaN
6      7                  M3DeTR    81.73  ...  Enter  2021  NaN
7      8             Voxel R-CNN   81.62%  ...  Enter  2020  NaN
8      9                 PV-RCNN   81.43%  ...  Enter  2019  NaN
9     10                SVGA-Net  80.82 %  ...  Enter  2020  NaN
10    11                 CIA-SSD   80.28%  ...  Enter  2020  NaN
11    12              SA-SSD EBM   80.12%  ...  Enter  2020  NaN
12    13                 PC-RGNN    79.9%  ...  Enter  2020  NaN
13    14                   Joint   78.96%  ...  Enter  2020  NaN
14    15                     STD   77.63%  ...  Enter  2019  NaN
15    16             UberATG-MMF   76.75%  ...  Enter  2020  NaN
16    17               F-ConvNet   76.51%  ...  Enter  2019  NaN
17    18               PointRGCN   75.73%  ...  Enter  2019  GCN
18    19               PointRCNN   75.42%  ...  Enter  2018  NaN
19    20            PointPillars   74.99%  ...  Enter  2018  NaN
20    21               PC-CNN-V2   73.80%  ...  Enter  2018  NaN
21    22                 RoarNet   73.04%  ...  Enter  2018  NaN
22    23                  3D-FCT   72.79%  ...  Enter  2021  NaN
23    24                    IPOD   72.57%  ...  Enter  2018  NaN
24    25  AVOD   Feature Pyramid   71.88%  ...  Enter  2017  NaN
25    26       Frustum PointNets   70.39%  ...  Enter  2017  NaN
26    27                VoxelNet   65.11%  ...  Enter  2017  NaN
27    28                     PGD   11.77%  ...  Enter  2021  NaN

[28 rows x 8 columns]
  • Related