<script type="text/javascript">
/**
* Define SVG path for target icon
*/
var targetSVG = "M9,0C4.029,0,0,4.029,0,9s4.029,9,9,9s9-4.029,9-9S13.971,0,9,0z M9,15.93 c-3.83,0-6.93-3.1-6.93-6.93S5.17,2.07,9,2.07s6.93,3.1,6.93,6.93S12.83,15.93,9,15.93 M12.5,9c0,1.933-1.567,3.5-3.5,3.5S5.5,10.933,5.5,9S7.067,5.5,9,5.5 S12.5,7.067,12.5,9z";
/**
* Create the map
*/
var i=1;
var countrydataprovider = {
"map": "indiaLow",
"getAreasFromMap": true,
"theme": "none",
"imagesSettings": {
"rollOverColor": "#089282",
"rollOverScale": 3,
"labelPosition": "middle",
"labelFontSize": 8,
"labelColor": "#fff",
"selectedScale": 3,
"selectedColor": "#089282",
"color": "#13564e"
},
"images": [
{
"imageURL": "nowcast_marker\/map-marker-icon-png-green.png",
"width": 20,
"height": 20,
"description": "<p>No Warning <\/br><\/br> Time of issue: 2022-10-07<\/br>1005 Hrs<\/br> Valid upto: 1305 Hrs <\/p>",
"zoomLevel": 5,
"scale": 0.5,
"title": "Bapatla",
"latitude": "15.905897",
"longitude": "80.471587"
},
I want to get the data regarding the information regarding "images" subsection. This is the code that I have written until now. However, I could not move forward. Could anybody please help?
import requests # This is a request to the website
from bs4 import BeautifulSoup # This is a parser
url = "https://mausam.imd.gov.in/imd_latest/contents/stationwise-nowcast-warning.php"
html = requests.get(url).content # requests instance
soup = BeautifulSoup(html, 'html.parser') # getting raw data
a = soup.find('script', attrs={'type': 'text/javascript'})
CodePudding user response:
You are on the right track, you just need to further dissect the information from that tag, to get what you need. Here is one way of obtaining that data:
import requests
import pandas as pd
from bs4 import BeautifulSoup as bs
import json
url = 'https://mausam.imd.gov.in/imd_latest/contents/stationwise-nowcast-warning.php'
script_w_data = bs(requests.get(url).text, 'html.parser').select_one('script[type="text/javascript"]').text.split('"images": [')[1].split(']')[0]
obj = json.loads('[' script_w_data ']')
df = pd.json_normalize(obj)
print(df)
Result in terminal:
imageURL width height description zoomLevel scale title latitude longitude
0 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Bapatla 15.905897 80.471587
1 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Eluru 16.71066 81.09524
2 nowcast_marker/map-marker-icon-png-yellow.png 20 20 <p>Light rain: < 5 mm/hr</br> Light Thundersto... 5 0.5 Gannavaram 16.540171 80.801249
3 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Guntur 16.306652 80.43654
4 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Kakinada 16.945181 82.238647
... ... ... ... ... ... ... ... ... ...
1115 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Namrup 27.12 95.18
1116 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Nazira 26.54 94.44
1117 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Moreh 24.2475 94.3045
1118 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Moirang 24.5028 93.7768
1119 nowcast_marker/map-marker-icon-png-green.png 20 20 <p>No Warning </br></br> Time of issue: 2022-1... 5 0.5 Jhandutta 31.3702 76.6369
1120 rows × 9 columns
See pandas documentation at https://pandas.pydata.org/docs/
Also BeautifulSoup docs: https://beautiful-soup-4.readthedocs.io/en/latest/