I need to download the content of a web page using Python.
What I need is the TLE of a specific satellite from Space-Track.org website.
An example of the url I need to scrape is the following:
Below the unsuccesful code I wrote/copied:
import requests
url = 'https://www.space-
track.org/basicspacedata/query/class/gp/NORAD_CAT_ID/44235/format/tle/emptyresult/show'
res = requests.post(url)
html_page = res.content
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_page, 'html.parser')
text = soup.find_all(text=True)
print(text)
res.post(url) returns Response [204] and I can't access the content of the webpage.
Could this happen because of the required login?
I must admit that I am not experienced with Python and I don't have the knowledge to this myself.
What I can do is to manipulate text files and from the DevTools page I can get the HTML file and extrapolate the text, but how can I do this programmatically?
CodePudding user response:
To access the url you mentioned , you need USERNAME and PASSWORD Authorization.
to do this( customize to your need):
import mechanize
from bs4 import BeautifulSoup
import urllib2
import cookielib ## http.cookiejar in python3
cj = cookielib.CookieJar()
br = mechanize.Browser()
br.set_cookiejar(cj)
br.open("https://id.arduino.cc/auth/login/")
br.select_form(nr=0)
br.form['username'] = 'username'
br.form['password'] = 'password.'
br.submit()
print br.response().read()
CodePudding user response:
I don't have access to this API, so take my advice with a grain of salt, but you should also try using requests.get
instead of requests.post
.
Why? Because requests.post
POSTs data to the server, while requests.get
GETs data from the server. GET and POST are known as HTTP methods, and to learn more about them, see https://www.tutorialspoint.com/http/http_methods.htm. Since web browsers use GET, you should give that a try.
CodePudding user response:
My bad for not seeing it before, but Space-Track has already a solution on their website: