Home > Software engineering >  Scrape website data without BS or selenium (Python)
Scrape website data without BS or selenium (Python)

Time:05-28

Basically my scenario is that I have the webpage open and want to copy some of the text from the website that is open on my screen ( there is a whole login process every time ) . For Security reasons, I do not want to have to continuously login to the webpage and for that reason, requests are not suitable. I also do not want to use selenium as it will open up a new browser when I wish to use my existing one. My question is with my browser already open on the page I want info from, is there some sort of script I can make that will retrieve certain information on the page for me and save it somewhere (almost like a macro but it's able to retrieve certain elements) . Is this a possibility?

CodePudding user response:

I'm not sure if I understood the question correctly.

One way might be to download the entire .html and process the respective data "locally" after downloading the .html.

CodePudding user response:

If you use "request", like with postman, you don't need to log in each time. If you have a valid JWT token you will skip login. But that depend how your stuff work (lack of details in your question).

I don't know about selenium, but with puppeteer (a concurrent), you can re-use an already installed browser instead of downloading a new one.

Also... do you even need selenium or puppeteer ? Can't you just run some code into your console (browser console) ? You can create and save snippets in source tab in chrome. If you need access to your file system directly (meaning the data you collect being downloaded automatically in download folder, or having download pop-up to choose folder, is not enough for you), you may give a look at TamperMonkey extension. Or maybe you need to make a chrome extension.


Update after reading your comment @JeanVanNiekerk:

// to get user name of the one asking.
console.log(
  document.querySelector('#question .user-details a').innerText
); // 'Jean Van Niekerk'
navigator.clipboard.writeText('stuff').then(
  e => {
    console.log('Copied text ready !');
  }
);

// If you write that above in the console, you
// will get `Uncaught (in promise) DOMException: Document is not focused.`
// This is a security (maybe it can be disabled for your special case, another
// option is to make an extension that has this kind of rights).

// To try it out right now, paste this code bellow into you console, and swiffly click on the page (anywhere)
setTimeout(() => {
  navigator.clipboard.writeText('stuff').then(
  e => {
    console.log('Copied text ready !');
  }
);
}, 1000);
// Ctrl V to paste your text :)
  • Related