I'm working on a misinformation project and I want to scrape a couple of quarantines subreddits (r/russsia specifically).
When I follow the guidelines posted on the praw docs I get a prawcore.exceptions.Forbidden: received 403 HTTP response
error.
I saw a couple of solutions from 3 years ago about manually adding the subreddit on the browser and using quarn.opt_in()
but no luck. Below is a code snippit:
reddit = praw.Reddit(user_agent='Comment Extraction (by /u/guy_asking_on_stackoverflow)',
client_id=sec.reddit_client_id, client_secret=sec.reddit_client_secret)
subred = reddit.subreddit(subreddit)
subred.quaran.opt_in() # error happens here
# for post in subred.top(limit=10): ERROR HAPPENS BEFORE, KEPT FOR POST HISTORY
# pass # error happens here
subred
is of type, praw.models.reddit.subreddit.Subreddit
but it will not return submissions.
Any ideas for a solution?
full error:
Traceback (most recent call last):
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3361, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-9de81e112c74>", line 1, in <cell line: 1>
for post in subred.top(limit=10):
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/praw/models/listing/generator.py", line 63, in __next__
self._next_batch()
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/praw/models/listing/generator.py", line 73, in _next_batch
self._listing = self._reddit.get(self.url, params=self.params)
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/praw/reddit.py", line 595, in get
return self._objectify_request(method="GET", params=params, path=path)
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/praw/reddit.py", line 696, in _objectify_request
self.request(
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/praw/reddit.py", line 885, in request
return self._core.request(
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/prawcore/sessions.py", line 330, in request
return self._request_with_retries(
File "/Users/travisbarton/opt/anaconda3/envs/work3.8/lib/python3.8/site-packages/prawcore/sessions.py", line 266, in _request_with_retries
raise self.STATUS_EXCEPTIONS[response.status_code](response)
prawcore.exceptions.Forbidden: received 403 HTTP response
CodePudding user response:
To scrape quarantined subreddits your client cannot be read only.
You can make your client fully authorized by also providing the account username and password.
reddit = praw.Reddit(user_agent='Comment Extraction (by /u/guy_asking_on_stackoverflow)',
client_id=sec.reddit_client_id, client_secret=sec.reddit_client_secret,
password=sec.reddit_password, username=sec.reddit_username)
https://praw.readthedocs.io/en/stable/getting_started/authentication.html#password-flow