I am rn trying to scrape twitter for an nlp research. I already used snscrape to get tweets with the required filters, the issue is that we need tweets from a specific age range. In my head I guess some profiles on twitter have their birthdate public, so maybe we can fetch that. Maybe webscrape that from the profile? Any ideas are welcomed.
Till now I have tried some methods of webscraping but can't find something concrete
CodePudding user response:
Twitter has a pretty well documented API that works very well with Python. Try to make a simple crawler and see one of the JSONs that you get for a Tweet/User.
You will need to sign up and get some Access Tokens/Keys to use in your script, but other than that you are ready to go: https://developer.twitter.com/en/docs/twitter-api
CodePudding user response:
The age of Twitter users is not made available via the API (and the website may show birthday but not year). There are also a number of other factors you should read about in relation to analysis of Twitter user data.