Get Twitter username and number of followers with Twitter API-CodePudding

I want to crawl Twitter by keyword with Twitter API. Im using Twitter Search API.

query = 'football'
tweet_fields = "author_id,created_at,text,public_metrics,possibly_sensitive,source,lang"
max_results = "50"

#define search twitter function
headers = {"Authorization": "Bearer {}".format(BEARER_TOKEN)}

url = "https://api.twitter.com/2/tweets/search/recent?query={}&tweet.fields={}&max_results={}".format(query, tweet_fields, max_results)
response = requests.request("GET", url, headers=headers)

status_code = response.status_code
print("Response Status Code:", status_code)

if response.status_code != 200:
    raise Exception(response.status_code, response.text)
else:
    pass

#print(response.json())
twitter_search_data = response.json()['data']

twitter_response = []
for data in twitter_search_data:
    print(data)

Im getting good results, but I want to get author_username also. For now I can only get author_id

I have tried to add this to my API link but I do not get those results:

expansions=author_id&user.fields={}
user_fields = "description,username"

url = "https://api.twitter.com/2/tweets/search/recent?query={}&tweet.fields={}&expansions=author_id&user.fields={}&max_results={}".format(query, tweet_fields, user_fields, max_results)

This is example result:

{'possibly_sensitive': False, 'source': 'Twitter for Android', 'lang': 'en', 'public_metrics': {'retweet_count': 1, 'reply_count': 0, 'like_count': 0, 'quote_count': 0}, 'created_at': '2021-10-05T12:23:05.000Z', 'id': '1445363916457005058', 'text': 'RT @COiNSTANTIN1: @MEXC_Global @PolkaExOfficial Check out @MiniFootballBsc We are bringing together the football and crypto community.\n⚽️Fa…', 'author_id': '1444275133854715912'}

Is there a way to add something to my Twitter API so that I can get: 1.author username 2.author name 3.number of followers for author 4.number of followings for author

CodePudding user response：

You're close to what you need in your code, but the user information you're requesting via the expansions, is actually being delivered in a second array called includes; and you're missing that, because your code is only printing each value in the data array.

If you want the metrics (number of followers / followings for each user) you will want to add an additional user field to your query:

user_fields = "description,username,public_metrics"

Then, you can either list out the includes separately, or do some matching to combine the user object with the matching Tweet. The simplest thing to do would be:

print(response.json()['data'])
print(response.json()['includes'])

You can match the user with the Tweet data by checking the author_id in the Tweet object against the id value in the user object.

There are also tools and libraries that can help you do this automatically, for example, the latest version of twarc can "flatten" this data into single objects.