append all results of get_all_tweets_count with Paginator from tweepy-CodePudding

im struggeling to get the results from my tweepy count into usabale format.

Im using the elevated Twitter access and the tweepy package.

counts = tweepy.Paginator(
        client.get_all_tweets_count,
        query=query, start_time=start_time,
        end_time=end_time,
        granularity='day') time.sleep(1)"
    
    
tweet_count = []
    
for count in counts:
        tweet_count.append(count.data)

Im getting as a result my Dataframe as a list of dictionaries.

The result is looking like:

Basically i have a output which is like:

data = [[{'end': '2020-12-01T00:00:00.000Z', 'start': '2020-11-30T00:00:00.000Z', 'tweet_count': 5780}, {'end': '2020-12-02T00:00:00.000Z', 'start': '2020-12-01T00:00:00.000Z', 'tweet_count': 3093}, {'end': '2020-12-03T00:00:00.000Z', 'start': '2020-12-02T00:00:00.000Z', 'tweet_count': 7379},...}]]

How can i get a nice DataFrame with pandas for example which looks like:

   Start                     End                       Count
0  2020-11-30T00:00:00.000Z  2020-12-01T00:00:00.000Z  5780  
1  2020-12-01T00:00:00.000Z  2020-12-02T00:00:00.000Z  3093      
2  ...                        ...                      ...

Im not that good in python but i guess i miss something with the format in list with dictionaries inside.

CodePudding user response：

Use chain.from_iterable to flatten the nested list then use pandas's DataFrame() constructor to build your dataframe.

from itertools import chain
data = list(chain.from_iterable(data))

df = pd.DataFrame(data).reindex(['start', 'end', 'tweet_count'], axis=1)

# Output :

print(df)

                      start                       end  tweet_count
0  2020-11-30T00:00:00.000Z  2020-12-01T00:00:00.000Z         5780
1  2020-12-01T00:00:00.000Z  2020-12-02T00:00:00.000Z         3093
2  2020-12-02T00:00:00.000Z  2020-12-03T00:00:00.000Z         7379