Home > Net >  Why am I getting 'ValueError: DataFrame constructor not properly called!'?
Why am I getting 'ValueError: DataFrame constructor not properly called!'?

Time:11-04

I tried to build a dataframe from api request response, but I'm getting this error.

url= 'https://www.reddit.com/r/Wallstreetbets/top.json?limit=10&t=year'
r= requests.get(url)
json= r.json()
message_df = pd.DataFrame(json['message'])
error_df = pd.DataFrame(json['error'])
message_df.head()

ValueError: DataFrame constructor not properly called!

CodePudding user response:

To get proper response from reddit server set User-Agent HTTP header (otherwise you will receive error message instead of proper response):

import json
import requests
import pandas as pd


headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:106.0) Gecko/20100101 Firefox/106.0"
}


url = "https://www.reddit.com/r/Wallstreetbets/top.json?limit=10&t=year"
r = requests.get(url, headers=headers)
data = r.json()

df = pd.DataFrame([c["data"] for c in data["data"]["children"]])
print(df.columns.to_list())

Prints:

['approved_at_utc', 'subreddit', 'selftext', 'author_fullname', 'saved', 'mod_reason_title', 'gilded', 'clicked', 'title', 'link_flair_richtext', 'subreddit_name_prefixed', 'hidden', 'pwls', 'link_flair_css_class', 'downs', 'thumbnail_height', 'top_awarded_type', 'hide_score', 'name', 'quarantine', 'link_flair_text_color', 'upvote_ratio', 'author_flair_background_color', 'subreddit_type', 'ups', 'total_awards_received', 'media_embed', 'thumbnail_width', 'author_flair_template_id', 'is_original_content', 'user_reports', 'secure_media', 'is_reddit_media_domain', 'is_meta', 'category', 'secure_media_embed', 'link_flair_text', 'can_mod_post', 'score', 'approved_by', 'is_created_from_ads_ui', 'author_premium', 'thumbnail', 'edited', 'author_flair_css_class', 'author_flair_richtext', 'gildings', 'content_categories', 'is_self', 'mod_note', 'created', 'link_flair_type', 'wls', 'removed_by_category', 'banned_by', 'author_flair_type', 'domain', 'allow_live_comments', 'selftext_html', 'likes', 'suggested_sort', 'tournament_data', 'banned_at_utc', 'url_overridden_by_dest', 'view_count', 'archived', 'no_follow', 'is_crosspostable', 'pinned', 'over_18', 'all_awardings', 'awarders', 'media_only', 'can_gild', 'spoiler', 'locked', 'author_flair_text', 'treatment_tags', 'visited', 'removed_by', 'num_reports', 'distinguished', 'subreddit_id', 'author_is_blocked', 'mod_reason_by', 'removal_reason', 'link_flair_background_color', 'id', 'is_robot_indexable', 'report_reasons', 'author', 'discussion_type', 'num_comments', 'send_replies', 'whitelist_status', 'contest_mode', 'mod_reports', 'author_patreon_flair', 'author_flair_text_color', 'permalink', 'parent_whitelist_status', 'stickied', 'url', 'subreddit_subscribers', 'created_utc', 'num_crossposts', 'media', 'is_video', 'post_hint', 'preview', 'link_flair_template_id']

CodePudding user response:

url='https://www.reddit.com/r/Wallstreetbets/top.json?limit=10&t=year'

r= requests.get(url)

json= r.json()

df = pd.DataFrame.from_dict(json, orient='index')

  • Related