Home > Net >  Empty list causing pd.DataFrame() to return no rows
Empty list causing pd.DataFrame() to return no rows

Time:11-30

import pandas as pd
pd.DataFrame({'genre': 'Pop',
 'country': 'CA',
 'artist_name': 'Olivia Rodrigo',
 'title_name': 'good 4 u',
 'release_date': '2021-05-13',
 'core_genre': 'Pop',
 'metrics': [],
 'week_id': 202101,
 'top_isrc': 'USUG12101245'})

is returning column names but an otherwise empty dataframe, and this is happening because of the empty list for metrics:. This is a problem. It would be better if this returned a 1-row dataframe with an empty list in the metrics column.

enter image description here

Here is an example of the data without missing metrics:

{'genre': 'Pop',
 'country': 'CA',
 'artist_name': 'Olivia Rodrigo',
 'title_name': 'drivers license',
 'release_date': '2021-01-07',
 'core_genre': 'Pop',
 'metrics': [{'name': 'Song w/SES On-Demand',
   'value': [{'name': 'tp', 'value': 1},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 1},
    {'name': 'atd', 'value': 1}]},
  {'name': 'Song w/SES On-Demand Audio',
   'value': [{'name': 'tp', 'value': 0},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 0},
    {'name': 'atd', 'value': 0}]},
  {'name': 'Streaming On-Demand Total',
   'value': [{'name': 'tp', 'value': 414},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 414},
    {'name': 'atd', 'value': 414}]},
  {'name': 'Streaming On-Demand Audio',
   'value': [{'name': 'tp', 'value': 69},
    {'name': 'lp', 'value': 0},
    {'name': 'ytd', 'value': 69},
    {'name': 'atd', 'value': 69}]}],
 'week_id': 202101,
 'top_isrc': 'USUG12004749'}

and this is handled quite nicely by pd.DataFrame(), creating a row for each of the 4 nested options within the list in metrics. I assume for the same reason pd.DataFrame() returns 4 rows on this second example (4 dicts in the list), pd.DataFrame() returns 0 rows in the example above (0 dicts in the list). However the lost row of data is a problem. How can we handle this?

CodePudding user response:

An empty list can be achieved by passing in a list of an empty list:

df = pd.DataFrame({'genre': 'Pop',
 'country': 'CA',
 'artist_name': 'Olivia Rodrigo',
 'title_name': 'good 4 u',
 'release_date': '2021-05-13',
 'core_genre': 'Pop',
 'metrics': [[]],
 'week_id': 202101,
 'top_isrc': 'USUG12101245'})

Gives

  genre country     artist_name title_name release_date core_genre metrics  week_id      top_isrc
0   Pop      CA  Olivia Rodrigo   good 4 u   2021-05-13        Pop      []   202101  USUG12101245

Or you could make it a list of an empty dict [{}] too.

Comment:

It's interesting that just specifying a single list returns a blank row, but I suppose from pandas's point of view, it might have trouble distinguishing a vector of row values from a single row value that is a vector, and the default behaviour is to, apparantly, throw the whole row away? Interesting.

  • Related