Home > database >  Unable to convert a pandas Dataframe to a list using literal_eval
Unable to convert a pandas Dataframe to a list using literal_eval

Time:12-04

I have been trying to convert a pandas Dataframe column to a list as the data in the column is being read as a str by default. Sample data in the dataframe 'movie' column 'genres' is

[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]

The code I am writing

import ast 
import pandas as pd
movie = pd.read_csv("tmdb_5000_movies.csv")
movie['genres'] = movie['genres'].apply(lambda x : ast.literal_eval(str(x)))
print(type(movie['genres']))

The output I am getting is

<class 'pandas.core.series.Series'>

Really can't wrap my head around where am I going wrong

CodePudding user response:

pandas.DataFrames are composed of Series objects (where a Series is simply a column. Series are container objects similar to Python lists and can actually be converted into a list by using their Series.tolist method.

ast.literal_eval is being applied on each element inside of your Series, converting them a string into dictionary, those dictionaries as then stored back into a Series.

So pretty much your code is working- but if you want a list of dictionaries instead of a Series of dictionaries, you'll need to the following:

import ast 
import pandas as pd
movie = pd.read_csv("tmdb_5000_movies.csv")
movie['genres'] = movie['genres'].apply(lambda x : ast.literal_eval(str(x)))

genres = movie['genres'].tolist()
print(genres)
  • Related