Suppose a .csv
file which looks like this:
- title: is the name of the column
- and
[senior innovation manager]
is the first row.
Note: both strings (title and row) look exactly as written here.
title
[senior innovation manager]
The idea is to convert this list string representation to an actual python list:
import ast
import pandas as pd
import numpy as np
# read the file
df = pd.read_csv(file_path, sep=',', na_values='NA', encoding='latin-1')
# convert first row to actual python list
df['title'][0]=ast.literal_eval(df['title'][0])
# inspect if ast.literal_eval() converted to actual list:
print(df['title'][0])
print(type(df['title'][0]))
However when tried the above code the next error arises:
Traceback (most recent call last):
File "file_path", line 76, in <module>
df['title'][0]=ast.literal_eval(df['title'][0])
File "C:\Users\id\Anaconda3\lib\ast.py", line 46, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "C:\Users\id\Anaconda3\lib\ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
[senior innovation manager]
What's the nature of this error?
Is it possible to convert this list string representation to an actual python list?
CodePudding user response:
I don't see any advantage to treating this as a CSV file or using pandas. You could simply read the second line of the file and strip the unwanted stuff out. You can do that by grabbing a slice from the second character to one before the end. In python list syntax, that's 1:-1
.
with open(file_path) as fileobj:
# skip title
fileobj.readline()
# get data
title_list = [fileobj.readline().strip()[1:-1]]
CodePudding user response:
In order to use literal_eval
your string must be written exactly as it would written in code. That is the string values contained in your list must be in quotes and separated by a comma. So your string should look something like this ['senior', 'innovation', 'manager']
If you're set on using this method you could try replacing the spaces in your string by ', '
and then adding the last two quotes after opening and before closing the brackets.