Home > Back-end >  Pandas.read_csv stops at 50 characters
Pandas.read_csv stops at 50 characters

Time:09-06

i've got some finance data stored in a csv. The problem comes when loading my bids and asks, which are lists of lists, stored in a csv. My goal now is to load the csv into normal lists i can zip and map to whatever i need. Currently have found a way using .to_string() to create a string representation of my list of lists, and then ast.literal_eval for conversion to actual list.

raw csv file data(just the 'Bids' column, didnt wanna crowd my post with the whole file):

[['1548.36000000', '36.94670000'], ['1548.35000000', '2.75850000'], ['1548.32000000', '4.56580000'], ['1548.31000000', '7.59050000'], ['1548.30000000', '18.99930000'], ['1548.26000000', '3.60850000'], ['1548.25000000', '4.30280000'], ['1548.17000000', '0.02000000'], ['1548.12000000', '29.70940000'], ['1548.03000000', '0.20000000']]

what i want to do:

df = pandas.read_csv('C:/Users/Ethan/Desktop/csv/test.csv', names=['E','ID','Bids','Asks'])
df = (df['Bids']).to_string(index=False)
bids = ast.literal_eval(df)

This yields an error:

   [['1548.36000000', '36.94670000'], ['1548.35000...
                                       ^SyntaxError: unterminated string literal (detected at line 1)

simply running len(df) without ast.literal_eval() yields a string length of 50, which is several hundred short obviously

df = pandas.read_csv('C:/Users/Ethan/Desktop/csv/test.csv', names=['E','ID','Bids','Asks'])
df = (df['Bids']).to_string(index=False)
print(len(df))

>50

So it seems python cant literal_eval my string because it simply isn't loading the entire thing. So then it would make sense why the string literal is unterminated. Has anyone encountered this before? Another method for loading a list of lists from a single csv column would be appreciated as well but would be great to know why this wont work.

CodePudding user response:

thanks everyone! with your help my solution wound up being simpler, using .at[] .at[] returns a string instead of any kind of object which i can use with literal_eval to convert it to list!

df = pandas.read_csv('C:/Users/Ethan/Desktop/csv/test.csv', names=['E','ID','Bids','Asks'])
df2 = df.at[0,'Bids']
dflist = ast.literal_eval(df2)
print(type(dflist))
print(dflist)
print(dflist[0])

results:

<class 'list'>
[['1548.36000000', '36.94670000'], ['1548.35000000', '2.75850000'], ['1548.32000000', '4.56580000'], ['1548.31000000', '7.59050000'], ['1548.30000000', '18.99930000'], ['1548.26000000', '3.60850000'], ['1548.25000000', '4.30280000'], ['1548.17000000', '0.02000000'], ['1548.12000000', '29.70940000'], ['1548.03000000', '0.20000000']]
['1548.36000000', '36.94670000']

CodePudding user response:

the pandas DataFrame to_string method doesn't return full string representation if it's too long.

e.g.

df = pd.DataFrame({1:[[1, [2]]*1000]})

df
                                                   1
0  [1, [2], 1, [2], 1, [2], 1, [2], 1, [2], 1, [2...
df[1].to_string(index=False)
'[1, [2], 1, [2], 1, [2], 1, [2], 1, [2], 1, [2]...'

try something like:

[ast.literal_eval(row) for row in df["bids"].astype("string").to_list()]
  • Related