How do can I convert a string that contains values that are both strings and numeric, given that the string within the list is not in quotes?
import pandas as pd
df = pd.DataFrame({'col_1': ['[2, A]', '[5, BC]']})
print(df)
col_1
0 [2, A]
1 [5, BC]
col_1 [2, A]
Name: 0, dtype: object
My aim is to use the list in another function, so I tried to transform the string with built-in functions such as eval() or ast.literal_eval(), however in both cases I need to add quotes around the strings, so it is "A" and "BC".
CodePudding user response:
You can first use a regex to add quotes around the potential strings (here I used letters underscore), then use literal_eval
(for some reason I have an error with pd.eval
)
from ast import literal_eval
df['col_1'].str.replace(r'([a-zA-Z_] )', r'"\1"', regex=True).apply(literal_eval)
output (lists):
0 [2, A]
1 [5, BC]
CodePudding user response:
It is already a string and If the data is going to be in a certain format-
df['col_2'] = df['col_1'].apply(lambda x: x.split(',')[1].rstrip(']'))
CodePudding user response:
If you want the output to be a list:
import pandas as pd
df = pd.DataFrame({'col_1': ['[2, A]', '[5, BC]']})
print(df)
a = df["col_1"].to_list()
actual_list = [[int(i.split(",")[0][1:]), str(i.split(",")[1][1:-1])] for i in a]
actual_list
Output:
[[2, 'A'], [5, 'BC']]
CodePudding user response:
If you just need to convert string representation list to list of strings, you can use str.strip()
together with str.split()
, as follows:
df['col_1'].str.strip('[]').str.split(',\s*')
Result:
print(df['col_1'].str.strip('[]').str.split(',\s*').to_dict())
{0: ['2', 'A'], 1: ['5', 'BC']}
If you further want to convert the strings of numeric values to numbers, you can further use pd.to_numeric()
, as follows:
df['col_1'].str.strip('[]').str.split(',\s*').apply(lambda x: [pd.to_numeric(y, errors='ignore') for y in x])
Result:
print(df['col_1'].str.strip('[]').str.split(',\s*').apply(lambda x: [pd.to_numeric(y, errors='ignore') for y in x]).to_dict())
{0: [2, 'A'], 1: [5, 'BC']} # 2, 5 are numbers instead of strings