I have a DataFrame that contains two columns, 'A_List' and 'B_List', which are of the string dtype. I have converted these to lists and I would like to now perform element wise addition of the elements in the lists at specific indices. I have attached an example of the csv file I'm using. When I do the following, I am getting an output that is joining the elements at the specified indices as opposed to finding their sum. What may I try differently to achieve the sum instead?
For example, when I do row["A_List"][0] row["B_List"][3]
, the desired output would be 0.16 (since 0.1 0.06 = 0.16). Instead, I am getting 0.10.06
as my answer.
import pandas as pd
df = pd.read_csv('Example.csv')
# Get rid of the brackets []
df["A_List"] = df["A_List"].apply(lambda x: x.strip("[]"))
df["B_List"] = df["B_List"].apply(lambda x: x.strip("[]"))
# Convert the string dtype of values into a list
df["A_List"] = df["A_List"].apply(lambda x: x.split())
df["B_List"] = df["B_List"].apply(lambda x: x.split())
for i, row in df.iterrows():
print(row["A_List"][0] row["B_List"][3])
CodePudding user response:
The problem is that when you're using the operator, Python is interpreting it as a concatenation of strings, not as an addition of numeric values. In order to add the numeric values, you will need to convert the elements of the lists from strings to floats before performing the addition. You can do this by using the map() function along with the float() constructor. Here's an updated version of your code:
import pandas as pd
df = pd.read_csv('Example.csv')
# Get rid of the brackets []
df["A_List"] = df["A_List"].apply(lambda x: x.strip("[]"))
df["B_List"] = df["B_List"].apply(lambda x: x.strip("[]"))
# Convert the string dtype of values into a list
df["A_List"] = df["A_List"].apply(lambda x: x.split())
df["B_List"] = df["B_List"].apply(lambda x: x.split())
# Convert the elements of the lists to floats
df["A_List"] = df["A_List"].apply(lambda x: list(map(float, x)))
df["B_List"] = df["B_List"].apply(lambda x: list(map(float, x)))
for i, row in df.iterrows():
print(row["A_List"][0] row["B_List"][3])
This will convert the elements of the lists from strings to floats before performing the addition, giving you the desired output.
Alternatively, you can use the pd.to_numeric(s, downcast='float') function to change the string values to float in a more direct way.
import pandas as pd
df = pd.read_csv('Example.csv')
df[['A_List', 'B_List']] = df[['A_List', 'B_List']].applymap(lambda x: pd.to_numeric(x.strip("[]").split(), downcast='float'))
This will apply the conversion in one line for both columns A_List and B_List