Home > database >  Is there a way to add two arrays in two columns in to a third array using pands
Is there a way to add two arrays in two columns in to a third array using pands

Time:02-18

I am working on a project, which uses pandas data frame. So in there, I received some values in to the columns as below. enter image description here

In there, I need to add this Pos_vec column and word_vec column and need to create a new column called the sum_of_arrays. And the size of the third column's array size should 2.

Eg: pos_vec                       Word_vec                        sum_of_arrays
   [-0.22683072, 0.32770252]      [0.3655883, 0.2535131]          [0.13875758,0.58121562] 

Is there anyone who can help me? I'm stuck in here. :(

CodePudding user response:

If you convert them to np.array you can simply sum them.

import pandas as pd
import numpy as np
df = pd.DataFrame({'pos_vec':[[-0.22683072,0.32770252],[0.14382899,0.049593687],[-0.24300802,-0.0908088],[-0.2507714,-0.18816864],[0.32294357,0.4486494]],
                  'word_vec':[[0.3655883,0.2535131],[0.33788466,0.038143277], [-0.047320127,0.28842866],[0.14382899,0.049593687],[-0.24300802,-0.0908088]]})

If you want to use numpy

df['col_sum'] = df[['pos_vec','word_vec']].applymap(lambda x: np.array(x)).sum(1)

If you don't want to use numpy

df['col_sum'] = df.apply(lambda x: [sum(x) for x in zip(x.pos_vec,x.word_vec)], axis=1)

CodePudding user response:

There are maybe cleaner approaches possible using pandas to iterate over the columns, however this is the solution I came up with by extracting the data from the DataFrame as lists:

# Extract data as lists
pos_vec = df["pos_vec"].tolist()
word_vec = df["word_vec"].tolist()

# Create new list with desired calculation
sum_of_arrays = [[x y for x,y in zip(l1, l2)] for l1,l2 in zip(pos,word)]

# Add new list to DataFrame
df["sum_of_arrays"] = sum_of_arrays
  • Related