Home > Software engineering >  Python Dataframe add a value to new column based on value from another column
Python Dataframe add a value to new column based on value from another column

Time:03-24

What is the shortest way to achieve this scenario::

Dataframe1: (Dataframe1 Column A has additional Values because of which i can not simply perform df2["Column C"] = df["Column B"] )

Column A Column B
Cell 1 Valu 2
Cell w Valu 8
Cell 3 Valu 4

Condition: Insert Value at Column C (New column) of Dataframe2 from Column B of Dataframe1 where Column A's value from Dataframe1 'Cell 1' matches Column X's value from Dataframe2 'Cell 1'

Dataframe2 Initial: (Has only Column X & Column J)

Column X Column J
cell 1 Data c
cell 3 Data f

Dataframe2 Final: (Which had only Column X & Column J, now has Column C with above mentioned condition)

Column X Column J Column C
Cell 1 Data c Valu 2
Cell 3 Data f Valu 4
for key, value3 in df['Column A'].iteritems():
        value2 = datetime.datetime.strptime(value3, '%m/%d/%Y').strftime('%Y-%m-%d')
        value2 = str(value2)
        for key2, value4 in df2['Column X'].iteritems():
            sep = ' '
            value = str(value4)
            stripped = value.split(sep, 1)[0]
            if value2 == stripped:
                x = df[df['Column A']==value3]['Column B'].values[0]
                df2['Column C'][key2] = x

CodePudding user response:

You can use a merge to achieve the result that you want.

import pandas as pd
df = pd.DataFrame({'Col A':['Cell 1','Cell 3'],'Col B':['Cell 2','Cell 4']})
df1 = pd.DataFrame({'Col X':['Cell 1','Cell 3'],'Col Y':['Cell c','Cell F']})
df2 = df1.merge(df,left_on='Col X',right_on='Col A',how='inner') 
df2

After this you can manipulate the data(remove extra columns, rename columns) but this would help you get 'Col B' into df1 if df['Col A'] = df1['Col X]

CodePudding user response:

This is how you can do it with DataFrame.join(...) operation. You can indeed also use the DataFrame.merge(...) method as well.

import pandas as pd

# definition of the dataframes
df = pd.DataFrame(columns=["A", "B"])
df.A = [1, 2, 3]
df.B = ["b1", "b2", "b3"]

df2 = pd.DataFrame(columns=["X"])
df2.X = [1, 3]

# join operation
df2_final = df2.set_index("X").join(df.set_index("A")).reset_index()

Which outputs:

   X   B
0  1  b1
1  3  b3
  • Related