Home > other >  Compare quantity of each item in dataframe in pandas
Compare quantity of each item in dataframe in pandas

Time:11-03

I fairly new to programming in python and using the pandas library and I am having problems with comparing 2 dataframes with different quantities i want to see if the quantity in data_1 is less than the quantity in data_2 for each item

import pandas as pd

data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]

df_1 = pd.Dataframe(data_1,columns = ['item','quantity'])
df_2 = pd.Dataframe(data_2,columns = ['item','quantity'])

enter image description here

enter image description here

i was trying to use pd.merge() the 2 dataframes to compare but it's not quite what i was looking for..... i needed a 3° dataframe with the differences only

CodePudding user response:

For such comparison item should be index and then you might concat quantity columns and compute desired column following way

import pandas as pd

data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]

df_1 = pd.DataFrame(data_1,columns = ['item','quantity'])
df_2 = pd.DataFrame(data_2,columns = ['item','quantity'])

df = pd.concat({"data1":df_1.set_index("item").quantity,"data2":df_2.set_index("item").quantity},axis=1,sort=False)
df['isless'] = df['data1'] < df['data2']
print(df)

output

            data1  data2  isless
banana       10.0      1   False
orange        2.0      2   False
strawberry    3.0      5    True
melon         NaN      8   False
  • Related