Home > Enterprise >  Using Pandas to calculate the difference between column values
Using Pandas to calculate the difference between column values

Time:12-19

I have a csv with two columns, Dates and Profits/Losses that I have read into the data frame.

import os
import csv
import pandas as pd
cpath = os.path.join('..', 'Resources', 'budget_data.csv')
df = pd.read_csv(cpath)
df["Profit/Losses"]= df["Profit/Losses"].astype(int)

I want to know the differences of profits and losses per month (with each row being one month) and so thought to use df.diff to calculate the values

df.diff()

This results however in errors as I think it is trying to calculate the dates column as well and I'm not sure how to make it only calculate the profits and losses.

CodePudding user response:

Maybe you can do this:

import pandas as pd
x = [[1,2], [1,2], [1,4]]

d = pd.DataFrame(x, columns=['loss', 'profit'])

d.insert(0, "diff", [d['profit'][i] - d['loss'][i] for i in d.index])

d.head()

Gives:

diff

CodePudding user response:

Is this what you are looking for?

import pandas as pd

data = pd.DataFrame(
    [
        ["2019-01-01", 40],
        ["2019-02-01", -5],
        ["2019-03-01", 15],
    ],
    columns = ["Dates", "Profit/Losses"]
)

data.assign(Delta=lambda d: d["Profit/Losses"].diff().fillna(0))

Yields

    Dates      Profit/Losses   Delta
0   2019-01-01  40             0
1   2019-02-01  -5             -45.0
2   2019-03-01  15             20.0
  • Related