It's my first time working with Panda, so I am trying to wrap my head around all of its functionalities.
Essentially, I want to download my bank statements in CSV and search for a keyword (e.g. steam) and compute the money I spent.
I was able to use panda to locate lines that contain my keyword, but I do not know how to iterate through them and attribute the cost of that purchase to a variable that I will sum up as the iteration grows.
If you look in the image I upload, I am able to find the lines containing my keyword in the dataframe, but what I want to do is for each line found, I want to take the content of the col1 and sum it up together.
# importing pandas module
import pandas as pd
keyword = input("Enter the keyword you wish to search in the statement: ")
# reading csv file from url
df = pd.read_csv('accountactivity.csv',header=None)
dff=df.loc[df[1].str.contains(keyword,case=False)]
value=df.values[68][2] #Fetches value of a specific cell in the CSV/dataframe created
print(dff)
print(value)
EDIT: I essentially was almost able to complete the code I wanted, using only the CSV reader, but I can't get that code to find substrings. It only works if I enter the exact same string, meaning if I enter netflix it doesn't work, I would need to write it exactly as it appears on the statement like NETFLIX.COM _V. Here is another screenshot of that working code. I essentially want to mimic that with the capabilities of just finding substrings.
import csv
data=[]
with open("accountactivity.csv") as csvfile:
reader = csv.reader(csvfile)
for row in reader:
data.append(row)
keyword = input("Enter the keyword you wish to search in the statement: ")
col = [x[1] for x in data]
Sum = 0
if keyword in col:
for x in range(0, len(data)):
if keyword == data[x][1]:
PartialSum=float(data[x][2])
Sum=Sum PartialSum
print(data[x][1])
print("The sum for expenses at ",keyword," is of: ",Sum,"$",sep = '')
else:
print("Keyword returned no results.")
The format of the CSV is the following: CSV Format
column 0 Date of transaction
column 1 Name of transaction
column 2 Money spent from account
column 3 Money received to account
The CSV file downloaded directly from my bank has no headers. So I refer to columns using col[0] etc...
Thanks for your help, I will continue meanwhile to look at how to potentially do this.
CodePudding user response:
dff[dff.columns[col_index]].sum()
where col_index
is the index of the column you want to sum together.
CodePudding user response:
Thanks everyone for your help. I ended up understanding more how dataframe with Pandas work and I used the command: df[df.columns["col_index"]].sum() (which was suggested to me by Jonny Kong) with the column of interest (which in my case is column 2 containing my expenses). It computes the sum of my expenses for the searched keyword which is what I need!
#Importing pandas module
import pandas as pd
#Keyword searched through bank statement
keyword = input("Enter the keyword you wish to search in the statement: ")
#Reading the bank statement CSV file
df = pd.read_csv('accountactivity.csv',header=None)
#Creating dataframe from bank statement with lines that match search keyword
dff=df.loc[df[1].str.contains(keyword,case=False)]
#Sum the column which contains total money spent on the keyword searched
Sum=dff[dff.columns[2]].sum()
#Prints the created dataframe
print("\n",dff,"\n")
#Prints the sum of expenses for the keyword searched
print("The sum for expenses at ",keyword," is of: ",Sum,"$",sep = '')
Again, thanks everyone for helping and supporting me through my first post on SO!