Home > Enterprise >  what module will help me to analyze this data?
what module will help me to analyze this data?

Time:10-19

I have a csv file which holds many records and there are many other csv files which holds the same records but with their different values. I want to analyze and to get some useful information from these csv files but do not know how to do it? Can you tell me how to do it csv file is like

Compony name,DCP,OpenMarket,HIGH,LOW,CURRENT_MARKET,CHANGE,VOLUME
Company Moon, 8.07, 9.07, 9.07, 7.80, 8.22, 0.15, 4547500
Company Sun, 7.07, 6.07, 5.07, 3.80, 7.22, 0.10, 1233333
.
.
.

This is one csv file containing information about some companies data in stock market. Another csv file holds same companies data but with different values . I want to get information according to their change,Volume and others prop to know which company is going Good.

CodePudding user response:

Pandas and numpy best libraries

CodePudding user response:

using pandas, if you want to diff cell by cell, and assuming the 2 files have the same number of rows / columns, I would do the following:

import pandas as pd

df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')

# join the 2 dataframes to get a single dataframe
df = df1.join(df2, rsuffix='_2')
# columns will now be A,B,C and A_2, B_2, C_2 respectively
# check for diff with 
df['A-diff'] = df['A'] - df['A_2']
# ...
# or to do it all:
for col in df1.columns:
  df[f'{col}-diff'] = df[col] - df['{col}_2']
#df['<column name>-diff'] has the diff for each column

I'm showing a simple substraction, but you can do whatever you want with the 2 columns. The point is to get the merged values in one dataframe to operate on them.

ref: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html

  • Related