Home > Software engineering >  How to get only column value of a dataframe (without reference formula)
How to get only column value of a dataframe (without reference formula)

Time:03-30

I want to get only column value of csv file rather then reference formula.

 df_csv = pd.read_csv(file_name)
 print(df_csv["column_head"])

the output is not the value. It is a csv reference formula.

0    =ROUND(IF(J2,I2/J2,0),4)
1    =ROUND(IF(J3,I3/J3,0),4)

But I want only cell values not the formula. How can I do this in python?

CodePudding user response:

I think you are reading from a line which defines the formulas. I would open the csv in excel and then check if I am reading the data right. use

df=pd.read_csv('file_path', header=n); where n is the row number from which data 
                                       starts

CodePudding user response:

Something seems to be wrong with the format of the file you read from. I checked on a spreadsheet containing formulas for some cells. here as an example:

XLTest

The 'D' column contains moving averages from the 'B' column. I saved the spreadsheet as a 'xlsx' file and as a 'csv' file.

For the first, importing with:

import pandas as pd
df_ex = pd.read_excel("xltest.xlsx")
df_ex = df_ex[df_ex['Global 10 years MA'].notnull()]
df_ex.head()

and the second, importing with:

df_ex2 = pd.read_csv('xltest.csv')
df_ex2[df_ex2['Global 10 years MA'].notnull()].head()

both gave this output (note: the 'notnull()' mask is just extracting the lines with a value in this example):

enter image description here

Extracting the "Column with a Formula" from the first dataframe:

ser = df_ex['Global 10 years MA']
ser.head()

gives this output:

enter image description here

As you see, values, and no excel formulas. Check you save your file as xlsx or csv format, in your spreadsheet program.

  • Related