Home > Mobile >  Extract specific row from xlsx file to a list using python
Extract specific row from xlsx file to a list using python

Time:12-16

I want to extract specific rows (assume for now that I already have the row number) from .xlsx file to a list. I addition, I don't know if it is possible but to take the first column as the list's name.

For example: the table I want to extract info from:

enter image description here

                   12/31/2020    12/31/2019    12/31/2018    12/31/2017
Revenue          1.823500e 11  1.614020e 11  1.369580e 11  1.110240e 11
Revenue Growth   1.298000e-01  1.785000e-01  2.336000e-01  2.373000e-01
Cost of Revenue  8.473200e 10  7.189600e 10  5.954900e 10  4.558300e 10
Gross Profit     9.761800e 10  8.950600e 10  7.740900e 10  6.544100e 10

If it is possible I want to get the info in this order: Revenue = ["1.8235E 11", "1.61402E 11", "1.36958E 11" , "1.11024E 11"]

I have already tried using xlrd to get this job done but I always get a message

xlrd.biffh.XLRDError: Excel xlsx file; not supported

Thanks in advance and thank you for your help!

CodePudding user response:

Install openpyxl then use read_excel:

# Python env: pip install openpyxl
# Anaconda env: conda install openpyxl

df = pd.read_excel('data.xlsx', index_col=0, engine='openpyxl')
print(df)

# Output:
                   12/31/2020    12/31/2019    12/31/2018    12/31/2017
Revenue          1.823500e 11  1.614020e 11  1.369580e 11  1.110240e 11
Revenue Growth   1.298000e-01  1.785000e-01  2.336000e-01  2.373000e-01
Cost of Revenue  8.473200e 10  7.189600e 10  5.954900e 10  4.558300e 10
Gross Profit     9.761800e 10  8.950600e 10  7.740900e 10  6.544100e 10

To extract the row Revenue, use:

Revenue = df.loc['Revenue']
  • Related