Home > Back-end >  Can you keep source formatting when reading Excel with Python?
Can you keep source formatting when reading Excel with Python?


Can Python read an Excel file while retaining the source format? The table below illustrates the source format; it is possible through Excel to CSV conversion; however, this method will not be the solution for the program.

Currency Percent Date Value
$345.0 50% Mar-22

CSV method that extracts source format:

rows = []
with open('filename.csv') as file:
    csvreader = csv.reader(file)
    header = next(csvreader)
    for row in csvreader:
['Currency', 'Percent', 'Date Value']
[['$345.0', '50%', 'Mar-22']]

Excel method that DOES NOT extract source format:

import pandas as pd
df = pd.read_excel('filename.xlsx', dtype=str)
['Currency', 'Percent', 'Date Value']
[['345.0', '0.50', '2022-03-01 00:00:00']]

CodePudding user response:

You will likely need to create custom converter functions and pass them in as arguments to the read_excel function.

For Example:

def tocurrency(value):
    val = float(value)
    return f"${val:.2f}"

def topercent(value):
    val = float(value)
    return f"{val:.0%}"

def todate(value):
    return value.strftime("%b-%d")

df = pd.read_excel('filename.xlsx', converters={"Currency":tocurrency, 
                                                "Percent": topercent, 
                                                "Date Value": todate})



  Currency Percent Date Value
0  $345.00     50%     Mar-22

CodePudding user response:

You could remove the dtype=str and instead provide a dictionary of datatypes that you wish to assign to those columns

For ex.

dtype = { "Currency" : int, "Percent" : float, "Date Value" : np.datetime64 } 
  • Related