Home > Back-end >  Can you keep source formatting when reading Excel with Python?
Can you keep source formatting when reading Excel with Python?

Time:06-25

Can Python read an Excel file while retaining the source format? The table below illustrates the source format; it is possible through Excel to CSV conversion; however, this method will not be the solution for the program.

Currency Percent Date Value
$345.0 50% Mar-22

CSV method that extracts source format:

rows = []
with open('filename.csv') as file:
    csvreader = csv.reader(file)
    header = next(csvreader)
    for row in csvreader:
        rows.append(row)
print(header)
print(rows)
['Currency', 'Percent', 'Date Value']
[['$345.0', '50%', 'Mar-22']]

Excel method that DOES NOT extract source format:

import pandas as pd
df = pd.read_excel('filename.xlsx', dtype=str)
print(df)
['Currency', 'Percent', 'Date Value']
[['345.0', '0.50', '2022-03-01 00:00:00']]

CodePudding user response:

You will likely need to create custom converter functions and pass them in as arguments to the read_excel function.

For Example:

def tocurrency(value):
    val = float(value)
    return f"${val:.2f}"

def topercent(value):
    val = float(value)
    return f"{val:.0%}"

def todate(value):
    return value.strftime("%b-%d")

df = pd.read_excel('filename.xlsx', converters={"Currency":tocurrency, 
                                                "Percent": topercent, 
                                                "Date Value": todate})

print(df)

output:

  Currency Percent Date Value
0  $345.00     50%     Mar-22

CodePudding user response:

You could remove the dtype=str and instead provide a dictionary of datatypes that you wish to assign to those columns

For ex.

dtype = { "Currency" : int, "Percent" : float, "Date Value" : np.datetime64 } 
  • Related