Can Python read an Excel file while retaining the source format? The table below illustrates the source format; it is possible through Excel to CSV conversion; however, this method will not be the solution for the program.
Currency | Percent | Date Value |
---|---|---|
$345.0 | 50% | Mar-22 |
CSV method that extracts source format:
rows = []
with open('filename.csv') as file:
csvreader = csv.reader(file)
header = next(csvreader)
for row in csvreader:
rows.append(row)
print(header)
print(rows)
['Currency', 'Percent', 'Date Value']
[['$345.0', '50%', 'Mar-22']]
Excel method that DOES NOT extract source format:
import pandas as pd
df = pd.read_excel('filename.xlsx', dtype=str)
print(df)
['Currency', 'Percent', 'Date Value']
[['345.0', '0.50', '2022-03-01 00:00:00']]
CodePudding user response:
You will likely need to create custom converter functions and pass them in as arguments to the read_excel
function.
For Example:
def tocurrency(value):
val = float(value)
return f"${val:.2f}"
def topercent(value):
val = float(value)
return f"{val:.0%}"
def todate(value):
return value.strftime("%b-%d")
df = pd.read_excel('filename.xlsx', converters={"Currency":tocurrency,
"Percent": topercent,
"Date Value": todate})
print(df)
output:
Currency Percent Date Value
0 $345.00 50% Mar-22
CodePudding user response:
You could remove the dtype=str and instead provide a dictionary of datatypes that you wish to assign to those columns
For ex.
dtype = { "Currency" : int, "Percent" : float, "Date Value" : np.datetime64 }