Home > OS >  How to store part of an Excel file name in a Python integer variable
How to store part of an Excel file name in a Python integer variable

Time:09-24

Imagine I have an excel file with the name LP_Elements_Shocked_202108160517.xlsx

I would like to pull out this specific part of the file name and store it as an integer 20210816

The pattern is consistent. All files begin with LP_Elements_Shocked_ and then are followed by the eight digits I need. And then there will always be 4 more digits I do not need after those

Here is what I have so far:

import pandas as pd
pd.read_excel('LP_Elements_Shocked_202108160517.xlsx')

CodePudding user response:

Since your pattern always starts with the same string, you can just use a substring (slice the string):

filename = 'LP_Elements_Shocked_202108160517.xlsx'

print(filename[20:28]) # prints: '20210816'

otherwise you could use a regex for more complex patterns.

For the part (from the comments) where you want to keep the filename with each dataframe, the simplest would be to add a column filled with the filename to each dataframe you read (by itself, pandas does not keep track of the filename of the excel file).

See this related Q&A: read_excel into data frame and keep file name as column (Pandas)

CodePudding user response:

use re

import re
file_name = 'LP_Elements_Shocked_202108160517.xlsx'
num = re.findall("\d ", file_name)[0][:-4]
print(num)

output

20210816
  • Related