How to store part of an Excel file name in a Python integer variable-CodePudding

Imagine I have an excel file with the name LP_Elements_Shocked_202108160517.xlsx

I would like to pull out this specific part of the file name and store it as an integer 20210816

The pattern is consistent. All files begin with LP_Elements_Shocked_ and then are followed by the eight digits I need. And then there will always be 4 more digits I do not need after those

Here is what I have so far:

import pandas as pd
pd.read_excel('LP_Elements_Shocked_202108160517.xlsx')

CodePudding user response：

Since your pattern always starts with the same string, you can just use a substring (slice the string):

filename = 'LP_Elements_Shocked_202108160517.xlsx'

print(filename[20:28]) # prints: '20210816'

otherwise you could use a regex for more complex patterns.

For the part (from the comments) where you want to keep the filename with each dataframe, the simplest would be to add a column filled with the filename to each dataframe you read (by itself, pandas does not keep track of the filename of the excel file).

CodePudding user response：

use re

import re
file_name = 'LP_Elements_Shocked_202108160517.xlsx'
num = re.findall("\d ", file_name)[0][:-4]
print(num)

output

20210816