I am trying to merge files that have the same columns but have different naming conventions. Some files have column names that contain a period ('.'), while others have columns name that does not contain a period.
Some of the files look like this:
First.Name | Last.Name |
---|---|
Cell 1 | Cell 2 |
Cell 3 | Cell 4 |
While others look like this:
First Name | Last Name |
---|---|
Cell 1 | Cell 2 |
Cell 3 | Cell 4 |
I only want to change the column names if the column names contain a period. How do I go about this?
This is a snippet of my code
li = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=0)
if df.loc[:,df.columns.str.contains('.')].any() == True:
df.rename(columns = {'First.Name':'First Name', 'Last.Name':'Last Name'})
li.append(df)
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
CodePudding user response:
You don't need to check for .
.
df.columns = df.columns.str.replace("."," ", regex=False)
CodePudding user response:
I'd do something like this:
dot_columns = [x for x in df.columns if '.' in x]
df = df.rename(columns={x: x.replace('.', ' ') for x in dot_columns})
Let me know if that works for you.