I have a dataframe with column names 0, 1, 2, 3, etc. I can easily assign the field names when I load the data but the issue becomes if the column name changes, I would not know. There is another file where the column names 0, 1, 2, 3 are identified along with their description. 0 is ID Number, 1, is Title, etc.
I did convert the dataframe to a dictionary.
layout_dict = layout_df.to_dict('records')
[{'FIELD_NUMBER': '0',
'FIELD_DESCRIPTION': 'ID Number',
'FIELD_TYPE': 'String',
'COMMENTS': 'SURE Voter ID number'},
{'FIELD_NUMBER': '1',
'FIELD_DESCRIPTION': 'Title',
'FIELD_TYPE': 'String',
'COMMENTS': nan},
From the layout_df dataframe with the column names 0, 1, 2, 3 is there a way to programmatically assign the column names to the FIELD_NUMBER in the dictionary and pull the FIELD_DESCRIPTION into the column name.
Here is how the data appears in the norco_df dataframe
0 | 1 |
---|---|
123 | dr |
234 | mr |
Once the mapping occurs to the data dictionary, the column names in norco_df dataframe should be renamed to:
ID Number | Title |
---|---|
123 | dr |
234 | mr |
CodePudding user response:
You can create a dictionary of the same shape as what .rename()
takes from your layout_df
:
columns = layout_df.set_index('FIELD_NUMBER')['FIELD_DESCRIPTION'].to_dict()
norco_df = norco_df.rename(columns=columns)
print(norco_df)
Output:
ID Number Title
0 123 dr
1 234 mr
CodePudding user response:
Note that field numbers are string in layout_dict. The following list comprehension converts the column names
norco_df.columns = [item['FIELD_DESCRIPTION'] for old_col in norco_df.columns for item in layout_dict if item['FIELD_NUMBER'] == str(old_col)]
CodePudding user response:
Use the layout_df
DataFrame to set up a Series and reindex
:
df = pd.DataFrame(columns=list('ABC'), index=[0])
s = layout_df.astype({'FIELD_NUMBER': int}).set_index('FIELD_NUMBER')['FIELD_DESCRIPTION']
df.columns = s.reindex(range(df.shape[1])).to_list()
Output:
ID Number Title NaN
0 NaN NaN NaN