What I am looking for, but can't seem to get right is essentially this: I'll read in a csv file to build a df. With the output being a table that I can manipulate or edit. Ultimately, I'm looking for a method to rename the column header placeholders with the first "string" in the column.
Currently:
import pandas as pd
df = pd.read_csv("NameOfCSV.csv")
df
Example Output:
m1_name | m1_delta | m2_name | m2_delta | m3_name | m3_delta |
---|---|---|---|---|---|
CO2 | 1 | NMHC | 2 | CH4 | 1 |
CO2 | 2 | NMHC | 1 | CH4 | 2 |
CO2 | 1 | NMHC | 2 | CH4 | 1 |
CO2 | 2 | NMHC | 1 | CH4 | 2 |
What I am trying to understand how to do is create a generic program that will grab the gas name within the column and rename the header "m*_name" and any other header corresponding to "m*_blah" with the gas name of that "m*_name" column.
Example Desired Output where the headers reflect the gas name:
**CO2_name | CO2_delta | NMHC_name | NMHC_delta | CH4_name | CH4_delta** |
---|---|---|---|---|---|
CO2 | 1 | NMHC | 2 | CH4 | 1 |
CO2 | 2 | NMHC | 1 | CH4 | 2 |
I've tried playing around with the several functions (primarily .rename()), trying to find some examples of similar problems, and digging for some documentation but have been unsuccessful with coming up with a solution. Beyond the couple column headers here in the example there are about a dozen per gas name, so I was also trying to build a loop structure to find the headers with the corresponding m_number,"m*_otherHeader", to populate those headers also. The datasets I'm working with are dynamic and the positions of these original columns from the csv change position as well. Some help would be appreciated, and/or pointing me in the direction of some examples or the proper documentation to read through would be great!
CodePudding user response:
df = df.drop_duplicates().set_axis([df[x '_name'].mode()[0] '_' y for x, y in df.columns.str.split('_')], axis=1)
Output:
>>> df
CO2_name CO2_delta NMHC_name NMHC_delta CH4_name CH4_delta
0 CO2 1 NMHC 2 CH4 1
1 CO2 2 NMHC 1 CH4 2