Home > Net >  Remove (and replace) default column names after glob
Remove (and replace) default column names after glob

Time:02-25

I am trying to glob a couple of performance data csv files from Open Hardware Monitor.

I can successfully and glob csv files with the following code:

os.chdir('C:\\Users\\tolga\\Downloads\\Programs\\OpenHardwareMonitor\\Perf_Data')
perf_logs_list = glob.glob('*.csv')
perf_logs = pd.concat(map(pd.read_csv, perf_logs_list))

In the resulting DataFrame, the index columns are exported as:

['Unnamed: 0', '/intelcpu/0/load/1', '/intelcpu/0/load/2', '/intelcpu/0/load/3', '/intelcpu/0/load/4', '/intelcpu/0/load/0', '/intelcpu/0/temperature/0', '/intelcpu/0/temperature/1', '/intelcpu/0/temperature/2', '/intelcpu/0/temperature/3', '/intelcpu/0/temperature/4', '/intelcpu/0/clock/1', '/intelcpu/0/clock/2', '/intelcpu/0/clock/3', '/intelcpu/0/clock/4', '/intelcpu/0/power/0', '/intelcpu/0/power/1', '/intelcpu/0/power/2', '/intelcpu/0/power/3', '/intelcpu/0/clock/0', '/ram/load/0', '/ram/data/0', '/ram/data/1', '/hdd/0/load/0', '/hdd/1/load/0']

however, the first line - iloc[0] - has more meaningful names:

['Time', 'CPU Core #1', 'CPU Core #2', 'CPU Core #3', 'CPU Core #4', 'CPU Total', 'CPU Core #1', 'CPU Core #2', 'CPU Core #3', 'CPU Core #4', 'CPU Package', 'CPU Core #1', 'CPU Core #2', 'CPU Core #3', 'CPU Core #4', 'CPU Package', 'CPU Cores', 'CPU Graphics', 'CPU DRAM', 'Bus Speed', 'Memory', 'Used Memory', 'Available Memory', 'Used Space', 'Used Space']

In my DataFrame, I'd like to use the meaningful names, as in iloc[0]. However, I could not find a way to drop the default index column that pandas assigns.

I tried resetting the index, however this only works for the columnar index, not the one in my case. Then I tried removing the row that contains the string 'load', but since the row is already promoted to column names, this try was unsuccessful again.

Can anyone point me to the right direction?

Many thanks!

tbalci


Further details if needed:

When I had a detailed look in the csv file using csvlook, here is what I got:

$ csvlook OpenHardwareMonitorLog-2020-05-18.csv | head -n 1
/usr/lib/python3/dist-packages/agate/utils.py:275: UnnamedColumnWarning: Column 0 has no name. Using "a".
| a                   | /intelcpu/0/load/1 | /intelcpu/0/load/2 | /intelcpu/0/load/3 | /intelcpu/0/load/4 | /intelcpu/0/load/0 | /intelcpu/0/temperature/0 | /intelcpu/0/temperature/1 | /intelcpu/0/temperature/2 | /intelcpu/0/temperature/3 | /intelcpu/0/temperature/4 | /intelcpu/0/clock/1 | /intelcpu/0/clock/2 | /intelcpu/0/clock/3 | /intelcpu/0/clock/4 | /intelcpu/0/power/0 | /intelcpu/0/power/1 | /intelcpu/0/power/2 | /intelcpu/0/power/3 | /intelcpu/0/clock/0 | /ram/load/0 | /ram/data/0 | /ram/data/1      | /hdd/0/load/0 | /hdd/1/load/0 |

That is, the csv file's first value is blank.

I tried slicing, but not successful. The following basic commands and outputs seemed a little bit strange:

perf_logs.iloc[0:2]

    Unnamed: 0  /intelcpu/0/load/1  /intelcpu/0/load/2  /intelcpu/0/load/3  /intelcpu/0/load/4  /intelcpu/0/load/0  /intelcpu/0/temperature/0   /intelcpu/0/temperature/1   /intelcpu/0/temperature/2   /intelcpu/0/temperature/3   ... /intelcpu/0/power/1 /intelcpu/0/power/2 /intelcpu/0/power/3 /intelcpu/0/clock/0 /ram/load/0 /ram/data/0 /ram/data/1 /hdd/0/load/0   /hdd/1/load/0   Unnamed: 0.1
0   Time    CPU Core #1 CPU Core #2 CPU Core #3 CPU Core #4 CPU Total   CPU Core #1 CPU Core #2 CPU Core #3 CPU Core #4 ... CPU Cores   CPU Graphics    CPU DRAM    Bus Speed   Memory  Used Memory Available Memory    Used Space  Used Space  NaN
1   05/18/2020 14:45:42 10.15625    7.8125  5.4567337   4.651439    7.0192337   59  58  60  58  ... 2.59845972  0.02348409  0.6255529   99.60028    42.48093    6.719555    9.098259    24.2333984  42.1686363  NaN

perf_logs.iloc[0]

Unnamed: 0                               Time
/intelcpu/0/load/1                CPU Core #1
/intelcpu/0/load/2                CPU Core #2
/intelcpu/0/load/3                CPU Core #3
/intelcpu/0/load/4                CPU Core #4
/intelcpu/0/load/0                  CPU Total
/intelcpu/0/temperature/0         CPU Core #1
/intelcpu/0/temperature/1         CPU Core #2
/intelcpu/0/temperature/2         CPU Core #3
/intelcpu/0/temperature/3         CPU Core #4
/intelcpu/0/temperature/4         CPU Package
/intelcpu/0/clock/1               CPU Core #1
/intelcpu/0/clock/2               CPU Core #2
/intelcpu/0/clock/3               CPU Core #3
/intelcpu/0/clock/4               CPU Core #4
/intelcpu/0/power/0               CPU Package
/intelcpu/0/power/1                 CPU Cores
/intelcpu/0/power/2              CPU Graphics
/intelcpu/0/power/3                  CPU DRAM
/intelcpu/0/clock/0                 Bus Speed
/ram/load/0                            Memory
/ram/data/0                       Used Memory
/ram/data/1                  Available Memory
/hdd/0/load/0                      Used Space
/hdd/1/load/0                      Used Space
Unnamed: 0.1                              NaN
Name: 0, dtype: object

I am seriously confused with this csv file. And I don't know if that is relevant to what I am trying to achieve.

CodePudding user response:

IIUC, try with skiprows=1 as parameter of pd.read_csv:

perf_logs = pd.concat([pd.read_csv(filename, skiprows=1)
                           for filename in perf_logs_list])
  • Related