Home > Software design >  Most efficient way to transform this data using Pandas?
Most efficient way to transform this data using Pandas?

Time:05-07

I currently have several hundred .csv files in the format shown on the left below, and I need to transform them all into the format shown on the right. I tried to highlight the blocks of data to make it easier to see what I'm trying to do.

enter image description here

Is there an efficient way to do this using Pandas? I was trying to formulate something using df.iteritems() but couldn't think of a good way to do it.

CodePudding user response:

Given:

       Date-Time  L-A
0  5/1/2022 0:00  1.4
1  5/1/2022 0:05  1.4
2  5/2/2022 0:10  1.4

Doing:

name = df.columns[1]
df['x'] = name
df = df.reindex(columns=['x', 'Date-Time', name])
print(df.values)

Output:

[['L-A VLX' '5/1/2022 0:00' 1.4]
 ['L-A VLX' '5/1/2022 0:05' 1.4]
 ['L-A VLX' '5/2/2022 0:10' 1.4]]

CodePudding user response:

My beginner's level way.

Slicing by column index location, later adding them together using concat:

print(df) # initial data frame
  DateTime  Units DateTime  Units
0        a      1     a111     10
1        b      2     b222     20
2        c      3     c333     30

Slicing by column index location as the initial DF has duplicated headers:

df2 = df.iloc[: , [0, 1]].copy()
df3 = df.iloc[: , [2, 3]].copy()
# adding all back into new DF
df_result = pd.concat([df2,df3]).reset_index(drop=True)
print(df_result)

output:

  DateTime  Units
0        a      1
1        b      2
2        c      3
3     a111     10
4     b222     20
5     c333     30
  • Related