I am reading multiple csv files into pandas and concatenating them into one frame. Each csv file shape is 8x256. I am using the following code:
path = r'may12_fm'
all_files = glob.glob(os.path.join(path, '*.csv'))
df_may= pd.concat((pd.read_csv(f, header=None) for f in all_files ), ignore_index=True, axis=1,)
I want to keep the name of each file as a higher level header, and also the lower header numbering to be from 0 to 255 for all concatenated frames, something like this:
--- m_0 ----------------- m_1 -------------- m_3
--- 0 1 2 3 4 .... 255 ---- 0, 1, 2, ...255------ 0, 1, 2, .....255
0
1
2
3
4
...
I appreciate helping me in this. Thanks
CodePudding user response:
Since you know the files names used, and (if) you know the number of columns per file, maybe you could just do it after the files are loaded.
Example frame
0 1 2 0 1 2
0 1 1 1 1 1 1
1 2 2 2 2 2 2
2 3 3 3 3 3 3
3 4 4 4 4 4 4
What this answer assumes you know
all_files = ['m_0','m_1']
cols_per_file = 3
Then you could build a multi-index
idx = pd.MultiIndex.from_arrays([np.repeat(all_files, cols_per_file), df.columns])
print(idx)
MultiIndex([('m_0', 0),
('m_0', 1),
('m_0', 2),
('m_1', 0),
('m_1', 1),
('m_1', 2)],
)
And then assign back to the frame
df.columns = idx
print(df)
Result
m_0 m_1
0 1 2 0 1 2
0 1 1 1 1 1 1
1 2 2 2 2 2 2
2 3 3 3 3 3 3
3 4 4 4 4 4 4