I have this dataframe:
01100MS,02200MS,02500MS,03100MS,22
626323,616720,616288,611860,622375
5188431,5181393,5173583,5165895,5152605
1915,1499,1310,1235,1907
1,4.1,4.41,4.441,4.4441
2,4.2,4.42,4.442,4.4442
3,4.3,4.43,4.443,4.4443
4,4.4,4.44,4.444,4.4444
5,4.5,4.45,4.445,4.4445
6,4.6,4.46,4.446,4.4446
7,4.7,4.47,4.447,4.4447
8,4.8,4.48,4.448,4.4448
9,4.9,4.49,4.449,4.4449
10,5,4.5,4.45,4.445
11,5.1,4.51,4.451,4.4451
I would like to have multiple headers. According to this post, I have done this:
dfr = pd.read_csv(file_input,sep=',',header=None,skiprows=0)
cols = tuple(zip(dfr.iloc[0], (dfr.iloc[1]).apply(lambda x: x[1:-1])))
However, I get an error:
TypeError: 'float' object is not subscriptable
The problem, I suppose, is due to the fact that 22 in the header is an integer. Indeed if I substitute 22 with A22 it works.
Due the fact that I have to work with multiple and large dataframe, I can not do it by end. As a consequence, I have tried this solution:
dfr.iloc[0] = dfr.iloc[0].apply(str)
but it does not seem to work.
Do you have some suggestions?
CodePudding user response:
apply(lambda x: x[1:-1])
removes the first and last character, this was needed in the other post you quote as the format was [col1] but in your case you want the same value as in the file.
The problem is that 22 has only 2 characters. So just remove the apply function and then you can build the multiIndex.