I have an excel file with 10 sheets, every one of these 10 sheets has the name of the respondent, like> sheet1-Andres
, sheet2-Paul
and so on...
The columns's header are the same for all sheets also the questions. The only difference is how each respondent fill Answers A
to Answers Z
with zero and ones
Question Answer A Answer B Answer C
question 1 1 0 0
question 2 0 1 1
question 3 0 0 1
As the excel file has 10 sheets, I am looking for a way to open this xlsx
file as a dataframe using Pandas but as one single dataframe in a way that I can after load it make some calculations, like:
How many `1` and `zeros` the Answer A got considering all the 10 respondents.
What's the best approach here?
CodePudding user response:
you can pass None
to the sheet_name
argument and then reset the index after a concat.
df = pd.concat(
pd.read_excel('your_file', sheet_name=None)
).reset_index(0).rename(columns={'level_0':'names'})
df['names'] = df1["names"].str.replace("sheet\d{1}-", "", regex=True).str.strip()
print(df)
names Question Answer 1 Answer 2 Answer 3
0 Andreas question-1 1.0 0.0 1.0
1 Andreas question-2 0.0 1.0 0.0
2 Andreas question-3 1.0 1.0 1.0
0 Paul question-1 1.0 0.0 1.0
1 Paul question-2 0.0 1.0 0.0
2 Paul question-3 1.0 1.0 1.0