I am trying to get pandas to read specific column values from an excel file, and allow me to manipulate specific individual elements within the specific column values by reading all the values into a numpy array.
Below is the code I am using
cols= 'A, AO, BB, BC'
df= pd.read_excel(path_to_excel, sheet_name=None, usecols=cols)
f1= pd.DataFrame(df, index=[0])
df1.to_numpy()
My df has 124412 rows X 4 columns
But for df1
I get the following error ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
.
I realise it is kind of a stupid question, but any help is appreciated.
CodePudding user response:
Remove the sheet_name=None
from your code.
cols= 'A, AO, BB, BC'
df= pd.read_excel(path_to_excel, usecols=cols)
f1= pd.DataFrame(df, index=[0])
df1.to_numpy()
The default value for argument sheet_name
is Sheet1
, not None
.
If you use sheet_name=None
pandas reads all worksheets, and returns a dictionary with sheet names as keys, and dataframes as values. For example:
cols= 'A, B, C'
df = pd.read_excel(filename, usecols=cols)
df
Returns:
A | B | C | |
---|---|---|---|
0 | 0 | 0 | foo1 |
1 | 1 | 1 | foo2 |
2 | 2 | 0 | foo3 |
3 | 3 | 1 | foo4 |
4 | 4 | 0 | foo5 |
5 | 0 | 0 | foo1 |
6 | 1 | 1 | foo2 |
7 | 2 | 0 | foo3 |
8 | 3 | 1 | foo4 |
9 | 4 | 0 | foo5 |
10 | 0 | 0 | foo1 |
11 | 1 | 1 | foo2 |
12 | 2 | 0 | foo3 |
13 | 3 | 1 | foo4 |
14 | 4 | 0 | foo5 |
While:
cols = 'A, B, C'
df = pd.read_excel(filename, sheet_name=None, usecols=cols)
Returns:
{'Sheet1': A B C
0 0 0 foo1
1 1 1 foo2
2 2 0 foo3
3 3 1 foo4
4 4 0 foo5
5 0 0 foo1
6 1 1 foo2
7 2 0 foo3
8 3 1 foo4
9 4 0 foo5
10 0 0 foo1
11 1 1 foo2
12 2 0 foo3
13 3 1 foo4
14 4 0 foo5}