Give df,
df = pd.DataFrame({'col1':np.arange(6), 'col2':[*'abcdef']})
col1 col2
0 0 a
1 1 b
2 2 c
3 3 d
4 4 e
5 5 f
Then when selecting a single column, using:
df['col1']
# returns a pd.Series
0 0
1 1
2 2
3 3
4 4
5 5
Name: col1, dtype: int32
Likewise when selecting a single row,
df.loc[0]
# returns a pd.Series
col1 0
col2 a
Name: 0, dtype: object
How can we force a single column or single row selection to return pd.DataFrame?
CodePudding user response:
Getting a single row or column as a pd.DataFrame or a pd.Series
There are times you need to pass a dataframe column or a dataframe row as a series and other times you'd like to view that row or column as a dataframe. I am going to show you a few tricks using square brackets, [], and double square brackets, [[]], along with reindex and squeeze.
df[['col1']]
# Using double square brackets returns a pd.DataFrame
col1
0 0
1 1
2 2
3 3
4 4
5 5
# Also using pd.DataFrame.reindex we can returns a single column dataframe
df.reindex(['col1'], axis=1)
Now, lets go the other way from the output:
# Let's squeeze to get pd.Series from this dataframe
df.reindex(['col1'], axis=1).squeeze()
0 0
1 1
2 2
3 3
4 4
5 5
Name: col1, dtype: int32
And, likewise with rows:
df.loc[[0]]
# Using double square brackets returns a single row dataframe
col1 col2
0 0 a
# Also using reindex
df.reindex([0])
Let's squeeze to get pd.Series from this dataframe
df.reindex([0]).squeeze()
col1 0
col2 a
Name: 0, dtype: object
The advantages or using pd.DataFrame.reindex
over pd.DataFrame.loc
is handling columns or indexes that may or may not be present in your dataframe. Using .loc, you will get a KeyError if the column is not present. However, using reindex, you will not get an Error you results will be all NaN allowing the code to continue executing.
Using pd.DataFrame.squeeze
allows you to convert that single column dataframe to a pd.Series without typing in the column header.
CodePudding user response:
You can also use to_frame
:
# One column
>>> df['col1'].to_frame()
col1
0 0
1 1
2 2
3 3
4 4
5 5
# One row
>>> df.loc[0].to_frame().T
col1 col2
0 0 a