Home > OS >  How to return a single column dataframe or single row dataframe as a dataframe or a series?
How to return a single column dataframe or single row dataframe as a dataframe or a series?

Time:09-24

Give df,

df = pd.DataFrame({'col1':np.arange(6), 'col2':[*'abcdef']})

   col1 col2
0     0    a
1     1    b
2     2    c
3     3    d
4     4    e
5     5    f

Then when selecting a single column, using:

df['col1']
# returns a pd.Series

0    0
1    1
2    2
3    3
4    4
5    5
Name: col1, dtype: int32

Likewise when selecting a single row,

df.loc[0]
# returns a pd.Series

col1    0
col2    a
Name: 0, dtype: object

How can we force a single column or single row selection to return pd.DataFrame?

CodePudding user response:

Getting a single row or column as a pd.DataFrame or a pd.Series

There are times you need to pass a dataframe column or a dataframe row as a series and other times you'd like to view that row or column as a dataframe. I am going to show you a few tricks using square brackets, [], and double square brackets, [[]], along with reindex and squeeze.

df[['col1']]
# Using double square brackets returns a pd.DataFrame

   col1
0     0
1     1
2     2
3     3
4     4
5     5

# Also using pd.DataFrame.reindex we can returns a single column dataframe
df.reindex(['col1'], axis=1)

Now, lets go the other way from the output:

# Let's squeeze to get pd.Series from this dataframe
df.reindex(['col1'], axis=1).squeeze()

0    0
1    1
2    2
3    3
4    4
5    5
Name: col1, dtype: int32

And, likewise with rows:

df.loc[[0]]
# Using double square brackets returns a single row dataframe

   col1 col2
0     0    a

# Also using reindex
df.reindex([0])

Let's squeeze to get pd.Series from this dataframe

df.reindex([0]).squeeze()

col1    0
col2    a
Name: 0, dtype: object

The advantages or using pd.DataFrame.reindex over pd.DataFrame.loc is handling columns or indexes that may or may not be present in your dataframe. Using .loc, you will get a KeyError if the column is not present. However, using reindex, you will not get an Error you results will be all NaN allowing the code to continue executing.

Using pd.DataFrame.squeeze allows you to convert that single column dataframe to a pd.Series without typing in the column header.

CodePudding user response:

You can also use to_frame:

# One column
>>> df['col1'].to_frame()
   col1
0     0
1     1
2     2
3     3
4     4
5     5

# One row
>>> df.loc[0].to_frame().T
  col1 col2
0    0    a
  • Related