Home > Blockchain >  Convert a dataframe with two columns to a Panda Series
Convert a dataframe with two columns to a Panda Series

Time:10-24

I have a dataframe with two columns

data = train[['Age', "Pclass"]]

    Age Pclass
0   22.0    3
1   38.0    1
2   26.0    3
3   35.0    1
4   35.0    3
... ... ...
886 27.0    2
887 19.0    1
888 24.0    3
889 26.0    1
890 32.0    3

And I want a Pandas Series which looks like that:

Age       22.0
Pclass     3.0
Name: 0, dtype: float64
Age       38.0
Pclass     1.0
Name: 1, dtype: float64
Age       26.0
Pclass     3.0
Name: 2, dtype: float64
Age       35.0
Pclass     1.0
Name: 3, dtype: float64
Age       35.0
Pclass     3.0
...

How can I achieve this?

I tried many approaches

  1. series = data.iloc[0]

But I get only the first result

Age       22.0
Pclass     3.0
Name: 0, dtype: float64
  1. series = data.iloc[:]

But I get a DataFrame again

  1. series = pd.Series(data['Age'], data['Pclass'])

This is also not correct, because I get this result:

Pclass
3    35.0
1    38.0
3    35.0
1    38.0
3    35.0
     ... 
2    26.0
1    38.0
3    35.0
1    38.0
3    35.0
Name: Age, Length: 891, dtype: float64

Any other suggestions?

So I want to have this output when I compute series[0] which gives me back the ages and series[1] all Pclasses

38.0
26.0
35.0
...

CodePudding user response:

You can try:

s = df.stack().droplevel(0)
print(s)
print(type(s))

OUTPUT:

Age       22.0
Pclass     3.0
Age       38.0
Pclass     1.0
Age       26.0
Pclass     3.0
Age       35.0
Pclass     1.0
Age       35.0
Pclass     3.0
dtype: float64
<class 'pandas.core.series.Series'>

As per discussion in comments:

s = pd.Series(list(df.T.to_numpy()))
print(s)
print(type(s))

OUTPUT:

0    [22.0, 38.0, 26.0, 35.0, 35.0]
1         [3.0, 1.0, 3.0, 1.0, 3.0]
dtype: object
<class 'pandas.core.series.Series'>

CodePudding user response:

You can try:

series = []
for i in range(len(data)):
    series.append(data.iloc[i])

OUTPUT:

Age       22
Pclass     1

Name: 0, dtype: int64
Age       48
Pclass     2
Name: 1, dtype: int64
Age       12
Pclass     3
Name: 2, dtype: int64
Age       45
Pclass     1
Name: 3, dtype: int64

If you want output when you compute series[0] which gives you back the ages and series[1] all Pclasses, you can use data['Age'] to give back all ages and data['Pclass'] to give back all Pclass

  • Related