Home > database >  Pandas Separate categorical and numeric features from multiple data frames and store in a new data f
Pandas Separate categorical and numeric features from multiple data frames and store in a new data f

Time:11-24

I have a situation where I want to separate categorical and numeric features from multiple data frames as mentioned below (df1,df2,df3, and df4) and I want to store these in two different data frames with names "Cont" and "Cat". I am looking for a process that loops into these multiple data frames and gives the output that I am looking for as explained below. This should purely work using the dtypes functionality of pandas to identify if a col is categorical or numeric

The input data frames look like: df1:

Name1       Number1
ABC         123
DEF         234
XXX         456

df2:

Name2        Number2
ABCD         1232
DEFG         2342
XXXY         4562

df3:

Name3      Number3
AB         12
DE         23
XX         45

df4:

Name4      Number4
A          1
D          3
X          5

The output should look like:

Cat:

Name1      Name2    Name3    Name4
ABC        ABCD     AB       A  
DEF        DEFG     DE       D
XXX        XXXY     XX       X 

and similarly: Cont:

Number1    Number2  Number3  Number4
123        1232     12       1
234        2342     23       2
456        4562     45       5

How can this be achieved?

CodePudding user response:

You can use pandas.DataFrame.select_dtypes to create the two dataframes.

Try this:

out = pd.concat([df1, df2, df3, df4], axis=1)
​
cat= out.select_dtypes(include="object") #or include="category"
cont= out.select_dtypes(include=np.number)

# Output :

print(cat)
  Name1 Name2 Name3 Name4
0   ABC  ABCD    AB     A
1   DEF  DEFG    DE     D
2   XXX  XXXY    XX     X

print(cont)
   Number1  Number2  Number3  Number4
0      123     1232       12        1
1      234     2342       23        3
2      456     4562       45        5
  • Related