Home > Software engineering >  Split the single column to 4 different columns in Dataframe
Split the single column to 4 different columns in Dataframe

Time:11-23

I just need need to split a single column of dataframe to 4 different columns. I tried few steps but didn't worked.

DATA1:

 Dump               
12525 2 153898 Winch
24798 1 147654 Gear
65116 4        Screw 
46456 1        Rowing
46563 5        Nut       

Expected1:

 Item  Qty  Part_no  Description             
12525  2    153898   Winch
24798  1    147654   Gear
65116  4             Screw 
46456  1             Rowing
46563  5             Nut       

DATA2:

 Dump               
12525 2 153898 Winch Gear
24798 1 147654 Gear nuts
65116 X        Screw bolts
46456 1        Rowing rings
46563 X        Nut       

Expected2:

 Item  Qty  Part_no  Description             
12525  2    153898   Winch Gear
24798  1    147654   Gear nuts
65116  X             Screw bolts
46456  1             Rowing rings
46563  X             Nut       

I tried the below code

data_df[['Item','Qty','Part_no','Description']] = data_df["Dump"].str.split(" ", 3, expand=True)

and got the output like 

 Item  Qty  Part_no  Description             
12525  2    153898   Winch
24798  1    147654   Gear
65116  4    Screw 
46456  1    Rowing
46563  5    Nut       

Any suggestions, how can i fix this???

CodePudding user response:

Use str.extract:

data_df[['Item','Qty','Part_no','Description']] = \
data_df['Dump'].str.extract(r'(\d )\s (\d )\s (\d*)\s*(\w )')

Output:

                    Dump   Item Qty Part_no Description
0   12525 2 153898 Winch  12525   2  153898       Winch
1    24798 1 147654 Gear  24798   1  147654        Gear
2   65116 4        Screw  65116   4               Screw
3  46456 1        Rowing  46456   1              Rowing
4     46563 5        Nut  46563   5                 Nut
  • Related