I just need need to split a single column of dataframe to 4 different columns. I tried few steps but didn't worked.
DATA1:
Dump
12525 2 153898 Winch
24798 1 147654 Gear
65116 4 Screw
46456 1 Rowing
46563 5 Nut
Expected1:
Item Qty Part_no Description
12525 2 153898 Winch
24798 1 147654 Gear
65116 4 Screw
46456 1 Rowing
46563 5 Nut
DATA2:
Dump
12525 2 153898 Winch Gear
24798 1 147654 Gear nuts
65116 X Screw bolts
46456 1 Rowing rings
46563 X Nut
Expected2:
Item Qty Part_no Description
12525 2 153898 Winch Gear
24798 1 147654 Gear nuts
65116 X Screw bolts
46456 1 Rowing rings
46563 X Nut
I tried the below code
data_df[['Item','Qty','Part_no','Description']] = data_df["Dump"].str.split(" ", 3, expand=True)
and got the output like
Item Qty Part_no Description
12525 2 153898 Winch
24798 1 147654 Gear
65116 4 Screw
46456 1 Rowing
46563 5 Nut
Any suggestions, how can i fix this???
CodePudding user response:
Use str.extract
:
data_df[['Item','Qty','Part_no','Description']] = \
data_df['Dump'].str.extract(r'(\d )\s (\d )\s (\d*)\s*(\w )')
Output:
Dump Item Qty Part_no Description
0 12525 2 153898 Winch 12525 2 153898 Winch
1 24798 1 147654 Gear 24798 1 147654 Gear
2 65116 4 Screw 65116 4 Screw
3 46456 1 Rowing 46456 1 Rowing
4 46563 5 Nut 46563 5 Nut