I have a pandas dataframe that looks like this:
col1
/bill/works/out
/daniel/lifts/weights
/filip/drives/abroad
I want to extract the first word between the first two backslashes and store it as a separate column, for example:
col1 names
/bill/works/out bill
/daniel/lifts/weights daniel
/filip/drives/abroad filip
I have tried:
df[df[col1].str.contains("bill")]
But this only selected the first row in col1 and not the word.
CodePudding user response:
You could try to make use of the split function within Python like done so here on an example string.
>>> data = "/bill/works/out"
>>> split = data.split("/")
>>> split
['', 'bill', 'works', 'out']
>>> name = split[1]
>>> name
'bill'
This should allow for you to do what you require for the columns.
To apply to your dataframe you could do the following:
df["names"] = df["col1"].str.split("/").str[1]
CodePudding user response:
Try with str.split
:
df["names"] = df["col1"].str.split("/").str[1]