Here's the dataset
Id Websites
1 facebook.com
2 linked.in
3 stackoverflow.com
4 harvard.edu
5 ugm.ac.id
Heres's my expected output
Id Name
1 facebook
2 linked
3 stackoverflow
4 harvard
5 ugm
CodePudding user response:
You can use a regex to get the part before the first dot, combined with pop
to remove the Website column:
df['Name'] = df.pop('Websites').str.extract('([^.] )')
output:
Id Name
0 1 facebook
1 2 linked
2 3 stackoverflow
3 4 harvard
4 5 ugm
CodePudding user response:
You can split the name by "." and take what appears before the first .
df['Names'] = df['Websites'].str.split('.').str[0]
Output:
Id Websites Names
1 facebook.com facebook
2 linked.in linked
3 stackoverflow.com stackoverflow
4 harvard.edu harvard
5 ugm.ac.id ugm
CodePudding user response:
Can make use of rsplit to split by the last occurrence of ".". Next part will be extracting out the domain name. Such that when cases like <abc.cde.com> occurs, it will return <abc.cde>
df['Name'].str.rsplit('.', 1).str[0]