Let it be the following Python Panda DataFrame:
NAME NUM_OWNERS NUM_DOCS NUM_RESIDENTS
Total 23900137 21028886 44571130.0
Macael-04062 366607 324413 727945.0
Spain 4283950 3642683 8464411.0
Badalona-08911 5829 6250 15480.0
Vallecas-28031 5691 5215 10358.0
I want to keep the rows containing a 5-digit number and modify the value of the NAME
column by that number.
Resulting DataFrame:
NAME NUM_OWNERS NUM_DOCS NUM_RESIDENTS
04062 366607 324413 727945.0
08911 5829 6250 15480.0
28031 5691 5215 10358.0
CodePudding user response:
Let us try use contains
filter then split
assign the new value
out = df[df.NAME.str.contains('-')].assign(NAME = lambda x : x['NAME'].str.split('-').str[-1])
Out[83]:
NAME NUM_OWNERS NUM_DOCS NUM_RESIDENTS
1 04062 366607 324413 727945.0
3 08911 5829 6250 15480.0
4 28031 5691 5215 10358.0
CodePudding user response:
df=df[df['name'].astype(str).str.contains(r'[\d]{5}')].assign(name = lambda x : x['name'].str.replace(r'[a-zA-Z]-?',''))
This logic check for 5 numbers if it is found then replaces characters