I want to take only the years in this string and place in a list
a = "Sam works in a company abc in New York. He joined the company last year 2019. Before joining ABC, he used to work for a small firm in Arizona. He worked there from 2015 to 2018. Before moving to Arizona Sam used to live in South Dakota and he has been living there since 2000's"
b = a.split()
year = []
for i in b:
if i.isdigit():
year.append(i)
print(b)
CodePudding user response:
The most straightforward approach to solving this is using re.findall
to find all 4 digit numbers surrounded by word boundaries.
>>> a = "Sam works in a company abc in New York. He joined the company last year 2019. Before joining ABC, he used to work for a small firm in Arizona. He worked there from 2015 to 2018. Before moving to Arizona Sam used to live in South Dakota and he has been living there since 2000's"
>>> import re
>>> re.findall(r'\b ( \d{4} ) \b', a, re.X)
['2019', '2015', '2018', '2000']
CodePudding user response:
text = """Sam works in a company abc in New York. He joined the company last year 2019. Before joining
ABC, he used to work for a small firm in Arizona. He worked there from 2015 to 2018. Before moving
to Arizona Sam used to live in South Dakota and he has been living there since 2000's"""
years=[]
year = ''
for v in text:
if v.isdigit():
year = v
else:
if year:
years.append(year)
year=''
print(years)
or
text = """Sam works in a company abc in New York. He joined the company last year 2019. Before joining
ABC, he used to work for a small firm in Arizona. He worked there from 2015 to 2018. Before moving
to Arizona Sam used to live in South Dakota and he has been living there since 2000's"""
numbers = [v for v in text if v.isdigit()]
years = [''.join(numbers[i:i 4]) for i in range(0,len(numbers),4)]
print(years)
CodePudding user response:
Use replace(".", "")
to replace dot with empty string and replace("'", " ")
to replace quote with space.
a = "Sam works in a company abc in New York. He joined the company last year 2019. Before joining ABC, he used to work for a small firm in Arizona. He worked there from 2015 to 2018. Before moving to Arizona Sam used to live in South Dakota and he has been living there since 2000's"
b = a.replace(".","").replace("'", " ").split()
print(b)
year = []
for i in b:
if i.isdigit():
year.append(i)
print(year)
# ['2019', '2015', '2018', '2000']