I'm trying to parse HTML code and extra some data from with using regular expressions. The website that provides the data has no API and I want to show this data in an iOS app build using Swift. The HTML looks like this:
$(document).ready(function() {
var years = ['2020','2021','2022'];
var currentView = 0;
var amounts = [1269.2358,1456.557,1546.8768];
var balances = [3484626,3683646,3683070];
rest of the html code
What I'm trying to extract is the years, amounts and balances.
So I would like to have an array with the year in in [2020,2021,2022] same for amount and balances. In this example there are 3 years, but it could be more or less. I'm able to extra all the numbers but then I'm unable to link them to the years or amounts or balances. See this example: https://regex101.com/r/WMwUji/1, using this pattern (\d|\[(\d|,\s*)*])
Any help would be really appreciated.
CodePudding user response:
Firstly I think there are some errors in your expression. To capture the whole number you have to use \d
(which matches 1 or more consecutive numbers e.g. 2020). Therefore a correct expression would be ('\d ',\s*|'\d ')
. If you want to match the numbers in amount which have a dot in the middle the syntax would be ('\d \.\d ',\s*|'\d \.\d ')
You can prefix your search with the property name: Using the regular expression
(?:var years.*)('\d ',\s*|'\d ')
Then you have a regex that only matches the line starting with var years.
The same can be repeated for amounts and balances (Note (?:foo)
denotes a non capturing group in a regular expression)
CodePudding user response:
You can continue with this regex:
^var (years \= (?'year'.*)|balances \= (?'balances'.*)|amounts \= (?'amounts'.*));$
It searches for lines with either years, balances or amount entries and names the matches acordingly. It matches the whole string within the brackets.