Consider the following multi-line string:
CreatedDate
Account.Example_1__c
Account.LastModifiedDate
Test__r.Example_2__c
Test__r.OwnerId
Test__r.Owner.Name
Test__r.Owner.Custom__c
Test__r.Owner.Custom__r.Id
$Action.Account.New
$ObjectType.Account
$Api.Session_ID
$Label.firstrun_helptext
.
...
How can we match the Salesforce Fields and skip the Global Variables (beginning with '$') using REGEX in JavaScript?
The REGEX should only match the following:
CreatedDate
Account.Example_1__c
Account.LastModifiedDate
Test__r.Example_2__c
Test__r.OwnerId
Test__r.Owner.Name
Test__r.Owner.Custom__c
Test__r.Owner.Custom__r.Id
/[\w.] /g
matches the Salesforce Fields, but it also includes the single dots and the Global Variables in the results.
It should not include .
, ..
, ...
, etc., or the Global Variables in the matches.
Additional Examples:
1) Note that this can be a single or multi-line string, and the fields can appear before and/or after other data on the same line:
For example:
Test__r.Example_1__c >>>> (Test__r.Example_2__c) <<<< $Action.Account.New >>>> ... Test__r.Example_3__c
... should match:
Test__r.Example_1__c
Test__r.Example_2__c
Test__r.Example_3__c
2) These fields are used in formulas (like Excel formulas), so the following:
Example_1__c/Example_2__c*Example_3__c-Example_4__c Example_5__c<>Example_6__c!=Example_7__c,Example_8__c
... should return:
Example_1__c
Example_2__c
Example_3__c
Example_4__c
Example_5__c
Example_6__c
Example_7__c
Example_8__c
CodePudding user response:
How about this one?
(?<!$)(?<=^|\n|\s|\()\w [\w\.]*
We ensure that the section does not start with "$"
, then mark the start as either the beginning of the string, anything that comes after newline or space, or after an opening parentheses as a special case. It then matches any word that does not start with a "."
, then allows dots.
CodePudding user response:
Match the $
word/dot chars to skip these matches, and match and capture the rest with word/dot pattern:
/\$[\w.] |(\w (?:\.\w )*)/g
See the regex demo. Details:
\$[\w.]
-$
and then one or more word/dot chars|
- or(\w (?:\.\w )*)
- Group 1: one or more word chars, and then zero or more sequences of a dot and then one or more word chars.
See the JS demo:
const text = 'Test__r.Example_1__c >>>> (Test__r.Example_2__c) <<<< $Action.Account.New >>>> ... Test__r.Example_3__c';
const rx = /\$[\w.] |(\w (?:\.\w )*)/g;
const matches = Array.from([...text.matchAll(rx)], (x) => x[1]).filter(Boolean);
console.log(matches);
The .filter(Boolean)
removes undefined
values from the result that appear because of the current pattern logic where we exclude matches without
a capture.