Home > Mobile >  Split dotted path but replace two dots with a symbol
Split dotted path but replace two dots with a symbol

Time:05-29

In my domain-specific language, I am able to reference something inside a tree-like structure using a dotted path, except that I can also go "up" with two dots in a row:

"sibling"         // gets sibling
".sibling"        // same as above
".sibling.nested" // indexes into sibling with 'nested'
"..down"          // goes up the hierarchy, then down the key 'down'
"down.."          // doesn't do anything, really, goes down and then back up
".."              // references parent
"down..down"      // same as "down"

I need to split the above like this:

["sibling"]
["sibling"]
["sibling", "nested"]
[symbol, "down"]
["down", symbol]
[symbol]
["down", symbol, "down"]

In essence, replacing the .. with a symbol, but splitting by single . normally. I can split plain dotted nested paths like this (even respects escaping the period):

path.split(/(?<!\\)\./)

Keys can potentially contain any characters, so a word boundary will not work sometimes.

But I'm having a hard time finding a clean solution that can replace the ... My current solution is to replace all empty strings with the symbol (empty strings mean that there were two periods in a row), and append a character to the ends of the path, removing after the splitting is done:

const symbol = Symbol();

function split(path) {
  path = "$"   path   "$"; // Any character can work; I just used '$' here
  // Split by dots, replacing empty strings with the symbol
  const split = path.split(/(?<!\\)\./).map((part) => (!part ? symbol : part));
  // Remove prefixed character from start
  split[0] = split[0].slice(1);
  // Remove suffixed character from end
  split[split.length - 1] = split[split.length - 1].slice(0, -1);
  // Remove start or end if they are empty
  if (!split[0]) split.shift();
  if (!split[split.length - 1]) split.pop();
  // Done
  return split;
}

const tests = [
  "sibling"        , // gets sibling
  ".sibling"       , // same as above
  ".sibling.nested", // indexes into sibling with 'nested'
  "..down"         , // goes up the hierarchy, then down the key 'down'
  "down.."         , // doesn't do anything, really, goes down and then back up
  ".."             , // references parent
  "down..down"     , // same as "down"
];
// (Stack Overflow snippet console displays null instead of Symbol)
console.log(tests.map(split));

It gets the job done, passes all the test cases, but it's much too verbose and clumsy. I'm looking for a hopefully shorter and easier solution than this.

CodePudding user response:

My trick is to split on . and .., but include the first . if available

const symbol = Symbol();

// Your function
function split(path) {
  path = "$"   path   "$"; // Any character can work; I just used '$' here
  // Split by dots, replacing empty strings with the symbol
  const split = path.split(/(?<!\\)\./).map((part) => (!part ? symbol : part));
  // Remove prefixed character from start
  split[0] = split[0].slice(1);
  // Remove suffixed character from end
  split[split.length - 1] = split[split.length - 1].slice(0, -1);
  // Remove start or end if they are empty
  if (!split[0]) split.shift();
  if (!split[split.length - 1]) split.pop();
  // Done
  return split;
}

function split2(path) {
  return path.split(/(?<!\\)(\.)?\./g) // Split on . and .., but include the first . if available
             .filter(s => s) // Remove falsey things
             .map(s => s === '.' ? symbol : s)
}

const tests = [
  "sibling"        , // gets sibling
  ".sibling"       , // same as above
  ".sibling.nested", // indexes into sibling with 'nested'
  "..down"         , // goes up the hierarchy, then down the key 'down'
  "down.."         , // doesn't do anything, really, goes down and then back up
  ".."             , // references parent
  "down..down"     , // same as "down"
];
console.log(tests.map(split));
console.log(tests.map(split2))

Output

[
  [ 'sibling' ],
  [ 'sibling' ],
  [ 'sibling', 'nested' ],
  [ Symbol(), 'down' ],
  [ 'down', Symbol() ],
  [ Symbol() ],
  [ 'down', Symbol(), 'down' ]
]
[
  [ 'sibling' ],
  [ 'sibling' ],
  [ 'sibling', 'nested' ],
  [ Symbol(), 'down' ],
  [ 'down', Symbol() ],
  [ Symbol() ],
  [ 'down', Symbol(), 'down' ]
]

EDIT: Better algorithm, better writing

CodePudding user response:

If I understood correctly (at least my solution also passed all the tests), here is my shorter version:

let f = str => str.split(/\b/g).filter(x=> x!='.').map(x=> x=='..'? Symbol() :x);
const tests = [
  "sibling"        , // gets sibling
  ".sibling"       , // same as above
  ".sibling.nested", // indexes into sibling with 'nested'
  "..down"         , // goes up the hierarchy, then down the key 'down'
  "down.."         , // doesn't do anything, really, goes down and then back up
  ".."             , // references parent
  "down..down"     , // same as "down"
];
console.log(tests.map(f));

Quick explanation: I suggest that every tree node is a single 'word' then I split the string by word boundaries which is easy with help of regex's \b and the rest is obvious I guess

CodePudding user response:

I'm not sure if this works for you, but I get the same output on your tests. It's only 2 lines less and little more readable though.

function split2(path: string) {
    path = path.replace('..', '.$.')
    let split = path.split('.').filter(x => x)
    split = split.map(function(x){return x === '$' ? symbol : x});
    return split;
}

Here's a playground.

  • Related