Home > OS >  Counting Occurrences Value Substrings contained in entire Object
Counting Occurrences Value Substrings contained in entire Object

Time:06-01

I want to count substrings of object values, i.e. the entire object contains a string instead of a key equals a string. A working Xpath in XSLT would be

count(//v[contains(.,current-grouping-key())])

But i can't figure this out in javascript.

I tried the following:

const obj = 
  [ { v: 'Bla Blu Bli' },
    { v: 'Bla Blu Bli' },
    { v: 'Bla Blu' },
    { v: 'Bla Bli' }
  ];

const count = obj.reduce( function(sums,entry) {
    sums[entry.v] = (sums[entry.v] || 0)   1;
    return sums;
 },{});
 
console.log(count)

But this counts the exact strings only. So I get:

"Bla Blu Bli": 2,
"Bla Blu": 1,
"Bla Bli": 1

instead of

 "Bla Blu Bli": 2,
 "Bla Blu": 3,
 "Bla Bli": 3

Is there a way to count the substrings instead of the exact values?

CodePudding user response:

you can use that:

const obj = 
  [ { v: 'Bla Blu Bli' }
  , { v: 'Bla Blu Bli' }
  , { v: 'Bla Blu'     }
  , { v: 'Bla Bli'     }
  ];

const counts = obj
  .map(e=>e.v.split(' ').sort((a,b)=>a.localeCompare(b)))
  .reduce((r,a,_,all)=>
    {
    let terms = a.join(' ')
    if (!r[terms])
      r[terms] = all.reduce((c,x)=>c (a.every(v=>x.includes(v))?1:0),0);
    return r
    },{})
    
console.log(  counts )
.as-console-wrapper {max-height: 100% !important;top: 0;}
.as-console-row::after {display: none !important;}

CodePudding user response:

You have to use indexOf or similar, to see if a substring exists in a string.

Example:

obj = [
    {
        "v": "Bla † Blu † Bli"
    },
    {
        "v": "Bla † Blu † Bli"
    },
    {
        "v": "Bla † Blu"
    }
]

const counts = Object.fromEntries(
  obj.map(({v}) => [v, obj.reduce((acc, el) => {
    if (el.v.indexOf(v) > -1) acc  ;
    return acc;
  }, 0)])
);

console.log(counts);

CodePudding user response:

This second version should be faster.

(you wrote I have 100k values , in a comment)

It creates an array only of different series, associated with the number of copies of identical series
And traverses this array by adding to this quantity those of the other sets including the same values,
by selecting only those whose size is larger.

I used Set elements because according to the doc a [set].has(value) is faster than an [array].includes(value)

const obj = 
  [ { v: 'Bla Blu Bli' }
  , { v: 'Bla Bli Blu' }
  , { v: 'Bla Blu'     }
  , { v: 'Bla Bli'     }
  ];

const counts = obj
  .reduce((r,o) => // create arr with unique sets with count of copies
    {
    let 
      arr = o.v.split(' ')
    , sam = r.find(x=>(x.s.size===arr.length) && arr.every(a=>x.s.has(a)) )
      ;
    if (sam)    sam.n   //   one more copy
    else      r.push({arr, s:new Set(arr), n:1 })
       // next step need  arr and set to avoid losing time 
       // in conversion operations between array and Set
    return r
    },[]) 
  .reduce((c,e,_,all) =>
    {
    c[e.arr.join(' ')] = e.n  
        all.reduce((s,x)=>((x.s.size > e.s.size && e.arr.every(a=>x.s.has(a))) ? s   x.n : s),0)
      // try to find includings, only in largest sets
    return c
    },{})  

console.log(  counts  )
.as-console-wrapper {max-height: 100% !important;top: 0;}
.as-console-row::after {display: none !important;}

  • Related