Home > database >  How to sort an array of Java-style package names in Javascript?
How to sort an array of Java-style package names in Javascript?

Time:11-11

I have an array of Java-style package names and want to sort it in a specific way. The sort order I want is generally alphabetically, but with the additional requirement that lowercase always goes either before or after uppercase, depending on a user supplied flag to the sorting function (which allows also a third option of leaving the original order alone).

Example:

const importStrings = [
  "foo.bar.Zonk",
  "lorem.ipsum.acme.Rocket",
  "foo.bar.BazFoo",
  "lorem.ipsum.Blah",
  "foo.bar.baz.Gnarf",
  "lorem.ipsum.acme.Amboss",
];

The order I want is either uppercase first on any given .:

[
  "foo.bar.BazFoo",
  "foo.bar.Zonk",
  "foo.bar.baz.Gnarf",
  "lorem.ipsum.Blah",
  "lorem.ipsum.acme.Amboss",
  "lorem.ipsum.acme.Rocket",
]

or lowercase first:

[
  "foo.bar.baz.Gnarf",
  "foo.bar.BazFoo",
  "foo.bar.Zonk",
  "lorem.ipsum.acme.Amboss",
  "lorem.ipsum.acme.Rocket",
  "lorem.ipsum.Blah",
]

Effectively, I want the sort order to be A-Za-z/a-zA-Z instead of AaBb..Zz/aAbB..zZ, which seems to be surprisingly difficult to achieve unless I'm missing something really, really obvious. All my previous experiments with various custom Array.sort() functions and localeCompare() options didn't yield the output I need.

My current approach kinda-works, but contains a ton of boilerplate and cruft, and I'm thinking that there has to be a better way of doing this. I currently split the individual strings on ., recursively build a object graph containing each "level" of the package name in an array, and then sort that array. That gives me the ordering I want, which I can then merge back together:

function createImportData(data, parts, sortFunction, reduceFunction) {
  const packageName = parts.shift();

  let localData = data.find((obj) => obj.value === packageName);
  if (!localData) {
    localData = { value: packageName, children: [] };
    data.push(localData);
  }

  if (parts.length) {
    createImportData(localData.children, parts, sortFunction, reduceFunction);
    localData.children = localData.children
      .reduce(reduceFunction, [[], []])
      .flatMap((array) => array.sort(sortFunction));
  }
}

function getSortFunction(options) {
  if (!shouldOrganizeImports(options)) {
    // leave all
    return (obj1, obj2) => 0;
  } else {
    return (obj1, obj2) => {
      obj1.value.localeCompare(obj2.value);
    };
  }
}

function getReduceFunction(options) {
  if (!shouldOrganizeImports(options)) {
    // just put it in a single array without changing anything
    return (groups, obj) => {
      groups[0].push(obj);
      return groups;
    };
  } else {
    const upperIndex = shouldOrganizeImportsUppercaseFirst(options) ? 0 : 1;
    const lowerIndex = shouldOrganizeImportsUppercaseFirst(options) ? 1 : 0;
    const reduce = (upperIndex, lowerIndex) => (groups, obj) => {
      obj.value.charAt(0).toUpperCase() === obj.value.charAt(0)
        ? groups[upperIndex].push(obj)
        : groups[lowerIndex].push(obj);
      return groups;
    };
    return reduce(upperIndex, lowerIndex);
  }
}

//initial call looks like this:
  const importData = [];
  const sortFunction = getSortFunction(options);
  const reduceFunction = getReduceFunction(options);

  for (const importString of importStrings) {
    createImportData(
      importData,
      importString.split("."),
      sortFunction,
      reduceFunction
    );
  }

The generated structure looks like this, with each level's children array sorted the way I need:

[
  {
    value: "foo",
    children: [
      {
        value: "bar",
        children: [
          { value: "baz", children: [{ value: "Gnarf", children: [] }] },
          { value: "BazFoo", children: [] },
          { value: "Zonk", children: [] },
        ],
      },
    ],
  },
  {value: "lorem", children: [/* etc.etc. */]}
]

How can I improve on my logic, or is this already the way to do it?

CodePudding user response:

That's a fairly idiosyncratic sort, but it's possible to do it a bit more concisely using Intl.Collator along with breaking the package names into their individual parts and custom logic to always put something starting with upper case before/after something starting with lower case regardless of length:

// Tells us whether a string starst with an English uppercase char
// (may need tweaking for non-English)
const startsUppercase = /^[A-Z]/;
function sort(array, upperFirst) {
    // Get a collator for the current locale with the appropriate case first flag,
    // and grab its `compare` function
    const collator = new Intl.Collator(undefined, { caseFirst: upperFirst ? "upper" : "lower"});
    const { compare } = collator;
    // Get a direction flag based on the upper-first/lower-first flag
    const direction = upperFirst ? -1 : 1;
    // Sort
    array.sort((a, b) => {
        // Get the two package names split into their parts
        const aparts = a.split(".");
        const bparts = b.split(".");
        // Assume they're equal...
        let result = 0;
        // Loop through as long as A) they're still equal, and
        // B) we have both `a` and `b` parts
        const length = Math.min(a.length, b.length);
        for (let i = 0; result === 0 && i < length;   i) {
            const apart = aparts[i];
            const bpart = bparts[i];
            const aupper = startsUppercase.test(apart);
            const bupper = startsUppercase.test(bpart);
            if (aupper && !bupper) {
                // `a` comes first/last
                result = direction;
            } else if (!aupper && bupper) {
                // `b` comes first/last
                result = -direction;
            } else {
                // Same case, compare remainder
                result = compare(apart, bpart);
            }
        }
        // If still no difference, compare on segment length
        if (result === 0) {
            return aparts.length - bparts.length;
        }
        return result;
    });
    return array;
}

Live Example:

Show code snippet

const importStrings = [
    "foo.bar.Zonk",
    "lorem.ipsum.acme.Rocket",
    "foo.bar.BazFoo",
    "lorem.ipsum.Blah",
    "foo.bar.baz.Gnarf",
    "lorem.ipsum.acme.Amboss",
];

// Tells us whether a string starst with an English uppercase char
// (may need tweaking for non-English)
const startsUppercase = /^[A-Z]/;
function sort(array, upperFirst) {
    // Get a collator for the current locale with the appropriate case first flag,
    // and grab its `compare` function
    const collator = new Intl.Collator(undefined, { caseFirst: upperFirst ? "upper" : "lower"});
    const { compare } = collator;
    // Get a direction flag based on the upper-first/lower-first flag
    const direction = upperFirst ? -1 : 1;
    // Sort
    array.sort((a, b) => {
        // Get the two package names split into their parts
        const aparts = a.split(".");
        const bparts = b.split(".");
        // Assume they're equal...
        let result = 0;
        // Loop through as long as A) they're still equal, and
        // B) we have both `a` and `b` parts
        const length = Math.min(a.length, b.length);
        for (let i = 0; result === 0 && i < length;   i) {
            const apart = aparts[i];
            const bpart = bparts[i];
            const aupper = startsUppercase.test(apart);
            const bupper = startsUppercase.test(bpart);
            if (aupper && !bupper) {
                // `a` comes first/last
                result = direction;
            } else if (!aupper && bupper) {
                // `b` comes first/last
                result = -direction;
            } else {
                // Same case, compare remainder
                result = compare(apart, bpart);
            }
        }
        // If still no difference, compare on segment length
        if (result === 0) {
            return aparts.length - bparts.length;
        }
        return result;
    });
    return array;
}

console.log(`Upper first:`);
console.log(sort(importStrings.slice(), true));
console.log(`Lower first:`);
console.log(sort(importStrings.slice(), false));
.as-console-wrapper {
    max-height: 100% !important;
}
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

If the array is large, it might be worth pre-splitting the package names so you do it only once rather than repeatedly (since the same string is often presented to the sort callback in the process of sorting large arrays). It's not a big change, see ***:

// Tells us whether a string starst with an English uppercase char
// (may need tweaking for non-English)
const startsUppercase = /^[A-Z]/;
function sort(array, upperFirst) {
    // Get a collator for the current locale with the appropriate case first flag,
    // and grab its `compare` function
    const collator = new Intl.Collator(undefined, { caseFirst: upperFirst ? "upper" : "lower"});
    const { compare } = collator;
    // Get a direction flag based on the upper-first/lower-first flag
    const direction = upperFirst ? -1 : 1;
    // *** Get the pre-split package names
    const names = new Map(array.map(name => [name, name.split(".")]));
    // Sort
    array.sort((a, b) => {
        // Get the two package names split into their parts
        const aparts = names.get(a); // ***
        const bparts = names.get(b); // ***
        // Assume they're equal...
        let result = 0;
        // Loop through as long as A) they're still equal, and
        // B) we have both `a` and `b` parts
        const length = Math.min(a.length, b.length);
        for (let i = 0; result === 0 && i < length;   i) {
            const apart = aparts[i];
            const bpart = bparts[i];
            const aupper = startsUppercase.test(apart);
            const bupper = startsUppercase.test(bpart);
            if (aupper && !bupper) {
                // `a` comes first/last
                result = direction;
            } else if (!aupper && bupper) {
                // `b` comes first/last
                result = -direction;
            } else {
                // Same case, compare remainder
                result = compare(apart, bpart);
            }
        }
        // If still no difference, compare on segment length
        if (result === 0) {
            return aparts.length - bparts.length;
        }
        return result;
    });
    return array;
}

Live Example:

Show code snippet

const importStrings = [
    "foo.bar.Zonk",
    "lorem.ipsum.acme.Rocket",
    "foo.bar.BazFoo",
    "lorem.ipsum.Blah",
    "foo.bar.baz.Gnarf",
    "lorem.ipsum.acme.Amboss",
];

// Tells us whether a string starst with an English uppercase char
// (may need tweaking for non-English)
const startsUppercase = /^[A-Z]/;
function sort(array, upperFirst) {
    // Get a collator for the current locale with the appropriate case first flag,
    // and grab its `compare` function
    const collator = new Intl.Collator(undefined, { caseFirst: upperFirst ? "upper" : "lower"});
    const { compare } = collator;
    // Get a direction flag based on the upper-first/lower-first flag
    const direction = upperFirst ? -1 : 1;
    // *** Get the pre-split package names
    const names = new Map(array.map(name => [name, name.split(".")]));
    // Sort
    array.sort((a, b) => {
        // Get the two package names split into their parts
        const aparts = names.get(a); // ***
        const bparts = names.get(b); // ***
        // Assume they're equal...
        let result = 0;
        // Loop through as long as A) they're still equal, and
        // B) we have both `a` and `b` parts
        const length = Math.min(a.length, b.length);
        for (let i = 0; result === 0 && i < length;   i) {
            const apart = aparts[i];
            const bpart = bparts[i];
            const aupper = startsUppercase.test(apart);
            const bupper = startsUppercase.test(bpart);
            if (aupper && !bupper) {
                // `a` comes first/last
                result = direction;
            } else if (!aupper && bupper) {
                // `b` comes first/last
                result = -direction;
            } else {
                // Same case, compare remainder
                result = compare(apart, bpart);
            }
        }
        // If still no difference, compare on segment length
        if (result === 0) {
            return aparts.length - bparts.length;
        }
        return result;
    });
    return array;
}

console.log(`Upper first:`);
console.log(sort(importStrings.slice(), true));
console.log(`Lower first:`);
console.log(sort(importStrings.slice(), false));
.as-console-wrapper {
    max-height: 100% !important;
}
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

CodePudding user response:

This is an approach where the character are extended to two characters, depening on their case:

lower, after sorting
---------------------------------------------

 f o o . b a r . b a z .G  n a r f
 f o o . b a r .B  a zF  o o
 f o o . b a r .Z  o n k
 l o r e m . i p s u m . a c m e .A  m b o s s
 l o r e m . i p s u m . a c m e .R  o c k e t
 l o r e m . i p s u m .B  l a h

upper, after sorting
---------------------------------------------

f o o . b a r .  Ba z  Fo o 
f o o . b a r .  Zo n k 
f o o . b a r . b a z .  Gn a r f 
l o r e m . i p s u m .  Bl a h 
l o r e m . i p s u m . a c m e .  Am b o s s 
l o r e m . i p s u m . a c m e .  Ro c k e t 

The the original data is mapped by the sorted indices.

const
    sort = (array, priority) => {
        const order = priority === 'lower';
        return array
            .map((s, index) => {
                const o = { index, value: '' };
                for (const c of s) o.value  = c === c.toLowerCase() === order ? ' '   c : c   ' ';
                return o;
            })
            .sort((a, b) => a.value.localeCompare(b.value))
            .map(({ index }) => array[index]);
    },
    data = ["foo.bar.Zonk", "lorem.ipsum.acme.Rocket", "foo.bar.BazFoo", "lorem.ipsum.Blah", "foo.bar.baz.Gnarf", "lorem.ipsum.acme.Amboss"],
    sorted1 = sort(data, 'lower'),
    sorted2 = sort(data, 'upper');

console.log(sorted1);
console.log(sorted2);
.as-console-wrapper { max-height: 100% !important; top: 0; }
<iframe name="sif3" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

  • Related