Home > Enterprise >  Typescript: enforce flexible string requirements - all underscore separated filename format
Typescript: enforce flexible string requirements - all underscore separated filename format

Time:01-28

Can one define a TypeScript type (I am using 4.7) that would allow a string to be any of the following:

XXX_YY_ZZZ.pdf
ABCD_GG_JJ_PPPP.pdf
VV_XX.pdf
ABCD.pdf

Essentially trying to make the keys to an interface like:

export interface FileChecks {
    [key: string]: (string|RegExp)[];
}

Conform to a known format that is a specific file naming format (all uppercase underscore separated with an extension).

I have seen examples where strings could be limited to something like:

1234-5678-9012
3456-4566-8798

which made me wonder if it could be taken further.

CodePudding user response:

There is no specific type in TypeScript that works the way you want. That is, you won't ultimately be able to write

const good: ValidFile = "ABC_DEF.pdf"; // okay
const bad: ValidFile = "ABC_DeF.pdf"; // error!

TypeScript doesn't currently have regular-expression-validated string types as discussed in microsoft/TypeScript#41160, so there's nothing like type ValidFile = /^[A-Z_]*\.pdf$/; you can write.

Conceptually you could imagine the type as a union of possible string literal types that conform to your rules, but TypeScript unions can only hold on the order of tens of thousands of members, whereas you would need... let's see... an infinite number, because the length of your string is unbounded. Even if you could restrict the maximum length to, say, 32 characters, that would still be more possibilities than TypeScript can represent.

So this is not possible directly.


Instead of trying to write a specific type, we can make a generic type that acts a constraint. That is, instead of ValidFile, we write ValidFile<T> that takes a candidate string literal type T; if T is valid, then ValidFile<T> will evaluate to just T. Otherwise it will evaluate to some valid string literal type which is "close" to T in some way. Then you can do checks like T extends ValidFile<T>. In order to prevent someone from having to write const good: ValidFile<"ABC_DEF.pdf"> = "ABC_DEF.pdf", we will make a generic helper function so you can write const good = validFile("ABC_DEF.pdf");.

So when we're done, instead of

const good: ValidFile = "ABC_DEF.pdf"; // okay
const bad: ValidFile = "ABC_DeF.pdf"; // error!

you will have

const good = validFile("ABC_DEF.pdf"); // okay
const bad = validFile("ABC_DeF.pdf"); // error! 

which is similar, if you squint at it.


Okay, here goes. First let's create a union of characters allowed in the filename prefix (the part before the extension):

type AllowedChars = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
  "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" |
  "T" | "U" | "V" | "W" | "X" | "Y" | "Z" | "_";

And then we can write a generic ValidFilePrefix<T> that validates the prefix:

type ValidFilePrefix<T extends string, A extends string = ""> =
  T extends `${infer F}${infer R}` ?
  ValidFilePrefix<R, `${A}${F extends AllowedChars ? F : "A"}`> :
  A

This tail-recursive conditional type uses template literal types to parse T, character-by-character. Each allowed character is left alone, while each invalid character is converted to "A". So ValidFilePrefix<"PqR"> will be "PAR".

Then we deal with the ".pdf" suffix like this:

type ValidFile<T extends string> =
  `${ValidFilePrefix<T extends `${infer F}.pdf` ? F : T>}.pdf`;

Here we strip off ".pdf" from the end of the string if it exists, evaluate ValidatePrefix on the result, and then add ".pdf" back to the end. So ValidFile<"PqR.pdf"> will be "PAR.pdf", and ValidFile<"PQR"> will be "PQR.pdf".

And now here's validFile():

const validFile = <T extends string>(
  x: T extends ValidFile<T> ? T : ValidFile<T>
) => x;

It uses conditional types to help with inference; T extends ValidFile<T> ? T : ValidFile<T> will infer T as whatever's passed as x, and then it will evaluate to ValidFile<T> no matter what. It would be nice to write const validFile = <T extends ValidFile<T>>(x: T) => x; but the compiler complains that this is a circular constraint. So the conditional type is a workaround.


Let's try it out:

const good = validFile("ABC_DEF.pdf");
const bad = validFile("ABC_DeF.pdf"); // error! 
// Argument of type '"ABC_DeF.pdf"' is not assignable to 
// parameter of type '"ABC_DAF.pdf"'.

Looks good. The valid file name is accepted, while the invalid one is rejected. Furthermore the error message should hopefully give some indication of what's wrong. It suggests that you should have written "ABC_DAF.pdf" instead of "ABC_DeF.pdf". There might be smarter versions of ValidFile that, for example, suggest "ABC_DEF.pdf" instead, by checking the capitalization of the invalid character before giving up and suggesting "A", but I'm not going to spend any extra time on that here.

Playground link to code

  • Related