Saturday, April 08, 2023

Strongly Typed String Literal Split-Map-Join in TypeScript

The Problem: Write a strongly typed TypeScript function that takes a valid CSS property name in kebab-case (used in CSS) and return the same property name in camelCase (usually used in CSS-in-JS). For example, call this function with "font-size" and it should return "fontSize".

The Strongly Typed String Literal Requirement: A TypeScript type that contains all valid kebab-case CSS property names is provided. Make sure TypeScript can infer to correct camelCase output from this function. To explain it in code:

type kebabCasePropertyName =
  | 'align-content' 
  | 'align-items'
  | 'align-self'
  | 'background'
  | 'background-attachment'
  | 'background-color'
  | 'background-image'
  /* remaining valid property names */

function convert(propertyName: kebabCasePropertyName) /*: define return type */ {
  /* implement function */
}

const camelCasePropertyName = convert('align-content');
// typeof camelCasePropertyName should be 'alignContent'

convert('invalid-property-name');
// TypeScript should throw a compile time error

Here is a TypeScript Playground with the same code. If you want to try solving this problem yourself, go ahead and try it out. You may find a solution better than the one I’m going to share below.

The Solution: This is the TypeScript Playground with the solution code. Now we can go into understanding how it works.

type Split<S, D>

This generic type splits string literal S with string literal delimiter D.

type Split<S extends string, D extends string>
  = string extends S ? Array<string>
  : S extends '' ? []
  : S extends `${infer T}${D}${infer U}` ? [T, ...Split<U, D>]
  : [S];

Here S extends string means S has to be a string. The more precise definition is S has to be a subset of all possible string values. The same applies to D, so S and D have to be strings or TypeScript will throw a compile time error.

The next few lines use the ternary conditional operator several times to pattern match different possible types of S. The following pseudo-code may help you follow the logic:

type Split<S extends string, D extends string>
if (string extends S) then return Array<string>
else if (S extends '') then return []
else if (S extends `${infer T}${D}${infer U}`) then return [T, ...Split<U, D>]
else return [S];

First, we try to match string extends S. If it passes, that means S isn’t a string literal. (Previously we already knew S is a subset of string. If string is also a subset of S then S is exactly string. Nothing more. Nothing less.) It’s a string and its value is unknown at compile time. There’s nothing we can do here. Split<S, D> can only be narrowed down to Array<string>.

Then we try to match S extends ''. It just means S is an empty string because the subset of empty string is just an empty string. Then we can narrow Split<S, D> down to an empty array.

And then we try to match S extends${infer T}{$D}${infer U}`. There are two concepts we need to understand here:

  1. TypeScript template literal types. When using JavaScript template literals, TypeScript can infer all the possible string interpolation outcomes.
  2. The infer keyword. It can only be used after the extends keyword. It can be used to deconstruct a type that’s constructed from other types.

So here we try to deconstruct S into template literal type `${infer T}${D}${infer U}`. For example, Split<'hello-world', '-'> has D extends '-', so it can be deconstructed into T extends 'hello' and U extends 'world', because ${T}${D}${U} will construct 'hello-world'. By using infer, we ask TypeScript to figure out T and U for us.

If the deconstructing works. we can narrow down Split<S, D> into [T, ...Split<U, D>]. This is very similar to how we would implement a JavaScript split function with recursion:

function split(string, delimiter) {
  const index = string.indexOf(delimiter);
  return index >= 0 
    ? [
      string.substring(0, index),
      ...split(string.substring(index + 1), delimiter)
    ]
    : [string];
}

If the deconstructing doesn’t work, the last line in the pseudo-code is just like the last line inside the JavaScript above. It means S is a string literal but it doesn’t contain D, so we return [S]. We can see the similarity between JavaScript and TypeScript type expressions.

type Join<A, D>

This is like reversing Split<S, D>, in a very similar recursive manner. `${T}${D}${Join<U, D>}` represents that recursion.

type Join<A extends Array<string>, D extends string>
  = A extends [] ? ''
  : A extends [infer T extends string] ? `${T}`
  : A extends [infer T extends string, ...infer U extends Array<string>] ? `${T}${D}${Join<U, D>}`
  : string;

Split<S, D> requires S and D to be string literals. Join<A, D> requires A to be an array literal and all of its elements are string literals. If A doesn’t satisfy these requirements, we can only narrow Join<A, D> down to string.

Here we use template literal type `${T}${D}${Join<U, D>}` to construct one string type from multiple string types. This is the opposite operation of how we deconstruct in Split<S, D>.

type LowercaseArray<A>>

Again we are using recursion to iterate through an array. This is similar to Join<A, D>. However, we don’t return a template literal type. We return a new array type that contains new string literal types.

type LowercaseArray<A extends Array<string>>
  = A extends [] ? []
  : A extends [infer T extends string, ...infer U extends Array<string>] ? [Lowercase<T>, ...LowercaseArray<U>]
  : A;

TypeScript has a built-in Lowercase<T> that returns the string literal type of the lower case of another string leteral type. We don’t need to do this ourselves.

type CapitalizeArray<A>

It’s very similar to LowercaseArray<A>. We use the built-in Capitalize<T> to capitalize the first letter of a string literal type.

type CapitalizeArray<A extends Array<string>>
  = A extends [] ? []
  : A extends [infer T extends string, ...infer U extends Array<string>] ? [Capitalize<Lowercase<T>>, ...CapitalizeArray<U>]
  : A;

type CamelCaseArray<A>

camelCase has first word in all lowercase and subsequent words in lowercase with first character capitalized. This can be achieved by combining LowercaseArray<A> and CapitalizeArray<A> into a new type CamelCaseArray<A>.

In the end, we can combine this type of Split<S, D> and Join<A, D> to create CamelCase<S>. It’s just like how we would implement this as a JavaScript function: a chained split-map-join operation.

function convert(propertyName) {
  return propertyName
    .split('-')
    .map((word, index) => index === 0
      ? word
      : `${word.charAt(0).toUpperCase()}${word.substring(1)}`)
    .join('');
}

How about the opposite operation? How can we create a TypeScript generic type that converts camelCase back to kebab-case? That’s an exercise for you. There’s no clear delimiter like '-' in this operation. Think about how you would do it in JavaScript and use pattern matching in TypeScript to achieve the same result.

No comments:

Post a Comment