I’m now well over a week behind with these challenges, so it’s Thursday of week 5, but we’re just finishing up week 4. Part of the issue is I’m quite bust, but part of the issue is that I am trying to use this as an opportunity to learn elixir, and I’m doing the elixir learning track. This means that the exercises are locked until I do the relevant learning exercises that teach the concepts. Elixir has quite an extensive learning track, and this exercise is quite far down the tree, so I actually had to do quite a few learning exercises to unlock it.

This was a fun one, and I took my time to make three different(ish) solutions. This week’s challenge was Roman Numerals: turn a number into a roman numeral.

Elixir

The simplest of my solutions was one I did in Elixir.

defmodule RomanNumerals do
  @doc """
  Convert the number to a roman number.
  """
  @spec numeral(pos_integer) :: String.t()
  def numeral(number) do
    _numeral(number,[])
  end
  def _numeral(number, acc) do
    case number do
      number when number >= 1000 -> _numeral(number - 1000, acc ++ ~c"M")
      number when number >= 900 ->  _numeral(number - 900,  acc ++ ~c"CM")
      number when number >= 500 ->  _numeral(number - 500,  acc ++ ~c"D")
      number when number >= 400 ->  _numeral(number - 400,  acc ++ ~c"CD")
      number when number >= 100 ->  _numeral(number - 100,  acc ++ ~c"C")
      number when number >= 90 ->   _numeral(number - 90,   acc ++ ~c"XC")
      number when number >= 50 ->   _numeral(number - 50,   acc ++ ~c"L")
      number when number >= 40 ->   _numeral(number - 40,   acc ++ ~c"XL")
      number when number >= 10 ->   _numeral(number - 10,   acc ++ ~c"X")
      number when number >= 9->     _numeral(number - 9,    acc ++ ~c"IX")
      number when number >= 5 ->    _numeral(number - 5,    acc ++ ~c"V")
      number when number >= 4 ->    _numeral(number - 4,    acc ++ ~c"IV")
      number when number > 0 ->     _numeral(number - 1 ,   acc ++ ~c"I")
      0 -> to_string(acc)
      _ -> "NOPE"
    end
  end
end

Nothing much to explain here: if the number is big enough, append the relevant characters to the acc (accumulator) then recursively call the function with the modified acc and the smaller number. ~c"XYZ" is how elixir creates a list of characters (which elixir distinguishes from strings). The function passes around a pair of (number, acc) until number is 0, and then it just returns acc, the string. (This kind of ambiguity in return type is something that a language haskell would not appreciate).

One issue with this approach is that, as the number gets smaller, the recursive calls to _numeral still runs through all the checks. But if you work through how it works, number only ever gets smaller, so if the number >= x check is false, it will be false on all future recursive calls to _numeral.

The other issue is that there’s a lot of repetitious code here. In essence, we have a dozen or so lines that are all doing the same thing, just with a different number to check against, and a different character list to add to the accumulator.

Can we do better? That very much depends what you mean by better, but I decided to haskell-brain my way to a solution that is no shorter, probably no more efficient, but does not have either of the two specific defects I mentioned above.

Haskell

Here it is:

module Roman (numerals) where

type FunctionGen = (Integer, String)
type NumeralAccumulator = (Integer, String)

multiply :: (Integer, String) -> FunctionGen -> FunctionGen
multiply (multiplier , string_multiples) (number,roman_num) = (multiplier * number, map (character_multiply string_multiples) roman_num)
  where character_multiply (s:_) 'I' = s 
        character_multiply (_:s:_) 'V' = s
        character_multiply (_:_:s:[]) 'X' = s

numerals_list :: [FunctionGen]
numerals_list = units ++ tens ++ hundreds ++ thousands
  where units = [ (1, "I"), (4, "IV"), (5, "V"), (9, "IX") ]
        tens  = map (multiply (10, "XLC")) units
        hundreds = map (multiply (100, "CDM")) units
        thousands = map (multiply (1000, "MZZ")) units

function_creator :: FunctionGen -> NumeralAccumulator -> NumeralAccumulator
function_creator (number, roman_number) (remainder, accumulator) 
  | remainder >= number = until (\(x, _) -> x < number) (\(x, y) -> (x - number, y ++ roman_number)) (remainder, accumulator)
  | otherwise = (remainder, accumulator)

function_list :: [NumeralAccumulator -> NumeralAccumulator]
function_list = map function_creator numerals_list

numerals :: Integer -> Maybe String
numerals n = Just . snd $ foldr (.) id function_list $ (n,"")

Let’s start at the bottom and work backwards. We’re taking a similar approach of having a function that takes a number and an accumulator (string) and, if the number is big enough, appends something to the accumulator and returns a new number and the new accumulator. The haskell version wants the output to be Maybe String, so we take the second element of the output pair and wrap it in a Just: that’s what the Just . snd is about. The . is just function composition. Unlike the elixir solution, however, we are composing functions rather than using recursion. So we get a list of functions (one function that checks if number is bigger than 1000 and if so appends M to acc, one that checks 900 and appends CM etc etc) and we use foldr (.) to create one function that composes the whole list of functions together. Other languages sometimes call foldr “reduce” or similar. (Haskell has several different kinds of folding we don’t need to get into which of these corresponds to reduce in some other language). Now, function composition is a two-place function, and thus something that can be used in foldr. The starting accumulator for this fold is the identity id. So foldr (.) id [f,g,h] is going to give us f . g . h. Or, more carefully, f . (g . (h . id)). For this to work, you need the input and output types of the functions to match up (h’s output is g’s input etc). In our case, this works because the input and output types of all the functions is going to be the same: NumeralAccumulator.

Now we just want to create our list of functions. Each function is the same, it just depends on two parameters, the number to check against, and the string to append if the input is bigger than the number. This pair of parameters is gathered together in FunctionGen. So there’s two things we need to do: create the list of pairs of parameters, and turn each parameter pair into a function.

It’s probably worth explaining at this point why I have two type synonyms for the same type at the top. So FunctionGen and NumeralAccumulator are both synonyms for (Integer, String), why define both? The point is that FunctionGen and NumeralAccumulator use their String and their Integer for different purposes. For NumeralAccumulator, as we’ll see, the Integer is the remaining value of the number that hasn’t been converted into Roman numerals yet, and the String is the accumulated characters of the converted number. For FunctionGen, the Integer is the value that the function will check its input against, and the String is the string to append if the check passes. This is also a way to use code instead of comments. If you look at the type signature for function_creator (repeated below), for example, without the type synonyms, this would just be (Integer, String) -> (Integer, String) -> (Integer, String) and it would be harder to figure out what it’s doing. But since we’re using type synonyms we can see that the first parameter is to be treated as a FunctionGen and the second is a NumeralAccumulator. And since the function is called function_creator, and its first argument is called FunctionGen, we have some inkling that this is going to be partially applied to create functions. So we pass in a FunctionGen and get as output a function from NumeralAccumulator to NumeralAccumulator.

For creating a list of functions, we can use the same trick I’ve used before, map a two-argument function over a list of parameters (a list of the type of the first argument), to get a list of functions. Using some symbols, take a function a -> b -> c and map it over a list of a to get a list of functions b -> c. That’s what function_list is doing. The remaining work is to generate the list of parameters, and to write the function that turns a parameter into a function.

function_creator :: FunctionGen -> NumeralAccumulator -> NumeralAccumulator
function_creator (number, roman_number) (remainder, accumulator) 
  | remainder >= number = until (\(x, _) -> x < number) (\(x, y) -> (x - number, y ++ roman_number)) (remainder, accumulator)
  | otherwise = (remainder, accumulator)

Let’s take function generation – function_creator – first. So, as we said, this is a function that takes a FunctionGen and returns a function from NumeralAccumulator to NumeralAccumulator. But what we’re actually going to do is write a function that takes a FunctionGen and a NumeralAccumulator and returns a NumeralAccumulator but it’s then just going to be partially applied to return the function. So function_creator takes its two arguments, and uses pattern matching to break down both arguments into the component integers and strings. Then the function basically does what a single line of the elixir _numeral function did: check if remainder >= number, and if it is append roman_number to accumulator. But there is one important wrinkle here.

In the elixir code, consider what happens with the number 2. That is, consider _numeral(2, []). It drops down to the number when number > 0 line, and thus calls _numeral(2 - 1, [] ++ ~c"I"). This recursive call also drops down to the same line on the elixir code which thus calls _numeral(0, ~c"II") which outputs "II". Now with the haskell code, we aren’t using recursion, we are composing the functions, so the function generated from (1, "I") has to handle turning 2 into "II" and 3 into "III" in one pass. That’s why we are using until. Now, until p f x keeps doing f to x until p x is true. That is, it checks p x and if it’s false, it does f x. Then it checks if p (f x) and if it’s false it does f (f x). And so on. That is, it passes the value to both p and f and if p returns false, do the same thing with the output, otherwise return the output of f. So the until line in the above code is a bit horrible because each of those f and p functions are lambdas (anonymous functions). The p in this context is (\(x, _) -> x < number) which checks if the number is less than x, and the f is \(x, y) -> (x - number, y ++ roman_number) which takes the number away from x and appends roman_number to the y. So let’s do:

until (\(x, _) -> x < 1) (\(x, y) -> (x - 1, y ++ "I")) (2, "")

Let’s do the check from the first function, (\x, _) -> x < 1) on input (2, ""): well the first thing we do is throw away the second parameter (_ in the lambda) and we check if 2 < 1. It is not, so let’s do \(x, y) -> (x - 1, y ++ "I") to input (2, ""): this returns (1, "I"). Now, since the check was false, we do it all again with the new input. Do (\x, _) -> x < 1) on input (1, "I"): again, false. Do \(x, y) -> (x - 1, y ++ "I") to input (1, "I"): (0, "II"). The check was false so we do it all again: Do (\x, _) -> x < 1) on input (0, "II"): true! So we return (0,"II"). We need to do this for (10, "X"), (100, "C") and so on. But, importantly, it doesn’t matter if we also do it for, for example (5,"V") because the until check will always be true on the second pass at most, so we’ll never get repeats of "V" or "L". (Why this is so is left as an exercise for the reader).

So now we have a function that turns a FunctionGen – a pair of an Integer and a String – into a function from NumeralAccumulator to NumeralAccumulator. Now all we need to do is generate the list of FunctionGen values we need to pass in to make function_list. I could, at this stage, just write out all the pairs [(1, "I"), (4, "IV"), (5, "V"), (9, "IX")] etc etc. But instead I decided to produce that list by exploiting the patterns in the list (why am I like this?). Notice that, given the work that the until is doing providing us with repeats, we need functions for 1, 4, 5, 9 and then we need 10, 40, 50, 90. And the pattern of characters is the same, even if the characters themselves are different. That is, the value for 4 is the value for 1 followed by the value for 5, and the value for 40 is the value for 10 followed by the value for 50. So the values for the tens can be filled in by just replacing I, V, X with X, L, C; and likewise the hundreds can be generated by using replacements C, D, M. As for the thousands, the exercise says we only need to go up to 3999, so there’s no need to figure out a symbol for 4000 or 5000, but we can replace I by M to handle thousands. So using the FunctionGen list called units – [(1, "I"), (4, "IV"), (5, "V"), (9, "IX")] – we can generate the other function_creator parameters we need by multiplying the integers by the relevant power of ten and replacing the characters by their counterparts as described above. So that’s what the multiply function does:

multiply :: (Integer, String) -> FunctionGen -> FunctionGen
multiply (multiplier , string_multiples) (number,roman_num) = (multiplier * number, map (character_multiply string_multiples) roman_num)
  where character_multiply (s:_) 'I' = s 
        character_multiply (_:s:_) 'V' = s
        character_multiply (_:_:s:[]) 'X' = s

numerals_list :: [FunctionGen]
numerals_list = units ++ tens ++ hundreds ++ thousands
  where units = [ (1, "I"), (4, "IV"), (5, "V"), (9, "IX") ]
        tens  = map (multiply (10, "XLC")) units
        hundreds = map (multiply (100, "CDM")) units
        thousands = map (multiply (1000, "MZZ")) units

We use it to map over the units list to generate the rest of the FunctionGen parameters we need. It takes a number to multiply by (a power of ten) and a string that represents the substitutions that need doing, and it outputs the modified FunctionGen. So multiply (10, "XLC") (1,"I") will output (10, "X"). The 1 has been multiplied by 10, and the "I" has been transformed into "X" through the magic of map (character_multiply "XLC") "I". The inputs to character_multiply are chars not strings, but in haskell, unlike elixir, a string just is a list of chars. So mapping character_multiply over a string just performs that function on each character in the string.

And we’re done. Putting it all together, we generate numerals_list through the use of multiply to create the tens, hundreds and thousands lists out of the units list and we concatenate them all together. We use that list to generate a list of functions using function_creator, we compose all those functions together into one big function that takes a number and a string (we pass in the empty) string, and that function adds to the string, and subtracts from the number until the number is 0 and the string corresponds to the roman numeral of the number we started with. Does this method have any virtues over the elixir method? Not especially, but it was fun getting it pieced together.

JavaScript

Both of these methods, however, are infinitely better than the Programming Crime I created in JavaScript.

const UNITS = "IVX" 
const TENS = "XLC" 
const HUNDREDS = "CDM" 
const THOUSANDS = "MZZ"
const NUMERALS = [UNITS, TENS, HUNDREDS, THOUSANDS]
const PATTERNS = ["", "0", "00", "000", "01", "1", "10", "100", "1000", "02", "2"]
export const toRoman = (n) => {
  return n.toString().split("").reverse()
    .map((c,i) => PATTERNS[parseInt(c)].split("")
      .map(x => NUMERALS[i].split("")[parseInt(x)]).join(""))
    .reverse().join("");
};

Again, we’re using that idea that there’s a pattern to roman numerals. Multiplying something by 10 is simply to replace each I with a X, each V with a L and so on. So the pattern for the numbers 10, 20, 30 and so on reiterates the pattern for numbers 1, 2, 3 but with different characters. So instead of taking the input to be a number, let’s just take the input to be the string representation of that number. Let’s map each numeral 0, 1, 2 etc to a pattern, the pattern of the roman numeral representation of it. Then all we need to do is figure out the position of this character in the string to figure out which characters to use in filling out the pattern for that numeral. So take the number 14. Treat it as a string. Turn each character into the pattern for that character: 1 becomes 0, 4 becomes 01. Then, taking into account the position of the character, map the pattern to a string. So the 1 is in the tens column, so map 0 to X. The 4 is in the units column so 01 maps to IV. If the 4 had been in the tens column, 01 would have mapped to XL. That’s all there is to it, turn a number into a string, into a pair of a pattern and a column, and then map the pattern and column into a string. Concatenate the strings and we’re done. But, instead of writing it out normally, let’s use nested array maps, several reverse operations and let’s also actually make use of a javascript feature that normally trips me up instead of being useful.

Let’s do this line by line. So:

  return n.toString().split("").reverse()

Take a number, turn it into a string, turn the string into a list of characers, and reverse it so that the units are index 0, the tens are index 1 etc. This means we don’t have to mess about with figuring out the index of the units column from the length of the string blah blah. Yes it’s an expensive way of avoiding some fairly simple maths. Sue me. So if we passed in 291, at this intermediate stage, that would have been turned into ['1', '9', '2'].

    .map((c,i) => PATTERNS[parseInt(c)].split("")

Take that list of strings (characters, I don’t think javascript makes a distinction) and map each one to… something. But wait, the function we pass in to map has two parameters. The first is the character from the array we got above, and the second is the index of that element in the array. So we take the first parameter c, we treat it as a number, and use it as the index to PATTERNS. And then we turn that string – the elements of PATTERNS are strings – into a list of characters. So PATTERNS[1] = "0" and thus we map 1 to ["0"], 9 gets mapped to ["1", "2"], and 2 gets mapped to ["0", "0"]. Or, more carefully, we map (1,0) to ["0"], we map (9,1) to ["1", "2"] and (2, 2) to ["0", "0"].

      .map(x => NUMERALS[i].split("")[x]).join(""))

We’re still inside that map, so we are mapping each of ["0"], ["1", "2"], and ["0", "0"]. And this is where we use that second parameter i above. We use i to pick out the correct units for our translation. NUMERALS is a list of strings. Each string in NUMERALS provides the information for how to turn a pattern into the roman numeral string for that pattern. And each string in NUMERALS corresponds to a column in the number. So the first element of NUMERALS is the UNITS which converts 0 into I, 1 into V and so on. and the second is the TENS column, which converts 0 into X, 1 into L and so on. The UNITS string – IVX – is used to do this conversion in the following way: we split("") it into a list, and then use the values in the pattern to pick out the element of that list by index. So a "0" gets mapped to UNITS[0] which is "I". The list of characters then gets turned into a string using join(""). As I’m explaining this, I realise that javascript has silently coerced these strings into numbers to use as indices to an array. It would probably have been safer to use parseInt(x) in place of x here. And that certainly would have been necessary if we had had to do any index maths (if I hadn’t reversed the list to make UNITS be index 0, for example). So our (1,0) uses UNITS to map the pattern to "I", and (9,1) becomes "XC" using TENS and (2,2) gets mapped – using HUNDREDS – to "CC". That extra parenthesis at the end means that we’re now done with both the inner and the outer map.

        .reverse().join("");

Finally, we reverse the list back, so the bigger numbers are to the left, and then we join the parts together. So we get "CCXCI", which is, indeed, 291 in roman numerals.

Now, this works, but it’s horrible to look at, difficult to parse, and I’m just sorry I made you look at it. It does, at least, demonstrate a quite different way to approach creating roman numerals as compared to the elixir and haskell solutions.

Stay tuned for more javascript crimes and haskell-brain next week!

February 15, 2024

Javascript crimes: Week 4 of 48 in 24

Elixir

Haskell

JavaScript

February 15, 2024

Javascript crimes: Week 4 of 48 in 24

Elixir

Haskell

JavaScript

Similar articles: