Home > OS >  Compress a Integer to a Base64 upper/lower case character in Ruby (to make an Encoded Short URL)
Compress a Integer to a Base64 upper/lower case character in Ruby (to make an Encoded Short URL)

Time:08-06

I have a member number like "123456" that I want to encode into the shortest string I can for use in a url shortener (without a database).

The standard characters A-Z, a-z and 0-9 give me 62 characters to work with, easily being 64 charaters if I add two special characters like _ and ! for example.

How can I convert the any number up to say 64 to be a single character. So something like.

convert(1)  # -> a
convert(10) # -> j
convert(26) # -> z
convert(27) # -> A
convert(52) # -> Z

So I could have any number .. and return a shorter encoded string.

Attempt with things like the built in Base64 are returning a string that's as long as the input.

Base64.encode64("10") # -> "MTA=\n" ... I want the output to be 1 character not 6!!

How can I encode integers to be a shorter base 64 string?

CodePudding user response:

First of all, there's base-64 (a numeral system) and Base64 (an encoding for binary data). Ruby's built-in Base64 module converts data (strings) to and from Base64 encoding.

I assume that you on the other hand want to convert a number from base-10 to base-64 and then use a custom alphabet (A-Z, a-z, 0-9, _, !) to represent each digit.

Your input number 123456 is in base-10. You can convert it to base-64 via digits – which returns an array of digits:

number = 123456
digits = number.digits(64).reverse
#=> [30, 9, 0]

And then map each digit to its corresponding character:

chars = [*'A'..'Z', *'a'..'z', *'0'..'9', '_', '!']

digits.map { |i| chars[i] }.join
#=> "eJA"

CodePudding user response:

It doesn't work with "base-64" for character conversion, but I've used base 36 a number of times in the past, using the built-in ruby #to_s and #to_i routines:

2.7.2 :007 > 340.to_s(36)
 => "9g"
2.7.2 :008 > "9g".to_i(36)
 => 340

I'm not 100% sure which characters you would use for your base character set. "Base 36" is all alphas (26) and all numerics (10).

  • Related