Base-56 Integer Encoding in PHP

I found myself needing to write a URL-shortening system recently, nothing particularly fancy, just something that would result in URLs similar to those of tinyURL and bit.ly. PHP’s own base64 function is purely for encoding string data to make it “URL safe” and actually results in a longer string. PHP also has a base_convert function, but this only goes up to base-36, and it seemed like a waste to only go as far as base-36 when more can be fit in.

I found a couple of functions written in Python on StackOverflow for base-62 conversion that solved the problem perfectly. The author of the solution, Baishampayan Ghose, also suggested a shorter alphabet (removing similar characters for ease of readability) which would whittle it down to base-56. I’ve ported the functions to PHP and employed a base-56 alphabet in the example below, but it would be trivial to swap it out for a longer one if required.

$alphabet_raw = "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ";
$alphabet = str_split($alphabet_raw);
 
function base56_encode($num, $alphabet){
    /*
	Encode a number in Base X
 
    `num`: The number to encode
    `alphabet`: The alphabet to use for encoding
    */
    if ($num == 0){
        return 0;
	}
 
	$n = str_split($num);
    $arr = array();
    $base = sizeof($alphabet);
 
    while($num){
        $rem = $num % $base;
        $num = (int)($num / $base);
        $arr[]=$alphabet[$rem];
	}
 
    $arr = array_reverse($arr);
    return implode($arr);
}
 
function base56_decode($string, $alphabet){
    /*
	Decode a Base X encoded string into the number
 
    Arguments:
    - `string`: The encoded string
    - `alphabet`: The alphabet to use for encoding
    */
 
    $base = sizeof($alphabet);
    $strlen = strlen($string);
    $num = 0;
    $idx = 0;
 
	$s = str_split($string);
	$tebahpla = array_flip($alphabet);
 
    foreach($s as $char){
        $power = ($strlen - ($idx + 1));
        $num += $tebahpla[$char] * (pow($base,$power));
        $idx += 1;
	}
    return $num;
}

Usual code-usage caveats apply.

This entry was posted in Codetry. Bookmark the permalink. Both comments and trackbacks are currently closed.