jpauclair

Because standing still is going backward…

Base64 Optimized as3 lib

with 25 comments

A few weeks ago I started a personal project that needed to send binary data to a text (not binary) receiver. This couldn’t be done without encoding the binary data in a format accepted by the receiver. There is a few well know encoder: Base64, yEnc, etc.
Base64 can have a 30% size overhead (compare to yEnc) but it’s fast and easy to use. I chose Base64.

While searching the net for some Base64 lib in as3, I found the class inside the as3Crypto package, and one from the PHPRPC protocol.

So I started with those lib. DAMN it was slow.

So why after a few years of as3 there is still no optimized library for something that common?

I went through both lib and optimized what I could. The final version is ~700% faster with almost no heap expansion compare to both libs.

What’s wrong with current as3 base64 libs

First of all, they are all using string manipulation. WHY?!

private static const BASE64_CHARS:String = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=";

By using a String as the character map, you must then use String manipulation such as:

BASE64_CHARS.charAt(outputBuffer[k]);

The as3Crypto is also using bytearray ::bytesAvailable for the main loop which is really slow !

// while there are still bytes to be processed
                        while (data.bytesAvailable > 0) {

For creating the final string, one lib join (array.join()) all the characters pushed in one array.
The other lib (crypto) make String += operation for each character added (it hurts!)

What shouold it be?

Well by definition, base64 is a binary-to-text encoder, but the encoding process have nothing to do with strings!
You start with an 8 bit byte (256) to get to a 6 bit byte (64)
So you take your first 6 bits and put them in one byte and the last 2 bit and put it in another byte.
But if you stop there you have 4 bits lost in the second byte. So we’ll compact this in the least common multiple of 8 and 6: 24!
So you read 3 bytes, modify it and output them in 4 bytes. (size overhead of ~30%)

				//Read 3 Characters (8bit * 3 = 24 bits)
                c = data[i++] << 16 | data[i++] << 8 | data[i++];

Both previous lib were using character by character concatenation.
But as I said previously, no strings in needed here… Its all byte!
So I decided to output the characters in a byte array, and return the final strings with bytearray.readUTFbytes(). (this is very fast compared to both others techniques)
One cool thing about the base64 algorithm is that you know the final length from the start:

out.length = (2 + data.length - ((data.length + 2) % 3)) * 4 / 3;

Optimization: we can set the ByteArray exact length and skip “grow” call and unneeded heap expansion. (Read my previous post concerning heap size)
The algorithm will extract your data into 6 bits chunks, but you still need to map these with the official base64 characters. In a byte array, what is stored is the “value” of that character (‘a’=97) so you don’t need to work with strings at all.
For the character mapping, you need a vector. of all the 64 characters values.

			var chars:String = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
			for (var i:int = 0; i < 64; i++)
			{
				encodeChars.push(chars.charCodeAt(i));
			}

We could write the data in the final buffer with:

				out.writeByte(_encodeChars[c >> 18] );
				out.writeByte(_encodeChars[c >> 12 & 0x3f]);
				out.writeByte(_encodeChars[c >> 6 & 0x3f]);
				out.writeByte(_encodeChars[c & 0x3f]);

But I have seen a lot of slower and older computer have better performances with only one ByteArray write so instead I used some bitwise to create an Int representing the 4 characters, and I write one Int at a time into the stream.

				//Convert to 4 Characters (6+2 bit * 4 = 24 bits)
				c = (_encodeChars[c >>> 18] << 24) | (_encodeChars[c >>> 12 & 0x3f] << 16) | (_encodeChars[c >>> 6 & 0x3f] << 8 ) | _encodeChars[c & 0x3f];
				out.writeInt(c);

On my old laptop, using one writeint instead of 4 writebyte cut the processing time in half!
And the last step before sending the final string is adding (if needed) the base64 padding characters (‘=’)
I won’t talk about the decoding code, but it’s all in the joined source

Conslusion

Don’t work with strings when you don’t REALLY need to. (They are slow)
Don’t use ByteArray methods if you don’t REALLY need to. (They are slow)

Full source code

The Base64 class is now on google code

And here it is:

/* Base64 library for ActionScript 3.0.
* Based on: Ma Bingyao code.
* Optimized by: Jean-Philippe Auclair  / jpauclair.wordpress.com
* Copyright (C) 2007 Ma Bingyao <andot@ujn.edu.cn>
* LastModified: Oct 26, 2009
* This library is free.  You can redistribute it and/or modify it.
*/
package
{
    import flash.utils.ByteArray;
    public class Base64
	{
        private static const _encodeChars:Vector.<int> = InitEncoreChar();
        private static const _decodeChars:Vector.<int> = InitDecodeChar();

        public static function encode(data:ByteArray):String
		{
			var out:ByteArray = new ByteArray();
			//Presetting the length keep the memory smaller and optimize speed since there is no "grow" needed
			out.length = (2 + data.length - ((data.length + 2) % 3)) * 4 / 3; //Preset length //1.6 to 1.5 ms
			var i:int = 0;
            var r:int = data.length % 3;
            var len:int = data.length - r;
            var c:int;   //read (3) character AND write (4) characters

            while (i < len)
			{
				//Read 3 Characters (8bit * 3 = 24 bits)
                c = data[i++] << 16 | data[i++] << 8 | data[i++];   

				//Cannot optimize this to read int because of the positioning overhead. (as3 bytearray seek is slow)
				//Convert to 4 Characters (6 bit * 4 = 24 bits)
				c = (_encodeChars[c >>> 18] << 24) | (_encodeChars[c >>> 12 & 0x3f] << 16) | (_encodeChars[c >>> 6 & 0x3f] << 8 ) | _encodeChars[c & 0x3f];

				//Optimization: On older and slower computer, do one write Int instead of 4 write byte: 1.5 to 0.71 ms
				out.writeInt(c);
				/*
				out.writeByte(_encodeChars[c >> 18] );
				out.writeByte(_encodeChars[c >> 12 & 0x3f]);
				out.writeByte(_encodeChars[c >> 6 & 0x3f]);
				out.writeByte(_encodeChars[c & 0x3f]);
				*/
            }   

            if (r == 1) //Need two "=" padding
			{
				//Read one char, write two chars, write padding
                c = data[i];
				c = (_encodeChars[c >>> 2] << 24) | (_encodeChars[(c & 0x03) << 4] << 16) | 61 << 8 | 61;
				out.writeInt(c);
            }
            else if (r == 2) //Need one "=" padding
			{
                c = data[i++] << 8 | data[i];
				c = (_encodeChars[c >>> 10] << 24) | (_encodeChars[c >>> 4 & 0x3f] << 16) | (_encodeChars[(c & 0x0f) << 2] << 8 ) | 61;
				out.writeInt(c);
            }   

            out.position = 0;
			return out.readUTFBytes(out.length);
        }   

        public static function decode(str:String):ByteArray
		{
            var c1:int;
            var c2:int;
            var c3:int;
            var c4:int;
            var i:int;
            var len:int;
            var out:ByteArray;
            len = str.length;
            i = 0;
            out = new ByteArray();
            var byteString:ByteArray = new ByteArray();
			byteString.writeUTFBytes(str);
            while (i < len)
			{
				//c1
                do
				{
                    c1 = _decodeChars[byteString[i++]];
                } while (i < len && c1 == -1);
                if (c1 == -1) break;   

				//c2
                do
				{
                    c2 = _decodeChars[byteString[i++]];
                } while (i < len && c2 == -1);
                if (c2 == -1) break;    

                out.writeByte((c1 << 2) | ((c2 & 0x30) >> 4));   

				//c3
                do
                {
                    c3 = byteString[i++];
                    if (c3 == 61) return out;   

                    c3 = _decodeChars[c3];
                } while (i < len && c3 == -1);
                if (c3 == -1) break;   

                out.writeByte(((c2 & 0x0f) << 4) | ((c3 & 0x3c) >> 2));   

				//c4
                do {
                    c4 = byteString[i++];
                    if (c4 == 61) return out;

                    c4 = _decodeChars[c4];
                } while (i < len && c4 == -1);
                if (c4 == -1) break;   

                out.writeByte(((c3 & 0x03) << 6) | c4);   

            }
            return out;
        }   

		public static function InitEncoreChar() : Vector.<int>
		{
			var encodeChars:Vector.<int> = new Vector.<int>();

			// We could push the number directly, but i think it's nice to see the characters (with no overhead on encode/decode)
			var chars:String = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
			for (var i:int = 0; i < 64; i++)
			{
				encodeChars.push(chars.charCodeAt(i));
			}
			/*
				encodeChars.push(
				65, 66, 67, 68, 69, 70, 71, 72,
				73, 74, 75, 76, 77, 78, 79, 80,
				81, 82, 83, 84, 85, 86, 87, 88,
				89, 90, 97, 98, 99, 100, 101, 102,
				103, 104, 105, 106, 107, 108, 109, 110,
				111, 112, 113, 114, 115, 116, 117, 118,
				119, 120, 121, 122, 48, 49, 50, 51,
				52, 53, 54, 55, 56, 57, 43, 47);
			*/
			return encodeChars;
		}

		public static function InitDecodeChar() : Vector.<int>
		{
			var decodeChars:Vector.<int> = new Vector.<int>();

			decodeChars.push(-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1, -1, 63,
							 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -1, -1, -1,
							 -1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14,
							 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, -1, -1, -1, -1, -1,
							 -1, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
							 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
							 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1);
			return decodeChars;
		}
    }
}

References:

http://code.google.com/p/as3crypto/
http://ntt.cc/2009/11/28/base64-encoder-and-decoder-with-actionscript-3-full-source-code.html

add to del.icio.usAdd to Blinkslistadd to furlDigg itadd to ma.gnoliaStumble It!add to simpyseed the vineTailRankpost to facebook

Written by jpauclair

January 9, 2010 at 9:39 am

25 Responses

Subscribe to comments with RSS.

  1. nicely done.

    lab9

    January 9, 2010 at 11:33 am

  2. Social comments and analytics for this post…

    This post was mentioned on Twitter by destroytoday: Optimized Base64 AS3 Lib by @jpauclair: http://bit.ly/8EqgYH 700% faster than AS3Crypto….

    uberVU - social comments

    January 9, 2010 at 8:08 pm

  3. Awesome! I wonder how much faster yet this could be using tools from the Apparat project to optimize the ByteArray manipulation.

    Jackson Dunstan

    January 11, 2010 at 4:53 pm

  4. ohh my!!! little i know about all this byte stuff but all in all i’m really impressed by your work.

    vitaLee

    January 16, 2010 at 8:59 am

  5. Thanks for this implementation; for those of you who want to manipulate Strings instead of ByteArray’s you can add a couple of cheeky helpers methods:

    public static function encodeString(value : String) : String
    {
    	const source : ByteArray = new ByteArray();
    	source.writeUTFBytes(value);
    	return encode(source);
    }
    
    public static function decodeToString(str : String) : String
    {
    	return decode(str).toString();
    }
    

    Jonny Reeves

    January 25, 2010 at 5:02 pm

  6. Good version, but not best. How about my more faster and simple implementation – http://pastebin.org/84812.

    My version work without Vectors, and you may use it with 8 and 9 player versions.

    andyone

    February 1, 2010 at 3:24 am

    • I got to benchmark this because just by looking at it, doesn’t seem possible to be faster.

      jpauclair

      February 1, 2010 at 7:13 am

      • I test. You version faster. You right, sorry.

        andyone

        February 1, 2010 at 7:44 am

      • More faster :(

        andyone

        February 1, 2010 at 7:51 am

  7. [...] appreciated by the community: FlashPlayer Security, FlashPlayer Memory, AS3 Array under the hood, Optimized Base64 Encoding. But this one, I think, will rule them [...]

  8. [...] Base64 Optimized as3 lib « jpauclair A Base64 Encoder that is 700% faster than the one provided by Adobe. (tags: base64 code optimization) [...]

  9. [...] encoded image_data, which is just a very large string containing the image.  I’m using the Base64 classes provided here by Jean-Philippe Auclair, and merely getting the ByteArray.  I can then use Loader [...]

  10. [...] I did some minor modification in the Base64 Class, I will commit that a bit later but instead of doing “writeInt” I’m using [...]

  11. [...] Base64 Optimized as3 lib [...]

  12. I bet if you use readString instead of readUTFBytes it might be faster, given that Base64 doesn’t include any non-ASCII characters, and ByteArray wouldn’t have to try to decode.

    Jasper St. Pierre

    June 5, 2010 at 7:50 am

    • That probably would work if I wasn’t a moron. I thought there was a way to put the bytes in a bytestring, but it seems that all Flash supports is UTF, which means slightly slower due to decoding.

      Also, you sure you got your endianness correct? I don’t think it has a constant default between platforms (I believe it uses the native endianness) so you probably want to explicitly set it to be LITTLE_ENDIAN.

      Third, AS3 has a hidden syntax for declaring Vectors:

      var n:Vector. = new [1, 2, 3];

      So you could probably eliminate the cinit overhead for that function call for that. If they removed it sometime before release, you could also use a static class block.

      Jasper St. Pierre

      June 5, 2010 at 8:06 am

  13. Hi.
    Here’s some kind of a result of a contest me and my friend had on making Base64 decoder/encoder:
    http://code.google.com/p/e4xu/source/browse/trunk/haxe/src/org/wvxvws/encoding/Base64.hx
    (My friend won in the end, though the difference is minor). If you compare this to as3corelib, then, according to my tests it’ll be about 20-25 times faster when encoding and 10-12 times faster when decoding. Of course the results may vary in different players / OS, so I don’t claim that to be the very precise estimation, though, it must be faster ;)
    This may not be a polished version, however, if you like it, of course, you can use / modify it.

    Best.
    -wvxvw

    wvxvw

    June 6, 2010 at 9:33 am

    • This is a “best-case” for fast mem opcodes for sure.
      I wanted to make a pure as3 compatible file but it’s great that you made one with haXe.

      Thanks a lot for sharing!

      jpauclair

      June 6, 2010 at 8:55 pm

    • One thing I wanted to try with the fastmem opcodes was to read 12 byte at a time (3 int) and convert those to 16 bytes (encoded). it’s not possible with standard as3 because of the cost of the “bytearray.position”. But it would be possible to try it with the fast opcodes.

      jpauclair

      June 6, 2010 at 9:05 pm

  14. [...] Base64 Optimized as3 lib « jpauclair [...]

  15. Would you mind posting a few use case examples?

    Mr. Entropy

    July 27, 2010 at 1:25 pm

  16. Yes, a few examples would be helpful. I’m getting 1046 errors when I try to create an instance of the Base64 class…

    Scene 1, Layer ‘Layer 2′, Frame 1, Line 604 1046: Type was not found or was not a compile-time constant: Base64.

    Big Brother

    July 28, 2010 at 4:42 pm

    • I guess if you’re having this many errors it’s because you classpath does not include the Class.
      The class use all static function so it’s very easy to use.

      Include the source path in your project,
      Import the class Base64
      Base64.encode(*valid byte array to encode*)
      Base64.decode(*string previously encoded*)

      jpauclair

      July 28, 2010 at 5:53 pm

  17. [...] live demo is here: http://flashmx.us/temp/2010-08-02-webcam/. A great Base64 class was created by Jean-Philippe Auclair (http://code.google.com/p/jpauclair-blog/source/browse/trunk/Experiment/Base64/). Its [...]


Leave a Reply