The Ashes
Details on JS compression; Squeezing every last byte on the wire
Ray Cromwell has a great article on techniques he has used with JavaScript compression to bring down the payload of your Ajax application.
There are some fantastic advantages to JavaScript being “binary as source” but there is also a real issue with it. We have to make a trade-off on verbose code…. even with minifiers and compressors. GWT has a compile step which allows it to do a little more, and Ray has done some work to get smart compression in there:
Combining base-54/base-64 obfuscated identifier encoding, stable sort-order for identifier allocation, my greedy clustering-by-edit-distance sort algorithm, and 7-zip as a gzip-compatible compressor, yields an incredible 21% reduction of the Showcase application. On a large 500k Javascript application, this means an additional 100k bandwidth is saved, with no performance penalty!
The post itself walks you through some of the basics of compression, and techniques such as sliding windows
:
LZ77 on the other hand, is a sliding window compression algorithm based on replacing strings with backwards references to previous strings in the input. For example, the string “this is a test” contains the substring ‘is’ repeated twice in a row, separated by a space, so that the second occurance of ‘is’ can be replaced with a length (2 characters, and a backwards distance (-3 positions), called the length-distance pair. The compressor typically scans backwards in the input within a certain window (e.g. 8,192 characters or 32,768 characters) looking for matches and then encoding them as length-distance pairs. The compressor has some freedom as to how hard it will search for a match before giving up (something I’ll get to later).
One important effect of the sliding window limit is that if two Javascript functions with common substrings are separated by more than this distance, they cannot be matched.
Really interesting stuff, Ray. I hope to see some of this in the minification libraries!
No Comments