<< Tomcat, Java and https | Home | lz-string: We'll try to make it faster >>

Just released: lz-string

lz-string is a compression program for JavaScript in the browser, based on LZW. It is fast, meant for localStorage or any other string-based storage in JavaScript, and is particularly efficient on short strings.

All can be read on the home page: http://pieroxy.net/blog/pages/lz-string/index.html And a demo can be found below for live compression: http://pieroxy.net/blog/pages/lz-string/demo.html

Categories : News, JavaScript

Lua decompress

I implemented the decompression algorithm in Lua 3.2, but it takes too long to complete the process. I do not understand why this happens. But data compression is extremely fast.

[Compress / Decompress] Python

[Decompress only] Lua 3.2

Data is compressed into Python, and decompressed on the Lua. I wanted to use lzstring in a communication network (RPC).
Avatar: pieroxy

Lua decompress

I don't know anything about Lua, I'll try to have a look anyways. My main question is: why would you want lz-string to do this? Lz-string is made to compress a string into another string. Both Python and Lua have byte arrays/streams which work with an existing large array of compression algorithms such as lzma, gzip, lz4 an plenty more both more performant and faster than mine.
Avatar: Charlie

Re: Just released: lz-string

This implementation/upgrade of LZW is genius!! Specially for Chrome since the localStorage stores every single byte as a word (2 bytes) occupying twice as much space hence instead of having 5MB of storage it gets reduced to 2.5MB.

C'est génial!

Avatar: pieroxy

Re: Just released: lz-string

 This is not specific to Chrome, but all browsers AFAIK. localStorage stores JavaScript Strings, which are UTF-16, meaning every character takes up 16bits of space. localStorage is limited effectively to 5MB, even on Chrome, it's just that Strings use a lot of memory in JavaScript. Note that this is not specific to JavaScript as it is the same in Java and C# for example.

The real issue I tried to solve in JavaScript is "how to represent a binary stream of data". The best answer I found is using Strings and using the 16-bits of all characters as storage.

Avatar: Anonymous

Re: Just released: lz-string

hi

Great job !!!

In my application I would like get from server compressed template data. Do you think about implementing this algorithm in java?

Avatar: pieroxy

Re: Just released: lz-string

 Just turn on gzip compression on your server and everything your server sends to the browser should be already gzip-compressed automatically, which is most likely much, much more efficient. You can see that if your server sends back a "Content-Encoding: gzip" into the http response to your browser. On Chrome, with the debugging tools turned on on the Network tab, you can see "Size/Content" for every request. Size is the actual number of bytes transferred while Content is the size of your uncompressed content.

Re: Just released: lz-string

FYI, I'm using LZString within a chrome-app (aka packaged app) and found that while using LZString.compress() to compress ~12MB of string before saving in chrome.storage.local, Chrome started suspending my chrome app unpredictably. Switching to LZString.compressToUTF16() fixed the suspend problem.

I suspect Chrome has some background storage checker that suspends extensions that store malformed data, like LZString compressed strings.

Avatar: pieroxy

Re: Just released: lz-string

 Thanks for the feedback. Thinking about it, LZString.compressToUTF16() seems much safer. Strings are sets of characters after all and who can predict what bugs an invalid character will trigger?

What is this chrome.storage.local you're using? If I read you correctly you store more than 5MB in it, so it must be something else than localStorage...

Re: Just released: lz-string

chrome.storage.local is storage API available to Chrome apps and extensions.

You can find documentation here: http://developer.chrome.com/apps/storage.html

Unlike localStorage, it's async. It's still limited to 5MB.

Good news is that I've managed to cut down my data to ~3.7M *characters* which,

using LZString compresses down to 10% (w00t). Bad news is, when I call

chrome.storage.local.getBytesInUse, returned number is ~2.5M *bytes* which,

if correct, suggests chrome.storage.local implementation doesn't just store what it

receives but transforms it somehow such that much of the compression magic gets

diffused. It could be that localStorage does the same.

Re: Just released: lz-string

I found myself needing a Java implementation of this for pre-compressing data from a Java program before putting it on a static web-server. I decided to not rely on the internal automatic compression of most web servers to reduce complexity.

I wrote one and put it on my github here if anyone is interested: https://github.com/ownaginatious/lz-string-java

Just a word of caution though; I haven't done a very extensive amount of testing.

Avatar: pieroxy

Re: Just released: lz-string

 Thanks for the effort. Just out of curiosity, why don't you rely on the webserver compression? That one is full fledge gzip which ought to compress a lot better than LZ-String...

 

Anyways, I'll link to your project from the documentation.

Re: Just released: lz-string

Hey, you can remove the link to my project (lz-string-java) from your website. I deleted it as it had too many problems and was confusing people. It looks as though someone implemented a better version. Thanks :)
Avatar: pieroxy

Re: Just released: lz-string

Just did it, thanks for the update.

Re: Just released: lz-string

Well, for my specific application it was crucial that data get compressed before being sent to clients. From what I understand, if I were to rely on the internal gzip of the webserver, there is a chance that some devices wouldn't support it and would request the uncompressed version, which would waste a lot of bandwidth. This kind of "guarantees" that the compression will always happen. The content I'm serving is already stored as files compressed using your algorithm :). I also wanted a method of sending compressed data back to a webserver, which from what I can tell isn't natively supported by many browsers.

I know that's an awfully specific purpose, but I figure someone else out there might find a clever use for a Java implementation :)

Also, I forgot to mention before - great library! Out of all the JS compression libraries I've looked through, yours is by far the easiest to use!

Avatar: Paul

lzString works ok, breaks Chrome debug tools

I am experimenting with lzString to store compressed data from javascript simulations

LZString seems to be working fine

But... when I bring up the debug tools, and click over to the Local Storage tab, it hangs and

the debug tools can not be dismissed with the X.

The website keeps working though.

Local storage with uncompressed data, even though it is pretty big (250K chars), does not

break the chrome Local Storage viewing tab in the dev tools.

This is probably  chromium's problem, not yours.  I have not yet tried other browsers.

Just thought you might want to be aware of it, and wondering what it might be....

Avatar: pieroxy

lzString works ok, breaks Chrome debug tools

 I have noticed this behavior with a localStorage full (5MB of packed strings) but it doesn't freeze, it just takes an awful lot of time rendering those weird strings. If you look at your task manager (or a top in a terminal) while trying to open the dev tools, you'll notice that one full core is crunching data for Chrome. This is the rendering part. while the rendering is done you can interact with your dev tools again.

Firebug exibits the same behavior 

This has to do (I think) with the handling of ultra-long strings coupled with the rendering of obscure UTF16 characters. With just a few kilobytes in localStorage everything is instantenuous.

Re: Just released: lz-string

 You're an absolute life saver, extremely easy to use and opens up lots of possibilities with cookies.

Re: Just released: lz-string

Hi! I just wanted to leave a comment to say thanks for writing and releasing this - as you found out yourself, there really isn't anything else comparable that is readily available! I wanted to be able to reduce the size of JSON data stored on the server and decompress it on the client and this has enabled me to do just that. I was hosting on NeoCities which has a limit on the size of hosted sites - so relying on gzip for transmission was no good as that compresses the content delivered over the wire but doesn't reduce the backend storage requirements. Shameless plug: I wrote about this at JavaScript Compression (Putting my JSON Search Indexes on a diet). But really I just wanted to say thanks. So thanks! :)

Avatar: Anonymous

nodejs error ?

I'm testing a node client <--> server chat program, and was wondering if your script was working for sending base64Strings.
It is working and it is not working :S ;)

A simple test gives me this:

var stringex = "This is my compression test.";
console.log("Size of sample is: " + stringex.length);

var compressed = Base64String.compress(stringex);
console.log("Size of compressed sample is: " + compressed.length);
string = Base64String.decompress(compressed);
console.log("Base64String Sample is: " + string);

var compressed3 = LZString.compressToUTF16(stringex);
console.log("Size of compressToUTF16 sample is: " + compressed3.length);
string = LZString.decompressFromUTF16(compressed3);
console.log("compressToUTF16 Sample is: " + string);

output:

Size of sample is: 28
Size of compressed sample is: 10
Base64String Sample is: ThisismycompressiontestA
Size of compressToUTF16 sample is: 17
compressToUTF16 Sample is: null

<div> </div>

 

Avatar: pieroxy

nodejs error ?

 So, I got around to it and I tested my lib on nodejs. 

Your first example (using Base64String.compress) cannot work as Base64String is meant to compress base64 encoded content. Your string ("This is my compression test.") is not a valid base64 string. So it doesn't work. Basically Base64String is meant to reencode Base64 content (usually images). The rationale is that base64 takes up a full character to store 6 bits, while compressToUTF16 stores 15 bits per characters.

Your second example is more useful in that really, it doesn't work. It looks as if your compressed test string triggers a bug in either compressToUTF16 or decompressToUTF16. I'm working on it as soon as I get home.

Avatar: pieroxy

nodejs error ?

I just released version 1.3.3. Version 1.3.2 was the result of a pull request that I obviously didn't test enough...

Avatar: Anonymous

nodejs error ?

 Oke thanks,

I will use is like this:

Serverside:
LZString.compressToUTF16(JSON.stringify({JSON:DATA}))

Clientside:
data = JSON.parse(LZString.decompressFromUTF16(data.compr));

Well test more later.

Avatar: pieroxy

nodejs error ?

 That's the idea. How will you get the resulting encoded string on the client side? Ajax.responseText ?

Avatar: Anonymous

nodejs error ?

Something like that, using http://socket.io/ but I can only send UTF-8 (and no binary) :S
compressToUTF16() works still, but maybe not for all messages... 

Maybe I can create a UTF16 detection
IF( ContainsUTF16Char( LZString.compressToUTF16(JSON.stringify(args[1])) )){
  nocompression
}


 

Avatar: pieroxy

nodejs error ?

I'm not worried about encoding problems here, but I am worried that a UTF-8 encoded string is going to be substantially bigger than the UTF-16 counterpart LZString is producing, resulting in wasted bandwidth. Your UTF16 detection is going to return true all the time : compressToUTF16 *is* generating UTF-16 characters as its name can tell.

The best way would probably be to generate ISO-(LATIN-1 for example) characters (all 256 of them being valid), or UTF-16. But for this to be "optimal" you need the content type to be set correctly. Do you have the hand on the content encoding of these requests? If yes, I suggest switching to "Content-Type: text/html; charset=utf-16" for a better bandwidth usage. 

If not, we'd need to write a compressToUTF8, using 7 bytes per character. Not quite hard.

The ideal solution would be for socket.io to be able to transfer byte arrays instead of strings. After all, we're trying to transmit binary data, not text.

 

Avatar: Anonymous

nodejs error ?

Haha thanks for the information, I don't know very mucht about UTF8 / UTF-16

Socket.io don't support binary yet :(
If you think its easy to create a compressToUTF8() than I say :)

I quick compression check gives me this

msg = {data:'I say hello'}
console.log('no compression',JSON.stringify(msg).length)
compr = LZString.LZString.compressToUTF16(JSON.stringify(msg));
console.log('compression',compr.length);

no compression 22
compression 17
 

msg = {data:'I say hello, I can say more and more text so this is big'}
console.log('no compression',JSON.stringify(msg).length)
compr = LZString.LZString.compressToUTF16(JSON.stringify(msg));
console.log('compression',compr.length);

<div> <div>no compression 67</div> <div>compression 37</div> <div> </div> <div>As you can see there is some compression.</div> <div> </div> <div>But if I understand you correct , you can make more compression using compressToUTF8() ?
 </div> </div>
Avatar: pieroxy

nodejs error ?

 You are confused between two very different things: compression and encoding. 

Compression is the act of taking bits as an input and outputing less bits on the output. This is where the LZ part of LZString is working.

Encoding is the act of taking bits as input and outputing characters as output. This is where the String part of LZString is working. This is needed because JavaScript doesn't know (consistently at least) how to handle binary data. It can handle numbers and strings. 

So no, you cannot "make more compression" using compressToUTF8(). compress returns to you a UTF-16 string that is packed at the maximum but all 65536 values of a 16-bit int are not valid, so the string is indeed an invalid UTF-16 string. This is why I created compressToUTF16 that will only store 15 bits per character (hence, a slightly less optimal encoding) but produces a valid string. compressToBase64 gives you a string where only 6 bits are used for each characters. You might think that's complete bullshit as the string will be more than twice as big as a string produced with compressToUTF16, and yet is it the most efficient way to upload your data to your server, because everything will pass through the "url encoding" encoder, making every character outside the Base64 range three bytes at least.

Now, to exchange data, computers usually use bytes, not strings. UTF-8, ISO-8859-1 and UTF-16 are methods (encodings) to represent a string as a byte stream. A 1024 characters string may be represented as 1024 bytes in ISO-8859-1 but 2048 bytes in UTF-8. Similarly, another string may be represented as 1000 bytes in UTF-8 but 2000 bytes in UTF-16. Another, 1000 bytes in UTF-16 and 1500 bytes in UTF-8. So you have to know what you're doing in order to optimize your stream of data. Just calling "length" on your strings gives you the number of characters, not the number of bytes needed to tranfer thesse characters to your browser.

So, yes, anyone can write a compressToUTF8() method (it is actually very simple). But in UTF-8, the first bit is reserved for higher characters. So you can only store 7 bits per characters, wasting a full 12.5% of bandwidth, where with UTF-16 I only waste half of that to be UTF-compliant. I am sure you can set the content-type of your requests with your library, so why not just setting it up to be UTF-16 and be over with it? You save bandwidth and time, to get a more elegant solution.

But if you use compressToUTF16 and send it out encoded in UTF-8, your stream may very well end up being twice as big as it needs to be. Again, only testing will tell.

Ajax

 hello , thanks for information.

i am using <a class="longLink" href="https://github.com/ownaginatious/lz-string-java" style="text-decoration: none; color: rgb(85, 130, 186); font-weight: bold; display: inline !important; overflow-x: auto; border: 1px solid rgb(221, 221, 221); background-color: rgb(238, 238, 238); padding: 1px; font-family: 'Trebuchet MS', 'Lucida Grande', Tahoma, Verdana, Arial, sans-serif; font-size: 16px; line-height: 24px;">https://github.com/ownaginatious/lz-string-java code for compress json file (UTF-16) , i want send compress data with ajax to client and use javascript LZ-String for decompress ,

when data send to client javascript undefine variable for data.

please help me.

Re: Just released: lz-string

Thanks for helping me with this haha, its hard to understand it :)

What i'm testing right now is this:

Sending message function(){

msg = JSON.stringify(args[1]);
compres = LZString.LZString.compressToBase64(msg);

if(Buffer.byteLength(compres, 'base64')<Buffer.byteLength(msg, 'utf8')){

<span class="Apple-tab-span" style="white-space:pre"> </span>console.log('COMPRESSION');

       args = ['event',{compr:compres}];

}

}

http://nodejs.org/api/buffer.html#buffer_class_method_buffer_bytelength_string_encoding

So it checks if the compression is smaller then the utf-8 message.

 

Avatar: pieroxy

Re: Just released: lz-string

 Wow... I can see quite a few mistakes here. First, you compress your content and then encode it in Base64 twice... Once with compresstoBase64 and once with Buffer.byteLength(..., 'base46'), hence, multiplying by ~1.8 its size. Looks like you need to read a bit about what is an encoding and what is compression.

Second, and this is a general advice whenever asking a question over the internet, we have no clue what you are trying to achieve. I mean, apart from calling LZString. What is it you want to do with LZString? Without this information, it's really hard to help you.

At last, what it looks like is that you're trying to compress some data on your server in order to send it to your client. Are you aware that pretty much all browsers and all webservers natively support compression? Furthermore, this is transparent, automatic and use gzip which is a far far better compression algorithm than LZString.

As I wrote in the documentation, LZString is meant to be used inside the browser. Of course, you can use it for other purposes, but then, be careful. It wasn't meant for that.

Re: Just released: lz-string

 <span style="font-family: 'Trebuchet MS', 'Lucida Grande', Tahoma, Verdana, Arial, sans-serif; font-size: 16px; line-height: 24px; background-color: rgb(240, 240, 240);"> Wow... I can see quite a few mistakes here. First, you compress your content and then encode it in Base64 twice... Once with compresstoBase64 and once with Buffer.byteLength(..., 'base46'), hence, multiplying by ~1.8 its size. Looks like you need to read a bit about what is an encoding and what is compression.</span>

?? <span style="background-color: rgb(240, 240, 240); font-family: 'Trebuchet MS', 'Lucida Grande', Tahoma, Verdana, Arial, sans-serif; font-size: 16px; line-height: 24px;">Buffer.byteLength(..., 'base46')</span> gives me the compression length
<span style="font-family: 'Trebuchet MS', 'Lucida Grande', Tahoma, Verdana, Arial, sans-serif; font-size: 16px; line-height: 24px; background-color: rgb(240, 240, 240);">Buffer.byteLength(msg, 'utf8') gives me the non-compression length</span>

If the compression is smaller then I send the compressed data to the client using socket.io (websocket/flash/ajax/etc)
If the compression is not smaller then I send the non-compressed data to the client using socket.io

Socket.io don't use <span style="background-color: rgb(240, 240, 240); font-family: 'Trebuchet MS', 'Lucida Grande', Tahoma, Verdana, Arial, sans-serif; font-size: 16px; line-height: 24px;">gzip for the messages.
So that's why i'm searching for a compressed method to send the messages to the clients and back using javascript.
I works now great, with some small compression in some text messages.</span>

Avatar: Jawa-the-Hutt

C# Class implementation

I needed this in a WCF Service, so I wrote a C# class file to go with it. You can find it here: https://github.com/jawa-the-hutt/lz-string-csharp
Avatar: pieroxy

C# Class implementation

Thanks, that's great! I referenced your implementation in the home page for the project.
Avatar: eguardiola

Re: Just released: lz-string

Hi,
Many thanks for lz-string, its working great!!!
I'm using it from GWT. I needed to code this tiny wrapper:
public class LZString {

	public native static String compressToUTF16(String input) /*-{
		return $wnd.LZString.compressToUTF16(input)
	}-*/;

	public native static String decompressFromUTF16(String input) /*-{
		return $wnd.LZString.decompressFromUTF16(input)
	}-*/;

}
Including the lz-string.js file is host page. :-)

Re: Just released: lz-string

First of all, Great work Pieroxy! :) I want to use your project to compress a string before ajax sending. The purpose is to remain in a restrictive limitrequestbody apache variable. The question is, is there a way to decompress the string in php? Thanks!
Avatar: pieroxy

Re: Just released: lz-string

I am not aware of any port in PHP at the moment. Porting the algorithm should be very straightforward though and shouldn't take more than a few hours. I wrote a small section about porting the lib on the home page to help you start up.

Re: Just released: lz-string

Hi, i did a beta in php. Feel free to use @ https://github.com/nullpunkt/lz-string-php cheers

CompressToUTF8

Thanks for the awesome library! It would be great if you could add CompressToUTF8 to the library, to use with XmlHttpRequests . XmlHttpRequests cannot be sent as UTF16, but only UTF8. They won't get automatically urlencoded so no overhead there, but the UTF16 will be re-encoded to UTF8 and that, as you know, is bad :)
Avatar: pieroxy

CompressToUTF8

How do you send data to the server with XmlHttpRequest without having it urlencoded ?

CompressToUTF8

I do a simple POST - XmlHttpRequest does not urlencode POST data, depending on what data you post there are different rules, which can be read here: http://www.w3.org/TR/XMLHttpRequest/#the-send()-method POST'ing a plain string will mean: Let encoding be "UTF-8". Let mime type be "text/plain;charset=UTF-8". Let the request entity body be data converted to Unicode and encoded as utf-8.
Avatar: pieroxy

CompressToUTF8

Indeed very simple... I don't know what I was thinking ;-) In the meantime you can send base64-encoded content. 6 bits used out of 8 for each character... It is less efficient but probably a lot better than transcoding the UTF-16 output into some UTF-8 crap.
Avatar: Jonathan Wagner

Re: Just released: lz-string

We are using lz-string for compression and are using the .net port. We are having serious performance issues the moment a string gets over say 200kb. We really want to keep it because it works great on the client but the compression/decompression serverside is killing our server.
Avatar: pieroxy

Re: Just released: lz-string

Well... I haven't had a look at the C# port of the lib, so there may be an issue there. I have a few remarks though:
  • Compression on the server looks rather useless here. gzip is built in every server and browser out there, performs better and should be used when going from the server to the client.
  • The decompression routine is much faster than the compression one
  • The lib is meant for small amounts of data, and its dictionary isn't capped. So with large chunks of data it may take a lot of memory. 200kb looks very low in this regard and I have compressed much larger chunks of data in browsers in a timely fashion.
  • I have a port of lz4 in javascript coming but I really have no timeline for it. I may be able to work on it a little more in the coming weeks (holiday season) but I make no promises.
Can you tell me if you are having memory issues or CPU issues? On which method do you see those issues (compress, decompress...) ?
Avatar: Jonathan Wagner

Re: Just released: lz-string

We found the issue, the decompression and compression function use the string concatenation operator += which is highly inefficient for larger strings. We switched to string builder and the performance issues disappeared. Files that were taking up to 30+ seconds to decompress dropped to under a second. The size of our data generally never exceeds 5mb. The reason we opted for lz-string was actually for cross browser support as far back as IE7. If you're interested, we use LZS with our API http://www.scribblemaps.com/api/
Avatar: pieroxy

Re: Just released: lz-string

You know it also works on IE6 ;-)

Great to hear you found out about the issue. Thanks for the update.

Avatar: Jawa-the-Hutt

Re: Just released: lz-string

I'm the author of the C# implementation. I do have a Pull Request that I haven't had time to look at yet that is supposed to address some performance issues. I'll try and get that done sometime in the next few days pending my work schedule. Once I get it merged, i'll post back and let you know so you can give it a try.

Re: Just released: lz-string

Hi pieroxy. A week ago i decided to use LZString on my project because JSON messages were very large to send to client. I'm currently using Node.js as backend and your library did amazing things on compression and decompression. On my client side, i have both Java and Javascript and on Java side I had problems using the ownaginatious/lz-string-java. The Decompression method did not decompress data from server and my backend code wasn't decompressing data sent from client either. So i decided to implement the exact model of 1.3.3 version of your Javascript LZString Library. And thanks to this post: http://www.productiverage.com/javascript-compression-putting-my-json-search-indexes-on-a-diet, i achieved what I needed using the utf16 compression and decompression on both sides (server and client). So, i would like to share this Java code with whoever have those needs too: https://github.com/diogoduailibe/lzstring4j. If you or anybody have feedbacks, please let me know. Thank's.
Avatar: pieroxy

Re: Just released: lz-string

Thanks ! I'm adding your project to the main LZString page.

Re: Just released: lz-string

I am using this library to compress a part of the GET URL. Since the compress fn returns back a binary response, I'm using the convert_string_to_hex method from your demo page to make the compressed data URL friendly. IMHO, this is much better than the compressToBase64 method which yields a bigger string. I'll reverse the unhex the string and decompress on the server side. Can you think of any cases where this will not work? Thanks for the great work!
Avatar: pieroxy

Re: Just released: lz-string

Well, I can tell you that convertToBase64 uses 6 bits per character and a hex string uses 4 bits per character. So base64 will produce much smaller strings in all cases. The only problem with base64 is that it uses characters that will be escaped in a URL. This is easily circumvented by using
compressToBase64().replace("=","$").replace("/","-")
Of course, when decompressing, you should use the reverse replaces ;-)

Re: Just released: lz-string

Realized that right after hitting submit. Now wisely using base64 encoding.
Avatar: kevin

Re: Just released: lz-string

This library and your work are inspiring. keep it up. this is minor and I feel bad pointing it out; http://pieroxy.net/blog/pages/lz-string/index.html#inline_menu_5 should reference base64 as being about 133% of the original binary size (not 166%) on a side note, do you have any interest in incorporating a base128 encoder into lz-string? I've got some pretty ugly javascript i'm happy to pass along (this is an extreme edge case but yields a cost of only 114.3% increase in size for certain storage implementations that can be coaxed into the charset=iso-8859-1 including offline html presented from disc or thumbdrive.)
Avatar: pieroxy

Re: Just released: lz-string

My math may be failing me, but 16/6 = 266, hence the 166% bigger ;-) Not 166% of the original size. Remember compress returns a string where 16 bits are used per character. compressToBase64 produces a string where 6 bits are used per character.

Base128 really is a corner case, so I don't think it should be incorporated in the lib itself. It can be incorporated as an "add-on" for extra encoding options. You can submit a patch for a file called "lzstring-encoding.js" for example. I might add some other URL encoding as well in there. That sounds like a good idea.

Re: Just released: lz-string

You're math is good. I forgot about the 16bit blocks in traditional implementation. Good call.
Avatar: Scott

Minified lz-string-x.y.z.js

Examining 1.3.3 I don't think that it can be minified and function. (e.g. not all code blocks are surrounded by {}'s)

Have you tried this? Are you interested in taking the code back if I make the necessary modifications?

I'm looking at including it in an application where we will use the Closure compiler to minify and concatenate it with other code.

Avatar: pieroxy

Minified lz-string-x.y.z.js

Can you explain to me (or point me in the direction of) the purpose of this? The last three lines in the JS file are there for node.js exclusively, so if you don't want them just throw them away.
Avatar: Scott

Re: Minified lz-string-x.y.z.js

This is to be able to minify lz-string so that the download to devices (desktop or mobile) is as small as possible. It also means that lz-string could probably be concatenated to other JS files to reduce the number of HTTP GETs to the server, reducing latency.

This has nothing to do with node.js and I don't know whether the last three (3) lines are relevant, yet.

Avatar: pieroxy

Re: Minified lz-string-x.y.z.js

Ok, I know what minifying means... I don't see what is the problem there. I just created a minified version of the lib on github but I didn't test it at all. Can you tell me if it works ?
Avatar: Scott

Re: Minified lz-string-x.y.z.js

It appears to work, although I have only tried the de/comp->UTF16 routines.

I should have just compressed it first, however, our experience has been that code that passes JSLint always works when minified and code that doesn't regularly - though not always - has issues that are not always obvious.

Our standard has been to JSLint all code to increase our odds that all code paths are Ok, not just the ones we've managed to test (min'd vs orig).

I'd still encourage you to JSLint the source as it will minimize the chances there is code that will not minify gracefully.

Avatar: Tarak

Arabic characters

Hello, we are using LZString in asp.net application. We have Web service (.asmx) in .net that returns xml string in compressed format using compressToBase64 function and we have html page that call the webservice using Ajax. We get the response and read the string by decompressing using function decompressFromBase64 function in JS. This works fine untill now but we have case where data contains some arabic characters (تستأنف في وقت لاحق )in xml and when that string is decompressed in JS, we don't get the exact characters that are passed from .net web service (we get تستأنف في وقت لاحق). I checked by using compressToUTF16 function but its not working. Can you please help or suggest any solution to work with Arabic (or any other language) characters during compress / decompress of string. Thanks
Avatar: pieroxy

Arabic characters

I wrote LZString specifically for the purpose of handling UTF characters so there shouldn't be any issue on that front. How do you know the decompressed version is garbled? in other words, what do you do with the decompressed string ?
Avatar: Tarak

Arabic characters

Decompressed string is shown on web page i.e. assign the string to div tag of html page.
Avatar: pieroxy

Arabic characters

What is the page charset set to? If the page is displayed in ISO-8859-1 for example, arabic characters cannot be shown (obviously). The page would need to be in UTF-8 (or other unicode encoding) for this to work. One other question: if you debug the decompression process - with firebug or any other debugging tool for your browser of choice - what does it shows right after the decompression stage?

Last question: Is this on a page on the internet where I can have a look?

Avatar: Anonymous

Arabic characters

Page charset is correctly set. But finally I found the problem where it was going wrong. Actually, in .net web service that is called by JS, we have .net object that is converted to XML string through xml serializer. So when object containing some Arabic characters is serialized it was giving wrong output. I was using earlier: Encoding.Default.GetString(memStream.ToArray()); and now converted to Encoding.UTF8.GetString(memStream.ToArray()); which solved the problem. Thanks for your help and quick reply.

decompressFromBase64 is null

When I try to use decompressFromBase64 on a text area value previously compressed with compressToBase64, the result is always null. This is the Javascript code I'm using: jQuery( document ).ready(function() { var text = jQuery('#overall_summary').val(); textC = LZString.decompressFromBase64(text); jQuery('#overall_summary').val(textC); }); Any idea why is it not working?
Avatar: pieroxy

decompressFromBase64 is null

I can only imagine that the text you're passing to decompressFromBase64 isn't the same as the text you got from compressToBase64. What I would suggest:
  • Try to decompress the text right out of the compressToBase64 method. If it works, see below, if it doesn't, paste it here so I can investigate.
  • Try to see why the base64 data you're decompressing isn't the same as the one produced by LZString.
Avatar: peter

Java problem?

Hi there. I like the look of this. I want to use it to compress data being sent via websocket to the web browser client, to minimize bandwidth usage. However, server (Java) appears to not be working right. The compressed result seems to be only 1 byte shorter than the input. And, it does not decompress in a stand-alone test (the result string is null). I'm using JDK8u5. please advise
Avatar: pieroxy

Java problem?

It looks as if it is a problem with the Java implementation of LZString, not with LZString itself. I'd advise you to first try to debug it with a very small input (something like "abcabc" for example). Compress with lzstring on your browser, compress with Java on your server and compare the result. If it is not the same, it is a problem with either the Java implementation or the way you use it. You can then try to go to the author of the Java port for more advices.

Also, as there are two Java ports, you can always try both.

Re: Just released: lz-string

Its taking me 5 seconds to compress an Image which is base64encoded already and then 15 seconds to decompress it back to the original base64 encoded image format. My dev laptop specs are pretty good with a i7 and 32gb ram. Is there something I am missing. This is in c#
Avatar: pieroxy

Re: Just released: lz-string

It's hard to say with this little information. Can you post your code? Also, how big is the file you're trying to compress? I would also suggest posting your question to the author of the C# port.

Remember, LZ-String was always meant for small strings compression, not arbitrary length binary data.

Re: Just released: lz-string

Can you provide a array version of lz-string? i would like to compress string into binary, sent over websocket and store the binary directly into database table.
Avatar: pieroxy

Re: Just released: lz-string

Errrr... An array of what? The output of the compress method is a string where all 16 bits of all characters is used. That's pretty close to being an array. And it's also one for loop away of being an array. I'm sure you can write it in just one line of code.

The whole point of LZString is that it produces Strings. That's even in the name of the lib. What you do with those strings is up to you.

Re: Just released: lz-string

ArrayBuffer, this one can be sent easily by Websocket But there is no function in javascript to convert string to uint8array (in UTF16 format).
Avatar: pieroxy

Re: Just released: lz-string

You can try the accepted answer in this StackOverflow question.

Re: Just released: lz-string

this solution do not consider endianess. if i write a function for uint8array, will you consider to accept upstream?
Avatar: pieroxy

Re: Just released: lz-string

It would be great! You can add a couple of functions compressToUint8Array and decompressFromUint8Array.
Avatar: Jonathan Wagner

Re: Just released: lz-string

So I have come across what I believe may be a bug in the compression code, but I can't be sure since I don't have a complete understanding of what is going on. If you could send me a note at my email I can forward you a link to the lz-string code. It appears to be cutting off without finishing and the dictionary size is not lining up with the c parameter in the javascript code.

Re: Just released: lz-string

Have some problem when porting to Java. leaved a issue on github. 1.0.2 and 1.33 found in reference folder. IT'S NOT completely binary-compatible in 'compress' method. compress("hello1hello2hello3hello4hello5hello6hello7hello8hello9helloAhelloBhelloChelloDhelloEhelloF").length() 1.3.3 ver result length is 23. 1.0.2 ver result length is 22. should I just port with the 1.3.3 ver instead? But the code in 1.3.3 is full of replication.
Avatar: pieroxy

Re: Just released: lz-string

Thanks, I am going to investigate this.

Re: Just released: lz-string

Anyway, port is done, and have the code on production for 2 weeks, seems all OK.

My code is much more faster than dioduailibe's implement which i think is too slow to be used on real production.
And dioduailibe's one is lack of [compressToBase64], which is excatly i need ...
Also, mine has a Rhino engine testcase.

With a 559KB large JSON to compress on i5-3570k @3.40GHz.
dioduailibe's code need 907ms.
Mozilla's Rhino engine need 649ms.(Which may be faster in continuous real production, 'cause Rhino env init slow. dioduailibe's is even slower than Rhino...)
my code need 117ms.

My code is at
https://github.com/rufushuang/lz-string4java

Can you give a link in the doc?
Avatar: pieroxy

Re: Just released: lz-string

Thanks. Link done.

Re: Just released: lz-string

It seems that decompress and decompressFromBase64 jam on an empty string as opposed to checking and just returning it.
Avatar: pieroxy

Re: Just released: lz-string

Thanks, I am going to have a look into this.
Avatar: eduardtomasek

lz-string for python 3

Hi, i made port for python 3. We store some data using lz-string and i need to process data using python worker. Impemented and tested methods are compress, decompress, compressToBase64, decompressFromBase64. I slightly tested on production data (packed json) and it worked. https://github.com/eduardtomasek/lz-string-python
Avatar: pieroxy

lz-string for python 3

Thanks a lot for this. I'll add a link in the doc.
Avatar: Barney

base64 – URI-safe?

Salut Perioxy,
I'm writing a single-page application that I want to be RESTful (everything necessary to build the model stored in URI) while holding multiple variables in query params. Some of these hold keys in indefinite lists, and I'd like to persist as many of these as possible while keeping the state bookmarkable without using a server-side service to generate a link to a JSON-encoded state – this becomes necessary when the URI reaches over 2000 characters.
Do you think it's safe to use base64 to encode and decode URI query parameters? So far everything seems to be working well but there may be edge cases I'm not aware of…
Thanks for a fantastically efficient & functionally complete plugin!
Avatar: pieroxy

base64 – URI-safe?

Hi, To encode your string to a valid URI component you can use:
compressToBase64().replace("=","$").replace("/","-")
Indeed, the output of a base64 encoding is very close to be URI compliant. Just replace = by $ and / by - and there's nothing more to URI encode. Just remember to convert those two characters back to = and / respectively before decompressing.
Avatar: pieroxy

base64 – URI-safe?

FYI, I've released version 1.3.5 with a new method: compressToEncodedURIComponent
Avatar: Barney

Re: Just released: lz-string

Thanks a lot. Works for me!

Possibility of compressing a JPEG image binary

Is it possible to compress a JPEG image captured from mobile camera directly, without converting the binary to base64 string first? My requirement is: • Need to capture JPEG image from camera • Compress the JPEG binary • Convert it to base64 string • Send this value to server • Decode the base64, compressed image • Decompress it from the server • Use it.
Avatar: pieroxy

Possibility of compressing a JPEG image binary

JPEG is already a compressed format, LZ-String as well as any other conservative compression program (gzip, 7zip, ...) will not compress much out of a JPEG binary and may very well produce an output that is bigger than the input.

Just encode your JPEG in base46 and send it to your server.

Avatar: Glenn

15-bit compress/decompres

Would generating the UTF16 versions be as simple as cloning the compress/decompress code, and changing some constants? In compress, change 15 to 14, and in decompress change 32768 to 16384? Would that produce the same data stream as the double-pass presently used? I'm don't understand the whole algorithm, as yet... but those constants seem to control the packing and unpacking of bits on the compresssed data stream... Maybe some other constants or calculations would have to go with those changes?
Avatar: pieroxy

15-bit compress/decompres

You are correct, all the methods other than compress() do make a second pass to re-encode the results. Modifying the core algorithm for every encoding output means cloning it every time.

That said, the compress method is pretty much useless since there's very little one can do with the result. So cloning the algo for compressToUTF16 probably makes sense. I'll get into that. Thanks.

Avatar: Glenn

15-bit compress/decompres

Your extra "confirm" makes me retype this message hours later, because it expired. Anyway, when saving as UTF-8, 7 or 11 bits per character give the optimal sizes, 8 & 12 bits are the worst, due to the design of UTF-8, but not quite a 10% savings at best. It was easy to change the python port to store different numbers of bits per character, so I did that for testing. for UTF-16 storage, 15 bits works well, giving a 20% savings versus the uncompressed version in UTF-8. So I wind up with different application specific encoding techniques that have saved me about 25% instead of using your scheme. But I like your scheme, and may use it if I need to do significant local storage. At the moment, a few hundred bytes is all. If you want my tweaked code, send me an email address to send it to.
Avatar: pieroxy

15-bit compress/decompres

I rewrote UTF16 and Base64 encodings around your original idea. The pre-release is there: https://github.com/pieroxy/lz-string/releases/tag/1.4.0-beta
Can you give me a hand and click on those JSPERF links so that I have a better sample to see if my changes are faster ?
Thanks
As for UTF-8 vs UTF-16, in JavaScript all strings are stored in UTF-16, so really, there is little choice.

Re: Just released: lz-string

hi, I try to release a Go implementation but still facing some tricky issues. Compression looks good but decompression sometimes fails or the reverse. https://github.com/darul75/moo/blob/master/src/github.com/darul75/lz/encoder.go You can run it by following command "go run encoder.go" to see what happened. Or if you get any ideas. Thx
Avatar: pieroxy

Re: Just released: lz-string

It looks as if you've ported the latest version. You should really NOT port the latest version which is full of bad stuff to make the various JS engines happy.

Can you try porting version 1.0.2? This will be much much easier to debug (and easier to write). Here it is

Re: Just released: lz-string

thank you, now implemented with your provided version with same result in decompression. At the end of routines sometimes good sometimes ugly ;)

Re: Just released: lz-string

Good! Now what we need to figure out if it is the decompression or the compression that doesn't work.

You have two strings: one that works and one that doesn't. Here is the result of the compression in Base46 of both your strings:

LZString135.compressToBase64("hello1hello2hello3hello4hello5hello6") => "BYUwNmD2CMoZAmOUDMzIBZ0FZ0DYgA=="
LZString135.compressToBase64("hello1hello2hello3hello4hello5") => "BYUwNmD2CMoZAmOUDMzIBZ0FYgA="

Can you post here the result of the compression of both strings by your library (in base64)?

Re: Just released: lz-string

I do not want to pollute your thread, details is here, looks like end stream of compression is not good. https://github.com/darul75/moo/issues/1
Avatar: Anonymous

Re: Just released: lz-string

Hi, I have a requirement to compress the data passed through http headers and decompress in using c#. I am using compressFromEncodedURIComponent for compression but not finding decompressFromEncodedURIComponent method to decompress in c#. Can you please let me know do I need to use any other libraries or how this can be done? Srinivas
Avatar: Srinivas

Re: Long time it has been released: lz-string :)

We are using compressToBase64() method in javascript library and sending this compressed string over http headers, after receiving this compressed string, using C# version of decompressFromBase64() method. It is always returning null. Please let me know any fix already available OR this is a known issue and will be fixed in next version of c# library?
Avatar: pieroxy

Re: Long time it has been released: lz-string :)

I assume both previous comments were from you. What you need to check is that the String you are feeding decompressFromBase64() on the C# side is the exact same as the string you are getting from compressToBase64() from your browser. My guess is that they are not the same, because some of the characters in base64 need to be URL encoded.

You should use compressFromEncodedURIComponent(). This is like base64 but the characters chosen are slightly different (see the last two chars below). With a couple of replaces on the server side - String.replace("-","/") and String.replace("$","=") - you should get back the original base64 string. Just feed that to decompressFromBase64() in your C# app.

The characters used for base64: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=

The characters used for uricomp: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-$

If you still have issues, please open a bug into the C# repo (I'm not the author of the C# code)

IE8 Broken

Seems lz-string is broken for IE<8; try compress/decompress with default text on your Demo page...0 bytes, null result. This appears to stem from use of array-like string access (which doesn't work in IE<8) at _compress line 154 and _decompress lines 501/509. Using charAt fixes, at least for plain compress/decompress calls (didn't test all the other call variants).
Avatar: pieroxy

IE8 Broken

Thanks for the report. I'm going to have a look into it.
Avatar: pieroxy

IE8 Broken

I just released LZ-String 1.4.4 to fix this issue. There was also an issue with the demo page under IE8 and older which I fixed as well in the process. It now works under IE6.

Good catch. Thanks.

demo page over reports the uncompressed size

It seems the demo page gives a uses 2 bytes per char for the number of bytes count. If someone is using only ascii chars, this is misleading I think. It's still smaller using lz-string - but not by as much.
Avatar: pieroxy

demo page over reports the uncompressed size

Internal representation of string in memory, localStorage, Indexed DB... are all using UTF16. These strings are effectively consuming two bytes per character, even for US-ASCII characters. Now, if encode those strings in UTF-8, I agree with you.

LZString is meant for the browser. Browser use UTF-16. localStorage quotas are by character, hence US-ASCII characters use 16 bits each.

This is why the demo page reflects this fact by counting all input characters as two bytes. Because they consume that many bytes.

demo page over reports the uncompressed size

Thanks the the explanation. In our case, we're trying to take a lot of query string data and compress it so the query string will not be too long. We would need to run the compressToEncodedURIComponent function to get a URI safe string. It says that the size of a base64 string is approx 1.66 times the binary size. If we were making a comparison in this case we should use 1/2 the size listed for the input text (1 byte per char), is that correct?
Avatar: pieroxy

demo page over reports the uncompressed size

Not quite. In order to add data to a query string, you have to process it through encodeURIComponent. In your case, assuming your data is in a string called data you should compare the length of encodeURIComponent(data) and the length of LZString.compressToEncodedURIComponent(data). Each version will take one byte to be sent to the server. If you have lots of characters that are not uri-safe in your original string (some json for example) there could be a sizeable gain.

Re: demo page over reports the uncompressed size

I believe one can get single-byte (8-bit) chars by using 8 bitsPerChar in compress and 128 resetValue in decompress like so:
compress8=function(uncompressed){
  //return LZString._compress(uncompressed,16,function(a){return String.fromCharCode(a);});
  return LZString._compress(uncompressed,8,function(a){return String.fromCharCode(a);});
}

decompress8=function(compressed){
  if (compressed == null) return "";
  if (compressed == "") return null;
  //return LZString._decompress(compressed.length,32768,function(index){return compressed.charCodeAt(index);});
  return LZString._decompress(compressed.length,128,function(index){return compressed.charCodeAt(index);}); 
}
Works for me anyway.
Avatar: pieroxy

Re: demo page over reports the uncompressed size

It works or not depending on what you are trying to achieve. In JavaScript, all String are stored as UTF16, so all characters use up 16 bits of memory. That's the way it is and no matter what parameters you pass to lz-string, it will stay that way. Same goes for Indexed DB, localStorage and other JavaScript in-browser storage systems.

For example, storing 6 bits per character is important for String you are sending back to your server through a GET, because only 65 characters are allowed in there, the rest occupying 24 bits (at best) by being escaped in the form of %AB where AB is the ASCII code of your char in hexadecimal. Furthermore, those characters are being sent in UTF-8 (most of the time) and characters below 128 will occupy only 8 bits.

So it all depend on what you are trying to achieve with the result of the compression. Hence my question: What are you exactly trying to do with this?

Re: demo page over reports the uncompressed size

I'm not presently doing anything with LZString (other than playing around with it); I was merely pointing out how one could easily get LZString to work with 8-bit IO if need be. As to why one would want/need to do so, interfacing with existing code/libs that expect 8-bit IO is one reason. For example, I have several existing Base85 (et al) transcoders that operate on 8-bit IO (be it string or array); 16-bit IO from existing compress/decompress won't work as-is. I could write extra wrapper code to work with/around the 16-bit chars, but changing 2 numbers was faster/easier/cleaner (especially since I'm just evaluating and don't want to invest much time/effort...yet anyway). Just as your own code tweaks those values to align on 6-bit boundaries for Base64, one can similarly tweak them to align on 8-bit boundaries (e.g. for Base85). Naturally one would not want to use 8-bit when multibyte is a possibility, but there are numerous scenarios/standards/encodings where 8-bit values are guaranteed.

Re: Just released: lz-string

is there no c++ version?
Avatar: pieroxy

Re: Just released: lz-string

Not that I know of. Porting the lib is very easy (I did it for the decompression with the Go version in less than 2 hours). But follow the instructions and don't try to port the latest version.

Perl implementation ...?

I just saw in a (somewhat recently) archived message that you had a Perl port that you were looking to add to CPAN. Do you have the source available in the meanwhile? If so where could I find it?
Avatar: pieroxy

Perl implementation ...?

There is no perl implementation that I am aware of... Where did you see this recently archived message ?

Re: Just released: lz-string

Hello. I work at a company that is making a software manual for a client. The manual will be made in Japanese then translated into English. Within the manual, there is a brief mention of "lz-string". For this reason, we would like to know if you require us to include a copyright phrase in our manual for your product. If so, could you please tell us what copyright phrase to use? (English only is fine.) Also, if you require some trademark attribution phrase ("lz-string is a trademark of..."), please let us know. If you need more specific details, please email me. Thank you for your time!
Avatar: pieroxy

Re: Just released: lz-string

The license for lz-string is literally that you can do whatever you want. Look up WTFPL in your favorite search engine.

This means that you can do whatever you want. If you want to give credit, by all means, do. If you don't want to, don't. If you want to put a link to this site, do. If you don't want to, don't.

You're on your own ;-)

Avatar: Josep

Re: Just released: lz-string

Great library! Thanks :) I plan to use it, but in the following way; in case the data is smaller than certain treshold,it would be stored "as is", and lz-string would be used when that limit is passed. ould I rely on the first char being "{" as a way to distinguish between the compressed and uncompressed versions? I'm quite sure I could, but I thought I better ask :)
Avatar: pieroxy

Re: Just released: lz-string

I would suggest to prepend all versions (whether it's compressed or not) by one character. Let's say if you compress you prepend the result with 'z' and if you don't, you prepend with '_' for example. Then, on the decompress side, you check the first char of your string and decide if you should decompress or not depending on this char.

That said, if you plan on storing those strings locally (in the browser), no matter what size is you string you'll most likely get a smaller version with LZString.

Avatar: Josep

Re: Just released: lz-string

Yes, just using an underscore as a marker sounds like the right solution. Thanks!

Misleading but great

Hi. I just want to point "the demo page" looks misleading then showing compressed string in "bytes". We was quite impressed at first then compressing ~200k string into ~40k result. But then after implementing a prototype into our server we got absolutely different, and much more "realistic" results. After a few moments of investigation we actually found, what every "byte" of the result on the demo page is the "character" which takes around 2-3 bytes in total. Yes we planned to use it as an extra layer for sending our scripts to the client, but it seems gzip is still our best friend... Yet it's great work indeed if you want to store string in the localStorage ;)
Avatar: pieroxy

Misleading but great

Ah... Good catch, thanks. I'll fix it soon. For the record, characters always takes 2 bytes internally in JS.
Avatar: Kris Erickson

Re: Just released: lz-string

Love the library and needed a way to be able to quickly view the strings in Chrome so I created the LZipped Local Storage Viewer extension for Chrome DevTools. https://chrome.google.com/webstore/detail/lzipped-local-storage/beicplgjaeliclenmidelkloajghllll. Maybe some of the users of your library will also find it useful. It's also available on GitHub: https://github.com/kriserickson/lz-localstorage-chrome-extension.
Avatar: pieroxy

Re: Just released: lz-string

This is genius !!! Thanks.
Avatar: koudelka

Re: Just released: lz-string

Hey there, I've released an elixir implementation, if you'd like to add it to your list. https://github.com/koudelka/elixir-lz-string :)
Avatar: jugglervr

Any way to "URI safe" a UTF16 compression?

I'm trying to store some data in the URI and I want to compress it to ensure that I don't run into problems with IE's 2k limit. UTF16 gives good ratios but has problems with some characters: "Uncaught SyntaxError: Failed to execute 'querySelectorAll' on 'Document':" Is there a good way to escape these strings or am I stuck with Base64?
Avatar: kreudom

Re: Just released: lz-string

I made a new implementation for C#. Seems to be faster and solves some problems where the results would be different from the javascript library. Just linking this here in case someone's interested. https://github.com/kreudom/lz-string-csharp
Avatar: tcolgate

Re: Just released: lz-string

I've got a working compress and decompress is go. Mostly a direct port of the reference implementation, and not tuned at all (does a 2MB/s in benchmarks). There are a few tests, and things seem OK. I need to implement a few of the external API functions, but that shouldn't take too long. Still need to add docs too. Code can be found here: https://github.com/tcolgate/gostikkit/tree/master/lzjs
Avatar: simone

strange chars decompressing a string

hello, i'm developing a mobile app, but sometimes i have problem using decompressFromUtf16, because it works but gives strange chars on result. This problem seems to happen only on Iphone This is what i do: 1) from Cordova app i compress (compressToUtf16) a json based string and i send it to a .net web service. It's a simple string with userid, password, name, etc 2) the web service takes the string, decompress it (decompressFromUtf16) and sometimes it works and sometimes not, displaying half string ok and half not, with uncomprensible chars. I'm not able to understand if the problem is on javascript compression or .NET decompression thanks for help!
Avatar: Ben

Help understanding the code

I'm trying to understand the code and have a few questions. Apologies that they are pretty basic: - What's enlargeIn's purpose? I'm just struggling to fully get it. - Comparing v1.02 to latest version, some of the basic constants change. enlargeIn, dictSize and numBits are all +1 in the older version. Why the change? - In the latest optimized version there are mostly, if not exclusively, '==' instead of '===' comparisons. Is this intentional or are they faster? I don't see any cases where the comparison result would be different, but all my reading points to the exact comparison as preferable unless coercion is intended. Many thanks for helping out an amateur coder.
Avatar: pudel

Question about _decompress() implementation

I try to port LZString 1.4.4 to Qt, but I struggle with _decompress() function.
There is a loop at the beginning (line 346):
var dictionary = []

for (i = 0; i < 3; i += 1) {
  dictionary[i] = i;
}
which inserts some integers: 0, 1, 2 to dictionary array.
Later there are some strings added to this dictionary too (line 438, 456):
dictionary[dictSize++] = f(bits);
Finally at the end of _decompress() there is condition (line 469):
if (dictionary[c]) {
  entry = dictionary[c];
} else {
  ...
}
What is the purpose for this codition ?
Currently it returns false for dictionary[c] == null, c >= dictionary.length and for c=0 (because dictionary[0] contains 0).

Shouldn't the for loop at the beginning of _decompress() be like:
for (i = 0; i < 3; i += 1) {
  dictionary[i] = f(i);    // or maybe: dictionary[i] = ''+i ???
}
so it would only contains strings ?

Different ports have different implementations:

diogoduailibe/lzstring4j:
List<String> dictionary = new ArrayList<String>(200);
for (int i = 0; i < 3; i += 1) {
   dictionary.add(i, "");
}

(...)

if (d < dictionary.size() && dictionary.get(d) != null) {
   entry = dictionary.get(d);
} else {
   (...)
}
rufushuang/lz-string4java:
ArrayList<String> result = new ArrayList<String>();
for (int i = 0; i < 3; i += 1) {
   dictionary.add(i, f(i));
}

(...)

if (cc < dictionary.size() && dictionary.get(cc) != null) {
   entry = dictionary.get(cc);
} else {
   (...)
}
jawa-the-hutt/lz-string-csharp:
List<string> dictionary = new List<string>();
for (i = 0; i < 3; i++)
{
   dictionary.Add(i.ToString());
}

(...)

if (dictionary.Count - 1 >= c) // if (dictionary[c] ) <------- original Javascript Equivalant
{
   entry = dictionary[c];
}
else
   (...)
}
kreudom/lz-string-csharp:
Dictionary<int, string> dictionary = new Dictionary<int, string>();
for (i = 0; i < 3; i++)
{
   dictionary[i] = Convert.ToChar(i).ToString();
}

(...)

if(dictionary.ContainsKey(c))
{
   entry = dictionary[c];
}
else
{
   (...)
}
Avatar: pieroxy

Question about _decompress() implementation

The answer is in the switch(c). If the decompressor receives an entry 0, 1 or 2 it has a special meaning. All other entries IDs are to be taken from the dictionary. 0 means it's followed by an 8 bit entity. 1 by a 16 bit entity. 2 means it's the end of the stream. So put whatever you want in the first three entries, the code will never read from it. Or don't put anything. But start appending to your dictionary at index 3.

Let me know if you still have issues.

Avatar: AmiArt

Re: Just released: lz-string

Port is done. Here is Qt implementation of lz-string 1.4.4:
https://github.com/AmiArt/qt-lzstring

Some functions were not implemented, since I only needed UTF16 compression.
Done some optimalizations both in _compress() and _decompress() function.

Benchmark results:

Config:
Intel Core i7-2600 3.40GHz / Win10
MSVC++ 2010 / Qt 5.5.1

JSON file size 604 KB - compress() / decompress()

Firefox 49 - 265ms / 16ms
Google Chrome 53 - 218ms / 14ms
Internet Explorer 11 - 326ms / 14ms
qt-lzstring - 62ms / 11ms

lz-string is not great or even good

Sorry to burst your bubble but your implementation of the elegant LZW algorithm is not particularly fast nor efficient and definitely not elegant. Using the javascript object as an associative array is guaranteed to be slow(ish) and slavishly repeating code to eliminate function calls is naïve. It is far better to use the original LZW algorithm and tweak it to add the required functionality, while exploiting the full abilities of the target language.

So to demonstrate what I mean (and just for fun) I have written a js LZW implementation in just such a way which can code any unicode string (hence the use of %, * and / just in case) as a valid UTF16 string (like your's ignoring the MSB and adding 32) which is about 50%-60% faster (yes, over twice as fast) encoding and is, I believe, suitably memory efficient and that has an indistinguishable decoding performance compared to lzString. It also has about a 3% improvement on compression. The code source is 2.5k (128 lines of well spaced code) and when minified (then prettyfied a bit) comes in at 17 lines long and around 1k and is included below. To test these claims, simply copy and paste the code as-is in a separate .js file, load as usual and substitute LZString.compressToUTF16(...), with lzw.Encode(...) and LZString.decompressFromUTF16(...) with lzw.Decode(...). Please note that this is not open source code and can only be legitimately used for test purposes. I reserve all rights, in the (probably vain) hope that others will devise their own efficient and elegant algorithms, hopefully more efficient and faster (but cuter?) than mine, and not persist in replicating badly wrought bloat-ware. Please also note that indexOf as used is horrendously slow in older browsers and would need to be replaced with a loop for an acceptable performance.

As a real challenge to lzString's ugliness, I have also written (but not included here) a “bells and whistles” js version that can output a native js array of bytes, a typed array of bytes, a char encoded string of bytes, a true base64 that is fully compatible with atob and btoa so when processed by these the result can be decoded without further modification, UTF16 and also a “lzstring” mode (i.e. potentially invalid UTF16) for comparison to your offering. It will encode any string or integer array (signed or not) and will return a string or array as per the original uncompressed data. This comes in at 254 lines long (6k) or about 2.5k minified and is about 40%-50% faster, but still with the 3% better compression, although it is possibly a few 1/1000ths slower than lzString when decoding, but this is hard to tell with any real certainty on a Windows platform using a WAMP.

Please write and publish better code in future.

// Copyright © 2016 Gary W. Hudson Esq. All rights reserved
var lzw = (function() {var z={

Decode:function(p)
{function f(){--h?k>>=1:(k=p.charCodeAt(q++)-32,h=15);return k&1}var h=1,q=0,k=0,e=[""],l=[],g=0,m=0,c="",d,a=0,n,b;
do{m&&(e[g-1]=c.charAt(0));m=a;l.push(c);d=0;for(a=g++;d!=a;)f()?d=(d+a>>1)+1:a=d+a>>1;if(d)c=l[d]+e[d],e[g]=c.charAt(0)
else{b=1;do for(n=8;n--;b*=2)d+=b*f();while(f());d&&(c=String.fromCharCode(d-1),e[g]="")}}while(d);return l.join("")},

Encode:function(p)
{function f(b){b&&(k|=e);16384==e?(q.push(String.fromCharCode(k+32)),e=1,k=0):e<<=1}function h(b,d){for(var a=0,e,c=l++;a!=c;)
e=a+c>>1,b>e?(a=e+1,f(1)):(c=e,f(0));if(!a){-1!=b&&(a=d+1);do{for(c=8;c--;a=(a-a%2)/2)f(a%2);f(a)}while(a)}}for(var q=[],k=0,
e=1,l=0,g=[],m=[],c=0,d=p.length,a,n,b=0;c<d;)a=p.charCodeAt(c++),g[b]?(n=g[b].indexOf(a),-1==n?(g[b].push(a),m[b].push(l+1),
c-=b?1:0,h(b,a),b=0):b=m[b][n]):(g[b]=[a],m[b]=[l+1],c-=b?1:0,h(b,a),b=0);b&&h(b,0);for(h(-1,0);1!=e;)f(0);return q.join("")}

};return z})();
if(typeof define==='function'&&define.amd)define(function(){return lzw})
else if(typeof module!=='undefined'&&module!=null)module.exports=lzw;

Re: Just released: lz-string

Thanks so much for this.. has really helped me build my web app.. I can now load 10x as many product definitions as before.. lifesaver!..
Avatar: Anonymous

Re: Just released: lz-string

i use nodejs,a web client compressed a base64 input ,now nodejs decompress this, like the guide( http://pieroxy.net/blog/pages/lz-string/guide.html) Compressing base64 input,what i can do with node.js,please help me
Home