Foreword - encoding JPEGsHow do you encode your JPEGs ?
Let's take a concrete example. You you have a 150x150 image. You want to compress it in JPG, but which quality settings should you use? JPEG encoders usually take a quality setting as a parameter, between Q1 and Q100, where 1 is the worst quality and 100 the best.
|Q 10 - 1579B|
|Q 20 - 1960B|
|Q 30 - 2260B|
|Q 40 - 2513B|
|Q 50 - 2729B|
|Q 60 - 2973B|
|Q 70 - 3308B|
|Q 80 - 3826B|
|Q 90 - 4939B|
|Q 100 - 32KB|
We can see that the greater the quality, the better and bigger the image. There is clearly a compromise to be done as Q100 is clearly too big an image and the increase in filesize doesn't translate to an equivalent increase in visual quality. 80 seems a sensible choice. The filesize is 3.8KB.
While this 80 quality settings is the most sensible, we're going to show that in some circumstances it is completely off the mark.
In this first part, we'll focus on high density displays, aka retina screens as they were first introduced by Apple a few years ago.
Under the idea of learning something new every day, I stumbled upon a nice article depicting the flaws of the hamburger menu.
And I learned a thing or two. So instead of rehashing the article here, I invite you to click the link above, leave my website and have a good read.
Why did I test it?
First, why not just run those scripts with node?
There are several things in there. When I do Java, I know that what I do is highly portable. Much more so than in any other language, node.js included. So dragging a dependency on node.js annoy me. Also, it's much easier to communicate with nashorn (give it inputs, get outputs back or even callbacks) than with an external process. But most of all, the stability of the whole ecosystem kind of scares me. npm is broken every other day and more often than not, upgrading anything leads to a whole bunch of problems. The simple fact that this whole thing is not even packaged for Ubuntu leaves me wondering.
What did I test?
First of all, the compression result. I have a slightly different script to run on node.js and on Java, but they all give an output of 28909 bytes out of 62498. I used the same options for both: do it all, compress as hell.
Notes: I'm running this on an Ubuntu 14.04LTS with a Q6600, an old quad-core from Intel not hyper-threaded. I've tested Java 1.7.0_11, 1.8.0_05 and 1.8.0_45 (with the optimistic type system enabled in nashorn)
The results of the first run (that's after uglifyjs has been loaded):
- node.js: 910ms
- Java 8u05: 22s
- Java 8u45: 28s
- Java 7: 19s
So, Java 8 is slower than Java 7. Node crushes them pretty hard - and that's quite an understatement. Note that the newer the version of Java, the worse it is. I surely did not expect that. I don't know much about V8 (node js engine), but I know a few things about Java. HotSpot takes its time to optimize the code it runs. Let's do a for loop with 10 iterations and see how much time it takes at the 10th iteration:
- node.js: 600ms - one core used
- Java 8u05: 3.6s - three cores used - about 10.8s cpu time
- Java 8u45: 3s - three cores used - about 12s cpu time
- Java 7: 17s - one core used
Now, this is more interesting. Java 8 is clearly ahead, but still about 6 times slower than node. Java8u45 is a bit ahead on that one. Java7 is still dreadful, but that was expected.
A last note on the cores used. I noticed it during this test because it runs for longer than the first one. Both tests involving Java 7 and node.js take about 100% CPU, which means they use one core (out of my quad core). The test in Java 8 uses about 300% CPU, which means bout three cores. This is also true for the first test of course, but I didn't notice it at first.
Let's do one more test: After 100 compressions, how much time does it take?
- node.js: 560ms, one core used
- Java 8u05: 1.5s, two cores used so about 3s cpu time
- Java 8u45: 2s, three cores used so about 6s cpu time
- Java 7: 17s - one core used
Well, this isn't much of a game changer, but it show one thing: Java 8 takes its time to optimize its shit. The Java8u05 process went from 300% to 200% cpu at around iteration 55. Java8u45 went down to 200% CPU at iteration 140, getting the script done in around 1.3s but that was too late for the iteration 100 threshold.
Even after the 10th iteration, it continues to go faster. And after a while, it only takes two cores, not three. This is less bad for Java 8 than the previous one. Now Java is only 2.7 times slower than node and 5.4 times if you count CPU time.
Edit: Let's do one more test: After 1000 compressions, how much time does it take?
- node.js: 670ms, one core used
- Java 8u05: 1.3s, 1.2 core used
- Java 8u45: 1.1s, 1 core used
- Java 7: 17s - one core used
Yes, you read that correctly, node.js is going slower after 1000 iterations...
Well, the least one can say it's that Java isn't fast... to become fast. The cpu consumption went from 200% to 120% at around iteration 155. My guess is that the extra cores used were due to the optimizer. It seems to be a heck of a lot of work.
|seconds||Iteration 1||Iteration 10||Iteration 100||Iteration 1000|
After discarding all results above 5s (so discarding Java7 entirely) and discarding iterations above 200 (they don't seem to change anything but for node who keeps climbing a little to 670ms and then stabilizes), here are the numbers:
I've tried a little playing with Java9 but it isn't really mature at this point so I won't disclose anything. Results are on par with Java 8u45. I strongly hope Oracle will pull their shit together and optimize this further.
Oracle, please spend less in lawyers that undermine your core business and more on engineers which will give a chance to your products. My $.02.
A final note on performance
This warmup that we have seen in the graph is a mix of the JVM warmup and Nashorn engine. But mostly, it is Nashorn. This means that if you create an engine every time you need to execute a script, you won't benefit from it. Here is the most simple way to create an engine:
Well, those pesky ScriptEngine objects created? Never discard them, or else you'll discard the entire optimizations with them.
It's the first time for me that a HDD in a raid1/5 array fails on me. I guess there has to be a first time for everything.
Anyways, I have a 6-2TB disk raid5 array for a total capacity of 10TB. I've had that array for now about 5 years. It's filled with 2TB hard disk drives and has been working pretty smoothly for me. 450MB/s read throughput is how I like it. It's even faster than my GB network, so this has never been the culprit in any of my operations.
For the record, I used the tool idle3-tools from cbothamy to reset my sleep time to something worthy. I have WDC green drives (see the picture, which is actually the drive that failed), and their default settings are really nuts. Really nuts.
Also for the record, my boot drive is a single old Maxtor 320GB drive and has nothing to do with mdadm. So if you have issues with booting onto your mdadm array, look no further. This page is not for you.
Now, I've had a failure this morning and I've had to change one of the drive. I have a "regular" box (See here for pics and french text) so there was no hot-swapping involved. But just a 5 minute downtime was good enough for me.
How did I find out about the faulty drive? Well, I have an "Error" section in my conky configuration and it outputs the diff of the regular /proc/mdstat with the initial one. Just logging into my desktop alerted me right away with a big red section that is usually empty right on my desktop. This is important, because if you don't monitor your raid array, it'll fail eventually on more than one drive and you'll lose everything. Note that from time to time, a red section appears in my conky as mdadm runs a routine check of the raid array.
Now, how do you get down to it. First, I needed to figure out which drive in my raid array was faulty. Here was the result of my mdstat file:
First a little explanation. The [UUUUU_] indicates the number of good disks (U) vs. the number of faulty disks (_). Similarly, the 6/5 indicates the number of total disks in the array (6) vs. the number of working disks (5). At last, the (F) next to sdf1 indicates the partifion that isn't available anymore.
Well... clearly one of the drives has made it to heaven, and it was /dev/sdf. Or has it? Could it just be a software glitch? In doubt, I reboot the server to see if things will get better. After a fight against the BIOS to make it boot with a faulty drive, I notice the same thing. I kinda hoped it would be a defect in the disk driver. Alas...
Well, the HDD seems to be dead. Let's remove it.
First, let's remove the drive from the array in mdadm:
This is done and mdadm is now aware that there is no disk /dev/sdf1 anymore. Next, I need to open the box and locate the good drive. Now... which one is it? I have 6 WDC green in my box... Let's list all the drives with their serial number:
Well... The faulty drive doesn't show its serial number anymore. The good news is that I have the serial numbers of the other drives and I'll be able to locate the one that I don't have.
I just opened the box, looked for a drive with a serial number not on my list, swapped it with a new one and put everything back together. Note that I did take care of making sure I did not swap SATA plugs. I wanted to make sure the new drive was on the same SATA plug the faulty one was on. I don't know if it matters.
After booting up, my raid array was in the same state as before shutting down. First, I had to partition the drive the same way the others were partitioned. I ran (as root):
Now, don't get mixed up with your drives. This will effectively erase the partition table of your drive, copying from /dev/sde to /dev/sdf. Depending on your setup, you can lose everything on the target drive. Here it was /dev/sdf, my newly installed drive.
The last thing I needed was to add the new drive to the array:
And that's all... almost. Now mdadm will rebuild the new drive with the data it should contain. It took about 500 minutes for me, a little more than 8 hours, so be patient. This will take time.
A few hours later, and my raid array is slowly recovering from the disaster:
As you can see, I still have 102.6 minutes left.
When everything was done:
I have to say I'm surprised at how smooth everything went. I haven't lost a byte and I still have a spare drive, although I'll need a few others if/when this happens again.
My first experience with mdadm was kind of a disaster, but it was because I used it as my / partition. Debugging it in busybox wasn't a lot of fun. But this time around, everything went smooth.
ReferencesBootstraping my mdadm CLI skills: http://tldp.org/HOWTO/Software-RAID-HOWTO-6.html Where is that damn disk again? http://unix.stackexchange.com/questions/121757/harddisk-serial-number-from-terminal mdadm recovery, removing, adding a drive. https://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array
I know, I'm a programmer. And as such, everyday or so of my life for the last 20 years I've written code. Most of it has been thrown away by now, but a good chunk is still alive and kicking. And what happens to this code now? Well, from time to time someone take a look at it and try to think "how am I going to make this code do what I want it to be doing". And if this person finds an answer quickly, the code I wrote was good.
As Dan McKinley puts it very well, the best way for this code to make things easy for other people is to be understandable at first glance. Smart and clever code doesn't meet this standard. Boring code does.
Boring code means code that meets long-established standards. It's boring because those standards have been known for a long time and you're smart! You can do better! Yes, you can, but no, you shouldn't. Because the next person modifying your code won't have a clue about your smart idea of the day.
In the same way, when choosing a library to answer a need you have, before jumping into the latest shiny bandwagon, ask yourself if the old and rock-stable lib everyone knows can do the job. If it can, there is probably no better choice. If and when it fails, you're more likely to have at least a few people onboard that know about that failure. When you push it to the limits, you also have experienced people that can better predict what will happen.
The only exception to this rule is areas in which you innovate, and there should be very few of them. You can't innovate on every front. Because innovation is draining your resources: it is longer to setup, harder to pickup for newcomers, harder to fine tune, more prone to bugs which are harder to debug. Just because it's new and no one has any experience with it. So choose those areas very carefully.
This is where the GTD (Get Things Done) should kick in. Use rock-solid and proven libs. Use rock-solid and proven patterns. You will be more likely to build a rock-solid platform that works without a glitch. And it will be faster and less expensive to build.