Tuesday, 17 January 2012

jqPlot, IE7, dateAxisRendering and "Object doesn't support property or method 'getTime'"

I am using the excellent graphing library jqPlot and as most people (I expect) I develop in Chrome but unfortunately have to deploy to IE. IE7 of all things. Anyway, jqPlot allows you to specify your axis as textual dates and interprets them, and handles them correctly (i.e. spacing and showing a sane number of ticks). The problem is that in Chrome the X label can be "2011-Dec-1" but in IE (7 and 9) that generates "Object doesn't support property or method 'getTime'". Changing the text to '2011-12-01' works and it seems to be hard coded to be year, month and day so localisation issues aren't a concern....

Friday, 22 April 2011

Another simplistic Groovy and Java comparison

I noticed that the home page for one of our web apps was very slow and that the speed decrease was directly (linear) related to the number of items on the page. This was good and gave me a good place to start.

The webapp is a Groovy web app using Spring (not grails) and is backed by the most excellent mongodb.

So, firing up jprofiler and just hydrating a bunch of items that would appear on the home page showed it took 903 seconds to load 7143 items. Wow - too slow. Looking at the netIO threads shows that only 16.5 seconds are spent waiting for Mongo. Considering there are hundeds of thousands of lookups (each item is pretty large and has many associations), I was pleasantly surprised.

This unfortunately meant the cost was in my code :) shucks.

The cpuview in jprofiler didn't really point out any obvious candidates (i.e. the time was spread equally amongst lots of methods).

The code itself is written in groovy and uses a lot of closures and method calls. I thought I would run a little experiment myself to see exactly how much this costs. Turns out that it costs *a lot*.

The experiment basically called a gazillion loops, which is pretty similar to the code I am profiling. The performance code:

package sandbox.performance;
public class TestPerformance {
private static final int ITERATIONS = 1000000;
public static void main(String[] args) {
long groovyDuration = testGroovy();
long javaDuration = testJava();

System.out.println("Groovy: " + groovyDuration + ", java: " + javaDuration);
}

private static long testJava() {
JavaClass counter = new JavaClass();
long start = System.currentTimeMillis();


for (int i=0; i
counter.callPlusOneTenTimesTenTimes();

}
long end = System.currentTimeMillis();
return end - start;
}

private static long testGroovy() {
GroovyClass counter = new GroovyClass();

long start = System.currentTimeMillis();

for (int i=0; i
counter.callPlusOneTenTimesTenTimes();

}
long end = System.currentTimeMillis();
return end - start;
}
}

JavaClass:

package sandbox.performance;

class JavaClass {

int callPlusOneTenTimesTenTimes() {
int total = 0;
for (int i = 0; i < 10; i++) {
total += callPlusOneTenTimes();
}
return total;
}

int callPlusOneTenTimes() {
int total = 0;
for (int i = 0; i < 10; i++) {
total += plusOne(i);
}
return total;
}

int plusOne(int x) {
return x + 1;
}
}

GroovyClass:

package sandbox.performance

class GroovyClass {

int callPlusOneTenTimesTenTimes() {
int total = 0
0..10.each { total += callPlusOneTenTimes()}

total
}

int callPlusOneTenTimes() {
int total = 0
0..10.each { int x -> total += plusOne(x)}

total
}

int plusOne(int x) {
x + 1
}
}

(I wish I could figure out the tags for code!)

The results were pretty terrifying: Groovy: 1759, java: 4

Wow - groovy is 439.75 times slower!!!

(This is running on ubuntu 11.04 with sun-java6-jdk and groovy-1.7.5)

And whilst this test is pretty simple it turns out to be pretty similar to the production
code. Time to rewrite it in Java I guess!



Wednesday, 23 March 2011

Suprise when profiling Java/Groovy app with jprofiler

First, if you are a Java programmer you really need to get hold of http://www.ej-technologies.com/products/jprofiler/overview.html!

Long story short, I profiled a long running application thinking I knew exactly where the performance penalty was - it wasn't anywhere near there :) It was in the equals() method of a groovy class which was implemented as:

boolean equals(Object other) {

other instanceof ThisClass && other.hashCode().equals(other.hashCode())

}

hashCode:

int hashCode() {

this.name ? name.hashCode() : 0

}

(This class had a single property which was a String and set in the constructor so I could easily cache the hashCode) but flip me, who knew this would be a bottleneck accounting for 17% of the CPU time. OK, it was called over 3 million times, but still....

The solution? I don't know if it is the cost of groovy or what, but my first call is to cache the hashCode and reference that in the hashCode and equals method. Run the profiler and see if that helped. Second step will be to re-implement in Java and see if that helps....

Long story

There was a Java/Groovy based app which used Hibernate and SQL Server which I migrated to use MongoDB. The app had some non-trivial complex hierarchies which were a pain to store in a relational DB but trivial to store in a document DB. There was an obvious aggregate root for the data structure so the conversion was very straight forward.

The application was fantastically quick for loading a single document but the user's home page displays all their active documents, of which there might be hundreds. The SQL server application uses a denormalised summary table which condenses each document into a single row so that is lightening quick but the mongo app was really slow.

Given that hibernate has a first level cache *and* given that the documents have lots of references *and* given that MongoDB has no such first level cache you might follow my line of thinking that the performance cost was in Mongo loading each referenced object over and over again.

Like me, you would be wrong :)

Lesson of the day - use jprofiler - it just might surprise you!

Monday, 21 March 2011

VMware - moving hosts between data centers

I wanted to move vcenter to a virtual machine running on one of the ESXi hosts that it was managing. Chicken and egg huh :) but this is actually a supported configuration.

So I removed all the hosts, shutdown vCenter, created the new virtual machine, installed vCenter and tried to add the first host and received a very cryptic 'You do not hold privilege "System > View" on folder ""'.

Hmmm....

Turns out it because I chose to turn on "Lockdown" mode on all the hosts which restricts who can administer them. Logging in via iLO to the host itself and disabling lockdown in the VMware console does the trick.

Thursday, 17 March 2011

Best filesystem for virtual machine running SQL Server (part 2)

This is the second part - please read part 1.

I don't care about synthetic tests like bonnie++ or dd if=/dev/zero etc. I am only interested in finding out which configuration runs my test the fastest.

However, it is nice to see the difference :) so these are the results of copying an 18GB file into each of the three configurations:

(recall, first config is XFS partition on single disk, second is RAID0 XFS and third is RAID1 XFS).

(Copying from the non-raid partition into a non-raid partition on the second disk)

time `dd if=test of=/data/singleb/test bs=1M ; sync`

18267+1 records in

18267+1 records out

19155058688 bytes (19 GB) copied, 164.756 s, 116 MB/s

real 2m52.385s

user 0m0.030s

sys 0m26.490s

(Copying from the non-raid partition into the RAID0 partition)

time `dd if=test of=/data/raid0/test bs=1M ; sync`

18267+1 records in

18267+1 records out

19155058688 bytes (19 GB) copied, 266.782 s, 71.8 MB/s

real 4m28.422s

user 0m0.010s

sys 0m27.840s

(Copying from the non-raid partition into the RAID0 partition)

time `dd if=test of=/data/raid1/test bs=1M ; sync`

18267+1 records in

18267+1 records out

19155058688 bytes (19 GB) copied, 400.644 s, 47.8 MB/s

real 6m47.905s

user 0m0.050s

sys 0m26.620s

Copying from one disk to another is the fastest, followed by RAID0 then RAID1. You might have expected RAID0 to be the fastest but recall that the 18GB file is being copied from a parition on the same disk so one of the disks in RAID0 (and RAID1) will be involved in reading as well as writing, hence the slow down.

OK, since I opened this door, maybe a fairer (but still meaningless :)) test would be to use dd if=/dev/zero so the disks are purely available for writing..... Results of that silly test (time `dd if=/dev/zero of=18G bs=1M count=18000; sync`) are:

single disk:

18000+0 records in

18000+0 records out

18874368000 bytes (19 GB) copied, 156.651 s, 120 MB/s

real 2m43.003s

user 0m0.040s

sys 0m16.760s

raid0:

18000+0 records in

18000+0 records out

18874368000 bytes (19 GB) copied, 82.0213 s, 230 MB/s

real 1m25.195s

user 0m0.030s

sys 0m16.800s

raid1:

18000+0 records in

18000+0 records out

18874368000 bytes (19 GB) copied, 189 s, 99.9 MB/s

real 3m16.593s

user 0m0.040s

sys 0m17.110s

All of this is meaningless really - the copy from one disk to another is probably the closest to the performance you will get in real life. Whilst RAID0 flies for a long sequential write it slows down significantly when it has to be read from as well - i.e. real life.

Based on this, I don't know whether it will be more performant to have the OS and one DB on disk1 and the second DB on disk2 or just both on RAID0... We shall see! :)

DISCLAIMER - RAID0 means losing all your data stored on that partition if *any* disk dies.

Best filesystem for virtual machine running SQL Server

(part 1)

I need to run SQL Server in a VM on top of Linux (Ubuntu 10.10) using VirtualBox ('cause getting VMware Player running is just not worth it!).

I have two 640GB WS6402AAEX disks (fairly quick) and the OS is installed in a (software) 64 GB RAID1 partition.

In fact, to be clear:

- /dev/sda1 is 16GB SWAP

- /dev/sda2 is 64GB RAID1 (software)

- /dev/sdb1 is 16GB SWAP

- /dev/sdb2 is 64GB RAID1 (software)

The base machine is a quad core CPU i5 760 (2.80 GHz) with 16GB RAM (purchased from those great people at http://pcspecialist.co.uk/.

The question is, which is the best setup for virtualising SQL Server? My plan is to install a single Server 2008 64Bit VM with 12GB RAM (leaving 4 for the Ubuntu 10.10 host). That VM will have 48GB for the OS and 64GB for SQL Server data files. It will also have IntelliJ 8 (don't ask) configured to run a development job which sucks data from one database and sticks it in another database. The same configuration (using Windows 7) takes about an hour to run when installed directly onto the hardware.

I plan to test the following scenarios:

- guest OS on /dev/sda3 (XFS), data on /dev/sdb3 (XFS) i.e. no RAID

- guest OS and data on RAID 0 (XFS)

- guest OS and data on RAID 1 (XFS)

*remember* - the host OS is running on a RAID1 partition on the same disks.

part 2 will show some meaningless synthetic tests and part 3 will show the results of the real test.

XFS or EXT4 for OS desktop?

Which file system to use for the OS? Quite simply, don't use XFS - use EXT (or whatever). Installing a minimal ubuntu server took absolute ages (hours) over XFS on RAID0 and RAID1.

Think about it - XFS *rocks* at large files but its weak spot is tiny files - what does an installation involve? Lots of small files.

Changing it to EXT4 and re-installing reduces it from hours to mere minutes (20 or so - I didn't time).

I will be using XFS for storing the virtual machines one though - for sure. Just not the OS.