I Am Not Charles

W2E Day 3: Morning Presentations

Posted in the Kitchen,the Living Room by Joe on September 30, 2010
Tags: , , , ,

JavaScript is the New Black – Why Node.js is Going to Rock Your World
by Tom Hughes-Croucher of Yahoo

Node is a Javascript interpreter that’s getting a lot of buzz. Basically it acts the same as a Python or Perl runtime (or as Tom said repeatedly, “Python or Ruby” – not a Perl fan apparently, which earns him some points with me), letting you run Javascript without a browser and putting it on the same level as the popular desktop or server-side scripting languages.

I’ve been wanting this for years: Javascript is a well-designed, powerful language with clean syntax, and there’s no reason it should be limited to embedding in browsers. And because it has a 100% lock on browser scripting, pretty much everybody has to learn it at some point anyway, so why switch back and forth between scripting languages for other tasks?

Tom makes this point more strongly, pointing out a huge number of job postings for Javascript programmers: web sites are now so complex that companies are not just hiring visual designers and expecting them to slap on some Javascript copied from a web site, they’re hiring full-fledged developers to code up their sites. Using Javascript on the server lets these developers write both back-end and front-end code rather than needing a separate team for each.

I don’t think this is a 100% win: every serious programmer should learn several languages so that they can distinguish the philosophy and structure of programming in general from the quirks of their particular language, so a pure Javascript developer who can’t pick up whatever language is being used on the server side isn’t much of a developer at all. But as long as you remain proficient in several languages – especially if they come from different paradigms – having to switch back and forth during day to day tasks which should be related does slow you down, so artificially limiting Javascript to the browser is a penalty even if it does help to discourage laziness.

The other big benefit touted by Tom is code reuse – which is a 100% win. There is often logic duplicated between client and server – form validation is a big example – and using Javascript on the server lets you use the exact same code, rather than having to rewrite the same algorithm in two different languages, a huge source of bugs. In fact, using Javascript on the server enables shared logic at a level that would be infeasible if it had to be written twice: consider a page that writes a lot of its HTML dynamically through Javascript. In a technique Tom refers to as “Progress Enhancement”, the first pass is done on the server, using the complete widget set and dynamic logic used on the client, so that as soon as the HTML is received it can be rendered instantly. But the dynamic Javascript is also repeated on the client side so that as the user interacts the page is reconfigured in the browser without going back to the server. (The server-side and client-side code will never be 100% identical, but at least it will have the same base rather than trying to do the same thing twice from scratch.) There is an example of this in the YUI/Express demo, with Yahoo UI widgets rendered first on the server without sacrificing client interaction. Tom demonstrated the Table View widget, which showed up a glitch in this scheme: the spacing generated on the server did not exactly match the client, so the widget originally rendered with header tabs squished together slightly and then spaced them out, leading to a slight UI flash. This is ugly and needs to be addressed (although I don’t know if it’s a systemic problem or just because the simplistic demo didn’t include any codeb to deal with it.) Still, that split second when the initial layout was flashed would have been blank without server-side rendering.

Under the hood, Node.js uses Google’s V8 engine and contains a non-blocking, event-driven HTTP server written in 100% Javascript which compares in performance with nginx. The performance graphs Tom showed were impressive and it seems to scale quite well (far better than Apache, for instance.) One big hole right now is that HTTPS support is sketchy, but this is being worked on.

One interesting technical note Tom highlighted: to make use of multi-core hardware with an event-driven server, new threads or processes need to be spun off by hand for heavy work (as opposed to automatically for each connection as in Apache). Although Node does support the fork system call, it also implements the HTML5 Web Workers spec. That means rather using slightly different concepts to spawn helpers on the client and the server, developers can reuse their knowledge when writing code in both places.

As a new language (in this context), Javascript doesn’t have as many 3rd-party libraries available as, say, Python and Ruby. But with the buzz it’s getting, more are popping up quickly: Tom showcased several, all available at GitHub:

NPM, the Node Package Manager
Mustache, a JSON-like templating language (which Twitter currently uses in JS in the client but Ruby on the server)
Express, an MVC framework similar to the lower levels of the Rails stack
Paperboy, a static file server

As well as using it as a web server, Node has an interactive shell just like Python’s or Ruby’s. Definitely going to be picking this up for my scripting needs, even though I don’t exactly do much server development.

Tom’s slides are online at http://speakerrate.com/sh1mmer.

When Actions Speak Louder Than Tweets: Using Behavioral Data for Decision-Making on the Web
by Jaidev Shergill, CEO of Bundle.com

Now here’s how to make a product focused presentation without sounding like a shill:

– Here are the resources we have that most people don’t (a large database of consumer behaviour data, including anonymized credit card purchases from a major bank, government statistics and nebulous “third party databases”)
– Here are some studies we did for our own information, whose results we think you’d find useful (“We tracked a group of people in detail and interviewed them to find out in depth how they make decisions”)
– Here’s a neat experiment we put together using these two pieces of information – we don’t even know if we’ll release it, we just wanted to find the results (and here they are)
– Oh, and here’s our actual product

Jaidev presented two theses, the first gleaned from interviewing study participants and the second from his own experience:

1. There’s more than enough information on the Web to make decisions, but 99% of it is useless for the specific person looking at it, because – especially when looking at opinions and reviews – people need to know how people that are like them feel about an option. (Here we are talking about subjective decisions like, “Is this a good restaurant?” or decisions with a lot variables like, “Does this new device fit my exact needs?”)

2. Online user-generated content is nearly useless for finding opinions because it is not filtered right. For example, review sites tend to polarize between 5 star and 1 star reviews because only users with strong opinions bother to rate, so all reviews are distorted. Many people filter by their social circle since their friends (mentions on Facebook, Twitter, etc) have things in common so their recommendations carry more weight, but this means that recommendations are skewed towards options with the latest hype. It turns out people are much better at reporting new things they just found than what they actually use longterm.

To illustrate this, Jaidev presented an experiment in which he used his company’s credit card database to build a restaurant recommendation system, by drawing a map between restaurants based on where people spent their money, how often they returned, and how much they spent there. Type in a restaurant you like and the system would return a list of where else people who ate at that restaurant spend their money. Rather than a subjective rating, the tool returns a “loyalty index” quantifying how much repeat business the restaurant gets. Presumably this will be more useful to you than a general recommendation because the originators of this data share at least one important factor with you: a love of the original restaurant.

The result was that a restaurant which was highly recommended on both review sites and in Jaidev’s circle rated very low. Compared to restaurants with similar food and prices, customers returned to this one far less often and spent far less. Reading reviews in depth revealed that, while the highest ratings praised the food quality, middling ratings sais that the food was good but management was terrible, with very slow service and high prices. Equally good food could be found elsewhere for less price and hassle. This information was available in reviews, but hard to find since it was drowned out by the all-positive or all-negative reviews.

So the main point to take away from the presentation is: hard data through data mining is still more valuable than the buzz generated through social media. Which is obvious, but a good point to repeat at this conference which is full of people who are so excited about adding social components to everything.

Jaidev did a great job of demonstrating the value of his company’s data set without actually sounding like he was selling it. He only demonstrated bundle.com itself briefly: it seems to be a money management site which allows users to compare their financial situation to the average and median to answer questions like, “Am I spending too much on these products?” and, “How much should I budget for this?”. The example Jaidev showed was an interactive graph of the cost of pet ownership. Looks like a useful site.

Alas, the equally useful looking restaurant recommender was only a proof of concept and is not released to the public. (And only covers Manhattan.) Email jaidev@bundle.com if you want to see it made public.

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. How does the unicorn breathe?)


W2E Day 2: Morning Presentations

Posted in the Kitchen by Joe on September 29, 2010

I’ve resigned myself to being a day behind in blogging. I’m just too slow at organizing my thoughts to get them into an order worthy of publishing in the evenings. Better to come at it fresh in the morning.

The shift from Monday’s 3 hour long workshops to yesterday’s shorter presentations was pretty jarring. I kept thinking the presenters were just finishing their introduction and finally about to go into detail – and the talk would be over. So my biggest complaint about a lot of these talks is that they gave such a broad overview that I didn’t really learn much.

How to Develop a Rich, Native-quality User Experience for Mobile Using Web Standards
by David Kaneda of Sencha, a company that was namedropped quite a few times by Jonathan Stark yesterday as the makers of a really high quality Javascript UI toolkit.

I expected this to overlap quite a bit with Jonathan’s workshop, but I wanted to see it anyway because I spoke with David briefly afterwards and I wanted to see how his opinions differed.

He offered quite a few nuggets of info like this interesting graph: in January 2010 the breakdown of US mobile web traffic was 47% iOS (iPhone/iPad), 39% Android, 7% RIM, 3% WebOS (Palm-now-HP Pre), 2% Windows Mobile, 2% Other – so 95% webkit. (Actually David was wrong about that – if these were released in January the RIM numbers would have to be pre-Torch, so that’s only 86% WebKit. For now.) Even at the height of IE’s popularity that kind of homogeneity is new for the web. It means the mobile developers, unlike web developers, are free to target the WebKit-specific subset of HTML5, without worrying about, for a trivial example, including both -webkit and -mozilla namespaced CSS selectors. That’s good because with limited networking, avoiding redundancy is important. I suspect it’s temporary, though, as new features are added and phones diverge in which exact webkit versions they support.

(Of course, this list reflects web usage, not phone ownership: these are the phones whose users spend the most time online. Even still I was surprised to see the Pre passing Windows Mobile.)

Wow, there are a lot more Android phones than I realized. David believes a year from now the basic free phone you get with your mobile account will be a shovelware Android phone, with a touch screen, GPU, etc, and a good browser, so really rich web apps will be available to everyone, not just the high end. (But, as I mentioned above, WebKit will continue to evolve and these phones would be good candidates for lack of upgrade support.)

As I expected, David mentioned a lot of the same things yesterday’s workshops did: he glossed over web storage and form validation (as well as web workers, which everybody seems to mention and then says “but I don’t have time to discuss that”.) He did cover, briefly, the new input types, video/audio (mentioning that video can be styled using the new CSS transforms, which stands to reason but hadn’t occurred to me to try; I assume it doesn’t distort the contents, that would be crazy!), meta viewport tags, CSS3 features, and the application cache manifest. That’s my biggest eye-opener so far – I helped to test the cache manifest in the Torch browser, although someone else did the implementation, and when I read the spec I thought it was ridiculous and would never catch on. But all the mobile app developers here seen really excited about it. It’s a huge pain in the ass to use, though, so clearly there’s some room for improvement (or at least tool support.)

David then went on to talk about some glitches in the mobile implementation, which is valuable info:

Touchscreen events: apparently there is a hardcoded 350ms delay in tapping an element before the click event is fired. That’s crazy sluggish! (He speculated that the reason was an attempt to allow people to cancel quickly if they touch the wrong thing, since fingers are thick and clumsy. I’ve seen other posts online guessing that this is to allow the user to continue and turn the click into a gesture.) David recommends working around this by binding custom events to touch down and touch up, which generate a click event immediately if the user does touch down and then touch up within a small radius. I dunno – sounds hackish and fragile to me. (I looked this up to see if it was in a spec somewhere or just an implementation detail, and all I could find was posts complaining about it on the iPhone. I’ll have to check with our UI people and see if it’s iPhone-specific or built into WebKit.)

No fixed elements: position:fixed doesn’t work on mobile – he didn’t say why, so I had to look it up. Actually it works fine to fix an item to a specific position on the page, that’s just not very useful on mobile since most people want it fixed to the viewport. This makes building pages with independently scrolling areas difficult. To fix this, he again recommends writing custom touch handlers to track movement events and update content based on that – a simple implementation would just move linearly, but to match the platform native scrolling, you would need to track acceleration and add bounce at the end, which gets pretty complex.

Wow. That’s even worse! After all the talk of moving transitions and animations into CSS so they can be implemented in the browser and accelerated, it saddens me deeply to hear people talk about implementing kinetic scrolling of all things in Javascript.

Finally he listed some Javascript frameworks to help with mobile development:

iScroll and TouchScroll both wrap touch scrolling as described above, and contain their own implementations of acceleration.

jQTouch he described as “limited” and “hypercard-like”; I guess he’s allowed to denigrate it since he wrote it. It fixes the 350 msec delay but not scrolling.

And Sencha Touch, his company’s product which is in beta, abstracts all touch events to generate artifical events like doubletap, swipe, and rotate, implements independantly scrollable areas, and a ton of other features.

For deployment he mentioned OpenAppMkt, an app store for web apps, and PhoneGap again for wrapping web apps in a native shell.

David’s slides and additional links are online at his site, including a lot of links to further resources. (Including a bunch I took the time to Google for myself. Oops.)

And at this point my fingers started to cramp up and my notes became much sketchier, so the rest of these writeups will be much, much shorter.

The Browser Performance Toolkit
by Matt Sweeney of YUI/Yahoo!

I was expecting this to describe a specific product, but it turned out to be a metaphorical toolkit – a list of performance measurement tools.

I thought this talk could have benefited from some more demonstration: how to read a waterfall diagram, how to use a profiler. It came off as just a list of products to check out later. It also do a lot to distinguish between them: there was a lot of, “And this has a network waterfall display which is a lot like the last one.” So why should I choose this one over the last one? When somebody asked which products he personally used, Matt’s answer was, “All of them,” which isn’t too helpful. Would have liked more depth here.

I won’t bother linking to these since there are so many and they’re easy to Google:

Firebug: Firefox’s builtin debugger, gives a waterfall display of asset loading, and can break out each load into DNS lookup time, connection time, data transfer time, etc; also has a Javasript profiler, which unfortunately is limited because it only gives function names and call counts, not full call stacks.

YSlow: an open source Firebug add-on from Yahoo, which gives a grade on various categories of network usage and detailed stats, good for getting a high-level view of network usage.

Page Speed: another open source Firebug add-on, from Google; similar to YSlow but also tracks some runtime performance. It can profile deferable Javascript by listing functions not yet called at onload time (candidates for deferral), and displays waterfall diagrams including separate JS parse and execute times.

Web Inspector: WebKit’s built-in debugger. Similar to Firebug, plus an “Audits” tab similar to YSlow/Page Speed. It includes the standard network loading waterfall, and a JS profiler which unlike Firebug’s includes system events like GC and DOM calls, and does include call stacks but not call counts, just run times.

IE8 “has a robust set of developer tools”, but he didn’t say any more about them. IE9 has a network usage tracker and profiler which seem on par with Firebug and Web inspector, with one huge caveat: the profiler’s timer is in 15msec chunks! So runtimes are not close to accurate due to rounding. At least call counts are still useful.

DynaTrace AJAX Edition is a free IE add-on supporting versions 6-8, with IE9 and Firefox support coming soon. It has the standard network waterfall, plus some nice high-level pie charts showing percentage of time spent loading various resources. Its profiler tracks builtins like Web Inspector’s, and can also capture function arguments, which sounds very useful, and has the nicest UI demonstrated, including pretty-printing the source code when looking at a function (especially useful when debugging obfuscated JS).

Matt also mentioned a few online tools: apparently showslow.com aggregates YSlow, Page Speed and DynaTrace scores, and publishes the results so it’s a good way to compare your page to others. But when I tried to go there I got what looked like a domain squatter, and I see no mentions of it in Google – did I copy the name down wrong? Webpagetest.org does network analysis for IE7.

Mentioned but not detailed were Fiddler and Charles (no relation), a proxy which among other things can be used to see Flash resource requests over the wire.

His final point was that, since browsers vary, you can’t just use one tool and need to profile in multiple browsers with multiple tools. Which makes sense, but it would have been nice to give more detail on what YSlow or Page Speed give you over the browsers’ builtin debuggers.

NPR Everywhere: The Power of Flexible Content
by Zach Brand, head of NPR’s digital media group

I could have gone to the HTML5 vs Flash presentation in this time slot, but after seeing 3 variants on HTML5 in a row I figured I should see something a little different instead. Here, the head of NPR’s digital media group described the challenges in getting NPR’s original news reporting formatted in a way that could be used in many contexts: large screens, small mobile devices, summarized on affiliates’ web sites, serialized through RSS, embedded in blogs.

This was another presentation where I would have liked to see more technical detail and less overview, but I guess there wasn’t time. The talk was interesting, but didn’t contain anything directly applicable to my work so I won’t bother to summarize it.

I will list two important links, though:

An API for public access to NPR’s news stories is at npr.org/api.

Zach promised a followup on the blog at npr.org/blog/inside

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. That ghost is full of cake.)

W2E Day 1: Building Cross-Platform Mobile Web Apps with Jonathan Stark

Posted in the Kitchen by Joe on September 28, 2010
Tags: , , , , , ,

Jonathan Stark, author of two O’Reilly books on mobile app development, shares techniques for doing it with web technologies.

This was a good companion to this morning’s session (okay, yesterday morning’s session now: Alex’s workshop took so long to write up that now I’m behind.) It covered a lot of the same ground but in a more hands-on, less theoretical way. It discussed the same CSS3 features: transforms, transitions, animation, gradients, rounded corners, and text shadows, but gave more complete code examples, took some time to explain them, and tweaked a lot of parameters to show their effect.

Jonathan disagreed with Alex on one thing: they both gave equal weight to Web Storage (for keeping simple persistent data on the client) and app cache manifests (for keeping resources on the client), Jonathan went on to give a gung-ho demo of HTML5’s SQL database integration, which Alex dismissed yesterday saying the API was “a mess”. One reason might be that Jonathan was speaking specifically about writing mobile pages, which means WebKit (which has the SQL API) while Alex, despite being a chrome developer, was being careful to keep his talk cross-platform and highlight the Firefox and Opera way to do things. I’m not sure now if Alex meant that the db situation is “a mess” because there is no convergence, or if he had actual problems with the API design.

Apart from the new CSS and HTML practical overview, Jonathan did cover some more philosophical questions: is it better to use native UI toolkits to write a mobile app for each device, or just write your “app” in HTML and use these new features to make it look as much like an app instead of a web page as possible? HTML is the clear winner in terms of developing for multiple devices (no need to learn Objective-C just to build for iPhone), distribution, and updating (just dump the new version to the live site), but native apps still have slightly better cosmetics and – critically – access to device features. Random apps on a web site aren’t allowed to access the camera, dial the phone or open the address book.

Nonetheless, Jonathan strongly recommended using web tools to build apps, since it let’s you target more than a single device. To get around the sandboxing problem, he touted PhoneGap, a very cool looking open source framework for compiling HTML apps into native packages. Just dump your web code into a dir, point PhoneGap at it, and it will generate an appropriate project to embed it for your platform (some Java classes and ant build files for Android, some Objective-C code and an Xcode project for iPhone, etc). Even better, it generates bridges to allow access to sandboxed features like the camera from Javascript (though obviously you can’t test this in the browser). Their supported feature list only lists Blackberry 4.5, but that’s not too surprising as the WebKit browser is so new. Hopefully support for BB6 widgets will be coming soon. I’ll definitely need to take a look at this project.

It’s important to note that Jonathan was talking about building apps through HTML, not just web pages. His examples blatantly ignored a lot of Alex’s optimization (or more precisely “not sucking horribly” advise), but as he reminded me when I talked to him afterwards, not blocking on network performance doesn’t matter as much when the app is all installed. He swears by Steve Souders for performance advice: I’ll need to check out his stuff to see how it compares. (I see he and Alex both work for Google, so probably Alex got it from him in the first place.)

Jonathan also touted the JQTouch library that he maintains, originally written by David Kaneda, whose presentation I’m going to see tomorrow. (Today, now. In fact, I’ve already seen it. But let’s maintain the fiction that I just walked out of the workshop.) It looks like a pretty good widget toolkit – built on jQuery, but I won’t hold that against it for native development. I wonder how hard it would be to rip out the jQuery usage and replace it with Alex’s h5.js…

Jonathan had some useful advice for mobile developers as well: as well as the obvious small screen and slow, unreliable, expensive data channels, remember that the user’s interaction with a mobile device will be different because they are almost always using it in a distracting environment. Users want to pop open the app, perform a tiny task, and be done before they reach the head of the line. That means always let the user know how much is left to go in their interaction, try to break things up into tiny chunks, and if there is ever the slightest pause for God’s sake throw up a progress bar or something!

A more specific recommendation: on a touch decvice, put controls (search bars, navigation buttons, etc) on the bottom of the page, since reaching up to tap a control will block the user’s view. Good common-sense advice; I hadn’t thought of that.

One thing that rubbed me the wrong way, as a traditional developer, is that Stark taught terrible coding discipline. Several times he pasted the same code into his sample app over and over and said, “this is a bit verbose, but you can just add a macro to your text editor to paste it in automatically.” Augh! No! No more cut and paste code – abstract it into a function if you use it more than once! (Unless maybe Javascript function calls have a lot of overhead I’m not aware of that make two line wrapper functions horribly inefficient.)

Other than that, a pretty good presentation, and it was good to see that a web designer is just as excited about new HTML features as browser developers expect (and for much the same reasons – both easier to write and faster).

Slides and some extended notes are promised at http://jonathanstark.com/web20.

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. )

O(log n) continues to beat O(n^2)

Posted in the Kitchen,the Utility Room by Joe on March 3, 2009

Two items:

1. I feel pretty bad about the several months of silence. I swore this time I wouldn’t start a nice professional journal and then let it languish. Oops.

2. I also didn’t want to post links with no original content but, well, this is a pretty cool result and I have nothing more to add to it. So here you go.

(Brief summary: my coworkers found a way to significantly speed up image tiling in Qt, using a simple algorithm that’s easily applicable to other toolkits and environments. Briefer summary: It makes painting faster, which is always good.)

Your clipboard isn’t broken, just confused

Posted in the Kitchen by Joe on July 4, 2008
Tags: , ,

This is kind of trivial, but it’s good to have it documented somewhere.

If you ever have to work with the Windows clipboard API directly (and it’s not too bad, as Windows API’s go) this might save you a lot of time: don’t try and step through any clipboard-related code in the debugger.

I was trying to figure out why pasting an image into my app didn’t work, so obviously the first thing to check is that the data is actually being retrieved from the clipboard correctly. I suspected it wasn’t being saved in the format I thought it was.

BOOL clipboardOpen = ::OpenClipboard(NULL);
if (clipboardOpen) {
    qDebug() << "Clipboard open";
} else {
    qDebug() << "Couldn't open clipboard: error" << ::GetLastError();

UINT format = 0;
while ((format = ::EnumClipboardFormats(format) != 0) {
    qDebug() << "Clipboard contains format" << format;

qDebug() << "Last EnumClipboardFormat status was" << ::GetLastError();

MSDN is pretty clear on how these two functions work: OpenClipboard returns true if the clipboard’s open, and EnumClipboardFormats returns 0 when there’s an error (in which case GetLastError returns the error code) or if it’s out of formats (in which case GetLastError returns 0, SUCCESS).

Since I was too lazy to actually hook up the Qt debug logger I was just stepping through this in the Visual Studio debugger to examine the results. And the results were basically:

Clipboard open
Last EnumClipboardFormat status was 1418

Since my app is emphatically not multithreaded, I was pretty baffled about how “Clipboard open” could be immediately followed by 1418: ERROR_CLIPBOARD_NOT_OPEN. I thought my paste problem was because my clipboard was seriously broken (on any OS but Windows I’d have thought something that fundamental was impossible, but on Windows I never assume anything). Took me ages to realize that it worked fine if it wasn’t in the debugger.

The problem, I think, is that when you pass NULL to OpenClipboard it associates the clipboard with the current process, and when you’re stepping through in the debugger it’s switching back and forth between the application and Visual Studio. Somehow the system is getting confused about which process has the clipboard open. This example seemed to work if you pass an HWND to associate it with a specific window instead of a process, but I wouldn’t want to place any bets that more complicated code would keep working. On Windows I never assume anything.

“Troubleshooting icecream” is not a helpful search

Posted in the Kitchen by Joe on April 25, 2008
Tags: , ,

I’ve been a happy distcc user for many years – every time I got a new computer, I tended to have the old one still sitting around with nothing much to do except be part of a compile farm. Really, what else are you going to do with the old thing? But last time I upgraded, it was because the old computer completely died, so for a year or so I’ve been laptop-only and distcc-free.

However! My new job sent me a company laptop, and the combination of two computers in the house plus big projects to compile from scratch means it’s time to get distcc back in shape. Except the new laptop came with SuSE (which I’m not a fan of, so far), and SuSE’s package manager only contained some new thing called Icecream.

Icecream has an unfortunate naming scheme for at least three reasons: if I need help, do I Google for “icecream” or “icecc”, the actual binary name? (This is compounded by the fact that it’s so new there’s very little documentation out there to find, and it’s pretty much overwhelmed by hits about actual ice cream. Mmmm.) Ubuntu has a completely different package called icecream – name collision! And finally, they had the gall to name their work scheduler “scheduler”, which tells you nothing about what it’s for – Ubuntu sensibly renamed it to “icecc-scheduler”.

Apart from that, though, this is a pretty nice improvement. The first obvious difference is that there’s no more manually keeping DISTCC_HOSTS in sync – thank God! One server runs the aforementioned “scheduler” daemon, which keeps track of which hosts are available and how loaded they are, and every host that wants to take on jobs runs the “iceccd” daemon, which – in theory – finds the scheduler by multicast. Even if multicast fails (see below) you just have to feed them the one scheduler hostname manually, instead of the entire list.

The other big improvement is that icecc has the ability to package up a toolchain in a chroot, and transfer the whole thing to a compile host. That means you don’t have to worry about keeping your tools exactly in sync on each host any more (although you can avoid overhead by doing so). I haven’t tried it yet, but if it works it will save me a lot of headache, since I have no intention of taking Ubuntu off my personal laptop and I can see things getting out of sync in a hurry. It can even be used for cross-compiles.

And now the bad: like I said, it’s new and there are very few docs. (EDIT: apparently it’s not so new, but it’s still harder to find information for it than for distcc. I guess I should have said it has less mindshare.) The page I linked to is about it, really. So as a public service, here are some troubleshooting tips, in the order I figured them out:

  • The main symptom of icecc not working is that all compiles go to localhost only. There are two main causes: iceccd on other hosts can’t find the scheduler, so they never register, or they register but the scheduler decides not to use them.
  • To check that a host is registering, run icecc with the “-vvv” (very verbose) flag, and tail -f the log file (/var/log/iceccd on SuSE, /var/log/iceccd.log on Ubuntu). If you see “scheduler not yet found.” more than once or twice, this is the problem – you’ll have to pass the scheduler hostname manually with the -s flag. Or just pass the hostname manually from the start, because in my experience the autodetection never works.
  • If you see “connection failed”, it’s found the host but couldn’t connect for some reason. Check your firewall settings.
  • If all the hosts are connected (you’ll be able to see them in the detailed hosts view of iccmon) but the schedular still sends all jobs to localhost, check the scheduler log (/var/log/icecc-scheduler.log on Ubuntu; I can’t remember the name on SuSE, because I renamed mine to match Ubuntu’s and forgot to write down the original). Again, use “-vvv”. You’ll see “login protocol version: ” lines, again confirming the hosts are registered, followed by lots of “handle_local_job” lines, which confirm that the scheduler’s not sending anything out.
  • Make sure all the hosts have the same network name. Ubuntu’s default parameters set the network name to ICECREAM; SuSE’s left it blank. I didn’t trust leaving it blank so I added a -n parameter to the command line on SuSE. I don’t know if it was necessary, because my problem turned out to be the next one:
  • Make sure all hosts use the same protocol version. Turns out the Ubuntu package I installed was a few versions behind the SuSE version, so it used protocol version 27 while SuSE used version 29. (icecc developers: if your protocol isn’t backwards compatible, this would be a WONDERFUL thing to give an error message about!) I ended up compiling from source on both machines to make sure I had the same version – fortunately, it’s small.

Whew. Yes, it was a bit of a pain in the ass to get working – I hope cross-compiling works, or I’ll have to ask myself whether it would’ve been easier to just track down distcc for SuSE in the first place.