I Am Not Charles


W2E Day 3: Morning Presentations

Posted in the Kitchen,the Living Room by Joe on September 30, 2010
Tags: , , , ,

JavaScript is the New Black – Why Node.js is Going to Rock Your World
by Tom Hughes-Croucher of Yahoo

Node is a Javascript interpreter that’s getting a lot of buzz. Basically it acts the same as a Python or Perl runtime (or as Tom said repeatedly, “Python or Ruby” – not a Perl fan apparently, which earns him some points with me), letting you run Javascript without a browser and putting it on the same level as the popular desktop or server-side scripting languages.

I’ve been wanting this for years: Javascript is a well-designed, powerful language with clean syntax, and there’s no reason it should be limited to embedding in browsers. And because it has a 100% lock on browser scripting, pretty much everybody has to learn it at some point anyway, so why switch back and forth between scripting languages for other tasks?

Tom makes this point more strongly, pointing out a huge number of job postings for Javascript programmers: web sites are now so complex that companies are not just hiring visual designers and expecting them to slap on some Javascript copied from a web site, they’re hiring full-fledged developers to code up their sites. Using Javascript on the server lets these developers write both back-end and front-end code rather than needing a separate team for each.

I don’t think this is a 100% win: every serious programmer should learn several languages so that they can distinguish the philosophy and structure of programming in general from the quirks of their particular language, so a pure Javascript developer who can’t pick up whatever language is being used on the server side isn’t much of a developer at all. But as long as you remain proficient in several languages – especially if they come from different paradigms – having to switch back and forth during day to day tasks which should be related does slow you down, so artificially limiting Javascript to the browser is a penalty even if it does help to discourage laziness.

The other big benefit touted by Tom is code reuse – which is a 100% win. There is often logic duplicated between client and server – form validation is a big example – and using Javascript on the server lets you use the exact same code, rather than having to rewrite the same algorithm in two different languages, a huge source of bugs. In fact, using Javascript on the server enables shared logic at a level that would be infeasible if it had to be written twice: consider a page that writes a lot of its HTML dynamically through Javascript. In a technique Tom refers to as “Progress Enhancement”, the first pass is done on the server, using the complete widget set and dynamic logic used on the client, so that as soon as the HTML is received it can be rendered instantly. But the dynamic Javascript is also repeated on the client side so that as the user interacts the page is reconfigured in the browser without going back to the server. (The server-side and client-side code will never be 100% identical, but at least it will have the same base rather than trying to do the same thing twice from scratch.) There is an example of this in the YUI/Express demo, with Yahoo UI widgets rendered first on the server without sacrificing client interaction. Tom demonstrated the Table View widget, which showed up a glitch in this scheme: the spacing generated on the server did not exactly match the client, so the widget originally rendered with header tabs squished together slightly and then spaced them out, leading to a slight UI flash. This is ugly and needs to be addressed (although I don’t know if it’s a systemic problem or just because the simplistic demo didn’t include any codeb to deal with it.) Still, that split second when the initial layout was flashed would have been blank without server-side rendering.

Under the hood, Node.js uses Google’s V8 engine and contains a non-blocking, event-driven HTTP server written in 100% Javascript which compares in performance with nginx. The performance graphs Tom showed were impressive and it seems to scale quite well (far better than Apache, for instance.) One big hole right now is that HTTPS support is sketchy, but this is being worked on.

One interesting technical note Tom highlighted: to make use of multi-core hardware with an event-driven server, new threads or processes need to be spun off by hand for heavy work (as opposed to automatically for each connection as in Apache). Although Node does support the fork system call, it also implements the HTML5 Web Workers spec. That means rather using slightly different concepts to spawn helpers on the client and the server, developers can reuse their knowledge when writing code in both places.

As a new language (in this context), Javascript doesn’t have as many 3rd-party libraries available as, say, Python and Ruby. But with the buzz it’s getting, more are popping up quickly: Tom showcased several, all available at GitHub:

NPM, the Node Package Manager
Mustache, a JSON-like templating language (which Twitter currently uses in JS in the client but Ruby on the server)
Express, an MVC framework similar to the lower levels of the Rails stack
Paperboy, a static file server

As well as using it as a web server, Node has an interactive shell just like Python’s or Ruby’s. Definitely going to be picking this up for my scripting needs, even though I don’t exactly do much server development.

Tom’s slides are online at http://speakerrate.com/sh1mmer.

When Actions Speak Louder Than Tweets: Using Behavioral Data for Decision-Making on the Web
by Jaidev Shergill, CEO of Bundle.com

Now here’s how to make a product focused presentation without sounding like a shill:

– Here are the resources we have that most people don’t (a large database of consumer behaviour data, including anonymized credit card purchases from a major bank, government statistics and nebulous “third party databases”)
– Here are some studies we did for our own information, whose results we think you’d find useful (“We tracked a group of people in detail and interviewed them to find out in depth how they make decisions”)
– Here’s a neat experiment we put together using these two pieces of information – we don’t even know if we’ll release it, we just wanted to find the results (and here they are)
– Oh, and here’s our actual product

Jaidev presented two theses, the first gleaned from interviewing study participants and the second from his own experience:

1. There’s more than enough information on the Web to make decisions, but 99% of it is useless for the specific person looking at it, because – especially when looking at opinions and reviews – people need to know how people that are like them feel about an option. (Here we are talking about subjective decisions like, “Is this a good restaurant?” or decisions with a lot variables like, “Does this new device fit my exact needs?”)

2. Online user-generated content is nearly useless for finding opinions because it is not filtered right. For example, review sites tend to polarize between 5 star and 1 star reviews because only users with strong opinions bother to rate, so all reviews are distorted. Many people filter by their social circle since their friends (mentions on Facebook, Twitter, etc) have things in common so their recommendations carry more weight, but this means that recommendations are skewed towards options with the latest hype. It turns out people are much better at reporting new things they just found than what they actually use longterm.

To illustrate this, Jaidev presented an experiment in which he used his company’s credit card database to build a restaurant recommendation system, by drawing a map between restaurants based on where people spent their money, how often they returned, and how much they spent there. Type in a restaurant you like and the system would return a list of where else people who ate at that restaurant spend their money. Rather than a subjective rating, the tool returns a “loyalty index” quantifying how much repeat business the restaurant gets. Presumably this will be more useful to you than a general recommendation because the originators of this data share at least one important factor with you: a love of the original restaurant.

The result was that a restaurant which was highly recommended on both review sites and in Jaidev’s circle rated very low. Compared to restaurants with similar food and prices, customers returned to this one far less often and spent far less. Reading reviews in depth revealed that, while the highest ratings praised the food quality, middling ratings sais that the food was good but management was terrible, with very slow service and high prices. Equally good food could be found elsewhere for less price and hassle. This information was available in reviews, but hard to find since it was drowned out by the all-positive or all-negative reviews.

So the main point to take away from the presentation is: hard data through data mining is still more valuable than the buzz generated through social media. Which is obvious, but a good point to repeat at this conference which is full of people who are so excited about adding social components to everything.

Jaidev did a great job of demonstrating the value of his company’s data set without actually sounding like he was selling it. He only demonstrated bundle.com itself briefly: it seems to be a money management site which allows users to compare their financial situation to the average and median to answer questions like, “Am I spending too much on these products?” and, “How much should I budget for this?”. The example Jaidev showed was an interactive graph of the cost of pet ownership. Looks like a useful site.

Alas, the equally useful looking restaurant recommender was only a proof of concept and is not released to the public. (And only covers Manhattan.) Email jaidev@bundle.com if you want to see it made public.

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. How does the unicorn breathe?)

Advertisements

I’m a greasy little monkey

Posted in the Workshop by Joe on July 2, 2008
Tags: , ,

Wow, work’s really been kicking my ass lately. I’ve been meaning to update this blog for ages, but I’ve had no time. Finally got the day off and spent an hour or so learning to use Javascript and Greasemonkey. While we’re waiting for something more substantial, here’s my first script. You might even find it useful:

// ==UserScript==
// @name           Include Linked Images
// @namespace      ca.notcharles.greasemonkey
// @description    Add linked images to the end of a webpage.
// @include        http://www.wizards.com/*&pf=true
// ==/UserScript==

/*
Copyright (c) 2008 Joe Mason 

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

var body = document.getElementsByTagName('body')[0];
var anchors = document.getElementsByTagName('a');

for (var i = 0; i < anchors.length; i++)
{
	var anchor = anchors[i];

	// only process anchors containing images
	if (anchor.getElementsByTagName('img').length == 0) continue;

	// add the target of the image to the end of the document
	var href = anchor.getAttribute('href');
	var hr = document.createElement('hr');
	body.appendChild(hr);
	var img = document.createElement('img');
	img.setAttribute('src', href);
	body.appendChild(img);
}

I won’t bother going through it because there are a million Javascript tutorials out there.

So what’s it useful for? Well, Wizards of the Coast have been releasing Dragon and Dungeon magazine articles online – free, for the time being. Sooner or later they’ll start charging for them so I’ve been downloading as many as I can and saving them as PDF’s. (The best way to do this is to click on the “Printer Friendly” link at the bottom of an article, and then print to PDF. On Windows you’ll have to install an add-on for this – I like PDFCreator.)

The problem is that some of them have thumbnailed images which link to a full-sized version, and I’d really like the full images to end up in the PDF. So this script just finds every image which is a link, and appends that image to the end of the page. It only runs on the printable format page (the “pf=true” at the end of the @include line). It just occurred to me it should really be checking that the link actually leads to an image, but meh – that’s not very likely for these articles, and if it happens I’ll deal with it then.

This article is a nice simple example to try it on.