I Am Not Charles


The Best Albums of 2010, part 5: The Stupid Methods

Posted in the Workshop by Joe on January 10, 2011
Tags: , ,

Part 1

Part 2

Part 3

Part 4

We have a set of ballots containing the year end Best Albums of 2010 lists from 9 publications. Yesterday I listed 12 possible ways to count these ballots to come up with a unified list.

Today I’ll go through the ways that don’t work for this data.

First Past the Post

I’ve said that First Past the Post is a terrible voting system. Now, let me demonstrate:

First we count up the top vote from every ballot. (Easy to do by hand, since there are only 9 ballots.) The winner is the #1 album of 2010. Then we drop that winner and count up the top votes that remain. The winner is the #2 album of 2010. Repeat until we’ve ranked all the albums.

That gives:

4 Kanye West – My Beautiful Dark Twisted Fantasy
2 Arcade Fire – The Suburbs
1 Caribou – Swim
1 John Grant – Queen of Denmark
1 These New Puritans – Hidden

So Kanye West has the #1 album of 2010, by this count. That doesn’t make any sense – 4 publications ranked him #1, but 3 others didn’t like his album enough to rank it at all! Compare Arcade Fire, whose album was liked enough to put in the top 25 by everyone: on the 4 lists where Kanye is #1, Arcade Fire is #2, #3, #4 and #11, but on the 5 other lists, Arcade Fire beats Kanye West hands down. So over half of our voters greatly prefer Arcade Fire to Kanye West, and most of the remaining voters only prefer Kanye by a tiny amount. Only Pitchfork (Kanye West at #1, Arcade Fire at #11) would be greatly dissatisfied by putting Arcade Fire ahead of Kanye West, while Mojo, Q, and NME would be extremely dissatisfied to put Kanye ahead of Arcade Fire. (And NPR, with Arcade Fire at #1 and Kanye West at #10, is a mirror of Pitchfork – call that greatly dissatisfied. And while Rough Trade wouldn’t be happy with either result, having ranked Arcade Fire way down at #21, they’d surely be even more pissed off if the win went to Kanye West, who they didn’t rank at all.)

More concisely: Arcade Fire is the Condorcet Winner. Kanye West should not win. So we can discard this result already.

Just for fun, let’s remove Kanye and see how the top 3 ends up. Votes for #2 are:

3 Arcade Fire – The Suburbs
1 Caribou – Swim
1 Deerhunter – Halcyon Digest
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 The Black Keys – Brothers
1 These New Puritans – Hidden

So Arcade Fire is #2. That’s not terrible. After removing Arcade Fire, votes for #3 are:

2 The Black Keys – Brothers
1 Beach House – Teen Drama
1 Caribou – Swim
1 Deerhunter – Halcyon Digest
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 Robert Plant – Band of Joy
1 These New Puritans – Hidden

#3 is The Black Keys, by 1 vote. But we were perilously close to a 9-way tie for 3rd. Which illustrates the weirdness of our data set: in most elections, there are a handful of candidates and hundreds of voters. We have hundreds of candidates and only a handful of voters. Some voting methods will produce a lot of ties, just because there are so few votes to go around that everyone will get one. These might be perfectly good voting methods for most elections, they just fall down on this edge case.

It looks like First Past the Post will be vulnerable to ties – if one list had ranked The Black Keys a little lower, we’d have one here. But it doesn’t matter, since we’ve already rejected it for failing to elect the Condorcet Winner. FAIL.

Approval Voting

Here’s one of those ties now. In approval voting, every album which appears on a list at all gets 1 point, and the album with the most points is #1. (Second most points is #2, etc.)

So we start off with a 2-way tie for first, followed by a 3 way tie for third:

9 Arcade Fire – The Suburbs
9 LCD Soundsystem – This is Happening
8 Beach House – Teen Dream
8 The National – High Violet
8 Vampire Weekend – Contra

This is because we defined “approval” as “anywhere in the year end list”. In an actual approval vote, the voters would know beforehand how the votes would be counted and probably be more selective in who they vote for. These lists are not really saying “any of these 25 or 50 albums would be ok by us as Album of the Year”. Really, they would probably vote for their top 3 or so (or some would vote for their top 3, some for their top 5, some would vote only for their favourite, etc.) If everyone approved only their top 3, we’d get:

6 Arcade Fire – The Suburbs
4 Kanye West – My Beautiful Dark Twisted Fantasy
2 Beach House – Teen Dream
2 Deerhunter – Halcyon Digest
2 The Black Keys – Brothers
2 These New Puritans – Hidden
1 Caribou – Swim
1 Elton John and Leon Russell – The Union
1 Gil Scott-Heron – I’m New Here
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 MGMT – Congratulations
1 Plan B – The Defamation of Strickland Banks
1 Robert Plant – Band of Joy
1 The National – High Violet

So now we have Arcade Fire at #1, Kanye West at #2 (both seem reasonable) and a 4-way tie for #3. And a 9-way tie for #7. And no way to rank anything after that.

In a real election, Approval Voting is better than First Past the Post because it has less need for tactical voting – for instance, take Rough Trade. The top of its Best Of list looks nothing like anyone else’s. If Rough Trade were a voter trying to actually influence an election, they would know (based on polls and publicity) that voting for Caribou, or Gil Scott-Heron, or These New Puritans, was useless – they have no hope of winning. They might even have picked up enough from the media to know that it’s shaping into a contest between Arcade Fire (who they rank 21) and Kanye West (who they hate – they didn’t even rank him). So they might be tempted to hold their nose and vote for Arcade Fire just to make sure Kanye West doesn’t win. It would be a tough choice, though, because what if their preferred candidates have a lot of underground support that isn’t getting media attention? With Approval Voting, Rough Trade could vote for their top 3 or 5 (or however many they wish) to show their support, plus throw in a vote for Arcade Fire just to make sure they have at least one vote that isn’t wasted. (You could say that this is still voting tactically, but Approval Voting at least gives more and better options for tactical voting.)

However, Approval Voting isn’t guaranteed to elect the Condorcet Winner – it depends entirely on how the voters choose to define “approval”. The various preferential ballot methods are clearly better at selecting the correct winner, because they let each voter give more information about their preferences. To balance this, Approval Voting is much easier to explain and count – you don’t even need a computer to count the ballots! So for a general election it may be a fair choice.

Regardless, it doesn’t work for our purposes, due to the number of ties we get when there are so many more candidates than there are ballots (which wouldn’t be a problem in a real election). FAIL.

Smith/Minmax

This one does need a computer to count. Using the ballots.txt file we generated in Part 3, we generate results with:

voteengine.py -m s//minmax sminmax-data.txt

This will think for a minute or two and then spit out “sminmax-data.txt”, a file containing a bunch of data about how it counted the votes, ending with the final results, in a line in the same format as the ballot:

71 > 41 > 84 > 167 > 196 > 230 > 256 > 258 > 119 > 1 > 20 > 171 > 26 > 73 > 188 > 214 > 3 > 208 > 37 > 247 > 23 > 15 > 162 > 114 > 217 > 240 > 201 > 154 > 142 > 102 > 161 > 143 > 116 > 13 > 185 > 78 > 147 > 62 > 183 > 220 > 223 > 137 > 210 > 88 > 18 > 244 > 67 > 118 > 211 > 81 > 259 > 6 > 86 > 229 > 122 > 197 > 47 > 180 > 257 > 108 > 145 > 35 > 233 > 176 > 141 > 101 > 231 > 56 > 58 > 212 > 103 > 129 > 40 > 204 > 24 > 42 > 252 > 195 > 39 > 187 > 253 > 239 > 218 > 98 > 105 > 21 > 155 > 138 > 33 > 205 > 79 > 243 > 100 > 16 > 19 > 226 > 199 > 224 > 38 > 131 > 169 > 173 > 207 > 68 > 163 > 134 > 96 > 245 > 120 > 72 > 193 > 8 > 249 > 9 > 66 > 123 > 209 > 90 > 153 > 255 > 236 > 10 > 202 > 178 > 121 > 127 > 242 > 53 > 82 > 159 > 237 > 182 > 2 > 30 > 189 > 250 > 148 > 44 > 126 > 170 > 221 > 177 > 29 > 12 > 248 > 174 > 112 > 92 > 50 > 109 > 139 > 151 > 34 > 94 > 146 > 52 > 99 > 117 > 89 > 110 > 28 > 140 > 150 > 45 > 190 > 251 > 106 > 65 > 104 > 175 > 203 > 46 > 61 > 95 > 149 > 70 > 115 > 191 > 235 > 135 > 234 > 85 > 22 > 17 > 184 > 130 > 132 > 260 > 43 > 136 > 213 > 200 > 80 > 111 > 157 > 216 > 181 > 14 > 198 > 164 > 238 > 49 > 97 > 246 > 25 > 75 > 63 > 36 > 133 > 107 > 32 > 124 > 165 > 91 > 179 > 11 > 254 > 158 > 77 > 54 > 222 > 74 > 232 > 125 > 152 > 113 > 215 > 144 > 83 > 5 > 31 > 227 > 186 > 64 > 57 > 168 > 76 > 69 > 261 > 27 > 228 > 93 > 225 > 206 > 241 > 166 > 59 > 192 > 160 > 4 > 60 > 128 > 219 > 51 > 87 > 48 > 194 > 156 > 172 > 7 > 55

Those are the candidate numbers of each album. To get a human-readable list out of that, we need to look up the name of each ballot. Remember that when we generated ballots.txt, we also saved the candidate names to candidates.txt – the name of candidate 1 is on line 1, candidate 2 is on line 2, etc. So we can write another simple python script, that reads candidates.txt and stores a map of candidate number to candidate name, and then reads the last line of sminmax-data and looks up each candidate name.

The Python script.

Save this script as “interpret-result.py”, and feed the last line of sminmax-data.txt into it with:

tail -n 1 sminmax-data.txt | ./interpret-results.py > sminmax-results.txt

Now open up sminmax-results.txt and look at the list:

Broken Bells – ‘Broken Bells’
John Grant – ‘Queen of Denmark’
Abe Vigoda – ‘Crush’
Against Me! – ‘White Crosses’
Ali Farka Toure & Toumani Diabate – ‘Ali & Toumani’
Allo Darlin’ – ‘Allo Darlin’
Aloe Blacc – ‘Good Things’
Am – ‘Future Sons And Daughters’
Antony and the Johnsons – ‘Swanlights’
Arcade Fire – ‘The Suburbs
Ariel Pink’s Haunted Graffiti – ‘Before Today’
Avey Tare – ‘Down There’
Avi Buffalo – ‘Avi Buffalo’
Band of Horses – ‘Infinite Arms’
Baths – ‘Cerulean’
Beach Fossils – ‘Beach Fossils’
Beach House – ‘Teen Dream’
Bear In Heaven – ‘Beast Rest Forth Mouth’
Belle and Sebastian – ‘Write About Love’
Ben Folds & Nick Hornby – ‘Lonely Avenue’
Best Coast – ‘Crazy For You ‘
Big Boi – ‘Sir Lucious Left Foot: The Son Of Chico Dusty’
Big K.R.I.T. – ‘K.R.I.T. Wuz Here’
Black Angels – ‘Phosphene Dream’
Black Rebel Motorcycle Club – ‘Beat The Devil’s Tattoo’

Woah. That ain’t right.

Broken Bells was ranked 5th by NPR and 11th by Rough Trade. And that’s it. There’s no way they should be anywhere near the top 3.

John Grant was ranked 1st by Mojo – so he’s got that going for him – and 6th by Q. And that’s it. Again, no way he should be ahead of Arcade Fire and Kanye West.

After that it starts spitting out albums in alphabetical order. Remember that in ballot.txt we specified that we’d break ties alphabetically. So this indicates that all the remaining ballots are tied for 3rd – or tied for last, depending how you look at it. That’s not useful at all.

This looks to me like VoteEngine’s s//minmax algorithm is buggy, because these results are just too weird to explain any other way. But life’s too short to debug it when there are 9 other algorithms to test out. FAIL.

Tomorrow, I’ll start going through algorithms that work fairly well, and start looking for the best.

The Best Albums of 2010, part, uh, 4: Twelve Ways to Count Votes

Posted in the Workshop by Joe on January 9, 2011
Tags: ,

Part 1

Part 2

Part 3. (Ha ha. That post starts with the correction that I’m using VoteEngine and not pyvote – but I didn’t actually fix the post title! And nobody wrote to correct me. Shows how much attention you’re paying.)

Whew. Going back to work did a number on the amount of thinking and typing I feel like doing in my off hours. Sorry for the delay.

In the first three posts, I showed how to turn a list of data into a ballot file for VoteEngine. Now I’ll run that ballot file through a bunch of voting systems and see what comes out. Here are the twelve voting methods I’ll be using. Ten of them just happen to be the methods supported by VoteEngine, and two of them are popular methods that I can count (and dismiss) just by looking at the list.

Before I list the methods, though, I need to explain an important concept in voting: the Condorcet Criterion. The Condercet Criterion was invented by the Marquis de Condorcet in the 18th century. It’s an attempt to answer the question, “Does this voting system always give a reasonable winner?”

The Condorcet Winner is the one candidate that would beat all other candidates in head-to-head races. That is, take a pair of candidates (for example, Arcade Fire and LCD Soundsystem). Look at all the ballots, comparing only those two candidates. (Looking back at the original lists, Mojo has Arcade Fire in 2nd and LCD Soundsystem in 36th, so Arcade Fire wins on this ballot – the fact that there was else somebody in 1st is irrelevant at the moment, because we’re only looking at a context between those two candidates. Q has Arcade Fire in 1st and LCD Soundsystem in 36th, so again Arcade Fire wins. In total, 8 of our 9 ballots have Arcade Fire ahead of LCD Soundsystem, and only 1 – Pitchfork – has LCD Soundsystem ahead. So, overall, Arcade Fire beats LCD Soundsystem.) Now repeat that for every possible pair of candidates. (Arcade Fire vs. Beach House, Arcade Fire vs. Vampire Weekend, Beach House vs. Vampire Weekend, etc.) When you’re finished you’ll have a list of exactly which candidates win, lose and tie against which other candidates in head-to-head (or “pairwise”) elections.

Now, if there’s one candidate that beats all other candidates, that’s the Condorcet Winner. It seems reasonable that this should be the overall winner – after all, if a voting system says that LCD Soundsystem is the overall winner, you have to justify why it beats Arcade Fire overall when 8 out of 9 voters preferred Arcade Fire! So the Condorcet Criterion says, “If a Condorcet Winner exists, does this voting method guarantee that they will be chosen?” (There isn’t always a Condorcet Winner. There could easily be a bunch of candidates all tied for first – they each beat the same number of other candidates, but not all other candidates. The Smith Set is all the candidates that are tied in this way – again, it seems reasonable that one of them should be the overall winner, but it’s not always clear which. You could also look at the Condorcet Winner as being a Smith Set with only candidate in it.) The Condorcet Winner and Smith Set are good ways to precisely define the common sense notion of “the candidates whose winning wouldn’t be ridiculous”.

Of course, the Condorcet Criterion isn’t the last word in evaluating a voting system. Even if a method passes the Condorcet Criterion, there are a lot of other things it could do poorly.

Officially, a Condorcet Method is a method of counting votes that passes the Condorcet Criterion. A method that doesn’t pass it is not a very good method, because it will sometimes elect the wrong person – a person who doesn’t make any sense as the winner. (Some people disagree that this is all that important – maybe it could theoretically happen, but doesn’t happen often in practice, and the system has some other property, like simplicity, that makes it attractive. I’ve even seen people argue that the Condorcet Winner tends to be a middle-of-the-road candidate who is many people’s second choice but few people’s first choice, and they would prefer to elect a candidate that is loved by a faction even if they’re hated by another faction. I – disagree – with this opnion.)

I will be using Condorcet Method more specifically, though: there are a lot of voting methods that are defined as, “Count up all the head-to-head elections between every possible pair of candidates, as defined above, to find the Condorcet Winner and, if it doesn’t exist, the Smith Set. If a Condorcet Winner exists, that’s the winner. Otherwise, use some tie breaking procedure to pick a member of the Smith Set to be the winner.” You don’t have to actually count the votes this way to get a method that satisfies the Condorcet Criterion, you just need a method that returns the same winner as you would get if you counted the votes this way. By the true definition, any method that satisfies the Condorcet Criterion is a Condorcet Method, but I’ll use the term only to describe methods that start out by finding the Condorcet Winner explicitly as described above. The only difference between all the different Condorcet Methods, by my definition, is what procedure they use to break ties if there’s no Condorcet Winner.

By the way, according to the 9 best-of lists we’re looking at, Arcade Fire is the Condorcet Winner. It’s the only band whose album was ranked above every other album on the majority of the lists. So we’d better hope that every voting system says it’s #1! It’s obvious just by looking at the lists that Arcade Fire had the most popular album of 2010 – it’s near the top of almost every list, and even on the Rough Trade and Pitchfork lists, it’s near the middle. The other obvious contender, Kanye West, is #1 on the lists that love him but doesn’t even appear on some lists. So the more interesting question is who else gets ranked at the top according to each system (and specifically, where does Kanye end up?)

With that out of the way, the ten voting systems supported by VoteEngine are (in alphabetical order):

  1. The Borda count (borda) – give each candidate points according to their position on the ballot, from 0 for a last place finish, 1 for second last, etc. up to N for a first place finish among N candidates. The winner has the most points.
  2. Borda Elimination (aka Baldwin’s Method) (borda-elim) – Like the Borda Count, except that instead of returning the candidate with the most points immediately, you keep eliminating the candidate with the least points (and then recounting as if that candidate had never been on the ballot) until you’re left with one winner.
  3. Copeland’s Method (copeland) – A Condorcet Method where ties are broken by the number of head-to-head victories minus the number of head-to-head defeats.
  4. Instant Runoff Voting (irv) – Look at only the first place votes on the ballots. If one candidate has a majority of votes (not just “the highest number of votes”; they need over 50%) then they’re the winner. Otherwise, eliminate the candidate with the fewest first place votes, and then recount as if that candidate had never been on the ballot. Repeat until one candidate has a majority.
  5. Minimax (minmax) – A Condorcet Method where ties are broken by choosing the candidate with the smallest margin of defeat. Actually, the candidate with the “minimum maximum” margin of defeat – hence minimax. (For example, Beach House loses to Arcade Fire 2 to 7. It also loses to LCD Soundsystem 4 to 5. Look at how a candidate compares with all other candidates and find it’s “maximum” margin of defeat – Beach House’s is probably that 2 to 7, but I haven’t looked at all of it’s contests by hand. Now the overall winner is the one whose maximum margin of defeat is smallest – the “minimum maximum”. Whew. And some people call this the simplest Condorcet method!)
  6. Nanson’s Method (nanson) – Like the Borda Count, except that instead of returning the candidate with the most points immediately, you keep eliminating all the candidates with less than the average number of points (and then recounting as if those candidates had never been on the ballot) until you’re left with one winner.
  7. Pairwise Elimination (pw_elim) – Like Minimax, except that instead of returning the candidate with the lowest maximum immediately, you keep eliminating the candidate with the highest maximum (and then recounting as if that candidate had never been on the ballot) until you’re left with one winner.
  8. Ranked Pairs (aka Tideman’s Method) (rp) – A Condorcet Method where ties are broken by ranking all pairs of candidates by margin of victory, and then adding them each to a graph (in order), skipping any pair that would create a cycle in the graph. The final graph will be a tree (since it has no cycles) so the root of the tree is the winner.
  9. Schulze’s Method (schulze) – A Condorcet Method where ties are broken by – um – something do to with graphs again. Really, it’s complicated, which is unfortunate as it seems to give the best results. I’ll describe it when I discuss the results in detail.
  10. Smith/Minimax (s//minimax) – Minimax has a problem, which is that if there’s no Condorcet winner, then the winner it returns isn’t guaranteed to be in the Smith Set. So, first get rid of all the candidates outside the Smith Set, then use Minimax to count the remainder.

The other two methods are:

  1. First-Past-the-Post (aka Winner-Take-All) – Each voter can vote for exactly one candidate. The winner is the candidate who gets the most votes, even if that’s not a majority. (Even though we have ballots with complete preferences on them, we can count them using first-past-the-post by only looking at the #1 preference.) Pretty much a terrible voting system; the only thing it has to recommend it is that it’s simple to explain.
  2. Approval Voting – Each voter can vote for as many candidates as they want, but all their votes have the same score. (That is, each voter either “approves” or “disapproves” of each candidate.) The winner is the candidate approved by the most voters. This has the advantage that it’s less restrictive than first-past-the-post, but at the same time it’s easier to explain and fill in a ballot than systems needing full preferential ballots. (Ballot instructions are basically, “Mark an X next to any candidate you find acceptable. You may choose as few or as many as you wish.”) For this sample, we can assume that any album that appears on a list would be “approved” by that list’s compilers.

Note that all these voting methods return a single winner. To get a complete ranked list, just take that winner out and count the votes again as if that candidate had never been on the ballot – whoever wins this time is in 2nd place. Then do it again to get the 3rd place winner, etc. (For the methods that already say, “Eliminate the candidate with the least number of votes and then recount,” the complete list is just the candidates in reverse order of elimination.) VoteEngine already does this for all the methods it supports above.

There are four other methods supported by VoteEngine, which I didn’t include, mostly because they only return one winner rather than a list. I could have written a wrapper to update the ballot file and run them again to find the list, but it was too much work. The other four methods are:

  1. Bucklin’s Method (bucklin) – Count only 1st place votes. If one candidate has a majority, that’s the winner. Otherwise, add the second place votes. Repeat, adding lower placed votes each time, until one candidate with a majority is found. (In this test, Arcade Fire gets the majority after adding second place votes, and then VoteEngine stops counting.)
  2. Condorcet/IRV (c//irv) – Return the Condorcet Winner if one exists, otherwise use IRV. In other words, a Condorcet Method using IRV to break ties. (In this test, it returns Arcade Fire, the Condorcet Winner, and then stops counting.) In theory this could return a winner outside the Smith Set, because if there’s no Condorcet Winner it throws away all the pairwise data it just counted up.
  3. Smith/IRV (s//irv) – Get rid of all the candidates outside the Smith Set, then user IRV to find the winner. In other words, a Condorcet Method using IRV on the Smith Set only to break ties. (In this test it works the same as c//irv since there is a Condorcet Winner.)
  4. UK Usenet (ukvt) – “apply the rules used by the uk.* usenet hierarchy”. I just ignored this one, because it’s not a standard voting method, so honestly who cares?

Tomorrow I’ll take a closer look at each of the twelve methods and talk about the results in detail.

The Best Albums of 2010, part 3: Counting Votes With Pyvote

Posted in the Workshop by Joe on January 3, 2011
Tags: , ,

Part 1

Part 2

First, a correction: for the last two posts, I’ve been linking to pyvote as the program to automatically count votes. Except I actually used VoteEngine. Natural mistake – they’re both Python programs used to count votes, and “pywhatever” is a common naming scheme for Python.

Short one this time. Last time, we turned 9 end-of-year best album lists into preferential ballots in a standard format. In order to count these ballots with VoteEngine, though, we need one more thing: as well as the file of ballots, VoteEngine needs a complete list of all candidates, which can either be passed on the command line with the “-cands” parameter or added to the ballot file itself. Our candidates are named with numbers counting up from 1, so this is easy for us to generate.

Another parameter that’s helpful is “-tie”, which takes a list of candidates in an order to use as tiebreakers. Whenever a voting system returns a tie between two candidates, the one that appears first in the -tie list is counted as the winner. I’m not actually sure what order is used if a candidate doesn’t appear in the tiebreaker list, but since we’re autogenerating the candidate list anyway it’s easy to always fill in a complete -tie list. We’ll break ties in alphabetical order based on song name.

Since we plan to count the same list of ballots over and over again with different voting methods, it will make things much easier to add these two parameters to the ballot file itself. This is a simple edit to the script we wrote last time. First, since “-cands” needs to come at the start of the file, we delay actually writing lines to the ballot file until after all ballots have been read. Then, after reading all input files and filling in the candidate map (which records candidate numbers mapped to song names), we write all the ballot lines plus two last lines: “-cands ” and “-tie “.

Here is the updated script, with the new lines highlighted.

This will output a ballots.txt that looks like this: identical to the one from last post, but with two more lines at the start.

-cands 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260

-tie 85 167 195 229 255 257 119 1 20 171 26 74 187 213 3 207 37 246 23 15 162 114 216 239 200 154 142 72 102 161 143 116 13 184 79 147 63 182 219 222 137 209 89 18 243 68 118 210 82 258 6 87 228 122 196 47 179 256 108 145 35 232 175 141 101 230 56 58 211 103 129 40 203 24 42 251 194 39 186 252 238 217 98 105 21 155 138 33 204 80 242 100 16 19 225 198 223 38 131 169 62 206 69 163 134 244 120 73 192 8 248 9 41 67 123 208 91 153 254 235 10 201 177 121 127 241 53 83 159 236 181 2 30 188 249 148 44 126 170 220 176 29 12 247 173 112 93 50 109 139 151 34 95 146 52 99 117 90 110 28 140 150 45 189 250 106 66 104 174 202 46 61 96 149 71 115 190 234 135 233 86 22 17 183 130 132 259 43 136 212 199 81 111 157 215 180 14 197 164 237 49 97 245 25 76 64 36 133 107 32 124 165 92 178 11 253 158 78 54 221 75 231 125 152 113 214 144 84 5 31 226 185 65 57 168 77 70 260 27 227 94 224 205 240 166 59 191 160 4 60 128 218 51 88 48 193 156 172 7 55

41 > 1 > 12 > 35 > 11 > 46 > 52 > 71 > 78 > 122 > 6 > 95 > 9 > 50 > 26 > 20 > 130 > 13 > 132 > 19 > 134 > 136 > 81 > 144 > 42 > 16 > 40 > 68 > 21 > 8 > 28 > 22 > 31 > 60 > 29 > 2 > 92 > 100 > 172 > 175 > 94 > 177 > 178 > 36 > 183 > 96 > 192 > 67 > 195 > 34

1 > 22 > 115 > 30 > 4 > 41 > 16 > 5 > 46 > 12 > 11 > 21 > 3 > 38 > 7 > 42 > 67 > 50 > 19 > 133 > 74 > 137 > 2 > 143 > 53 > 83 > 52 > 26 > 69 > 154 > 158 > 75 > 166 > 54 > 93 > 31 > 78 > 8 > 62 > 96 > 37 > 94 > 180 > 60 > 182 > 9 > 191 > 35 > 6 > 97

70 > 1 > 3 > 2 > 30 > 42 > 55 > 43 > 44 > 75 > 5 > 13 > 105 > 125 > 7 > 19 > 129 > 51 > 12 > 6 > 8 > 36 > 20 > 4 > 26 > 148 > 23 > 150 > 45 > 83 > 18 > 163 > 109 > 10 > 33 > 169 > 170 > 15 > 173 > 14 > 97 > 110 > 93 > 50 > 39 > 46 > 190 > 90 > 64 > 98 > 63 > 108 > 201 > 203 > 206 > 99 > 9 > 89 > 211 > 37 > 24 > 21 > 61 > 218 > 17 > 16 > 221 > 35 > 225 > 102 > 40 > 112 > 69 > 25 > 231

13 > 21 > 70 > 116 > 71 > 118 > 63 > 80 > 48 > 32 > 72 > 26 > 5 > 56 > 3 > 128 > 61 > 89 > 11 > 84 > 1 > 43 > 142 > 145 > 9 > 34 > 51 > 44 > 2 > 41 > 159 > 164 > 64 > 60 > 18 > 99 > 24 > 23 > 174 > 30 > 4 > 35 > 7 > 52 > 185 > 14 > 189 > 33 > 55 > 198 > 199 > 62 > 202 > 59 > 100 > 207 > 20 > 209 > 210 > 212 > 213 > 215 > 91 > 108 > 37 > 219 > 222 > 223 > 226 > 227 > 228 > 229 > 36 > 112 > 232 > 66 > 236 > 237 > 238 > 25 > 241 > 19 > 243 > 245 > 107 > 65 > 39 > 247 > 109 > 110 > 114 > 68 > 251 > 252 > 255 > 256 > 257 > 40 > 38 > 260

10 > 6 > 1 > 2 > 73 > 8 > 19 > 29 > 121 > 17 > 4 > 7 > 15 > 23 > 27 > 47 > 3 > 76 > 45 > 12 > 18 > 79 > 16 > 88 > 106 > 5 > 33 > 28 > 14 > 11 > 49 > 162 > 165 > 167 > 22 > 113 > 57 > 58 > 13 > 92

10 > 11 > 101 > 1 > 73 > 4 > 47 > 22 > 58 > 2 > 54 > 123 > 124 > 17 > 5 > 127 > 3 > 53 > 29 > 28 > 15 > 49 > 141 > 146 > 76 > 7 > 149 > 90 > 31 > 86

10 > 1 > 3 > 25 > 17 > 14 > 4 > 119 > 48 > 6 > 9 > 18 > 2 > 7 > 77 > 5 > 104 > 85 > 24 > 20 > 44 > 139 > 43 > 107 > 147 > 32 > 8 > 87 > 34 > 82 > 51 > 59 > 12 > 27 > 36 > 84 > 62 > 171 > 80 > 33 > 66 > 176 > 15 > 40 > 61 > 186 > 187 > 111 > 197 > 56

1 > 11 > 5 > 117 > 72 > 2 > 4 > 25 > 3 > 10 > 16 > 14 > 91 > 126 > 74 > 65 > 6 > 49 > 8 > 7 > 135 > 140 > 102 > 9 > 79 > 12 > 81 > 151 > 153 > 155 > 157 > 161 > 37 > 15 > 23 > 168 > 68 > 13 > 31 > 53 > 98 > 28 > 179 > 24 > 184 > 54 > 188 > 57 > 196 > 113 > 64 > 200 > 19 > 204 > 205 > 22 > 208 > 27 > 58 > 17 > 214 > 216 > 217 > 18 > 38 > 29 > 220 > 224 > 32 > 69 > 39 > 95 > 63 > 230 > 233 > 234 > 235 > 67 > 239 > 240 > 30 > 242 > 244 > 106 > 103 > 20 > 246 > 248 > 111 > 101 > 104 > 249 > 250 > 253 > 254 > 105 > 258 > 114 > 259 > 21

10 > 2 > 6 > 15 > 3 > 4 > 9 > 120 > 20 > 27 > 1 > 8 > 45 > 24 > 17 > 14 > 13 > 103 > 131 > 66 > 57 > 138 > 82 > 38 > 25 > 59 > 39 > 5 > 152 > 77 > 156 > 160 > 65 > 18 > 16 > 56 > 55 > 86 > 23 > 85 > 87 > 47 > 32 > 181 > 21 > 34 > 193 > 194 > 48 > 88

With this file, we can count the votes using a given method by running “voteengine.py -m < ballots.txt" – giving ballots.txt on stdin. For a complete list of voting methods, see the VoteEngine docs

Ok, preparations out of the way, tomorrow I’ll start evaluating voting systems!

The Best Albums of 2010, part 2: Making a Ballot

Posted in the Workshop by Joe on January 2, 2011
Tags: , , ,

Yesterday I linked to 9 lists of the Best Albums of 2010, from magazines, web sites, and one radio show, and promised to explain how we distilled those lists into one canonical Top 25. The first step is to turn each list into a preferential election ballot.

The standard First Past the Post voting system used in pretty much every political election in North America is… not very good. Its one virtue is that it’s simple to explain: you can vote for one, and only one, candidate, and the candidate with the most votes wins. The problem is that this often makes it really tough to decide who to vote for. To get your preferred result, you need to consider a whole host of things other than “how good this candidate is”. You need to vote tactically. The obvious example is when you don’t think your favourite candidate can win – do you vote for them anyway, or switch your vote to your second choice? (It’s a tough decision if your candidate has ALMOST enough support to be viable.) Another example is if you mainly want one candidate to LOSE – you still need to pick one of their opponents to vote for. There are more serious problems in an election like the Canadian Parliament or US Electoral College, where a bunch of individual elections each elect one winner, and then the team with the most winners gets the grand prize, but let’s consider only elections where everyone votes directly for a single winner, like a city mayor or state governor.

Every voter can, in theory, rank all the candidates in order of preference (although if they don’t have strong opinions or aren’t very informed, that ranking may just have their favored candidate in first and everyone else tied for last…) Judging whether a voting system is any good basically involves measuring how much a voter’s “true preferences” contribute to the election’s outcome. First past the post is a poor system because it forces voters to leave out most of the information about their preferences, and in fact encourages them to “lie” by listing a candidate who isn’t actually their first choice. A better system would give each voter a more complicated ballot in which they could list all their preferences. Say, by putting a 1 next to their favoured candidate, a 2 next to their second candidate, etc. Or with a computer touch screen which removes each candidate’s name as the voter touches it, and keeps track of the order the voter chose them in. Or, although this has obvious practical problems in a large election, by writing down each candidate’s name in order on a sheet of paper (which is exactly what we have with our 9 best-of-2010 lists!) There are many physical ballots we can imagine that could record a voter’s complete preferences. But in order to study or simulate a voting system, it’s helpful to have a standard notation. Whoever counts the votes – human or computer – can start by translating each physical ballot into this standard notation.

The notation commonly used is to give each candidate a symbol (such as the first letter of their name or party), and for each ballot, list the symbols for each candidate on one line, from the most liked on the left to the least liked on the right. The symbols are separated by “>” to show that the candidate to the left is preferred to the candidate on the right, or “=” to show that they’re tied. (And, if not all candidates are listed, all the unlisted candidates are assumed to all be tied for last place.)

So, with the Canadian political parties Conservative (C), Liberal (L), New Democrat (N), and Green (G) – assume we are not in Quebec so the Bloc Quebecois is unavailable – we have examples like the following:

The extreme right-winger would like the Conservatives to win, and at all costs wants the NDP and Green Party to lose – they’d vote “C > L > N = G”.

The extreme left-winger is the opposite – “N = G > L > C” (assuming they can’t decide between NDP and Green).

Or you may have an NDP supporter who thinks the Green Party is also a good left wing choice, but that the Liberals are no better than the Conservatives – “N > G > C = L”.

Or an NDP supporter who thinks that the Green Party are a bunch of upstarts who can’t be trusted – “N > C = L > G”.

Or any weird and wonderful combination of these.

So, we have 9 lists of songs, with over 200 songs between them. The first thing we need to do is get a symbol for each song, and make sure we can turn that symbol back into a name when it’s time to output the results. Save each list into a text file (with just the songs, one per line – no rank numbers). This may take some cutting and pasting. Then go through each list and make sure that each song is spelled EXACTLY the same each time it appears (the Unix commands sort and uniq may help with this).

Now we want to read all the text files and turn them into two data structures: a map of Candidate Number to song name, and a set of ballots in the above format. Since I’m using pyvote, a Python program to count votes, the natural way to do this conversion is to write another Python script.

The script will read each file given on the command line and write out two files. “candidates.txt” is the complete list of all candidate songs, in order of Candidate Number – given a number, the song name is on that line number of candidates.txt. “ballots.txt” is the list of ballots – so for our best-of lists, it will be 9 lines long. Since WordPress removes indentation in code blocks, which is fatal for Python code, I’ve put the script on pastebin:

The Python Script.

It’s pretty simple – read each line of each file, generate a number for the song, and save the name and number in a map called “candidates”. For each file, write the numbers of each song as a string joined by ” > ” to “ballots.txt” – since none of these lists have ties, we don’t need to worry about “=”. Finally, sort the candidate map by number, and write each name in “candidates.txt”.

Save the Python file as “convertLists2Ballots.py” and run it with “convertLists2Ballots.py <list of text files>”.

The candidates.txt file this spits out isn’t very interesting, but ballots.txt looks like this:

41 > 1 > 12 > 35 > 11 > 46 > 52 > 70 > 77 > 122 > 6 > 94 > 9 > 50 > 26 > 20 > 130 > 13 > 132 > 19 > 134 > 136 > 80 > 144 > 42 > 16 > 40 > 67 > 21 > 8 > 28 > 22 > 31 > 60 > 29 > 2 > 91 > 100 > 172 > 176 > 93 > 178 > 179 > 36 > 184 > 95 > 193 > 66 > 196 > 34

1 > 22 > 115 > 30 > 4 > 41 > 16 > 5 > 46 > 12 > 11 > 21 > 3 > 38 > 7 > 42 > 66 > 50 > 19 > 133 > 73 > 137 > 2 > 143 > 53 > 82 > 52 > 26 > 68 > 154 > 158 > 74 > 166 > 54 > 92 > 31 > 77 > 8 > 173 > 95 > 37 > 93 > 181 > 60 > 183 > 9 > 192 > 35 > 6 > 97

69 > 1 > 3 > 2 > 30 > 42 > 55 > 43 > 44 > 74 > 5 > 13 > 105 > 125 > 7 > 19 > 129 > 51 > 12 > 6 > 8 > 36 > 20 > 4 > 26 > 148 > 23 > 150 > 45 > 82 > 18 > 163 > 109 > 10 > 33 > 169 > 170 > 15 > 174 > 14 > 97 > 110 > 92 > 50 > 39 > 46 > 191 > 89 > 63 > 98 > 62 > 108 > 202 > 204 > 207 > 99 > 9 > 88 > 212 > 37 > 24 > 21 > 61 > 219 > 17 > 16 > 222 > 35 > 226 > 102 > 40 > 112 > 68 > 25 > 232

13 > 21 > 69 > 116 > 70 > 118 > 62 > 79 > 48 > 32 > 71 > 26 > 5 > 56 > 3 > 128 > 61 > 88 > 11 > 83 > 1 > 43 > 142 > 145 > 9 > 34 > 51 > 44 > 2 > 41 > 159 > 164 > 63 > 60 > 18 > 99 > 24 > 23 > 175 > 30 > 4 > 35 > 7 > 52 > 186 > 14 > 190 > 33 > 55 > 199 > 200 > 96 > 203 > 59 > 100 > 208 > 20 > 210 > 211 > 213 > 214 > 216 > 90 > 108 > 37 > 220 > 223 > 224 > 227 > 228 > 229 > 230 > 36 > 112 > 233 > 65 > 237 > 238 > 239 > 25 > 242 > 19 > 244 > 246 > 107 > 64 > 39 > 248 > 109 > 110 > 114 > 67 > 252 > 253 > 256 > 257 > 258 > 40 > 38 > 261

10 > 6 > 1 > 2 > 72 > 8 > 19 > 29 > 121 > 17 > 4 > 7 > 15 > 23 > 27 > 47 > 3 > 75 > 45 > 12 > 18 > 78 > 16 > 87 > 106 > 5 > 33 > 28 > 14 > 11 > 49 > 162 > 165 > 167 > 22 > 113 > 57 > 58 > 13 > 91

10 > 11 > 101 > 1 > 72 > 4 > 47 > 22 > 58 > 2 > 54 > 123 > 124 > 17 > 5 > 127 > 3 > 53 > 29 > 28 > 15 > 49 > 141 > 146 > 75 > 7 > 149 > 89 > 31 > 85

10 > 1 > 3 > 25 > 17 > 14 > 4 > 119 > 48 > 6 > 9 > 18 > 2 > 7 > 76 > 5 > 104 > 84 > 24 > 20 > 44 > 139 > 43 > 107 > 147 > 32 > 8 > 86 > 34 > 81 > 51 > 59 > 12 > 27 > 36 > 83 > 96 > 171 > 79 > 33 > 65 > 177 > 15 > 40 > 61 > 187 > 188 > 111 > 198 > 56

1 > 11 > 5 > 117 > 71 > 2 > 4 > 25 > 3 > 10 > 16 > 14 > 90 > 126 > 73 > 64 > 6 > 49 > 8 > 7 > 135 > 140 > 102 > 9 > 78 > 12 > 80 > 151 > 153 > 155 > 157 > 161 > 37 > 15 > 23 > 168 > 67 > 13 > 31 > 53 > 98 > 28 > 180 > 24 > 185 > 54 > 189 > 57 > 197 > 113 > 63 > 201 > 19 > 205 > 206 > 22 > 209 > 27 > 58 > 17 > 215 > 217 > 218 > 18 > 38 > 29 > 221 > 225 > 32 > 68 > 39 > 94 > 62 > 231 > 234 > 235 > 236 > 66 > 240 > 241 > 30 > 243 > 245 > 106 > 103 > 20 > 247 > 249 > 111 > 101 > 104 > 250 > 251 > 254 > 255 > 105 > 259 > 114 > 260 > 21

10 > 2 > 6 > 15 > 3 > 4 > 9 > 120 > 20 > 27 > 1 > 8 > 45 > 24 > 17 > 14 > 13 > 103 > 131 > 65 > 57 > 138 > 81 > 38 > 25 > 59 > 39 > 5 > 152 > 76 > 156 > 160 > 64 > 18 > 16 > 56 > 55 > 85 > 23 > 84 > 86 > 47 > 32 > 182 > 21 > 34 > 194 > 195 > 48 > 87

Incomprehensible to a human, but pyvote will be able to read this and then try lots of different voting systems on it. Tomorrow I’ll show how to make this happen.

(As an aside, instead of a bunch of text files, it’s pretty common to get data you want to turn into votes as a spreadsheet. To process this with Python, save your spreadsheet in CSV format – that’s “comma separated value”, a simple text format that can’t handle formatting or formulas – and then read it into Python using the csv module.)

The Best Albums of 2010 (According to Schulze)

Posted in the Workshop by Joe on December 31, 2010
Tags: , ,

Over on my lovely wife’s music blog is a list of the Top Ten Albums of 2010, culled from nine separate end-of-year lists:

All Songs Considered (NPR)
MOJO
NME
Pitchfork
Q
Rolling Stone
Rough Trade
Stereogum
Spin

These lists actually have 25 to 100 entries on them, and they don’t all have the same entries or in the same order. The Rough Trade list doesn’t have a single one of our Top 10 in their top 10! (And our #3 song isn’t on their list at all!) So where’d we come up with these numbers?

Voting theory!

Treat each end-of-year list as a ballot in an election. In a standard election, several thousand or several million people all choose from a handful of candidates. Here we have just 9 ballots cast to choose between several hundred songs. So it’s the reverse of the elections that are commonly studied, but there’s no reason standard election tools wouldn’t work. And it’s an edge case that might reveal interesting properties of the methods used to count the votes.

I counted the votes using 10 separate methods (using vote-counting software – I’m not insane!) and then we picked the result that seemed to make the most sense. Not scientific at all, but nobody said this was a scientific study – it’s just for curiosity. We ended up choosing the results returned by the Schulze Method, the same method used by elections in a lot of open source software groups (including Debian and Gentoo).

Tomorrow I’ll explain how to turn a list into a ballot (and what that even means), and how to use pyvote to process the ballots. Then I’ll go through each of the 10 voting systems, describe them, and discuss the results. That’ll take a while… (A lot longer than it did to actually generate them!)

In the meantime, here’s the complete list, in order. All 261 of them.

1. Arcade Fire – The Suburbs
2. Beach House – Teen Dream
3. Kanye West – My Beautiful Dark Twisted Fantasy
4. LCD Soundsystem – This is Happening
5. Deerhunter – Halcyon Digest
6. Vampire Weekend – Contra
7. The National – High Violet
8. Janelle Monae – The ArchAndroid
9. The Black Keys – Brothers
10. Yeasayer – Odd Blood
11. MGMT – Congratulations
12. Joanna Newsom – Have One On Me
13. Ariel Pink’s Haunted Graffiti – Before Today
14. Big Boi – Sir Lucious Left Foot: The Son of Chico Dusty
15. Caribou – Swim
16. Best Coast – Crazy For You
17. Grinderman – Grinderman 2
18. Robyn – Body Talk Pt. 1
19. Crystal Castles – Crystal Castles
20. Gorillaz – Plastic Beach
21. Sleigh Bells – Treats
22. Flying Lotus – Cosmogramma
23. Gil Scott-Heron – I’m New Here
24. Avi Buffalo – Avi Buffalo
25. Robert Plant – Band of Joy
26. Sufjan Stevens – The Age of Adz
27. Kings of Leon – Come Around Sundown
28. M.I.A. – MAYA
29. Neil Young – Le Noise
30. Tame Impala – Innerspeaker
31. John Grant – Queen of Denmark
32. Laura Marling – I Speak Because I Can
33. Edwyn Collins – Losing Sleep
34. Matthew Dear – Black City
35. The Dead Weather – Sea of Cowards
36. The Roots – How I Got Over You
37. Titus Andronicus – The Monitor
38. Emeralds – Does it Look Like I’m Here?
39. Hot Chip – One Life Stand
40. Foals – Total Life Forever
41. These New Puritans – Hidden
42. Wild Nothing – Gemini
43. Broken Bells – Broken Bells
44. Liars – Sisterworld
45. Salem – King Night
46. Warpaint – The Fool
47. Glasser – Ring
48. Belle and Sebastian – Write About Love
49. Manic Street Preachers – Postcards From a Young Man
50. Surfer Blood – Astro Coast
51. Swans – My Father Will Guide Me Up a Rope To the Sky
52. Four Tet – There Is Love in You
53. Charlotte Gainsbourg – IRM
54. Drake – Thank Me Later
55. Field Music – Measure
56. Gold Panda – Lucky Shiner
57. Jamey Johnson – The Guitar Song
58. No Age – Everything in Between
59. Oneohtrix Point Never – Returnal
60. Paul Weller – Wake Up the Nation
61. Band of Horses – Infinite Arms
62. Crocodiles – Sleep Forever
63. Interpol – Interpol
64. Midlake – The Courage of Others
65. Perfume Genius – Learning
66. Phosphorescent – Here’s To Taking it Easy
67. Spoon – Transference
68. Superchunk – Majesty Shredding
69. The Coral – Butterfly House
70. The Morning Benders – Big Echo
71. The Walkmen – Lisbon
72. Das Racist – Shut Up, Dude / Sit Down, Man
73. Eminem – Recovery
74. The-Dream – Love King
75. Twin Shadow – Forget
76. The Tallest Man on Earth – The Wild Hunt
77. Cee Lo Green – The Lady Killer
78. Danger Mouse and Sparklehorse – Dark Night of the Soul
79. John Legend and The Roots – Wake Up!
80. Elton John and Leon Russell – The Union
81. Erykah Badu – New Amerykah, Pt. 2: Return of the Ankh
82. Mavis Staples – You Are Not Alone
83. Of Montreal – False Priest
85. Peter Gabriel – Scratch My Back
86. Sharon Jones and the Dap-Kings – I Learned the Hard Way
87. Sharon Van Etten – Epic
88. The Gaslight Anthem – American Slang
89. Tom Jones – Praise and Blame
90. Villagers – Becoming a Jackal
91. Mount Kimbie – Crooks and Lovers
92. Teenage Fanclub – Shadows
93. Zola Jesus – Stridulum II
94. Abe Vigoda – Crush
95. Broken Social Scene – Forgiveness Rock Record
96. Delorean – Subiza
97. Dum Dum Girls – I Will Be
98. Frightened Rabbit – The Winter of Mixed Drinks
99. Gayngs – Relayted
100. Gonjasufi – A Sufi and a Killer
101. jj – jj no 3
102. Jonsi – Go
103. Klaxons – Surfing the Void
104. Male Bonding – Nothing Hurts
105. Marina and the Diamonds – The Family Jewels
106. My Chemical Romance – Danger Days
107. Mystery Jets – Serotonin
108. Rick Ross – Teflon Don
109. Tamaryn – The Waves
110. The Drums – The Drums
111. Man Alive – Everything Everything
112. Steve Mason – Boys Outside
113. Wavves – King of the Beach
114. Black Angels – Phosphene Dream
115. Antony and the Johnsons – Swanlights
116. Caitlin Rose – Own Side Now
117. Darkstar – North
118. Doug Paisley – Constant Companion
119. James Blake – The Bells Sketch
120. How to Dress Well – Love Remains
121. Girls – Broken Dreams Club
122. John Mellencamp – No Better Than This
123. Kid Cudi – Man on the Moon II: The Legend of Mr. Rager
124. Big K.R.I.T. – K.R.I.T. Wuz Here
125. Marnie Stern – Marnie Stern
126. Ceo – White Magic
127. Avey Tare – Down There
128. Lower Dens – Twin Hand Movement
129. Frank (Just Frank) – The Brutal Wave
130. Baths – Cerulean
131. Mumford and Sons – Sigh No More
132. Local Natives – Gorilla Manor
133. Plan B – The Defamation of Strickland Banks
134. Ray LaMontagne and the Pariah Dogs – God Willin’ and the Creek Don’t Rise
135. New Pornographers – Together
136. Massive Attack – Heligoland
137. Josh Ritter – So Runs the World Away
139. Roots Manuva meets Wrongtom – Duppy Writer
140. Rumer – Seasons of My Soul
141. Isobel Campbell and Mark Lanegan – Hawk
142. Sam Amidon – I See the Sign
143. She and Him – Volume II
144. Bruce Springsteen – The Promise
145. Small Black – New Chain
146. Take That – Progress
147. Corinne Bailey Rae – The Sea
148. Bryan Ferry – Olympia
149. Brandon Flowers – Flamingo
150. Taylor Swift – Speak Now
151. Kid Rock – Born Free
152. Elizabeth Cook – Welder
153. Maximum Balloon – Maximum Balloon
154. Peter Wolf – Midnight Souvenirs
155. Ted Leo and the Pharmacists – The Brutalist Bricks
156. Against Me! – White Crosses
157. The Chemical Brothers – Further
158. The Fall – Your Future, Our Clutter
159. Factory Floor – Untitled
160. Les Savy Fav – Root for Ruin
161. New Young Pony Club – The Optimist
162. Islet – Wimmy
163. Hurts – Happiness
164. Lonelady – Nerve Up
165. Magnetic Man – Magnetic Man
166. Pulled Apart By Horses – Pulled Apart By Horses
167. Kelis – Flesh Tone
168. First Aid Kit – The Big Black and the Blue
169. Ikonika – Contact, Love, Want, Have
170. Errors – Come Down With Me
171. The Fresh and Onlys – Play It Strange
172. The Jim Jones Revue – Burning Your House Down
173. The White Stripes – Under Great White Northern Lights
174. Drive-By Truckers – The Big To-Do
175. Carolina Chocolate Drops – Genuine Negro Jig
176. Laura Veirs – July Flame
177. Dr. Dog – Shame Shame
178. Bob Dylan – The Witmark Demos
179. Gogol Bordello – Trans-Continental Hustle
180. Tramples By Turtles – Palomino
181. Johnny Cash – American VI: Ain’t No Grave
182. The Hold Steady – Heaven Is Whenever
183. Black Rebel Motorcycle Club – Beat the Devil’s Tattoo
184. Freelance Whales – Weathervanes
185. Los Lobos – Tin Can Trust
186. Tom Petty and the Heartbreakers – Mojo
187. Elvis Costello – National Ransom
188. Richard Thompson – Dream Attic
189. Ra Ra Riot – The Orchard
190. Justin Townes Earle – Harlem River Blues
191. Blitzen Trapper – Destroyer of the Void
192. Trent Reznor And Atticus Ross – The Social Network Soundtrack
193. Goldfrapp – Head First
194. Jakob Dylan – Women and Country
195. Ben Folds and Nick Hornby – Lonely Avenue
196. Jimi Hendrix – Valleys of Neptune
197. Leonard Cohen – Songs From the Road
198. OK Go – Of the Blue Colour of the Sky
199. The Books – The Way Out
200. Junip – Field
201. Deer Tick – Black Dirt Sessions
202. Sade – Soldier of Love
203. Tricky – Mixed Race
204. I Am Kloot – Sky at Night
205. Skream – Outside the Box
206. Cherry Ghost – Beneath This Burning Shoreline
207. Two Door Cinema Club – Tourist History
208. Voice of the Seven Thunders – Voice of the Seven Thunders
209. Brian Eno – Small Craft on a Milk Sea
210. Dylan LeBlanc – Paupers Field
211. Konono No 1 – Assume Crash Position
212. Smoke Fairies – Through Low Light and Trees
213. PVT – Church With No Magic
214. The Soft Pack – The Soft Pack
215. O Children – O Children
216. Holly Miranda – The Magician’s Private Library
217. Sea of Bees – Songs For the Ravens
218. Pantha Du Prince – Black Noise
220. Cours Lapin – Cours Lapin
221. Darwin Deez – Darwin Deez
222. School of Seven Bells – Disconnect From Desire
223. Beach Fossils – Beach Fossils
224. Shit Robot – From the Cradle to the Rave
225. Chilly Gonzales – Ivory Tower
226. Connan Mokasin – Please Turn Me Into the Snat
227. Holy Fuck – Latin
228. The School – Loveless Unbeliever
229. Tobacco – Maniac Meat
230. Dios – We Are Dios
231. Allo Darlin’ – Allo Darlin’
232. El Guincho – Pop Negro
233. Kort (Kurt Wagner and Cortney Tidwell) – Invariable Heartache
234. Solar Bears – She Was Coloured In
235. Free Energy – Stuck on Nothing
236. Kings Go Forth – The Outsiders Are Back
237. Dan Michaelson and the Coastguards – Shakes
238. Stornoway – Beachcomber’s Windowsill
239. Magic Kids – Memphis
240. Fool’s Gold – Fool’s Gold
241. Frankie Rose and the Outs – Frankie Rose and the Outs
242. Aloe Blacc – Good Things
243. Drums of Death – Generation Hexed
244. Am – Future Sons and Daughters
245. Time and Space Machine – Set Phazer to Stun
246. Walls – Walls
247. The Dillinger Escape Plan – Option Paralysis
248. Happy Birthday – Happy Birthday
249. The Eighties Matchbox B-Line Disaster – Blood and Fire
250. Woods – At Echo Lake
251. Tyler, The Creator – Bastard
252. Kylesa – Spiral Shadow
253. Women – Public Strain
254. Forest Swords – Dagger Paths
255. Wyatt, Atzmon, Stephens – For the Ghosts Within
256. Eli Paperboy Reed – Come and Get It
257. Kelley Soltz – To Dreamers
258. The Besnard Lakes – Are the Roaring Night
259. Roky Erickson with Okkervil River – True Love Cast Out All Evil
260. Jane Weaver and Septieme Souer – The Fallen By Watch Bird
261. Ali Farka Toure and Toumani Diabate – Ali and Toumani

…and that’s it for the W2E writeups

Posted in the Foyer by Joe on October 3, 2010
Tags: ,

I have notes on the remaining 1.5 days of the Web 2.0 Expo, which I had planned to write up on the plane home on Thursday and post on Friday. But due to a horrendous series of hurricane-related travel problems, I didn’t get home until late Friday evening and writing things up for this blog was the last thing on my mind while rushing around trying to find out if I could get home at all. And tomorrow I leave for a 2 week vacation, so that’s all I have time for. Sorry about that, folks.

W2E Day 3: Morning Presentations

Posted in the Kitchen,the Living Room by Joe on September 30, 2010
Tags: , , , ,

JavaScript is the New Black – Why Node.js is Going to Rock Your World
by Tom Hughes-Croucher of Yahoo

Node is a Javascript interpreter that’s getting a lot of buzz. Basically it acts the same as a Python or Perl runtime (or as Tom said repeatedly, “Python or Ruby” – not a Perl fan apparently, which earns him some points with me), letting you run Javascript without a browser and putting it on the same level as the popular desktop or server-side scripting languages.

I’ve been wanting this for years: Javascript is a well-designed, powerful language with clean syntax, and there’s no reason it should be limited to embedding in browsers. And because it has a 100% lock on browser scripting, pretty much everybody has to learn it at some point anyway, so why switch back and forth between scripting languages for other tasks?

Tom makes this point more strongly, pointing out a huge number of job postings for Javascript programmers: web sites are now so complex that companies are not just hiring visual designers and expecting them to slap on some Javascript copied from a web site, they’re hiring full-fledged developers to code up their sites. Using Javascript on the server lets these developers write both back-end and front-end code rather than needing a separate team for each.

I don’t think this is a 100% win: every serious programmer should learn several languages so that they can distinguish the philosophy and structure of programming in general from the quirks of their particular language, so a pure Javascript developer who can’t pick up whatever language is being used on the server side isn’t much of a developer at all. But as long as you remain proficient in several languages – especially if they come from different paradigms – having to switch back and forth during day to day tasks which should be related does slow you down, so artificially limiting Javascript to the browser is a penalty even if it does help to discourage laziness.

The other big benefit touted by Tom is code reuse – which is a 100% win. There is often logic duplicated between client and server – form validation is a big example – and using Javascript on the server lets you use the exact same code, rather than having to rewrite the same algorithm in two different languages, a huge source of bugs. In fact, using Javascript on the server enables shared logic at a level that would be infeasible if it had to be written twice: consider a page that writes a lot of its HTML dynamically through Javascript. In a technique Tom refers to as “Progress Enhancement”, the first pass is done on the server, using the complete widget set and dynamic logic used on the client, so that as soon as the HTML is received it can be rendered instantly. But the dynamic Javascript is also repeated on the client side so that as the user interacts the page is reconfigured in the browser without going back to the server. (The server-side and client-side code will never be 100% identical, but at least it will have the same base rather than trying to do the same thing twice from scratch.) There is an example of this in the YUI/Express demo, with Yahoo UI widgets rendered first on the server without sacrificing client interaction. Tom demonstrated the Table View widget, which showed up a glitch in this scheme: the spacing generated on the server did not exactly match the client, so the widget originally rendered with header tabs squished together slightly and then spaced them out, leading to a slight UI flash. This is ugly and needs to be addressed (although I don’t know if it’s a systemic problem or just because the simplistic demo didn’t include any codeb to deal with it.) Still, that split second when the initial layout was flashed would have been blank without server-side rendering.

Under the hood, Node.js uses Google’s V8 engine and contains a non-blocking, event-driven HTTP server written in 100% Javascript which compares in performance with nginx. The performance graphs Tom showed were impressive and it seems to scale quite well (far better than Apache, for instance.) One big hole right now is that HTTPS support is sketchy, but this is being worked on.

One interesting technical note Tom highlighted: to make use of multi-core hardware with an event-driven server, new threads or processes need to be spun off by hand for heavy work (as opposed to automatically for each connection as in Apache). Although Node does support the fork system call, it also implements the HTML5 Web Workers spec. That means rather using slightly different concepts to spawn helpers on the client and the server, developers can reuse their knowledge when writing code in both places.

As a new language (in this context), Javascript doesn’t have as many 3rd-party libraries available as, say, Python and Ruby. But with the buzz it’s getting, more are popping up quickly: Tom showcased several, all available at GitHub:

NPM, the Node Package Manager
Mustache, a JSON-like templating language (which Twitter currently uses in JS in the client but Ruby on the server)
Express, an MVC framework similar to the lower levels of the Rails stack
Paperboy, a static file server

As well as using it as a web server, Node has an interactive shell just like Python’s or Ruby’s. Definitely going to be picking this up for my scripting needs, even though I don’t exactly do much server development.

Tom’s slides are online at http://speakerrate.com/sh1mmer.

When Actions Speak Louder Than Tweets: Using Behavioral Data for Decision-Making on the Web
by Jaidev Shergill, CEO of Bundle.com

Now here’s how to make a product focused presentation without sounding like a shill:

- Here are the resources we have that most people don’t (a large database of consumer behaviour data, including anonymized credit card purchases from a major bank, government statistics and nebulous “third party databases”)
– Here are some studies we did for our own information, whose results we think you’d find useful (“We tracked a group of people in detail and interviewed them to find out in depth how they make decisions”)
– Here’s a neat experiment we put together using these two pieces of information – we don’t even know if we’ll release it, we just wanted to find the results (and here they are)
– Oh, and here’s our actual product

Jaidev presented two theses, the first gleaned from interviewing study participants and the second from his own experience:

1. There’s more than enough information on the Web to make decisions, but 99% of it is useless for the specific person looking at it, because – especially when looking at opinions and reviews – people need to know how people that are like them feel about an option. (Here we are talking about subjective decisions like, “Is this a good restaurant?” or decisions with a lot variables like, “Does this new device fit my exact needs?”)

2. Online user-generated content is nearly useless for finding opinions because it is not filtered right. For example, review sites tend to polarize between 5 star and 1 star reviews because only users with strong opinions bother to rate, so all reviews are distorted. Many people filter by their social circle since their friends (mentions on Facebook, Twitter, etc) have things in common so their recommendations carry more weight, but this means that recommendations are skewed towards options with the latest hype. It turns out people are much better at reporting new things they just found than what they actually use longterm.

To illustrate this, Jaidev presented an experiment in which he used his company’s credit card database to build a restaurant recommendation system, by drawing a map between restaurants based on where people spent their money, how often they returned, and how much they spent there. Type in a restaurant you like and the system would return a list of where else people who ate at that restaurant spend their money. Rather than a subjective rating, the tool returns a “loyalty index” quantifying how much repeat business the restaurant gets. Presumably this will be more useful to you than a general recommendation because the originators of this data share at least one important factor with you: a love of the original restaurant.

The result was that a restaurant which was highly recommended on both review sites and in Jaidev’s circle rated very low. Compared to restaurants with similar food and prices, customers returned to this one far less often and spent far less. Reading reviews in depth revealed that, while the highest ratings praised the food quality, middling ratings sais that the food was good but management was terrible, with very slow service and high prices. Equally good food could be found elsewhere for less price and hassle. This information was available in reviews, but hard to find since it was drowned out by the all-positive or all-negative reviews.

So the main point to take away from the presentation is: hard data through data mining is still more valuable than the buzz generated through social media. Which is obvious, but a good point to repeat at this conference which is full of people who are so excited about adding social components to everything.

Jaidev did a great job of demonstrating the value of his company’s data set without actually sounding like he was selling it. He only demonstrated bundle.com itself briefly: it seems to be a money management site which allows users to compare their financial situation to the average and median to answer questions like, “Am I spending too much on these products?” and, “How much should I budget for this?”. The example Jaidev showed was an interactive graph of the cost of pet ownership. Looks like a useful site.

Alas, the equally useful looking restaurant recommender was only a proof of concept and is not released to the public. (And only covers Manhattan.) Email jaidev@bundle.com if you want to see it made public.

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. How does the unicorn breathe?)

W2E Day 2: Afternoon Presentations

Posted in the Living Room by Joe on September 29, 2010

What to Expect From Browsers in the Next Five Years: A Perspective
Panel discussion

This was a great discussion, although it really didn’t touch on its ostensible topic much! It was more a discussion of the current state of the browser. The panelists were:

Douglas Crockford, creator of JSON, active on the ECMAscript committee, and Javascript evangelist at Yahoo
Alex Russell, whose workshop I already went to on day 1
Brendan Eich, original inventor of Javascript and Mozilla CTO
Håkon Wium Lie, creator of CSS and Opera CTO

The discussion moved way too fast to take detailed notes, so here are some quotes (paraphrased):

On the IE9 beta:
Hakon: “I’ve dedicated my life to improving IE”
Doug: “They have no ECMAscript 5 – there’s no provision in the standard for subsets meaning they’ve chosen to be noncompliant”
Alex: “It’s not available on XP, which is a concern because XP still has so many users”

On how to differentiate a browser:
Hakon: “Opera has a good emphasis on JS performance, but getting on every device in Asia and Africa are another goal.”
Brendan: “Mozilla is behind on performance but improvements are ongoing. Integration with the user’s identity is what I’m interested in (password management, etc so that the browser has your info and doles it out to sites rather than the user passing info to sites individuallly).”
Doug: “Do we work AT ALL – getting better.” (I have no idea what the context of this was any more.)
Alex: “Developer productivity. The big bottleneck is now network behaviour: we need more expressive API’s because using javascript to address HTML and CSS shortcomings makes the web less semantic.”

On data privacy with the browser knowing your identity:
Hakon: “I’m scared about evercookie. I would rather have the browser shield the user.”
Brendan: “The browser should be the first point of trust, providing API’s to the user, instead of Facebook providing API’s.”
Doug: “XSS is the biggest security problem: all scripts share a common global context, and there is a complicated language abd protocol stack with different conventions at each level making it impossible to reason about. HTML5 was irresponsible in adding new networking, local storage, and DB access before fixing XSS.”
Brendan: “We need to fix incrementally, or nobody will adopt. HTML5 has some security features.”
Alex: “We have theoretical solutions. The battle will be good vs usable: it has to be a system that everyone can use because that is the web’s strength.”

On the use of “HTML5″ to refer to a set of features, many of which are actually CSS additions, rather than the actual spec:
Hakon: “This is a marketing problem: people want one name. The solution to ensure MS supports these features is ACID tests.”
Alex: “This is unproductive because lag time on creating ACID tests and getting browsers in hands is very long. I’m more interested in dynamics of getting browsers upgraded and released quickly.”
Doug: “This gets worse in the short term for web developers because the differences between browsers will increase, but longterm IE6 will die and it will become better.”
Brendan: “We need to use JS libraries for extensibility, because building everything into the browser is too slow to deploy.”
Hakon: “Yes, we need to use JS as a sandbox and move successes into the declarative side.”

On apps vs the Web:
Hakon “Native apps will be a footnote.”
Alex: “The Chrome web store let’s you monetize a web page. Targeting fixed hardware let’s you target the edge of a device, but as hardware improves we can burn cycles and the Web becomes a cost reduction to avoid writing things 5 times.”

On Google’s NACL:
Somebody, I didn’t write who (maybe the moderator): “It’s a promising research and prototyping story to get better performance out of the browser.”
Brendan: “It’s too complex. People here are calling it ActiveG. Nobody wants it. I don’t want pthreads running in my browser.”

On the lack of audience questions:
Moderator: “I don’t think you understand – this man invented CSS. If you have any questions about why your transforms aren’t working, ask them now!”

On developer tools:
Everybody: “Our browser has a tool built in. This is its name.”

On XML compatibility:
Doug: “XML is obsolete, didn’t you get the memo?”
Brendan: “Firefox uses heavy XML. We bought into it heavily, but now we’ve ripped a bunch of it out.”

On Mozilla’s evolving role now that every browser is debuting new user features, and the fight to get IE to follow web standards is won:
Brendan: “We represent the user first and only.”
Hakon: “Competition is great. We need several rendering engines to verify the models.”
Brendan: “People think engines are too complex to maintain several, but it’s not true.”

On standards bodies:
Brendan: “We need smart people who can do both practical and theory to serve on them, and we need them to work together.” Doug: “ECMAcript is the most important standard because if the others are broken, JS is the workaround.”

Some genius in the audience asked the best question of the session: “What’s the biggest unfilled hole in the HTML/CSS/Javascript stack?” On that:
Doug: “Security.”
Alex: “Integration. There are leaky seams between standards, and too many standards groups that don’t cooperate.”
Hakon: “CSS has an object model coming to help with that.”
Brendan: “The security problem lies in the leaky schemes.”

On plugin privacy concerns (users uniquely identifiable by their list of installed plugins):
Brendan: “We’re going to start turning off old plugins.”
Alex: “We’re trying to reduce the surface area of plugins. We need to leak some info so that they can be instantiated, but at least we need to hide their upgrade path and only keep the most recent versions installed.”

10 Things You Never Knew About Opera
by Håkon Wium Lie

I wanted to see this after seeing Hakon at the last panel, but it turned out to be pretty useless: basically a big ad for Opera. I guess that’s expected from the title, but I was hoping for some interesting new technology they incorporated or something. Instead it was all, “We employ 500 engineers,” and, “We’re big in Russia.” The only new thing mentioned was Opera Unite, which didn’t work when he tried to demo it. (My big question about it was how it dealt with firewalls and NAT – apparently it doesn’t, which is pretty useless.)

David Kaneda mentioned in the morning that he was required to make his presentation about general technology and not his particular product. I guess sponsors are immune.

Personal, Relevant, Connected: Designing Integrated Mobile Experiences for Apps and Web
by some drones from Microsoft

This was even worse. A lot of people walked out. I won’t waste my time by elaborating.

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. No, I would not like a pig.)

W2E Day 2: Morning Presentations

Posted in the Kitchen by Joe on September 29, 2010
Tags:

I’ve resigned myself to being a day behind in blogging. I’m just too slow at organizing my thoughts to get them into an order worthy of publishing in the evenings. Better to come at it fresh in the morning.

The shift from Monday’s 3 hour long workshops to yesterday’s shorter presentations was pretty jarring. I kept thinking the presenters were just finishing their introduction and finally about to go into detail – and the talk would be over. So my biggest complaint about a lot of these talks is that they gave such a broad overview that I didn’t really learn much.

How to Develop a Rich, Native-quality User Experience for Mobile Using Web Standards
by David Kaneda of Sencha, a company that was namedropped quite a few times by Jonathan Stark yesterday as the makers of a really high quality Javascript UI toolkit.

I expected this to overlap quite a bit with Jonathan’s workshop, but I wanted to see it anyway because I spoke with David briefly afterwards and I wanted to see how his opinions differed.

He offered quite a few nuggets of info like this interesting graph: in January 2010 the breakdown of US mobile web traffic was 47% iOS (iPhone/iPad), 39% Android, 7% RIM, 3% WebOS (Palm-now-HP Pre), 2% Windows Mobile, 2% Other – so 95% webkit. (Actually David was wrong about that – if these were released in January the RIM numbers would have to be pre-Torch, so that’s only 86% WebKit. For now.) Even at the height of IE’s popularity that kind of homogeneity is new for the web. It means the mobile developers, unlike web developers, are free to target the WebKit-specific subset of HTML5, without worrying about, for a trivial example, including both -webkit and -mozilla namespaced CSS selectors. That’s good because with limited networking, avoiding redundancy is important. I suspect it’s temporary, though, as new features are added and phones diverge in which exact webkit versions they support.

(Of course, this list reflects web usage, not phone ownership: these are the phones whose users spend the most time online. Even still I was surprised to see the Pre passing Windows Mobile.)

Wow, there are a lot more Android phones than I realized. David believes a year from now the basic free phone you get with your mobile account will be a shovelware Android phone, with a touch screen, GPU, etc, and a good browser, so really rich web apps will be available to everyone, not just the high end. (But, as I mentioned above, WebKit will continue to evolve and these phones would be good candidates for lack of upgrade support.)

As I expected, David mentioned a lot of the same things yesterday’s workshops did: he glossed over web storage and form validation (as well as web workers, which everybody seems to mention and then says “but I don’t have time to discuss that”.) He did cover, briefly, the new input types, video/audio (mentioning that video can be styled using the new CSS transforms, which stands to reason but hadn’t occurred to me to try; I assume it doesn’t distort the contents, that would be crazy!), meta viewport tags, CSS3 features, and the application cache manifest. That’s my biggest eye-opener so far – I helped to test the cache manifest in the Torch browser, although someone else did the implementation, and when I read the spec I thought it was ridiculous and would never catch on. But all the mobile app developers here seen really excited about it. It’s a huge pain in the ass to use, though, so clearly there’s some room for improvement (or at least tool support.)

David then went on to talk about some glitches in the mobile implementation, which is valuable info:

Touchscreen events: apparently there is a hardcoded 350ms delay in tapping an element before the click event is fired. That’s crazy sluggish! (He speculated that the reason was an attempt to allow people to cancel quickly if they touch the wrong thing, since fingers are thick and clumsy. I’ve seen other posts online guessing that this is to allow the user to continue and turn the click into a gesture.) David recommends working around this by binding custom events to touch down and touch up, which generate a click event immediately if the user does touch down and then touch up within a small radius. I dunno – sounds hackish and fragile to me. (I looked this up to see if it was in a spec somewhere or just an implementation detail, and all I could find was posts complaining about it on the iPhone. I’ll have to check with our UI people and see if it’s iPhone-specific or built into WebKit.)

No fixed elements: position:fixed doesn’t work on mobile – he didn’t say why, so I had to look it up. Actually it works fine to fix an item to a specific position on the page, that’s just not very useful on mobile since most people want it fixed to the viewport. This makes building pages with independently scrolling areas difficult. To fix this, he again recommends writing custom touch handlers to track movement events and update content based on that – a simple implementation would just move linearly, but to match the platform native scrolling, you would need to track acceleration and add bounce at the end, which gets pretty complex.

Wow. That’s even worse! After all the talk of moving transitions and animations into CSS so they can be implemented in the browser and accelerated, it saddens me deeply to hear people talk about implementing kinetic scrolling of all things in Javascript.

Finally he listed some Javascript frameworks to help with mobile development:

iScroll and TouchScroll both wrap touch scrolling as described above, and contain their own implementations of acceleration.

jQTouch he described as “limited” and “hypercard-like”; I guess he’s allowed to denigrate it since he wrote it. It fixes the 350 msec delay but not scrolling.

And Sencha Touch, his company’s product which is in beta, abstracts all touch events to generate artifical events like doubletap, swipe, and rotate, implements independantly scrollable areas, and a ton of other features.

For deployment he mentioned OpenAppMkt, an app store for web apps, and PhoneGap again for wrapping web apps in a native shell.

David’s slides and additional links are online at his site, including a lot of links to further resources. (Including a bunch I took the time to Google for myself. Oops.)

And at this point my fingers started to cramp up and my notes became much sketchier, so the rest of these writeups will be much, much shorter.

The Browser Performance Toolkit
by Matt Sweeney of YUI/Yahoo!

I was expecting this to describe a specific product, but it turned out to be a metaphorical toolkit – a list of performance measurement tools.

I thought this talk could have benefited from some more demonstration: how to read a waterfall diagram, how to use a profiler. It came off as just a list of products to check out later. It also do a lot to distinguish between them: there was a lot of, “And this has a network waterfall display which is a lot like the last one.” So why should I choose this one over the last one? When somebody asked which products he personally used, Matt’s answer was, “All of them,” which isn’t too helpful. Would have liked more depth here.

I won’t bother linking to these since there are so many and they’re easy to Google:

Firebug: Firefox’s builtin debugger, gives a waterfall display of asset loading, and can break out each load into DNS lookup time, connection time, data transfer time, etc; also has a Javasript profiler, which unfortunately is limited because it only gives function names and call counts, not full call stacks.

YSlow: an open source Firebug add-on from Yahoo, which gives a grade on various categories of network usage and detailed stats, good for getting a high-level view of network usage.

Page Speed: another open source Firebug add-on, from Google; similar to YSlow but also tracks some runtime performance. It can profile deferable Javascript by listing functions not yet called at onload time (candidates for deferral), and displays waterfall diagrams including separate JS parse and execute times.

Web Inspector: WebKit’s built-in debugger. Similar to Firebug, plus an “Audits” tab similar to YSlow/Page Speed. It includes the standard network loading waterfall, and a JS profiler which unlike Firebug’s includes system events like GC and DOM calls, and does include call stacks but not call counts, just run times.

IE8 “has a robust set of developer tools”, but he didn’t say any more about them. IE9 has a network usage tracker and profiler which seem on par with Firebug and Web inspector, with one huge caveat: the profiler’s timer is in 15msec chunks! So runtimes are not close to accurate due to rounding. At least call counts are still useful.

DynaTrace AJAX Edition is a free IE add-on supporting versions 6-8, with IE9 and Firefox support coming soon. It has the standard network waterfall, plus some nice high-level pie charts showing percentage of time spent loading various resources. Its profiler tracks builtins like Web Inspector’s, and can also capture function arguments, which sounds very useful, and has the nicest UI demonstrated, including pretty-printing the source code when looking at a function (especially useful when debugging obfuscated JS).

Matt also mentioned a few online tools: apparently showslow.com aggregates YSlow, Page Speed and DynaTrace scores, and publishes the results so it’s a good way to compare your page to others. But when I tried to go there I got what looked like a domain squatter, and I see no mentions of it in Google – did I copy the name down wrong? Webpagetest.org does network analysis for IE7.

Mentioned but not detailed were Fiddler and Charles (no relation), a proxy which among other things can be used to see Flash resource requests over the wire.

His final point was that, since browsers vary, you can’t just use one tool and need to profile in multiple browsers with multiple tools. Which makes sense, but it would have been nice to give more detail on what YSlow or Page Speed give you over the browsers’ builtin debuggers.

NPR Everywhere: The Power of Flexible Content
by Zach Brand, head of NPR’s digital media group

I could have gone to the HTML5 vs Flash presentation in this time slot, but after seeing 3 variants on HTML5 in a row I figured I should see something a little different instead. Here, the head of NPR’s digital media group described the challenges in getting NPR’s original news reporting formatted in a way that could be used in many contexts: large screens, small mobile devices, summarized on affiliates’ web sites, serialized through RSS, embedded in blogs.

This was another presentation where I would have liked to see more technical detail and less overview, but I guess there wasn’t time. The talk was interesting, but didn’t contain anything directly applicable to my work so I won’t bother to summarize it.

I will list two important links, though:

An API for public access to NPR’s news stories is at npr.org/api.

Zach promised a followup on the blog at npr.org/blog/inside

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. That ghost is full of cake.)

W2E Day 1: Building Cross-Platform Mobile Web Apps with Jonathan Stark

Posted in the Kitchen by Joe on September 28, 2010
Tags: , , , , , ,

Jonathan Stark, author of two O’Reilly books on mobile app development, shares techniques for doing it with web technologies.

This was a good companion to this morning’s session (okay, yesterday morning’s session now: Alex’s workshop took so long to write up that now I’m behind.) It covered a lot of the same ground but in a more hands-on, less theoretical way. It discussed the same CSS3 features: transforms, transitions, animation, gradients, rounded corners, and text shadows, but gave more complete code examples, took some time to explain them, and tweaked a lot of parameters to show their effect.

Jonathan disagreed with Alex on one thing: they both gave equal weight to Web Storage (for keeping simple persistent data on the client) and app cache manifests (for keeping resources on the client), Jonathan went on to give a gung-ho demo of HTML5’s SQL database integration, which Alex dismissed yesterday saying the API was “a mess”. One reason might be that Jonathan was speaking specifically about writing mobile pages, which means WebKit (which has the SQL API) while Alex, despite being a chrome developer, was being careful to keep his talk cross-platform and highlight the Firefox and Opera way to do things. I’m not sure now if Alex meant that the db situation is “a mess” because there is no convergence, or if he had actual problems with the API design.

Apart from the new CSS and HTML practical overview, Jonathan did cover some more philosophical questions: is it better to use native UI toolkits to write a mobile app for each device, or just write your “app” in HTML and use these new features to make it look as much like an app instead of a web page as possible? HTML is the clear winner in terms of developing for multiple devices (no need to learn Objective-C just to build for iPhone), distribution, and updating (just dump the new version to the live site), but native apps still have slightly better cosmetics and – critically – access to device features. Random apps on a web site aren’t allowed to access the camera, dial the phone or open the address book.

Nonetheless, Jonathan strongly recommended using web tools to build apps, since it let’s you target more than a single device. To get around the sandboxing problem, he touted PhoneGap, a very cool looking open source framework for compiling HTML apps into native packages. Just dump your web code into a dir, point PhoneGap at it, and it will generate an appropriate project to embed it for your platform (some Java classes and ant build files for Android, some Objective-C code and an Xcode project for iPhone, etc). Even better, it generates bridges to allow access to sandboxed features like the camera from Javascript (though obviously you can’t test this in the browser). Their supported feature list only lists Blackberry 4.5, but that’s not too surprising as the WebKit browser is so new. Hopefully support for BB6 widgets will be coming soon. I’ll definitely need to take a look at this project.

It’s important to note that Jonathan was talking about building apps through HTML, not just web pages. His examples blatantly ignored a lot of Alex’s optimization (or more precisely “not sucking horribly” advise), but as he reminded me when I talked to him afterwards, not blocking on network performance doesn’t matter as much when the app is all installed. He swears by Steve Souders for performance advice: I’ll need to check out his stuff to see how it compares. (I see he and Alex both work for Google, so probably Alex got it from him in the first place.)

Jonathan also touted the JQTouch library that he maintains, originally written by David Kaneda, whose presentation I’m going to see tomorrow. (Today, now. In fact, I’ve already seen it. But let’s maintain the fiction that I just walked out of the workshop.) It looks like a pretty good widget toolkit – built on jQuery, but I won’t hold that against it for native development. I wonder how hard it would be to rip out the jQuery usage and replace it with Alex’s h5.js…

Jonathan had some useful advice for mobile developers as well: as well as the obvious small screen and slow, unreliable, expensive data channels, remember that the user’s interaction with a mobile device will be different because they are almost always using it in a distracting environment. Users want to pop open the app, perform a tiny task, and be done before they reach the head of the line. That means always let the user know how much is left to go in their interaction, try to break things up into tiny chunks, and if there is ever the slightest pause for God’s sake throw up a progress bar or something!

A more specific recommendation: on a touch decvice, put controls (search bars, navigation buttons, etc) on the bottom of the page, since reaching up to tap a control will block the user’s view. Good common-sense advice; I hadn’t thought of that.

One thing that rubbed me the wrong way, as a traditional developer, is that Stark taught terrible coding discipline. Several times he pasted the same code into his sample app over and over and said, “this is a bit verbose, but you can just add a macro to your text editor to paste it in automatically.” Augh! No! No more cut and paste code – abstract it into a function if you use it more than once! (Unless maybe Javascript function calls have a lot of overhead I’m not aware of that make two line wrapper functions horribly inefficient.)

Other than that, a pretty good presentation, and it was good to see that a web designer is just as excited about new HTML features as browser developers expect (and for much the same reasons – both easier to write and faster).

Slides and some extended notes are promised at http://jonathanstark.com/web20.

(While I’m attending this conference on behalf of Research In Motion, this blog and its contents are my personal opinions and do not represent the views of my employers. )

Next Page »

Follow

Get every new post delivered to your Inbox.