I Am Not Charles


The Best Albums of 2010, part 5: The Stupid Methods

Posted in the Workshop by Joe on January 10, 2011
Tags: , ,

Part 1

Part 2

Part 3

Part 4

We have a set of ballots containing the year end Best Albums of 2010 lists from 9 publications. Yesterday I listed 12 possible ways to count these ballots to come up with a unified list.

Today I’ll go through the ways that don’t work for this data.

First Past the Post

I’ve said that First Past the Post is a terrible voting system. Now, let me demonstrate:

First we count up the top vote from every ballot. (Easy to do by hand, since there are only 9 ballots.) The winner is the #1 album of 2010. Then we drop that winner and count up the top votes that remain. The winner is the #2 album of 2010. Repeat until we’ve ranked all the albums.

That gives:

4 Kanye West – My Beautiful Dark Twisted Fantasy
2 Arcade Fire – The Suburbs
1 Caribou – Swim
1 John Grant – Queen of Denmark
1 These New Puritans – Hidden

So Kanye West has the #1 album of 2010, by this count. That doesn’t make any sense – 4 publications ranked him #1, but 3 others didn’t like his album enough to rank it at all! Compare Arcade Fire, whose album was liked enough to put in the top 25 by everyone: on the 4 lists where Kanye is #1, Arcade Fire is #2, #3, #4 and #11, but on the 5 other lists, Arcade Fire beats Kanye West hands down. So over half of our voters greatly prefer Arcade Fire to Kanye West, and most of the remaining voters only prefer Kanye by a tiny amount. Only Pitchfork (Kanye West at #1, Arcade Fire at #11) would be greatly dissatisfied by putting Arcade Fire ahead of Kanye West, while Mojo, Q, and NME would be extremely dissatisfied to put Kanye ahead of Arcade Fire. (And NPR, with Arcade Fire at #1 and Kanye West at #10, is a mirror of Pitchfork – call that greatly dissatisfied. And while Rough Trade wouldn’t be happy with either result, having ranked Arcade Fire way down at #21, they’d surely be even more pissed off if the win went to Kanye West, who they didn’t rank at all.)

More concisely: Arcade Fire is the Condorcet Winner. Kanye West should not win. So we can discard this result already.

Just for fun, let’s remove Kanye and see how the top 3 ends up. Votes for #2 are:

3 Arcade Fire – The Suburbs
1 Caribou – Swim
1 Deerhunter – Halcyon Digest
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 The Black Keys – Brothers
1 These New Puritans – Hidden

So Arcade Fire is #2. That’s not terrible. After removing Arcade Fire, votes for #3 are:

2 The Black Keys – Brothers
1 Beach House – Teen Drama
1 Caribou – Swim
1 Deerhunter – Halcyon Digest
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 Robert Plant – Band of Joy
1 These New Puritans – Hidden

#3 is The Black Keys, by 1 vote. But we were perilously close to a 9-way tie for 3rd. Which illustrates the weirdness of our data set: in most elections, there are a handful of candidates and hundreds of voters. We have hundreds of candidates and only a handful of voters. Some voting methods will produce a lot of ties, just because there are so few votes to go around that everyone will get one. These might be perfectly good voting methods for most elections, they just fall down on this edge case.

It looks like First Past the Post will be vulnerable to ties – if one list had ranked The Black Keys a little lower, we’d have one here. But it doesn’t matter, since we’ve already rejected it for failing to elect the Condorcet Winner. FAIL.

Approval Voting

Here’s one of those ties now. In approval voting, every album which appears on a list at all gets 1 point, and the album with the most points is #1. (Second most points is #2, etc.)

So we start off with a 2-way tie for first, followed by a 3 way tie for third:

9 Arcade Fire – The Suburbs
9 LCD Soundsystem – This is Happening
8 Beach House – Teen Dream
8 The National – High Violet
8 Vampire Weekend – Contra

This is because we defined “approval” as “anywhere in the year end list”. In an actual approval vote, the voters would know beforehand how the votes would be counted and probably be more selective in who they vote for. These lists are not really saying “any of these 25 or 50 albums would be ok by us as Album of the Year”. Really, they would probably vote for their top 3 or so (or some would vote for their top 3, some for their top 5, some would vote only for their favourite, etc.) If everyone approved only their top 3, we’d get:

6 Arcade Fire – The Suburbs
4 Kanye West – My Beautiful Dark Twisted Fantasy
2 Beach House – Teen Dream
2 Deerhunter – Halcyon Digest
2 The Black Keys – Brothers
2 These New Puritans – Hidden
1 Caribou – Swim
1 Elton John and Leon Russell – The Union
1 Gil Scott-Heron – I’m New Here
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 MGMT – Congratulations
1 Plan B – The Defamation of Strickland Banks
1 Robert Plant – Band of Joy
1 The National – High Violet

So now we have Arcade Fire at #1, Kanye West at #2 (both seem reasonable) and a 4-way tie for #3. And a 9-way tie for #7. And no way to rank anything after that.

In a real election, Approval Voting is better than First Past the Post because it has less need for tactical voting – for instance, take Rough Trade. The top of its Best Of list looks nothing like anyone else’s. If Rough Trade were a voter trying to actually influence an election, they would know (based on polls and publicity) that voting for Caribou, or Gil Scott-Heron, or These New Puritans, was useless – they have no hope of winning. They might even have picked up enough from the media to know that it’s shaping into a contest between Arcade Fire (who they rank 21) and Kanye West (who they hate – they didn’t even rank him). So they might be tempted to hold their nose and vote for Arcade Fire just to make sure Kanye West doesn’t win. It would be a tough choice, though, because what if their preferred candidates have a lot of underground support that isn’t getting media attention? With Approval Voting, Rough Trade could vote for their top 3 or 5 (or however many they wish) to show their support, plus throw in a vote for Arcade Fire just to make sure they have at least one vote that isn’t wasted. (You could say that this is still voting tactically, but Approval Voting at least gives more and better options for tactical voting.)

However, Approval Voting isn’t guaranteed to elect the Condorcet Winner – it depends entirely on how the voters choose to define “approval”. The various preferential ballot methods are clearly better at selecting the correct winner, because they let each voter give more information about their preferences. To balance this, Approval Voting is much easier to explain and count – you don’t even need a computer to count the ballots! So for a general election it may be a fair choice.

Regardless, it doesn’t work for our purposes, due to the number of ties we get when there are so many more candidates than there are ballots (which wouldn’t be a problem in a real election). FAIL.

Smith/Minmax

This one does need a computer to count. Using the ballots.txt file we generated in Part 3, we generate results with:

voteengine.py -m s//minmax sminmax-data.txt

This will think for a minute or two and then spit out “sminmax-data.txt”, a file containing a bunch of data about how it counted the votes, ending with the final results, in a line in the same format as the ballot:

71 > 41 > 84 > 167 > 196 > 230 > 256 > 258 > 119 > 1 > 20 > 171 > 26 > 73 > 188 > 214 > 3 > 208 > 37 > 247 > 23 > 15 > 162 > 114 > 217 > 240 > 201 > 154 > 142 > 102 > 161 > 143 > 116 > 13 > 185 > 78 > 147 > 62 > 183 > 220 > 223 > 137 > 210 > 88 > 18 > 244 > 67 > 118 > 211 > 81 > 259 > 6 > 86 > 229 > 122 > 197 > 47 > 180 > 257 > 108 > 145 > 35 > 233 > 176 > 141 > 101 > 231 > 56 > 58 > 212 > 103 > 129 > 40 > 204 > 24 > 42 > 252 > 195 > 39 > 187 > 253 > 239 > 218 > 98 > 105 > 21 > 155 > 138 > 33 > 205 > 79 > 243 > 100 > 16 > 19 > 226 > 199 > 224 > 38 > 131 > 169 > 173 > 207 > 68 > 163 > 134 > 96 > 245 > 120 > 72 > 193 > 8 > 249 > 9 > 66 > 123 > 209 > 90 > 153 > 255 > 236 > 10 > 202 > 178 > 121 > 127 > 242 > 53 > 82 > 159 > 237 > 182 > 2 > 30 > 189 > 250 > 148 > 44 > 126 > 170 > 221 > 177 > 29 > 12 > 248 > 174 > 112 > 92 > 50 > 109 > 139 > 151 > 34 > 94 > 146 > 52 > 99 > 117 > 89 > 110 > 28 > 140 > 150 > 45 > 190 > 251 > 106 > 65 > 104 > 175 > 203 > 46 > 61 > 95 > 149 > 70 > 115 > 191 > 235 > 135 > 234 > 85 > 22 > 17 > 184 > 130 > 132 > 260 > 43 > 136 > 213 > 200 > 80 > 111 > 157 > 216 > 181 > 14 > 198 > 164 > 238 > 49 > 97 > 246 > 25 > 75 > 63 > 36 > 133 > 107 > 32 > 124 > 165 > 91 > 179 > 11 > 254 > 158 > 77 > 54 > 222 > 74 > 232 > 125 > 152 > 113 > 215 > 144 > 83 > 5 > 31 > 227 > 186 > 64 > 57 > 168 > 76 > 69 > 261 > 27 > 228 > 93 > 225 > 206 > 241 > 166 > 59 > 192 > 160 > 4 > 60 > 128 > 219 > 51 > 87 > 48 > 194 > 156 > 172 > 7 > 55

Those are the candidate numbers of each album. To get a human-readable list out of that, we need to look up the name of each ballot. Remember that when we generated ballots.txt, we also saved the candidate names to candidates.txt – the name of candidate 1 is on line 1, candidate 2 is on line 2, etc. So we can write another simple python script, that reads candidates.txt and stores a map of candidate number to candidate name, and then reads the last line of sminmax-data and looks up each candidate name.

The Python script.

Save this script as “interpret-result.py”, and feed the last line of sminmax-data.txt into it with:

tail -n 1 sminmax-data.txt | ./interpret-results.py > sminmax-results.txt

Now open up sminmax-results.txt and look at the list:

Broken Bells – ‘Broken Bells’
John Grant – ‘Queen of Denmark’
Abe Vigoda – ‘Crush’
Against Me! – ‘White Crosses’
Ali Farka Toure & Toumani Diabate – ‘Ali & Toumani’
Allo Darlin’ – ‘Allo Darlin’
Aloe Blacc – ‘Good Things’
Am – ‘Future Sons And Daughters’
Antony and the Johnsons – ‘Swanlights’
Arcade Fire – ‘The Suburbs
Ariel Pink’s Haunted Graffiti – ‘Before Today’
Avey Tare – ‘Down There’
Avi Buffalo – ‘Avi Buffalo’
Band of Horses – ‘Infinite Arms’
Baths – ‘Cerulean’
Beach Fossils – ‘Beach Fossils’
Beach House – ‘Teen Dream’
Bear In Heaven – ‘Beast Rest Forth Mouth’
Belle and Sebastian – ‘Write About Love’
Ben Folds & Nick Hornby – ‘Lonely Avenue’
Best Coast – ‘Crazy For You ‘
Big Boi – ‘Sir Lucious Left Foot: The Son Of Chico Dusty’
Big K.R.I.T. – ‘K.R.I.T. Wuz Here’
Black Angels – ‘Phosphene Dream’
Black Rebel Motorcycle Club – ‘Beat The Devil’s Tattoo’

Woah. That ain’t right.

Broken Bells was ranked 5th by NPR and 11th by Rough Trade. And that’s it. There’s no way they should be anywhere near the top 3.

John Grant was ranked 1st by Mojo – so he’s got that going for him – and 6th by Q. And that’s it. Again, no way he should be ahead of Arcade Fire and Kanye West.

After that it starts spitting out albums in alphabetical order. Remember that in ballot.txt we specified that we’d break ties alphabetically. So this indicates that all the remaining ballots are tied for 3rd – or tied for last, depending how you look at it. That’s not useful at all.

This looks to me like VoteEngine’s s//minmax algorithm is buggy, because these results are just too weird to explain any other way. But life’s too short to debug it when there are 9 other algorithms to test out. FAIL.

Tomorrow, I’ll start going through algorithms that work fairly well, and start looking for the best.

Advertisements

The Best Albums of 2010, part 3: Counting Votes With Pyvote

Posted in the Workshop by Joe on January 3, 2011
Tags: , ,

Part 1

Part 2

First, a correction: for the last two posts, I’ve been linking to pyvote as the program to automatically count votes. Except I actually used VoteEngine. Natural mistake – they’re both Python programs used to count votes, and “pywhatever” is a common naming scheme for Python.

Short one this time. Last time, we turned 9 end-of-year best album lists into preferential ballots in a standard format. In order to count these ballots with VoteEngine, though, we need one more thing: as well as the file of ballots, VoteEngine needs a complete list of all candidates, which can either be passed on the command line with the “-cands” parameter or added to the ballot file itself. Our candidates are named with numbers counting up from 1, so this is easy for us to generate.

Another parameter that’s helpful is “-tie”, which takes a list of candidates in an order to use as tiebreakers. Whenever a voting system returns a tie between two candidates, the one that appears first in the -tie list is counted as the winner. I’m not actually sure what order is used if a candidate doesn’t appear in the tiebreaker list, but since we’re autogenerating the candidate list anyway it’s easy to always fill in a complete -tie list. We’ll break ties in alphabetical order based on song name.

Since we plan to count the same list of ballots over and over again with different voting methods, it will make things much easier to add these two parameters to the ballot file itself. This is a simple edit to the script we wrote last time. First, since “-cands” needs to come at the start of the file, we delay actually writing lines to the ballot file until after all ballots have been read. Then, after reading all input files and filling in the candidate map (which records candidate numbers mapped to song names), we write all the ballot lines plus two last lines: “-cands ” and “-tie “.

Here is the updated script, with the new lines highlighted.

This will output a ballots.txt that looks like this: identical to the one from last post, but with two more lines at the start.

-cands 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260

-tie 85 167 195 229 255 257 119 1 20 171 26 74 187 213 3 207 37 246 23 15 162 114 216 239 200 154 142 72 102 161 143 116 13 184 79 147 63 182 219 222 137 209 89 18 243 68 118 210 82 258 6 87 228 122 196 47 179 256 108 145 35 232 175 141 101 230 56 58 211 103 129 40 203 24 42 251 194 39 186 252 238 217 98 105 21 155 138 33 204 80 242 100 16 19 225 198 223 38 131 169 62 206 69 163 134 244 120 73 192 8 248 9 41 67 123 208 91 153 254 235 10 201 177 121 127 241 53 83 159 236 181 2 30 188 249 148 44 126 170 220 176 29 12 247 173 112 93 50 109 139 151 34 95 146 52 99 117 90 110 28 140 150 45 189 250 106 66 104 174 202 46 61 96 149 71 115 190 234 135 233 86 22 17 183 130 132 259 43 136 212 199 81 111 157 215 180 14 197 164 237 49 97 245 25 76 64 36 133 107 32 124 165 92 178 11 253 158 78 54 221 75 231 125 152 113 214 144 84 5 31 226 185 65 57 168 77 70 260 27 227 94 224 205 240 166 59 191 160 4 60 128 218 51 88 48 193 156 172 7 55

41 > 1 > 12 > 35 > 11 > 46 > 52 > 71 > 78 > 122 > 6 > 95 > 9 > 50 > 26 > 20 > 130 > 13 > 132 > 19 > 134 > 136 > 81 > 144 > 42 > 16 > 40 > 68 > 21 > 8 > 28 > 22 > 31 > 60 > 29 > 2 > 92 > 100 > 172 > 175 > 94 > 177 > 178 > 36 > 183 > 96 > 192 > 67 > 195 > 34

1 > 22 > 115 > 30 > 4 > 41 > 16 > 5 > 46 > 12 > 11 > 21 > 3 > 38 > 7 > 42 > 67 > 50 > 19 > 133 > 74 > 137 > 2 > 143 > 53 > 83 > 52 > 26 > 69 > 154 > 158 > 75 > 166 > 54 > 93 > 31 > 78 > 8 > 62 > 96 > 37 > 94 > 180 > 60 > 182 > 9 > 191 > 35 > 6 > 97

70 > 1 > 3 > 2 > 30 > 42 > 55 > 43 > 44 > 75 > 5 > 13 > 105 > 125 > 7 > 19 > 129 > 51 > 12 > 6 > 8 > 36 > 20 > 4 > 26 > 148 > 23 > 150 > 45 > 83 > 18 > 163 > 109 > 10 > 33 > 169 > 170 > 15 > 173 > 14 > 97 > 110 > 93 > 50 > 39 > 46 > 190 > 90 > 64 > 98 > 63 > 108 > 201 > 203 > 206 > 99 > 9 > 89 > 211 > 37 > 24 > 21 > 61 > 218 > 17 > 16 > 221 > 35 > 225 > 102 > 40 > 112 > 69 > 25 > 231

13 > 21 > 70 > 116 > 71 > 118 > 63 > 80 > 48 > 32 > 72 > 26 > 5 > 56 > 3 > 128 > 61 > 89 > 11 > 84 > 1 > 43 > 142 > 145 > 9 > 34 > 51 > 44 > 2 > 41 > 159 > 164 > 64 > 60 > 18 > 99 > 24 > 23 > 174 > 30 > 4 > 35 > 7 > 52 > 185 > 14 > 189 > 33 > 55 > 198 > 199 > 62 > 202 > 59 > 100 > 207 > 20 > 209 > 210 > 212 > 213 > 215 > 91 > 108 > 37 > 219 > 222 > 223 > 226 > 227 > 228 > 229 > 36 > 112 > 232 > 66 > 236 > 237 > 238 > 25 > 241 > 19 > 243 > 245 > 107 > 65 > 39 > 247 > 109 > 110 > 114 > 68 > 251 > 252 > 255 > 256 > 257 > 40 > 38 > 260

10 > 6 > 1 > 2 > 73 > 8 > 19 > 29 > 121 > 17 > 4 > 7 > 15 > 23 > 27 > 47 > 3 > 76 > 45 > 12 > 18 > 79 > 16 > 88 > 106 > 5 > 33 > 28 > 14 > 11 > 49 > 162 > 165 > 167 > 22 > 113 > 57 > 58 > 13 > 92

10 > 11 > 101 > 1 > 73 > 4 > 47 > 22 > 58 > 2 > 54 > 123 > 124 > 17 > 5 > 127 > 3 > 53 > 29 > 28 > 15 > 49 > 141 > 146 > 76 > 7 > 149 > 90 > 31 > 86

10 > 1 > 3 > 25 > 17 > 14 > 4 > 119 > 48 > 6 > 9 > 18 > 2 > 7 > 77 > 5 > 104 > 85 > 24 > 20 > 44 > 139 > 43 > 107 > 147 > 32 > 8 > 87 > 34 > 82 > 51 > 59 > 12 > 27 > 36 > 84 > 62 > 171 > 80 > 33 > 66 > 176 > 15 > 40 > 61 > 186 > 187 > 111 > 197 > 56

1 > 11 > 5 > 117 > 72 > 2 > 4 > 25 > 3 > 10 > 16 > 14 > 91 > 126 > 74 > 65 > 6 > 49 > 8 > 7 > 135 > 140 > 102 > 9 > 79 > 12 > 81 > 151 > 153 > 155 > 157 > 161 > 37 > 15 > 23 > 168 > 68 > 13 > 31 > 53 > 98 > 28 > 179 > 24 > 184 > 54 > 188 > 57 > 196 > 113 > 64 > 200 > 19 > 204 > 205 > 22 > 208 > 27 > 58 > 17 > 214 > 216 > 217 > 18 > 38 > 29 > 220 > 224 > 32 > 69 > 39 > 95 > 63 > 230 > 233 > 234 > 235 > 67 > 239 > 240 > 30 > 242 > 244 > 106 > 103 > 20 > 246 > 248 > 111 > 101 > 104 > 249 > 250 > 253 > 254 > 105 > 258 > 114 > 259 > 21

10 > 2 > 6 > 15 > 3 > 4 > 9 > 120 > 20 > 27 > 1 > 8 > 45 > 24 > 17 > 14 > 13 > 103 > 131 > 66 > 57 > 138 > 82 > 38 > 25 > 59 > 39 > 5 > 152 > 77 > 156 > 160 > 65 > 18 > 16 > 56 > 55 > 86 > 23 > 85 > 87 > 47 > 32 > 181 > 21 > 34 > 193 > 194 > 48 > 88

With this file, we can count the votes using a given method by running “voteengine.py -m < ballots.txt" – giving ballots.txt on stdin. For a complete list of voting methods, see the VoteEngine docs

Ok, preparations out of the way, tomorrow I’ll start evaluating voting systems!

The Best Albums of 2010, part 2: Making a Ballot

Posted in the Workshop by Joe on January 2, 2011
Tags: , , ,

Yesterday I linked to 9 lists of the Best Albums of 2010, from magazines, web sites, and one radio show, and promised to explain how we distilled those lists into one canonical Top 25. The first step is to turn each list into a preferential election ballot.

The standard First Past the Post voting system used in pretty much every political election in North America is… not very good. Its one virtue is that it’s simple to explain: you can vote for one, and only one, candidate, and the candidate with the most votes wins. The problem is that this often makes it really tough to decide who to vote for. To get your preferred result, you need to consider a whole host of things other than “how good this candidate is”. You need to vote tactically. The obvious example is when you don’t think your favourite candidate can win – do you vote for them anyway, or switch your vote to your second choice? (It’s a tough decision if your candidate has ALMOST enough support to be viable.) Another example is if you mainly want one candidate to LOSE – you still need to pick one of their opponents to vote for. There are more serious problems in an election like the Canadian Parliament or US Electoral College, where a bunch of individual elections each elect one winner, and then the team with the most winners gets the grand prize, but let’s consider only elections where everyone votes directly for a single winner, like a city mayor or state governor.

Every voter can, in theory, rank all the candidates in order of preference (although if they don’t have strong opinions or aren’t very informed, that ranking may just have their favored candidate in first and everyone else tied for last…) Judging whether a voting system is any good basically involves measuring how much a voter’s “true preferences” contribute to the election’s outcome. First past the post is a poor system because it forces voters to leave out most of the information about their preferences, and in fact encourages them to “lie” by listing a candidate who isn’t actually their first choice. A better system would give each voter a more complicated ballot in which they could list all their preferences. Say, by putting a 1 next to their favoured candidate, a 2 next to their second candidate, etc. Or with a computer touch screen which removes each candidate’s name as the voter touches it, and keeps track of the order the voter chose them in. Or, although this has obvious practical problems in a large election, by writing down each candidate’s name in order on a sheet of paper (which is exactly what we have with our 9 best-of-2010 lists!) There are many physical ballots we can imagine that could record a voter’s complete preferences. But in order to study or simulate a voting system, it’s helpful to have a standard notation. Whoever counts the votes – human or computer – can start by translating each physical ballot into this standard notation.

The notation commonly used is to give each candidate a symbol (such as the first letter of their name or party), and for each ballot, list the symbols for each candidate on one line, from the most liked on the left to the least liked on the right. The symbols are separated by “>” to show that the candidate to the left is preferred to the candidate on the right, or “=” to show that they’re tied. (And, if not all candidates are listed, all the unlisted candidates are assumed to all be tied for last place.)

So, with the Canadian political parties Conservative (C), Liberal (L), New Democrat (N), and Green (G) – assume we are not in Quebec so the Bloc Quebecois is unavailable – we have examples like the following:

The extreme right-winger would like the Conservatives to win, and at all costs wants the NDP and Green Party to lose – they’d vote “C > L > N = G”.

The extreme left-winger is the opposite – “N = G > L > C” (assuming they can’t decide between NDP and Green).

Or you may have an NDP supporter who thinks the Green Party is also a good left wing choice, but that the Liberals are no better than the Conservatives – “N > G > C = L”.

Or an NDP supporter who thinks that the Green Party are a bunch of upstarts who can’t be trusted – “N > C = L > G”.

Or any weird and wonderful combination of these.

So, we have 9 lists of songs, with over 200 songs between them. The first thing we need to do is get a symbol for each song, and make sure we can turn that symbol back into a name when it’s time to output the results. Save each list into a text file (with just the songs, one per line – no rank numbers). This may take some cutting and pasting. Then go through each list and make sure that each song is spelled EXACTLY the same each time it appears (the Unix commands sort and uniq may help with this).

Now we want to read all the text files and turn them into two data structures: a map of Candidate Number to song name, and a set of ballots in the above format. Since I’m using pyvote, a Python program to count votes, the natural way to do this conversion is to write another Python script.

The script will read each file given on the command line and write out two files. “candidates.txt” is the complete list of all candidate songs, in order of Candidate Number – given a number, the song name is on that line number of candidates.txt. “ballots.txt” is the list of ballots – so for our best-of lists, it will be 9 lines long. Since WordPress removes indentation in code blocks, which is fatal for Python code, I’ve put the script on pastebin:

The Python Script.

It’s pretty simple – read each line of each file, generate a number for the song, and save the name and number in a map called “candidates”. For each file, write the numbers of each song as a string joined by ” > ” to “ballots.txt” – since none of these lists have ties, we don’t need to worry about “=”. Finally, sort the candidate map by number, and write each name in “candidates.txt”.

Save the Python file as “convertLists2Ballots.py” and run it with “convertLists2Ballots.py <list of text files>”.

The candidates.txt file this spits out isn’t very interesting, but ballots.txt looks like this:

41 > 1 > 12 > 35 > 11 > 46 > 52 > 70 > 77 > 122 > 6 > 94 > 9 > 50 > 26 > 20 > 130 > 13 > 132 > 19 > 134 > 136 > 80 > 144 > 42 > 16 > 40 > 67 > 21 > 8 > 28 > 22 > 31 > 60 > 29 > 2 > 91 > 100 > 172 > 176 > 93 > 178 > 179 > 36 > 184 > 95 > 193 > 66 > 196 > 34

1 > 22 > 115 > 30 > 4 > 41 > 16 > 5 > 46 > 12 > 11 > 21 > 3 > 38 > 7 > 42 > 66 > 50 > 19 > 133 > 73 > 137 > 2 > 143 > 53 > 82 > 52 > 26 > 68 > 154 > 158 > 74 > 166 > 54 > 92 > 31 > 77 > 8 > 173 > 95 > 37 > 93 > 181 > 60 > 183 > 9 > 192 > 35 > 6 > 97

69 > 1 > 3 > 2 > 30 > 42 > 55 > 43 > 44 > 74 > 5 > 13 > 105 > 125 > 7 > 19 > 129 > 51 > 12 > 6 > 8 > 36 > 20 > 4 > 26 > 148 > 23 > 150 > 45 > 82 > 18 > 163 > 109 > 10 > 33 > 169 > 170 > 15 > 174 > 14 > 97 > 110 > 92 > 50 > 39 > 46 > 191 > 89 > 63 > 98 > 62 > 108 > 202 > 204 > 207 > 99 > 9 > 88 > 212 > 37 > 24 > 21 > 61 > 219 > 17 > 16 > 222 > 35 > 226 > 102 > 40 > 112 > 68 > 25 > 232

13 > 21 > 69 > 116 > 70 > 118 > 62 > 79 > 48 > 32 > 71 > 26 > 5 > 56 > 3 > 128 > 61 > 88 > 11 > 83 > 1 > 43 > 142 > 145 > 9 > 34 > 51 > 44 > 2 > 41 > 159 > 164 > 63 > 60 > 18 > 99 > 24 > 23 > 175 > 30 > 4 > 35 > 7 > 52 > 186 > 14 > 190 > 33 > 55 > 199 > 200 > 96 > 203 > 59 > 100 > 208 > 20 > 210 > 211 > 213 > 214 > 216 > 90 > 108 > 37 > 220 > 223 > 224 > 227 > 228 > 229 > 230 > 36 > 112 > 233 > 65 > 237 > 238 > 239 > 25 > 242 > 19 > 244 > 246 > 107 > 64 > 39 > 248 > 109 > 110 > 114 > 67 > 252 > 253 > 256 > 257 > 258 > 40 > 38 > 261

10 > 6 > 1 > 2 > 72 > 8 > 19 > 29 > 121 > 17 > 4 > 7 > 15 > 23 > 27 > 47 > 3 > 75 > 45 > 12 > 18 > 78 > 16 > 87 > 106 > 5 > 33 > 28 > 14 > 11 > 49 > 162 > 165 > 167 > 22 > 113 > 57 > 58 > 13 > 91

10 > 11 > 101 > 1 > 72 > 4 > 47 > 22 > 58 > 2 > 54 > 123 > 124 > 17 > 5 > 127 > 3 > 53 > 29 > 28 > 15 > 49 > 141 > 146 > 75 > 7 > 149 > 89 > 31 > 85

10 > 1 > 3 > 25 > 17 > 14 > 4 > 119 > 48 > 6 > 9 > 18 > 2 > 7 > 76 > 5 > 104 > 84 > 24 > 20 > 44 > 139 > 43 > 107 > 147 > 32 > 8 > 86 > 34 > 81 > 51 > 59 > 12 > 27 > 36 > 83 > 96 > 171 > 79 > 33 > 65 > 177 > 15 > 40 > 61 > 187 > 188 > 111 > 198 > 56

1 > 11 > 5 > 117 > 71 > 2 > 4 > 25 > 3 > 10 > 16 > 14 > 90 > 126 > 73 > 64 > 6 > 49 > 8 > 7 > 135 > 140 > 102 > 9 > 78 > 12 > 80 > 151 > 153 > 155 > 157 > 161 > 37 > 15 > 23 > 168 > 67 > 13 > 31 > 53 > 98 > 28 > 180 > 24 > 185 > 54 > 189 > 57 > 197 > 113 > 63 > 201 > 19 > 205 > 206 > 22 > 209 > 27 > 58 > 17 > 215 > 217 > 218 > 18 > 38 > 29 > 221 > 225 > 32 > 68 > 39 > 94 > 62 > 231 > 234 > 235 > 236 > 66 > 240 > 241 > 30 > 243 > 245 > 106 > 103 > 20 > 247 > 249 > 111 > 101 > 104 > 250 > 251 > 254 > 255 > 105 > 259 > 114 > 260 > 21

10 > 2 > 6 > 15 > 3 > 4 > 9 > 120 > 20 > 27 > 1 > 8 > 45 > 24 > 17 > 14 > 13 > 103 > 131 > 65 > 57 > 138 > 81 > 38 > 25 > 59 > 39 > 5 > 152 > 76 > 156 > 160 > 64 > 18 > 16 > 56 > 55 > 85 > 23 > 84 > 86 > 47 > 32 > 182 > 21 > 34 > 194 > 195 > 48 > 87

Incomprehensible to a human, but pyvote will be able to read this and then try lots of different voting systems on it. Tomorrow I’ll show how to make this happen.

(As an aside, instead of a bunch of text files, it’s pretty common to get data you want to turn into votes as a spreadsheet. To process this with Python, save your spreadsheet in CSV format – that’s “comma separated value”, a simple text format that can’t handle formatting or formulas – and then read it into Python using the csv module.)

The Best Albums of 2010 (According to Schulze)

Posted in the Workshop by Joe on December 31, 2010
Tags: , ,

Over on my lovely wife’s music blog is a list of the Top Ten Albums of 2010, culled from nine separate end-of-year lists:

All Songs Considered (NPR)
MOJO
NME
Pitchfork
Q
Rolling Stone
Rough Trade
Stereogum
Spin

These lists actually have 25 to 100 entries on them, and they don’t all have the same entries or in the same order. The Rough Trade list doesn’t have a single one of our Top 10 in their top 10! (And our #3 song isn’t on their list at all!) So where’d we come up with these numbers?

Voting theory!

Treat each end-of-year list as a ballot in an election. In a standard election, several thousand or several million people all choose from a handful of candidates. Here we have just 9 ballots cast to choose between several hundred songs. So it’s the reverse of the elections that are commonly studied, but there’s no reason standard election tools wouldn’t work. And it’s an edge case that might reveal interesting properties of the methods used to count the votes.

I counted the votes using 10 separate methods (using vote-counting software – I’m not insane!) and then we picked the result that seemed to make the most sense. Not scientific at all, but nobody said this was a scientific study – it’s just for curiosity. We ended up choosing the results returned by the Schulze Method, the same method used by elections in a lot of open source software groups (including Debian and Gentoo).

Tomorrow I’ll explain how to turn a list into a ballot (and what that even means), and how to use pyvote to process the ballots. Then I’ll go through each of the 10 voting systems, describe them, and discuss the results. That’ll take a while… (A lot longer than it did to actually generate them!)

In the meantime, here’s the complete list, in order. All 261 of them.

1. Arcade Fire – The Suburbs
2. Beach House – Teen Dream
3. Kanye West – My Beautiful Dark Twisted Fantasy
4. LCD Soundsystem – This is Happening
5. Deerhunter – Halcyon Digest
6. Vampire Weekend – Contra
7. The National – High Violet
8. Janelle Monae – The ArchAndroid
9. The Black Keys – Brothers
10. Yeasayer – Odd Blood
11. MGMT – Congratulations
12. Joanna Newsom – Have One On Me
13. Ariel Pink’s Haunted Graffiti – Before Today
14. Big Boi – Sir Lucious Left Foot: The Son of Chico Dusty
15. Caribou – Swim
16. Best Coast – Crazy For You
17. Grinderman – Grinderman 2
18. Robyn – Body Talk Pt. 1
19. Crystal Castles – Crystal Castles
20. Gorillaz – Plastic Beach
21. Sleigh Bells – Treats
22. Flying Lotus – Cosmogramma
23. Gil Scott-Heron – I’m New Here
24. Avi Buffalo – Avi Buffalo
25. Robert Plant – Band of Joy
26. Sufjan Stevens – The Age of Adz
27. Kings of Leon – Come Around Sundown
28. M.I.A. – MAYA
29. Neil Young – Le Noise
30. Tame Impala – Innerspeaker
31. John Grant – Queen of Denmark
32. Laura Marling – I Speak Because I Can
33. Edwyn Collins – Losing Sleep
34. Matthew Dear – Black City
35. The Dead Weather – Sea of Cowards
36. The Roots – How I Got Over You
37. Titus Andronicus – The Monitor
38. Emeralds – Does it Look Like I’m Here?
39. Hot Chip – One Life Stand
40. Foals – Total Life Forever
41. These New Puritans – Hidden
42. Wild Nothing – Gemini
43. Broken Bells – Broken Bells
44. Liars – Sisterworld
45. Salem – King Night
46. Warpaint – The Fool
47. Glasser – Ring
48. Belle and Sebastian – Write About Love
49. Manic Street Preachers – Postcards From a Young Man
50. Surfer Blood – Astro Coast
51. Swans – My Father Will Guide Me Up a Rope To the Sky
52. Four Tet – There Is Love in You
53. Charlotte Gainsbourg – IRM
54. Drake – Thank Me Later
55. Field Music – Measure
56. Gold Panda – Lucky Shiner
57. Jamey Johnson – The Guitar Song
58. No Age – Everything in Between
59. Oneohtrix Point Never – Returnal
60. Paul Weller – Wake Up the Nation
61. Band of Horses – Infinite Arms
62. Crocodiles – Sleep Forever
63. Interpol – Interpol
64. Midlake – The Courage of Others
65. Perfume Genius – Learning
66. Phosphorescent – Here’s To Taking it Easy
67. Spoon – Transference
68. Superchunk – Majesty Shredding
69. The Coral – Butterfly House
70. The Morning Benders – Big Echo
71. The Walkmen – Lisbon
72. Das Racist – Shut Up, Dude / Sit Down, Man
73. Eminem – Recovery
74. The-Dream – Love King
75. Twin Shadow – Forget
76. The Tallest Man on Earth – The Wild Hunt
77. Cee Lo Green – The Lady Killer
78. Danger Mouse and Sparklehorse – Dark Night of the Soul
79. John Legend and The Roots – Wake Up!
80. Elton John and Leon Russell – The Union
81. Erykah Badu – New Amerykah, Pt. 2: Return of the Ankh
82. Mavis Staples – You Are Not Alone
83. Of Montreal – False Priest
85. Peter Gabriel – Scratch My Back
86. Sharon Jones and the Dap-Kings – I Learned the Hard Way
87. Sharon Van Etten – Epic
88. The Gaslight Anthem – American Slang
89. Tom Jones – Praise and Blame
90. Villagers – Becoming a Jackal
91. Mount Kimbie – Crooks and Lovers
92. Teenage Fanclub – Shadows
93. Zola Jesus – Stridulum II
94. Abe Vigoda – Crush
95. Broken Social Scene – Forgiveness Rock Record
96. Delorean – Subiza
97. Dum Dum Girls – I Will Be
98. Frightened Rabbit – The Winter of Mixed Drinks
99. Gayngs – Relayted
100. Gonjasufi – A Sufi and a Killer
101. jj – jj no 3
102. Jonsi – Go
103. Klaxons – Surfing the Void
104. Male Bonding – Nothing Hurts
105. Marina and the Diamonds – The Family Jewels
106. My Chemical Romance – Danger Days
107. Mystery Jets – Serotonin
108. Rick Ross – Teflon Don
109. Tamaryn – The Waves
110. The Drums – The Drums
111. Man Alive – Everything Everything
112. Steve Mason – Boys Outside
113. Wavves – King of the Beach
114. Black Angels – Phosphene Dream
115. Antony and the Johnsons – Swanlights
116. Caitlin Rose – Own Side Now
117. Darkstar – North
118. Doug Paisley – Constant Companion
119. James Blake – The Bells Sketch
120. How to Dress Well – Love Remains
121. Girls – Broken Dreams Club
122. John Mellencamp – No Better Than This
123. Kid Cudi – Man on the Moon II: The Legend of Mr. Rager
124. Big K.R.I.T. – K.R.I.T. Wuz Here
125. Marnie Stern – Marnie Stern
126. Ceo – White Magic
127. Avey Tare – Down There
128. Lower Dens – Twin Hand Movement
129. Frank (Just Frank) – The Brutal Wave
130. Baths – Cerulean
131. Mumford and Sons – Sigh No More
132. Local Natives – Gorilla Manor
133. Plan B – The Defamation of Strickland Banks
134. Ray LaMontagne and the Pariah Dogs – God Willin’ and the Creek Don’t Rise
135. New Pornographers – Together
136. Massive Attack – Heligoland
137. Josh Ritter – So Runs the World Away
139. Roots Manuva meets Wrongtom – Duppy Writer
140. Rumer – Seasons of My Soul
141. Isobel Campbell and Mark Lanegan – Hawk
142. Sam Amidon – I See the Sign
143. She and Him – Volume II
144. Bruce Springsteen – The Promise
145. Small Black – New Chain
146. Take That – Progress
147. Corinne Bailey Rae – The Sea
148. Bryan Ferry – Olympia
149. Brandon Flowers – Flamingo
150. Taylor Swift – Speak Now
151. Kid Rock – Born Free
152. Elizabeth Cook – Welder
153. Maximum Balloon – Maximum Balloon
154. Peter Wolf – Midnight Souvenirs
155. Ted Leo and the Pharmacists – The Brutalist Bricks
156. Against Me! – White Crosses
157. The Chemical Brothers – Further
158. The Fall – Your Future, Our Clutter
159. Factory Floor – Untitled
160. Les Savy Fav – Root for Ruin
161. New Young Pony Club – The Optimist
162. Islet – Wimmy
163. Hurts – Happiness
164. Lonelady – Nerve Up
165. Magnetic Man – Magnetic Man
166. Pulled Apart By Horses – Pulled Apart By Horses
167. Kelis – Flesh Tone
168. First Aid Kit – The Big Black and the Blue
169. Ikonika – Contact, Love, Want, Have
170. Errors – Come Down With Me
171. The Fresh and Onlys – Play It Strange
172. The Jim Jones Revue – Burning Your House Down
173. The White Stripes – Under Great White Northern Lights
174. Drive-By Truckers – The Big To-Do
175. Carolina Chocolate Drops – Genuine Negro Jig
176. Laura Veirs – July Flame
177. Dr. Dog – Shame Shame
178. Bob Dylan – The Witmark Demos
179. Gogol Bordello – Trans-Continental Hustle
180. Tramples By Turtles – Palomino
181. Johnny Cash – American VI: Ain’t No Grave
182. The Hold Steady – Heaven Is Whenever
183. Black Rebel Motorcycle Club – Beat the Devil’s Tattoo
184. Freelance Whales – Weathervanes
185. Los Lobos – Tin Can Trust
186. Tom Petty and the Heartbreakers – Mojo
187. Elvis Costello – National Ransom
188. Richard Thompson – Dream Attic
189. Ra Ra Riot – The Orchard
190. Justin Townes Earle – Harlem River Blues
191. Blitzen Trapper – Destroyer of the Void
192. Trent Reznor And Atticus Ross – The Social Network Soundtrack
193. Goldfrapp – Head First
194. Jakob Dylan – Women and Country
195. Ben Folds and Nick Hornby – Lonely Avenue
196. Jimi Hendrix – Valleys of Neptune
197. Leonard Cohen – Songs From the Road
198. OK Go – Of the Blue Colour of the Sky
199. The Books – The Way Out
200. Junip – Field
201. Deer Tick – Black Dirt Sessions
202. Sade – Soldier of Love
203. Tricky – Mixed Race
204. I Am Kloot – Sky at Night
205. Skream – Outside the Box
206. Cherry Ghost – Beneath This Burning Shoreline
207. Two Door Cinema Club – Tourist History
208. Voice of the Seven Thunders – Voice of the Seven Thunders
209. Brian Eno – Small Craft on a Milk Sea
210. Dylan LeBlanc – Paupers Field
211. Konono No 1 – Assume Crash Position
212. Smoke Fairies – Through Low Light and Trees
213. PVT – Church With No Magic
214. The Soft Pack – The Soft Pack
215. O Children – O Children
216. Holly Miranda – The Magician’s Private Library
217. Sea of Bees – Songs For the Ravens
218. Pantha Du Prince – Black Noise
220. Cours Lapin – Cours Lapin
221. Darwin Deez – Darwin Deez
222. School of Seven Bells – Disconnect From Desire
223. Beach Fossils – Beach Fossils
224. Shit Robot – From the Cradle to the Rave
225. Chilly Gonzales – Ivory Tower
226. Connan Mokasin – Please Turn Me Into the Snat
227. Holy Fuck – Latin
228. The School – Loveless Unbeliever
229. Tobacco – Maniac Meat
230. Dios – We Are Dios
231. Allo Darlin’ – Allo Darlin’
232. El Guincho – Pop Negro
233. Kort (Kurt Wagner and Cortney Tidwell) – Invariable Heartache
234. Solar Bears – She Was Coloured In
235. Free Energy – Stuck on Nothing
236. Kings Go Forth – The Outsiders Are Back
237. Dan Michaelson and the Coastguards – Shakes
238. Stornoway – Beachcomber’s Windowsill
239. Magic Kids – Memphis
240. Fool’s Gold – Fool’s Gold
241. Frankie Rose and the Outs – Frankie Rose and the Outs
242. Aloe Blacc – Good Things
243. Drums of Death – Generation Hexed
244. Am – Future Sons and Daughters
245. Time and Space Machine – Set Phazer to Stun
246. Walls – Walls
247. The Dillinger Escape Plan – Option Paralysis
248. Happy Birthday – Happy Birthday
249. The Eighties Matchbox B-Line Disaster – Blood and Fire
250. Woods – At Echo Lake
251. Tyler, The Creator – Bastard
252. Kylesa – Spiral Shadow
253. Women – Public Strain
254. Forest Swords – Dagger Paths
255. Wyatt, Atzmon, Stephens – For the Ghosts Within
256. Eli Paperboy Reed – Come and Get It
257. Kelley Soltz – To Dreamers
258. The Besnard Lakes – Are the Roaring Night
259. Roky Erickson with Okkervil River – True Love Cast Out All Evil
260. Jane Weaver and Septieme Souer – The Fallen By Watch Bird
261. Ali Farka Toure and Toumani Diabate – Ali and Toumani

Advogato #2

Posted in the Junk Room by Joe on April 18, 2008
Tags: , , ,

Originally posted on Advogato on March 25, 2004:

Wonderful timing – just after I got all excited in my last post, Mark Hahn announced Prothon, a prototype-based language closely based on Python.

Pros: they advertise a better interpreter (but why not just use the same terp for Python?), no rough edges like the __init__ problem.

Cons: not actually Python compatible, so you can’t use existing class libraries.

I imagine I’ll look quickly at it, say, “Very nice,” and then keep using Python. Because this is pre-alpha, and Python works now. And hacking with metaclasses is fun!

Commentary: prothon.org is dead, and I assume the language went with it. I stopped following it because the author kept adding his own pet syntax changes and preferences that had nothing to do with prototypes, so it got farther and farther from Python.

Advogato #1

Originally posted on Advogato on March 22, 2004:

Fun with Python

A. M. Kuchling, in rec.arts.int-fiction, just showed me a really neat Python trick which he attributed to Michael Hudson. But first, some background:

I use a special-purpose language called Inform to write interactive fiction. It’s sort of a hybrid between standard class-based and prototype-based OO languages – there’s a shorthand to create a single object and add properties and methods to it, but if you want to create a bunch of identical objects you still need to create a class for them. It’s very handy for world modelling. Most of the specialized IF languages use the same scheme.

I’ve been thinking for a while about using a standard language – Python or Ruby, by preference – since every IF language has annoyances and weirdnesses. This involves writing a standard library for world-modelling, which is a big job. There are already a few for Python, but they’re really cumbersome compared to the Inform standard library, because creating a new object with a few specialized behaviours always involves both a new class and a new object.

Behold – prototyped Python (direct from amk’s post)! Now we get to see how well the whitespace survives it’s trip through HTML and back:

Wow. The answer to that is “not at all”. The <pre> tag appears to do precisely nothing.

Instead, the code is on my Wiki page at work. To summarize: it lets you say “class SpecialRoom(Room)” and get both a class SpecialRoom and an instance (also conveniently named SpecialRoom) automatically.

This still has a few weirdnesses. Unlike a true prototype-based language, there are still classes and objects, so you can’t directly say “issubclass(SpecialRoom, Room)” – you need to “issubclass(SpecialRoom.__class__, Room.__class__)”. I’ll need to experiment and read up some more to find out exactly how the class and the instance differ, and decide how to make it more seamless (or even if it’s worth it). This also can’t handle dynamically creating objects, which can be handled by adding a clone() method:

 def clone(self): class anon(self): pass return anon 

Except I’m not sure exactly where to put this. I tried initializing it in Prototype.__new__, which works fine but apparently isn’t correct – that’s what Prototype.__init__ is for, except it’s not actually getting called in the above code.

Still, this is lots of fun! Massive new project, here I come…

Commentary: The massive new project never materialized. I updated the wiki link to go directly to the relevant page, where you can see why – it’s a big disorganized mess of “here’s one way to do it” and “here’s another way to do it”. It was an interesting subject, though, and Python’s gotten a bunch of new features since then (decorators came out just after the original post) so I should really get back to that. It would be nice to pull it all together into a complete solution.

Advogato #0

Posted in the Junk Room by Joe on April 18, 2008
Tags: , , , ,

Originally posted on Advogato on March 22, 2004:

First post!

Anyone else find the list of project relationships is a bit limited? (At least, it’s not defined as well as the Certifications.) I just listed myself as a “Helper” on SWIG, even though all I’ve really contributed are a few minor patches. I’ve been using it quite a bit now at work – including digging into the massive and complex support libraries – and I figure I know as much about it as anyone who isn’t one of the core developers by now. (Now I will go in on Monday and be proved entirely wrong about that.) But there’s really no way to list myself as “Observer” or “Interested Follower” or “Available to help you if you’re having problems with it”, which is really what I want to say.

Commentary: 4 years later, I don’t really care much how Advogato’s organized. And I’ve forgotten most of what I knew about SWIG – the last time I worked with Python bindings, we used Boost.Python, which had a different set of tradeoffs: clunkier for simple things (you have to wrap every method by hand, but it’s only one line each), but much, much nicer as soon as you need to do something complicated. (That’s only a vague impression though, as I was mostly a consumer – the framework was already written when I arrived.)