IAmA blogger for FiveThirtyEight at The New York Times. Ask me anything.
I use Stata for anything hardcore and Excel for the rest.
Most of the one-off charts are just done in Excel. It isn't that hard to make Excel charts look unExcellish if you take a few minutes and get away from the awful default settings. For anything more advanced, like the stuff that appears in the right-hand column at 538, I'm relying on the help of the NYT's awesome team of interactive journalists.
I'd certainly like to aim to increase the level of disclosure at 538 going forward. Sometimes what happens is that I have best intentions to write a super detailed, 5000-word methodology post, and then some senate candidate does or says something stupid, and I get caught up in the news cycle and it gets forgotten about. Which is a pretty lame excuse, I know. At the same time, 538 is a commercial business and the ability to license proprietary intellectual property is a fairly big part of how I make my living, so the disclosure would probably stop short of outright releasing source code or my database in most cases.
More often than not, people overrate the reliability of predictions in systems with a lot of complexity. There are certainly exceptions, and presidential elections are almost certainly one of them, but it's a bit weird/ironic that I'm known for one of the exceptional cases.
One of the things I'm trying to figure out is what range of topics to cover at 538. After the 2008 election, it became sort of a quantitatively-flavored politics blog, and I think that was something of a mistake. Some things, like cabinet nominations, really do requite careful reporting, and statistical analysis will provide a dollop of color commentary at best. On other days, the lead political story is just gossipy and stupid and isn't really newsworthy at all. So on a day like today, when the Chuck Hagel nomination is the major political story and that doesn't really play into our strengths, I'd rather write about something like baseball instead. The ambition is to expand 538 "horizontally" across topics, based on HOW we cover the news, rather than into the politics vertical, if that makes sense.
We're definitely overdue to do a couple of posts on same-sex marriage, however.
I don't own (or rent) a cat.
I'd encourage you to read my book and ask whether she fairly interprets my hypothesis. I don't think she does. The financial crisis chapter is quite explicit about asserting that the credit ratings agencies were not just stupid, but also a bunch of dirty rotten scoundrels, so to speak. And the book is generally quite skeptical about the role played by "experts".
Maybe this is too vague, but I think the most important thing is just to lessen the amount of book-learnin' that you do and start to play around with some data sets instead.
When I was in Mexico last week, I got recognized at the top of the Sun Pyramid at Teotihuacan, which I'm pretty sure really is a sign of the Apocalypse.
Groupthink and perverse incentives were the causes; to the extent their polling or analysis was bad, it flowed from that.
It's a tricky problem, statistically. The issue is that while gun ownership rates could plausibly be a cause of fatal crimes and accidents, it can also be a reaction to it, i.e. people purchase guns because they feel unsafe.
I'm not saying that the issue is intrinsically inscrutable. But it's something that more requires a PhD-thesis-level treatment than a blog post to really add much insight, I think.
I'd probably lower the threshold for players getting dropped from the ballot, from 5 percent to 2 percent or so, or have some sort of a sliding scale where the threshold depends on how many times a player's name has appeared. It now seems plausible that Alan Trammell will eventually get in, for example, and it's a little weird that Lou Whitaker got dropped from the ballot years ago when he might otherwise be gathering some support along with Trammell right now.
Perhaps I can convince Penguin that my next book should be a 256-taqueria burrito bracket with entries from all across the country.
At some point in the last few weeks of the election, I guess I decided to lean into the upside outcome a little bit in terms of pushing back at the pundits in my public appearances -- as opposed to emphasizing the uncertainty in the model, as I had for most of the year. (Nothing about the model design itself changed -- just how I tended to talk about it.)
Stupid poker analogy: part of playing well is in maximizing the amount of value you get from a hand in the event that things go well, in addition to mitigating your losses if they don't.
News organizations tend to have incentives to "root for the story". Part of what were were saying for much of the campaign -- both at different stages of the general election and perhaps even more emphatically in the end-stage of the primary when Romney pretty much had things wrapped up -- is that the outcome had become fairly certain. So that creates a bit of a culture clash.
Traditionally, soccer leagues just kept track of goals and bookings, and there's only so much value you can mine from that data. But I know that the EPL and MLS are starting to track all other sorts of statistics as well: tackles, passes, time of possession, etc. Would be interesting to explore that at some point. I suspect there is some low-hanging fruit since the soccer culture (even more than in most American sports) tends not to be very data-friendly.
Historically, periods of greater polarization are associated with better performance for third-party candidates, so the chances of a successful independent campaign are probably higher than average. However, that still might mean there's 3 or 5 percent chance of an independent candidate winning the 2016 election as opposed to a 1 or 2 percent chance. You might need a perfect storm where (i) Obama is perceived as really having screwed up and (ii) the Republicans nominate someone terrible and (iii) someone VERY talented runs and takes his campaign very seriously and (iv) then gets a few breaks in the Electoral College, etc. None of those individual steps are impossible, but the odds against the parlay are pretty long.
Mostly from trying to win my fantasy baseball league and my NCAA tournament pool.
Politics. I don't think its close. Between the pundits and the partisans, you're dealing with a lot of very delusional people. And sports provides for much more frequent reality checks. If you were touting how awesome Notre Dame was, for example*, you got very much slapped back into reality last night. In politics, you can go on being delusional for years at a time.
It's a complicated issue that maybe doesn't lend itself so well to the reddit treatment.
My quick-and-dirty view is that people are too quick to affiliate themselves with identity groups of all kinds, as opposed to carving out their own path in life.
Obviously, there is also the issue of how one is perceived by others. Living in New York in 2013 provides one with much a much greater ability to exercise his independence than living in Uganda -- or for that matter living in New York forty years ago. So perhaps there's a bit of a "you didn't build that" quality in terms of taking for granted some of the freedoms that I have now.
And/but/also, one of the broader lessons in the history of how gay people have been treated is that perhaps we should empower people to make their own choices and live their own lives, and that we should be somewhat distrustful about the whims and tastes and legal constraints imposed by society.
In terms of quality of life, it's very close. But New York is a lot better for someone working in "the media", and probably also more broadly for most people who are super ambitious about their careers. One of the big cultural differences here -- very much for better and worse -- is that people are often very career-driven well into their 40s, 50s, 60s.
Tempted, yes, but sometimes resisting temptation is a good thing.
Sorry for a brief answer to a very long question, but I've long been surprised that there isn't an elections-reference.com. Sean Forman had better get on that or I might steal the idea.
If we had a list of exactly who used steroids and when, you could do a lot of clever things. But we don't, and the sample of alleged and actual steroids users is liable to be nonrandom and biased in various ways.
Intellectually, the defense is pretty simple, which is that 20 percent outcomes happen 20 percent of the time. In fact, the 20 percent outcomes are supposed to happen 20 percent of the time (not substantially more OR substantially less) or you've calibrated your model incorrectly.
OK, not quite that simple: any time a low-probability event occurs (although I'm not sure that I'd describe a 20 percent outcome as a "low-probability event") you ought to be asking whether your model of the universe was correct, particularly in cases where there is a considerable amount of structural uncertainty. The answer may well be "yes" -- you shouldn't necessarily be in a rush to change your model and there can be harm in doing so -- but you should be posing the question.
But I have no illusion: this defense would have been less than persuasive to many people. If you watch a poker hand, and a guy gets all-in before the flop with aces against kings (an 80/20 bet), our animal instinct is very much to tag him as a LOSER if a king comes up on the flop, even though he probably played his hand perfectly. So I'd just have had to take my lumps and acknowledge that I'd been very fortunate in many respects in life (i.e. often getting much more credit than I deserved) up through 11/6/12.
Yes, I think, in large part because the split-the-baby solutions to steroids use are hard to apply in practice. I might use steroids use as a tiebreaker for otherwise very close cases (and I think McGwire, Sosa and Palmeiro all fall into that category). But I don't think people should pretend that we can put each player's stats through some kind of algorithm and come up with "steroid-neutral" statistics. We just don't know all that much about who did and didn't use steroids, and when.
There are certainly cases where applying objective measures badly is worse than not applying them at all, and education may well be one of those.
In my job out of college as a consultant, one of my projects involved visiting public school classrooms in Ohio and talking to teachers, and their view was very much that teaching-to-the-test was constraining them in some unhelpful ways.
But this is another topic that requires a book- or thesis-length treatment to really evaluate properly. Maybe I'll write a book on it someday.
Yes, definitely. The New York Times guys really are the very best at the world at this. Part of that is because they really are journalists in addition to being programmers and/or graphic artists: the goal is to communicate complex information clearly and accurately, and not just to make something cool or pretty. There should be a Pulitzer category for this stuff.
Yes, it's called a playoff. Ideally an 8- or 12- or 16-team playoff, I think.
The irony is that of all college and professional sports, NCAA football is the one that might most necessitate a playoff because 12 games just isn't enough to tell you very much -- especially when many/most are played against mediocre competition. If instead a team needs to win 3 or 4 games against top-flight opponents to win the national championship, you can say with a bit more confidence that they're deserving.
It worries me a bit. There is probably a danger zone in which a candidate's supporters take for granted that he'll win the election and so don't turn out to vote, but the election is nevertheless close enough for him to lose. That may have happened in the Democratic primary in New Hampshire in 2008, for example. There were a lot of reasons why Hillary beat her polls, but one contributing factor may have been that a lot of independent voters who would otherwise have voted for Barack chose to vote in the GOP primary instead since it seemed more competitive.
2012 was a reasonably close election. Not 2000 close, obviously, but closer than average.
The distinction that got lost a bit was between closeness and uncertainty. If a baseball game is 3-2 in the bottom of the 9th inning and you've got Papelbon on the mound or whatever, it has definitely been a "close" game but not one in which the outcome is in all that much doubt.
Less abstractly: when it became clear (i) Romney's "momentum" from Denver had begun to recede and (ii) that the final major news event of the campaign (Hurricane Sandy) was working to Obama's benefit, some of the uncertainty was removed.
Well, I guess I'd put it like this: statistical analysis may not get you as far in basketball* or (especially) football as it does in baseball. But it still probably gets you much further than in most industries.
I am Julia Scheeres, NYT Bestselling author of "Jesus Land" and "A Thousand Lives: The Untold ...
I am Tom Standage, Digital Editor at The Economist. AMA
IamA Ricardo Baca, marijuana editor for The Denver Post AMA!
I am film director Timo Vuorensola, made a film about Moon Nazis called IRON SKY, now working on ...
I'm Scott Thompson of the Kids in the Hall Larry Sanders and NBC's Hannibal AMA!
We are Matt and Bianca. Owners of "Everybody Swing", the swinger magazine and hugely popular ...