Saturday, February 25, 2017

Predict the Magic Online Championships Metagame and Win $40 in Cardhoarder/Isle of Cards Credit!

The Magic Online Championship being held at WoTC HQ on March 3-5 is a unique event which invites 16 players - some top players from the Pro Tour and some successful online qualifiers - to 14 rounds of grueling Swiss play for a total prize pool $116,000. The select nature of the competition means that making the right metagame call for the 8 Standard rounds in this tournament will give you a huge advantage in taking down the first place prize of $40,000 and instant Platinum status.


Think you've got a read on the meta but somehow missed getting an invite to this 16-competitor tournament? Well, you might not be in the running for the 40 grand, but nail the questions in the FREE metagame prediction survey and you can still get bragging rights and earn $40 credit at Cardhoarder/Isle of Cards! Runner up gets $10 in credit.

Fill out the survey here. Good luck!

Tuesday, February 7, 2017

Lessons Learned about Running the Pro Tour Aether Revolt Metagame Prediction Game

The PT Aether Revolt Metagame Prediction Game has concluded and our winner has been identified. The response has been quite positive so it’s likely I’ll run a similar game again in the future (though I am running out of random unwanted sealed product to put up as prizes). So with that in mind, I’m going to take a moment to review how this game went from an operational perspective, to guide how I iterate and improve to deliver an even better experience for participants for PT Amonkhet.


1. Picking cards seems far more popular than picking players… why not both?
The prediction game for PT AER was actually the second such survey I ran; the first one was for PT EMN when I ran a fantasy draft of Pro Players. Now the PT EMN Fantasy Draft had a number of advantages over the PT AER Metagame Prediction Game—I put up a more valuable product as a prize for the winner, because the survey was picking players and not cards I could schedule it to be closer to the actual event and ride more of the PT hype, and my posts and content leading up the the game were much higher visibility than the ones for this PT (one of the interactive dashboards I made to provide context for people’s fantasy drafts got a shout-out from the mothership.) That’s a lot of people seeing my invitation to fill out the survey. Nevertheless, even with less visibility, being timed well before the Pro Tour, and a less valuable prize, the PT AER game ended up with over twice the participation of the PT EMN Fantasy Draft.


Ultimately I think the reason here is just that most of the Magic community has a stronger relationship to the cards than to the pro players. Sure, the biggest names on the PT have pretty broad name recognition - your Jon Finkels and Luis Scott Vargas’s. But something I heard many times from my personal friends that I invited to join the PT EMN Fantasy Draft, even ones who were quite into Magic, was this: “I’m not sure if I even know 8 Pros to pick for my fantasy roster!” The fact is, your average Magic player is probably not going to have a very strong opinion on how well some mid-tier Gold-level Pro is going to do in the next PT. On the other hand, everyone who plays Magic can have an opinion on whether or not that sweet card in the spoiler is going to be OP or not. Ultimately even with all its relative handicaps, especially in visibility, the PT AER prediction game it just more accessible to a bigger potential audience.


That said, I don’t want to completely stop running Pro Player Fantasy drafts, because personally I love them. I enjoy following the Pro scene and I still find it more emotionally engaging to root for a player than to root for specific cards. And another advantage of the player fantasy drafts is, unlike the metagame prediction game, it gives me something to run in the week immediately prior to the PT, so it doesn’t necessarily even need to conflict with the metagame prediction game. Honestly, I wanted to do it again this PT, and just couldn’t find the time to organize it.


Ideally for PT Amonkhet there will be two games. One, to be run immediately after set release, for contestants to predict the PT meta. The second, to be run the week prior to the PT, for contestants to predict Pro Player performance. This will be the best of both worlds and keep people engaged.


2. Scoring Structure Could be Tweaked to Reward Non-Obvious Picks
Unlike the PT EMN Fantasy Draft, scoring of the PT AER Metagame Prediction Game was done pretty much straight-up. There were no odds, no handicaps, not many limitations other than the segregation of colored/colorless/gold cards. I didn’t want to discourage people too much from making the “obvious” choices because remember, one of my objectives was to answer the question of just how “obvious” development’s “obvious” mistakes truly are if we don’t have the benefit of weeks and weeks of metagame iteration.


I think this did make the game a little less interesting as a consequence, because there was little incentive to deviate away from these obvious picks and no payoff for correctly identifying an under-hyped card. The way the scoring worked out, you still got enough points for picking Felidar Guardian—a hyped card that actually significantly underperformed its expectation—that it didn’t really hurt your chance of winning. I wish there were some way to reward the lone submission which correctly identified Release the Gremlins as a constructed-worthy card. Over 300 submissions, and only one person saw that Manic Vandal might be constructed-playable in an artifact set! Finding the hidden gem is an accomplishment and for future iterations of these games I’ll try to tweak the scoring to reflect the difficulty of that.


3. Getting Visibility/Accessibility Tricky.
Speaking of visibility and accessibility, both are key in letting people know about the game.


My primary avenue for advertising was just making posts on this blog, which I would then post links to on reddit. There were three posts that I made in all about the PT AER metagame prediction game: one post announcing the game, one post reviewing the entries so far after a few days, and one post after the PT was over wrapping everything up.


The funny thing is how the first two posts both struggled to get any traction at all, with 0 (!) and 6 net upvotes respectively. The most popular post, by far, was the wrap-up post after the PT, which shot straight to the top of the front page the Monday after the Pro Tour and is sitting up 350 net upvotes and counting. Of course, by this time it was too late for people to actually participate and join the game.


So what happened here? Shouldn’t the posts where people can actually, you know, play the game and win a free duel deck, be more popular than the post where I just throw up some charts about an event that has already concluded? Well I think a few things are at work here.


First, there’s just the element of luck. Just like in Magic, sometimes when it comes to social media things just break your way. If you get a few early upvotes that can give you enough momentum to get on the main page and grow from there. If you get a few early downvotes you never make it off the new tab and are buried. I do think there was a bit of a snowball effect here. The third post got a cluster of upvotes pretty much immediately after I submitted it, which definitely helped give it that nudge in the right direction to get things rolling.


Second, though, there’s definitely that element of accessibility again. It’s widely been observed that the Reddit model tends to reward easily digestible content that garners quick upvotes. My recap post was a bunch of charts you can scroll through, go “huh that’s interesting,” and then immediately upvote. This is much more accessible than the initial posts, which are entire surveys which you need to click through, consider your options, and then make selections. By the time you’ve done all that you might not be interested in going back to reddit to give me an upvote.


Here I may personally be somewhat to blame. You may notice that my post announcing the game doesn’t just announce the game - I have to first go on a rant wherein I disparage the collective card-evaluation prowess of the Magic community and defend WoTC RND. Now, this wasn’t a non-sequitor. The point of all that was to advance a thesis I do genuinely believe, and establish the concepts of prediction commitment and testing for which the prediction game could serve as a functional exercise. And I actually thought that framing the game this way would give it a “hook” that would encourage participation—let’s prove this random guy on the internet wrong and show him that I actually am smarter than Wizards RND! But still, this made the entire announcement more difficult to digest.


Third, timing was probably an issue. Because I wanted participants in the metagame prediction game to make their predictions without the benefit of any major tournament results, I needed to schedule it for the week immediately after prerelease, prior to any SCG Opens. At this point, the PT is on the horizon, but the Magic zeitgeist is still very much revolving around cool things that happened at prereleases and release events, not the PT. The recap post had the advantage of going up the Monday after the PT, when the conversation in the Magic community was all about… the PT. So the recap post was far more topical for its time than the game announcements posts.


Fourth, and I don’t want to over-emphasize this as a potential factor because I do believe it’s the least important of the four, there was perhaps a bit of gamesmanship from the game participants themselves seeking to prevent broader participation. Posters of giveaways in other subreddits have often observed how often early voting patterns seem to skew negative on posts where they are giving something away (ahem, like a free duel deck) because lurkers are trying to improve their own odds of winning by reducing the visibility of the giveaway and driving down participation. Magic players can be value fiends with a solid grasp of game theory, so this kind of behavior on /magictcg wouldn’t be inconceivable to me. This theory is made more convincing by the observation that my unpopular 6-point second blog post was actually very similar in content to the eventual popular 350-point recap post; both posts go through charts looking at survey results, but the recap post went up after there was no longer anything on the line.


Still on the whole I think the effect of this is minor, and comes only from a small minority of the community. If a post doesn’t get initial traction, though, that small minority could become the “vocal” minority as the more positive-minded broader community never gets a chance to see and appreciate a giveaway. And in any case, maybe my first post just got a bunch of downvotes because I was ranting about how stupid the community was - that’s not an unreasonable theory either.


TL;DR, My Plans for PT Amonkhet
  1. Time permitting, run both a prediction game about picking cards and one about picking Pro players.
  2. Tweak the scoring model to reward people who correctly identify non-obvious/under-hyped choices.
  3. Keep the lessons learned about getting visibility in mind. Getting traction on reddit is never going to be an exact science, but I can emphasize accessibility and timing to try and do better. Also, maybe don’t announce the contest with an extended rant this time.

Well, I do hope to see everyone to participated this time back next time for PT Amonkhet prediction games. Now I can’t guarantee 100% that I will run them, as this is of course a hobby for all of us, and I just happened to have a lot of spare time at the appropriate periods this time around, but the positive response to the event this time definitely encourages me to continue doing this. (Shoutout to the random stranger who bought me reddit gold!)

As you've seen, it can be difficult to get the word out about these free prediction games. If you're interested in being notified about the next Pro Tour prediction game, leave your email address. I'll only use email addresses collected this way to notify you about new Pro Tour Prediction Games.

Monday, February 6, 2017

Comparing the Community's Predictions with the Pro Tour Top Decks

A few weeks ago I asked the /magictcg and /spikes reddit communities to fill out a survey with their card predictions for Pro Tour Aether Revolt. The idea was to get a sense for just how good we are at identifying the broken cards in a standard format before any tournament results come in.


Well, the results are in. As per the contest rules, I scored each card according to how many top decks (7+ constructed wins) it appeared in, with sideboard-only appearances counting for 0.25. With over 300 entries, here's how things turned out.


The Colored Cards

I asked respondents to select 5 monocolored cards (in any distribution of colors). The chart below shows all the cards with >20 selections on the survey.
The story of Pro Tour Aether Revolt was the complete dismantling of Jeskai CopyCat decks, mostly by Mardu Vehicles, so it’s no surprise that Felidar Guardian is the biggest disappointment here, vastly underperforming its hype. On the other hand, solid answers such as Fatal Push and Shock were as good as advertised.


The biggest breakout card was Rishkar, Peema Renegade, whose value in the BG deck with +1/+1 counters synergy was underestimated by the survey respondents (keep in mind the survey was intentionally closed just prior to the first SCG opens.)


Also of note is the gap between Disallow and Metallic Rebuke. Many people picked the versatile hard counter, and while it did show up in a reasonable number of top decks, the overlooked Metallic Rebuke actually far surpassed Disallow in point value even though it appeared in 1 less maindeck. This is thanks to its frequent inclusion in the sideboard of the deck of the tournament, Mardu Vehicles.


Other notable misses on the survey’s part are Yahenni’s Expertise, Sram’s Expertise, and Herald of Anguish. All appeared to be solid constructed-worthy cards but in the end did not fit into any top decks. Also a miss, though in the other direction: Release the Gremlins. This card showed up in a few Mardu Vehicles lists as a mirror-breaker, but was completely ignored by survey respondents, and is left off the chart above as it garnered just 1 pick in the survey.


The Gold Cards

I asked respondents to select just one gold card. The chart below shows all selections.
No real surprises here. Winding Constrictor turned out to be as solid as it looked (though to be fair, it didn’t really have much competition in its category.)


The Colorless Cards

I asked respondents to select just one colorless card. The chart below shows all selections.
No real surprises here either, even though the community didn’t quite nail the relative prevalence of Walking Ballista, Heart of Kiran, and Aethersphere Harvester, significantly overvaluing the Ballista. All three are clearly solid cards and their relative performance at this Pro Tour likely says more about the particular meta at that event (Mardu Vehicles for days) than it does about the strength of the cards themselves.


So What Does it All Mean?

In my announcement for this contest, you’ll find that part of the motivation behind this particular experiment was to test my belief that “WoTC makes (almost) no obvious development mistakes.” The idea was that the community, with the benefit of hindsight bias and the crowdsourced testing resources of millions, vastly overestimates how “obvious” WoTC’s development mistakes are. With the survey I sought to discover if (1) The community picks would coalesce around some suspected broken cards, and (2) If those cards would truly turn out to be broken.


The answers to these questions appear to be (1) sort of, and (2) no. If any deck is broken in the current standard meta, it is the Mardu Vehicles deck. The predictions of the survey didn’t exactly reflect that, with respondents heavily favoring the more hyped Felidar Guardian and Walking Ballista over the Mardu Vehicles options in those respective categories.


That said… my assertion that WoTC development makes very few obvious mistakes was rather undermined by Sam Stoddard’s admission that development completely missed the Felidar Guardian/Saheeli Rai interaction. Yes, CopyCat fell flat at the Pro Tour so perhaps development dodged a bullet. Or perhaps, like the Aetherworks decks in previous standard, CopyCat will just go into hiding for a while, to burst back into a changed meta that won’t be quite so hostile to it. Either way, the power level of CopyCat is so high that Stoddard’s admission is truly very troubling for those of us, like myself, that are generally sanguine about WoTC RD’s capabilities.


So what does this all mean? Perhaps we need to wait and see. The single-deck-dominated Pro Tour meta is certainly troubling and created one of the worst Top 8 viewing experiences in recent memory. On the other hand, that dominant deck was not CopyCat, and it’s not impossible for a relatively stale Pro Tour metagame to eventually create a diverse standard metagame. Will Mardu Vehicles stay on top now that it’s public enemy number 1? And if it doesn’t, does that just mean that CopyCat is broken after all? I will be very interested to find out.


I hope everyone who participated had fun playing the prediction game and following the Pro Tour. Keep your eyes peeled for future events of this nature for Pro Tour Amonkhet. (Or leave your email address so I can let you know about the next one!)


But Wait, Who Won?


Oh yeah, there’s the little matter of the Japanese Elspeth vs Kiora I promised to send the winner. Congratulations to the winner of the Pro Tour Aether Revolt Metagame Prediction Game: “M” from New York. M had close to the theoretical maximum-scoring ballot. These were the picks of the winning ballot:


Pick 5 Monocolored Cards
Pick 1 Gold Card
Pick 1 Colorless Card
Felidar Guardian, Metallic Rebuke, Fatal Push, Shock, Rishkar, Peema Renegade
Winding Constrictor
Heart of Kiran


Looking at the top-scoring submissions, they all shared many of the same “probably strong” cards. There was a cluster of leaders that established an advantage by picking the undervalued Rishkar, and M opened a gap on the rest of the field by taking Metallic Rebuke over Disallow. As discussed above, the greater splashability of Metallic Rebuke led to its inclusion in many Mardu Vehicles sideboards, pushing it over the top in this Pro Tour.


The only way for M to have improved on the winning submission theoretically would have been to swap Felidar Guardian for Greenbelt Rampager, which would have marginally improved the score by just 1 point.


Here’s what M had to say about his picks:


1. Felidar Guardian. This one was pretty obvious. Give the masses a splinter twin and so shall they go forth and abuse.
2. Metallic Rebuke. Basically a mana leak in most decks, something standard has been clamoring for for quite some time. Probably just a strict upgrade to already played cards like spell shrivel or revolutionary rebuff.
3. Fatal Push. Many a nay-sayer advocated that this was made for modern and legacy, and wouldn't impact standard nearly as much. Well that was obviously wrong. The sheer power level of this card is off the charts, especially against the top tier decks of the current format. Nice spaceship you got there, its a shame if some spartan king shoved it into a pit.
4. Shock. Another obvious pick due to interaction with saheeli-cat combo, however this card overperformed further with double duty as additional fatal push copies in some matchups.
5. Rishkar, Peema Renegade. I'll be honest this is one of my favorite cards in the set. Obvious snake synergies aside, its pure value and mana ramp all in one bundled up package.
6. Winding Constrictor. What happens when you put a hardened scales effect on a 2/3 body for 2 mana? An entire deck archetype thats what. As previously mentioned, its best friends with Rishkar, and also finally gave verdurous gearhulk the spotlight it very much deserved.
7. Heart of Kiran. Ban smuggler's copter they said. Vehicles will be fair again they said. The format will be more diverse they said. I guess there's one downfall: dies to king leonidas?

Congrats to M on his win!

If you're interested in staying in touch and participating in future free games like this one, leave your email address. I'll only use email addresses collected this way to notify you when I create a Pro Tour Prediction Game.

Friday, January 20, 2017

The Community's Predictions (So Far) for Pro Tour AER

A few days ago I challenged the community to prove how good they were at identifying broken Magic cards by predicting the metagame of PT Aether Revolt, with a free duel deck up as a prize. After nearly 300 entries, let’s take a look at the results so far.

If you would like to play the prediction game before reading this summary of responses so far, go here. Entries will be open for about another day until StarCity begins broadcasting their Standard Open.


Top 8 Predictions


Let’s start by looking at the tiebreaker questions. These are questions I added to the survey to break any ties that could occur if there was a tie in the scoring on the card-by-card predictions. Unlike the main ballot, which asked about the best constructed decks (7+ wins), the tiebreaker section sought predictions about the Top 8 decks only. Also, these questions were optional, so some respondents left these questions blank.




The community predicts a mean of 4.48 unique archetypes in the top 8, with a mode of 5. This would make for a reasonably diverse top 8, but I think the community actually isn’t optimistic enough here. In the past, we’ve seen that even PTs which inaugurated relatively stale Standard seasons (such as PT SOI) had diverse top 8s, due to the early state of the metagame, the inclusion of limited rounds confounding the constructed results, and the sheer skill of certain top players. (Seriously, how many people built that middling Seasons Past deck because Jon Finkel is a beast?) My personal over/under on number of Top 8 archetypes would have been set at 5.5.


Here things get pretty interesting. White and Blue together combine for nearly 2/3rds of the responses here. This makes sense as those colors represent a number of archetypes (Various combos, Tempo/Flash, Control) that can be expected to put players into the Top 8.

I still remember how last year, everyone was all talking about how Magic was all about creatures now and Green with all the high value creatures was just too good and Blue is always going to suck. Or how the year before that people were wondering if maybe RDW was always just going to be the right deck to bring to Pro Tours because it seemed to always feast on the inchoate PT metagames. The lesson here is to remember that WoTC likes to constantly mix things up in Standard and that the pendulum swings around by design.


Mean here is 2.67, which seems reasonable to me just based on my gut impression of past Pro Tours. More and more, we are seeing “stacked PT T8s” reminding us that Magic is at the end of the day a game of skill. Yes, you always need a bit of luck to Top 8, but over and over again we’ve seen how the consistently good players are more likely to put themselves in a position to benefit from that luck.

I made a small mistake on this question and failed to include an option for 0 first time top 8s, which likely skews the results a little. If this question is needed to break a tie, I will be giving the win to the respondent with the answer closest to the true number.

Ok, now it’s time to look at some cards. On the survey I segregated the colorless cards, the monocolored cards, and the gold cards. This was so that cards that could go into any colored deck wouldn't be compared against cards that could only go into decks of specific colors.


Colorless Cards



Walking Ballista is the favorite here, but Heart of Kiran and Aethersphere Harvester aren’t far behind.
Metallic Mimic and Inspiring Statuary come in as dark horse picks among the colorless cards.
These picks are all fairly conventional. In fact, as of this writing, the ranking of the top 4 colorless rares in the survey perfectly correspond with their ranking by market price! (As a Mythic, Heart of Kiran’s greater scarcity of course throws off this pattern.)


Gold Cards


Winding constrictor’s combo potential made it by far the community pick. Oath of Ajani is a distant second.


Monocolored Cards


And finally, we have the biggest chunk of the survey, the monocolored cards. Each submission picked 5 cards, distributed as the respondent desired among the different colors (ie, it’s a valid strategy to stack up on a single color’s cards if you anticipate that color to truly be that strong in PT AER.) There are too many cards here to list, so the table below is all the cards with >20 selections.
The Copycat hype is real. Felidar Guardian was a clear top pick here and frankly, the only top pick that is potentially truly “broken.”
The next top 3 selections were strong answers that have a very good chance of being staples in the format, but aren’t going to be format-destroying cards anytime soon. Sure, we know that Shock is a good card that will probably see quite a bit of play. But we also know it’s probably not a development mistake.
Rounding out the top 5 is another card with some brokenness potential. As both a Cryptolith Rites and a Travel Preparations on a stick, the community thinks Rishkar is a card with some upside.
There’s a natural tension to prediction games like this - to some extent, you can use the conventional wisdom as a guide to your picks. But if you hew too closely to conventional wisdom, then your submission won’t stick out from the crowd in any way and your chance of winning is lower. There is a long tail of dark horse picks among the submissions so far: many cards with 5-20 picks that I haven’t included in the chart above.


What it All Means


As a reminder, one of the objectives of this whole game is to resolve the question of whether or not the community is actually able to identify development mistakes better than the people who design the game, or if complaints about RnD’s incompetence are just hindsight bias. Here’s what I had to say in my post announcing the contest:

If I’m wrong, and there is an obvious development mistake, the community’s picks should concentrate into a few (<3) cards, and those cards should turn out to be OP. If the community’s picks are spread out among a lot of cards, and some of them do turn out to be OP - then I’m right, and there were no obvious development mistakes. If the community’s picks are concentrated into a few cards, and those cards do not turn out to be OP, then I’m still right, and there were still no obvious development mistakes, because the ones we thought were “obviously too good” turned out not to be.

I’m pretty sure I’m right, but hey—maybe you guys will prove me wrong.

Looking at the results so far, I would say the only cards the community has pegged as potential development mistakes are Felidar Guardian and Winding Constrictor. If CopyCat turns out to be a truly degenerate combo, or Winding Constrictor turns out to be oppressive, I think it would be fair to conclude the RnD missed some pretty “obvious” mistakes in this set.

It’s not too late to prove me wrong and win a Japanese Elspeth vs Kiora Duel Deck! The FREE prediction game is open until Starcity begins streaming its first Standard content, which will likely occur sometime on Saturday morning!

Monday, January 16, 2017

There are (Almost) No Obvious Development Mistakes and Complaints are (Almost) Always Hindsight Bias: Issuing the PT AER Metagame Prediction Challenge

(Click here if you just want to go straight where you can win a Japanese Elspeth vs Kiora for predicting the PT AER metagame. Keep reading if you want to know why I think you - the collective you - aren't any good at identifying broken MtG cards.)

Complaints about WoTC RD’s card balancing/metagame prediction abilities basically all fall into something like this pattern:

Step 1) New cards are spoiled. Something like 10-20 strong standout cards with high competitive potential are identified by the community as cards to watch.

Step 2) Competitive metagame forms over the course of many weeks of high-level competitive play and deck refinement from a playerbase of millions. Eventually a handful of the true standout cards are discovered from among the group identified in Step 1. In some cases these cards might even be broken.

Step 3) “RD is terrible at their jobs, how did they miss this obviously broken card, I saw that this card was broken the second it was spoiled. Clearly the FFL is no better than a bunch of drunk monkeys.”

The key bias that allows people to falsely believe step 3 is that they did identify the development mistakes when they were spoiled. It’s just that they identified them as one of a bunch of possibly good cards. Some of these good cards didn’t quite make it, some of them were actually good, and some of them were broken. Then, we forget our misses and zero in with hindsight bias on the hits, and wonder why RD isn’t as good as we are at identifying the broken cards.

I’m not exaggerating about the level of contempt that is sometimes expressed for RD, by the way:
I hope we all can see the hindsight bias at play here. Particularly telling is how this poster’s ability to identify development “mistakes” seems to take a nosedive as we approach recent sets. Grim Flayer’s deck just got knocked out of the meta. And Mardu Vehicles is certainly a top contender, but Depala is hardly a card people point to as an OP development mistake. It’s not impossible that Yehenni’s Expertise will turn out to be a development mistake, but at this point it’s just one of many cards that could possibly be OP. Even Sylvan Advocate no longer seems like such a big mistake now that it’s completely fallen out of the metagame.

It’s instructive to go back and read some of the set reviews of sets that contained obvious development mistakes. Yes, we thought Dromoka’s Command and Collected Company might be good. We also thought Sidisi, Undead Vizer and Narset and Thunderbreak Regent and Secure the Wastes might be good. Some of those cards were good, some weren’t, and some were perhaps too good.

It’s Impossible to Evaluate Individual Cards without knowing the Metagame Context (And the Metagame is Impossible for the FFL to Accurately Predict)

As the fate of “obviously broken” cards fading in and out of the meta shows us, the difference between OP, good, fringe, and not-quite-good-enough cards isn’t the cards themselves. Their actual strength is always an emergent property of the metagame in which they exist. Take one of development’s biggest mistakes of recent memory:
There’s no denying that Collected Company was a pretty big miss on RD’s part, and the eventual Dragons/Origins-BFZ-Shadows standard it dominated was truly quite stale. But even this card’s strength in standard was highly context-dependent! On release CoCo was seen mainly as a great addition to modern, spawning a new but not broken archetype. Meanwhile in standard it was middling, forming part of a good-not-great Green/White aggro deck. It wasn’t until two sets later that enough solid 3-drops were released to create an environment for CoCo to become the oppressively format-warping card that we grew to hate.

CoCo started out fine and became strong, so in hindsight we consider it broken. It’s also instructive to consider a card that had something of the reverse dynamic:
Half a year ago Duskwatch Recruiter was labelled a “development mistake” just as often as Collected Company was. Green getting a recurring card draw effect that’s also an extremely efficient beater that’s also a ramp spell? How ridiculous is that? Of course this is broken, what are the morons in RD doing? But wait, the meta changed, and these days even when there is a green deck in the format, Recruiter doesn’t make the cut.

There’s been a lot of whining about FNM promos lately, so let’s look at one of the FNM whiffs of last year:
Flaying Tendrils has always been a bulk card. Which intern in RD do they have picking these Promos anyway?

But wait, what happened the last time they printed a similar effect?
You may not remember, but while in standard this was a $2+ uncommon. Sounds like something that would have been a solid FNM card!

Of course the difference is that Flaying Tendrils is in a standard environment where a mass -2/-2 is pretty useless, and Drown in Sorrow was in a standard environment where a mass -2/-2 was amazing.

So: individual cards are impossible to evaluate absent foreknowledge about how the entire metagame will shape out, and the metagame is the emergent result of the crowdsourced efforts of a horde of highly motivated and intelligent players (and even then it takes us a few months to really shake it out). Given this, I feel confident asserting that there are almost no obvious development mistakes, and if you think you identified some, you’re likely operating under hindsight bias.


You Totally Predicted the CoCo was OP, Though, and can Prove it


Any such assertions made after-the-fact (and any arguments that rely on such post-hoc assertions) are indistinguishable from hindsight bias and should be discounted. The only way to rigorously test whether you are as good at identifying OP cards as you say you are is to pre-commit, before the tournament results come in. Since we’ve just finished the Aether Revolt prerelease, that means… now. In the vein of the PT EMN Fantasy Draft, I am happy to unveil...


The Pro Tour Aether Revolt Metagame Prediction Challenge - Win a Japanese Elspeth vs Kiora!

Contest is here.

All entries will be individually scored. You receive 1 point every time one of your cards appears in a top-performing standard deck (7 wins/21 points or better). For each sideboard-only appearance, you will receive 0.25 points. To capture the impact of cards that may be format-defining despite not being 4-offs in their decks (such as Emrakul), you will receive the full point value even when your selected card is a 1-off, 2-off-, 3-off in its decks. The top individual entry will win a new Japanese Elspeth vs Kiora.

The challenge here is to prove me wrong. If I’m wrong, and there is an obvious development mistake, the community’s picks should concentrate into a few (<3) cards, and those cards should turn out to be OP. If the community’s picks are spread out among a lot of cards, and some of them do turn out to be OP - then I’m right, and there were no obvious development mistakes. If the community’s picks are concentrated into a few cards, and those cards do not turn out to be OP, then I’m still right, and there were still no obvious development mistakes, because the ones we thought were “obviously too good” turned out not to be.
I’m pretty sure I’m right, but hey—maybe you guys will prove me wrong.


Appendix - In Which I Concede a Situation Where I Look Pretty Wrong, but Not Really because Reasons


Funny thing about the terms of the metagame prediction challenge - had I run this challenge for Pro Tour Kaladesh, I would probably have lost. Why? Well, you may recall that this card was recently banned:
And honestly, had I run the prediction challenge pre PT-KLD, I expect a lot of people would have picked Copter. Now, I could submit the small quibble that the Smuggler’s Copter is colorless. For a prediction game that’s scored simply on the number of decks in which your picked cards appear, the strategic choice is to load up on powerful colorless cards that can go in many archetypes, rather than powerful archetype-specific cards. Even if you thought, say, Chandra, Torch of Defiance would end up being stronger than Smuggler’s Copter, your incentive would still to pick Copter. My point is not that Chandra is stronger than Copter, as we’ve learned that she’s not, just that someone *believing* at that time that Chandra is stronger would still have picked Copter, thus misrepresenting the wisdom of the playerbase versus development. It’s for this reason that I’ve segregated the colorless cards from the pick pool in the PT AER prediction challenge.

But that’s just a quibble, and  to be completely honest - a lot of people pegged Copter as the defining card of the set shortly after release, head and shoulders above the rest of the set. I don’t believe many people predicted a banworthy-level of brokenness from Copter, but it was definitely a case where FFL missed something that was reasonably obvious.

That said, a single exception does not disprove a general rule, and there is an “almost” in my assertion for a reason: “complaints are (almost) always hindsight bias.” So even granting that complaints about FFL missing on Smugglers Copter are more reasonable than most MtG balance whining, I still believe overall that obvious development mistakes are extremely rare.