This is an example of why piloting new ideas is wise. The truth is we often don't pilot stuff. Many times it works out fine (and no-one mentions we didn't pilot it on a small scale). When you don't pilot and it then fails on a big scale this is the question, I think.
Were we bozos for not seeing the risk - looking back is it a pretty strong case we should have piloted.
If we often don't pilot and it works 99 times out of 100 it may be we are pretty good at knowing what needs to be piloted and excepting some failures is ok in order to get things done. Part of the decision that is critical is making sure you don't fail to pilot when it is really costly to be wrong (which is part of the decision on whether to pilot).
We can just always point to failure to pilot as the dumb thing to do when it fails. But I see that as a bit overly simplistic. Many organization don't pilot well. Getting them to do so all the time would likely stop you from doing better stuff. Getting them to do so when
- there are likely to be be things we should learn
- there are significant questions about how it would work
- the costs of widespread failure are large
- we can't consider the potential risks and make a judgement that there is likely not to be a problem
With this particular example it seems to me one that could have been thought about rationally and a decent case that we don't need to pilot could have been made. And that illustrates that there is always a risk to implementing without piloting (there is a risk of doing it anyway including a very big one of failing to catch the problems because your pilot failed to capture some important features (for example - you didn't think of the need to pilot with pink towels... - this would be an easy mistake to make).
And it shows why thinking about pilots is important - which is another thing we often fail to do, considering how to make the pilot cover the risky scenarios that may take place. Sometimes organizations will use certain locations to pilot stuff which can be useful - you can train these locations to provide good feedback, etc.. But as soon as you make the pilot locations different than were it will be done there are risks of not catching things.
It is the interaction of variables that often creates problems which it was this time. Pink flags meet the initial criteria of being noticeable. The interaction of putting many other pink items into play (certain jerseys, towels, etc.) is what seems to be the issue.
Related: What is the Explanation Going to be if This Attempt Fails? - Accept Taking Risks, Don’t Blithely Accept Failure Though - Management is Prediction - Combinatorial Testing for Software - European Blackout: Human Error-Not