Editor's note: Derek Beaton is a PhD student in the School of Behavioral and Brain Sciences at The University of Texas at Dallas. He also loves beer, so recently he set out to scientifically prove which local beer is Dallas' favorite. Here are his results.
Have you ever wondered what Dallas' favorite local craft beer is? You're probably thinking, "Yeah, it's clearly Lone Star because it's the 'National Beer of Texas'," or "Duh, it's the one in my hand right now, bro!" While those are worthy guesses, they're also wrong, and you should feel bad.
If you're a craft-beer nerd, you probably have some ideas, based on what you see on tap around town. But pure availability doesn't make a beer Dallas' favorite, or else Lone Star would indeed be the favorite. As a hybrid beer and stats junkie, I decided I had to know: Of all the local craft beers that are produced and available throughout DFW, which are Dallas' favorites?
Thirty-five beers made it onto my survey, selected for the following criteria:
- The brewery itself must have been in operation for at least one year
- The beer itself must have been available for at least six months
- The beer must be offered year-round (no seasonals, specials or one-offs)
Franconia, Peticolas, Revolver, Martin House, Four Corners, Lakewood, Rahr, Deep Ellum, Community, and Cedar Creek were all represented. I randomized the order in which the beers were listed and sent the survey out. Here's a quick breakdown of the demographics of the participants:
- 202 respondents. One was excluded**.
- Gender: 36 Females, 160 Males, 1 Meat Popsicle, 1 Unicorn, 1 Minotaur and 3 non-responses.
- 33 people who work in the beer industry (brewers, bartenders, waitstaff, etc...).
- 58 people who consider themselves homebrewers.
Participants were asked to select one of the six following options for each beer***:
- It is one of my favorite beers.
- I like this beer.
- This beer is OK.
- I don't like this beer.
- I've never had this beer.
- I have no opinion.
We're looking at the beers (listed vertically) and the proportion (out of 201) of responses. I reordered the beers so they're listed in order of most to least "It is one of my favorite beers" responses.
There are some clear favorites: Temptress, Velvet Hammer, Blood & Honey and Mosaic IPA all received a lot of "Favorite" responses.
As a stats nerd, though, this picture feels a bit ... rudimentary. There are better ways to assess and visually represent Dallas' favorite beer. So let's turn to one of my favorite statistical methods: correspondence analysis (CA). CA is a technique that takes a large table made up of a bunch of variables (in this case the responses) and turns them into new variables that better represent what's happening****.
The data from above looks something like this:
What will CA do with a table of data like this? It will tell us which beers are most similar to one another -- based on the different categories. It can tell us if any of the categories are similar to one another, too. But most importantly, it tells us which beers are more related to certain responses than other beers. Let's take a look at what it produced:
CA produces new variables called "components" -- denoted by the axes (horizontal and vertical lines) in these pictures. There are three other axes besides these, but those aren't very important. Just these first two explain 87 percent of the data.
With what we know about CA, we can say the following:
- Temptress, Velvet Hammer, Blood & Honey and Mosaic are more associated with "A Favorite" than other beers (both figures).
- The responses of "OK" and "Do Not Like" are essentially the same -- which probably means people are being polite when they say "OK" or they're being excessively harsh when they say "Do Not Like."
- Cedar Creek's Scruffy's, Cedar Creek's Elliot's Phoned Home and Martin House's XPA are grouped at the bottom left of the left figure which means they are nearly identical based on their responses. Most people haven't had these beers. Sad times.
Let's go a bit further. Let's combine "OK" with "Do Not Like," because they are basically one in the same. We'll also combine "No Opinion" with "Never Had," since they're essentially non-responses. Let's do another CA and this time color each beer by the responses they are most similar to.
With the combined responses, we can see the general configuration is essentially the same. Except this time we can explain 92.5 percent of the data instead of just 87 percent. It's also a little clearer that from right to left is a gradient of liking (or having tried) a beer.
We also have a clearer idea of which beers people have never had (in gray), which ones are not particularly cared for (in red), which ones are liked (in yellow) and which are Dallas' favorites (in green).
The favorites are still Temptress, Velvet Hammer, Blood & Honey and Mosaic. So why did I exclude poor ol' Blood & Honey from the top three? Let's take a look at the responses in these four categories, like we did initially. Beers are sorted by those with the most "A Favorite" responses:
Now we have a different perspective -- one that we get directly from the CA results. Some beers are very related to "A Favorite" and rarely receive a "Do Not Like." Unfortunately for Blood & Honey, responses of "A Favorite", "Like" and "Do Not Like" are all equally likely.
Very few people would say they "Do Not Like" Temptress, Velvet Hammer and Mosaic IPA. Thus, these three -- in that order -- are Dallas' favorite beers. It's science.
In about a year I'l re-do this survey. By then, approximately 30,786 new breweries will have opened, and some that are currently open -- but didn't qualify this time -- might also be in the running.
All analyses performed in R. Correspondence Analysis was performed with the ExPosition package, a package created by particularly attractive and smart people.
*I only realized after I sent out the survey I had made 2 glaring errors. I mistakenly excluded Firewheel and Armadillo Ale Works. Whoops -- sorry!
**They responded with "I've never had this beer" to all beers.
***For the stats nerds: these are survey options not usually seen. Often times when you get a survey, you're asked to respond with a 1, 2, 3, 4, or 5 (or some similar numeric scale). Well, what if people have no opinion? What if they don't want to answer the question? They need a way to opt out. Also, categories aren't numbers, you dummy! For your (statistical) health!
****For the stats nerds (again). Technically both the beers and the responses are variables. The observations (people) are kind of hiding. Each person simply helps increase the number of responses within a particular cell of this table. CA is analogous to principal components analysis but for data more suited for ?2 analyses.