Visualizing and comparing setlist variance

30 Sep 2020 -

I love live music, and miss it dearly. As part of my project to teach myself data science, I have been doing a lot of projects focused on live music, and mostly focused on Phish. One of the things that makes Phish ripe for data analysis is their dedication to switching up setlist from night to night, surprising fans with bustouts, and having jams come out of pretty much any song. This is also one of the main things that keep Phish fans coming back for multiple runs a year and seeing every show in a weekend. On the other side of the spectrum, there are bands that are extremely consistent. You pretty much know what they’ll play, and many of their fans go hoping to see the hits, a polished set, and the band at the top of their game every night. This got me wondering if there was a way to measure how much bands change up their setlists and rank them.

The first step was getting the data I needed. And the only place to do that is on setlist.fm. They have a relatively straightforward API that allows you to scrape through all the setlists of a band. One small hump was that this was the first time I was getting data in JSON format. But, once I realized I could pretty much treat them like Python dicts, it became pretty easy to get a json file for any band and all their setlists on the site.

My next step was to find some initial bands to work with to prototype my metrics and graphs. The first band I started with was Vampire Weekend. I knew enough about their history to know that after the release of their last album, they started changing up setlist more often, so I thought they would be a good band to use to see patterns in my prototypes. After working with them a bit, I added Arcade Fire, Beck, and Pearl Jam so that I would have some variety of bands to check on as I was building graphs.

After having a script to get setlists and a few bands to work with, I needed to have measurements to focus on. I started with a few guiding principles. The first was that I wanted to have at least a few different measurements and combine them into a setlist variance index of sorts. The next was that each of these measurements would have a similar impact on the sum, for this I started with the hope that each number would be a percentage so that each would stay between 0 and 1. And last, I wanted each metric to measure a different aspect of how a band changes up their setlist, not just multiple measurements of the same concept. Here’s what I came up with:

Song Gap Analysis

This was where I started because I thought that the most basic way a band changes up their setlist is by deciding if they are going to play any of the same songs that they played the night before. So, I made a measure that cycles through every show and comes up with a percentage of songs from that show that weren’t played the night before. Then came up with an average of that over the band’s career. On the graph, I color coded to show if the band hadn’t played the song in two shows or in three or more shows. This is only for a visual on the graph. The final metric is just an average of how many songs weren’t played the night before.

To make the data a bit cleaner, I dropped all shows with less than 5 songs in it. This was to avoid an issue I was seeing where some bands did a lot of TV appearances with 2 or 3 songs and then the next show would seem like it had a bunch of new songs in it since it was compared to the TV appearance, not the previous concert.

An issue with this metric is when bands have empty setlists. Ultimately, I decided to just drop all empty setlists and treat each show as the next show after the last one with a setlist. This isn’t a great solution since a band could play a show, then have two shows where a setlist wasn’t recorded, then on the third show play all the same songs as the show that had a setlist. This would make it seem like they don’t change up the setlist at all but they may have played all different songs during the two shows that don’t have setlists. However, I did a little digging and there wasn’t any real pattern in the types of bands that have missing setlists. And, a lot of the missing setlists were early in the band’s career and a long stretch of missing shows. So, while this solution isn’t perfect, I think it’s better to keep as much data in the formulas as possible instead of only using shows where there was a recorded setlist in the previous show.

This metric ended up having the biggest interval between the most consistent band (Iron Maiden - 2%) and the band with the most song turnover every show (String Cheese Incident - 93%). This aligned with my expectations, and I think that changing up the setlist each night should have the biggest impact on the final rank.

Song Turnover Each Year

After looking at how bands change up from night to night, I thought I would look up how they change from year to year. Maybe they don’t change up a lot from night to night, but if you see them on their next tour, will they play new songs or the same hits as last year? So, I looked at all songs played in a year and then saw what percentage of those songs weren’t played the year before. I considered doing this cumulatively, only showing which songs were new in a year that had never been played before. But, I thought that a band who pulled out songs that hadn’t been played in 10 years and added them back into the rotation should get some credit for that and bands that have been around forever shouldn’t have their percentage go down because they don’t have many new songs each year.

There were a few bands that were surprisingly way ahead in this (U2 and The Rolling Stones) and I realized that I was having a similar issue as the one above. Some bands had a year where they only had one concert, usually a short TV or festival appearance. Thus, I dropped all years where the band played less than 10 distinct songs. I also realized that this measure is impacted a bit by the previous one. Bands that change up their setlist a lot from night to night are using up a larger percentage of their catalog each year. Thus, they have a smaller number of songs left to play the next year. Also if they play a ton of different songs in a year, then they have to play more new songs to get their percentage up. You can see this with bands like Phish, Grateful Dead, and Widespread Panic who all have very high song gap numbers but low new songs per year.

This metric had the third smallest range between the lowest (Widespread Panic - 21%) and the highest (Neil Young at a little over 62%). This makes sense to me as the metric that should have a low impact on the final rank, especially since a high score on the previous metric causes a lower score here.

Top 10 Song Frequency

This one is the easiest to explain. How often does the band play their most popular songs? I got a list of the top 10 songs and then counted up how many times that song was played and divided that by the number of concerts the band had since the song was debuted. Then, I averaged that number over all the top 10 songs to come up with a metric to rank bands on whether or not you’ll see their most popular songs when you go see them. Unlike the other three measures, the lower the number here indicates more variety in setlists so I subtracted the average from 100% when doing the final rankings.

I didn’t have to make many adjustments to this metric. Originally, I had the metric just be the number of times the song was played over the total number of shows the band ever had. Then, I realized that for bands that played hundreds of shows before having a hit and getting big, it would show they played that song at a much lower percentage of the time. So, I switched the denominator for each song to be the number of shows since the debut of the song.

One other issue I found was that some bands will play their most popular song twice in a show. I assume that this is usually parts of the song spread throughout the show and considered counting each song once. But, then I figured, this measure is ultimately a way to show how much of the show is taken up by the most popular songs so I counted every time a song was played, even if it was twice in the same show.

This metric had the second biggest gap between highest (Kendrick Lamar - 96%) and lowest (String Cheese Incident - 24%). This feels right as the second most impactful metric.

Top Song Placement

The last metric I included was seeing how bands change up the order of songs within a show. Even if they play the same songs every night, they can switch up the order a bit and that would be more variety than playing songs in the same order each time. So, I took the top 10 songs and graphed the point in the show where they were played. This was found by taking the number of the song in the show (3rd song) and dividing that by the number of songs in the show (20 songs). Then, to see how much the band changes up the placement, I found the interquartile range of all times played for each of the top 10 songs. This is the interval where half of the times a song is played is within that interval. These 10 intervals were then averaged to find a song placement index for comparison. These intervals can be seen in yellow on the graph.

There were several things I needed to do to figure out this metric. I first tried variance and standard deviation, more typical measures of spread, but the differences between bands were really small since I was finding the spread of hundreds of numbers all below 1 and pretty close to each other. This is why I went with interquartile range, it increased the measures a bit to make them more similar to the other metrics and it still measured spread.

One issue I noticed is that there are a few songs that bands played a lot at the beginning and end of shows. Some examples are I Can’t Get No Satisfaction and My Iron Lunch. This makes their interquartile range essentially the entire show. However, for the few bands where this happened, they all had a few songs that had very tiny IQR intervals so they pretty much cancel each other out.

This metric had the largest gap between highest (Railroad Earth - 51%) and lowest (The National - 11%) averages.

Potential Other Measures

I got a couple suggestions that I also considered but left out for now. One was a measure of all unique songs in a year divided by the total number of songs played that year. This one I’m still considering because it seems like a big factor for a band to learn a lot of different songs each year. But, it feels like that overlaps a lot with the first two of my metrics so I’m leaving it out for now. The other was a different way of measuring the song gaps with an average for each show of how many shows it has been since each song was played. Thus if all the songs were played the night before they would all get a 1 and that would average to 1. But if half the songs were played the show before and the other were played 3 shows earlier that show would have a score of 2. I don’t think this adds enough new info compared with the song gap metric and I’m not sure how I would adjust the number to be around the same as all the other metrics.

Analysis of Final Ranking:

Here are a few things I noticed in the final ranking: A lot of the bands that originally had the highest setlist variance had less than 250 shows. This was due to a combination of those bands being new and fans not uploading every setlist. Both of these things artificially push the setlist variance metrics up. So, I added a filter to remove them from the list unless you click the box. Part of the reason that String Cheese Incident and Umphrey’s McGee end up ahead is that they play their most popular songs less often than other bands. I guess I can’t blame them for struggling to pick any of their songs to play more often. It’s cool to see how high on the list some of the older acts are like Zappa and Prince It’s really interesting to see the shifts during a band’s career in the Song Gap graph. Both Springsteen and Dylan recently stopped changing things up as much and, as mentioned above, Vampire Weekend saw a big shift after FOTB came out.

Feedback and Feature Requests

Many of these bands I don’t know well enough to do quality checks of the data. If you notice any issues with a band’s data, please let me know so I can improve the graph. Also, if you have any questions that aren’t answered by the writeup or any suggestions to improve the dashboard, send me a message. I can also add more bands if your favorite band isn’t on here.

@JesseRoe55 on Twitter and u/PhishStatSpatula on Reddit