Today’s Top Tip: Challenge Gender Bias within Spotify’s Most Streamed Playlist
Authors: Clara Kim, Jackeline Lopez Ruiz, Francisca Moya Jimenez
Spotify’s Today Top Hits playlist has over 27 million followers, making it without a doubt one of the most influential streaming playlists of our digital music age. With more than 20 billion streams, this Spotify-generated playlist not only reflects but shapes our music culture. In the past, Spotify has been accused of promoting masculine pop culture by disproportionately featuring more male artists on several playlists. According to Data Feminism by Catherin D’Ignazio and Lauren F. Kelin, algorithmic bias can perpetuate oppression, so that those in power stay in power, and those who are not stay without. Our team was curious how well the most-streamed playlist on Spotify, Today’s Top Hits, embraced gender diversity in the past few years.
Since we didn’t want to draw conclusions from a snapshot of the playlist at a single timestamp, we began by searching for archives of Today’s Top Hits. Turns out that Spotify doesn’t have that data available for its users. What then? Thank you mackorone on GitHub for recording snapshots of public Spotify playlists! What a hero. Thanks to Mack’s archives, we were able to collect records of Today’s Top Hits every 2 weeks for the past year and a half for a total of 37 records.
Now we had a list of all the artists that were on Today’s Top Hits for the past year and a half. In order to analyze gender diversity from this list, we needed to know the gender of the artists that appeared in the records. Guess what? We couldn’t find that information within Spotify either! For any given artist, Spotify’s database only provides the artist’s name, current popularity ranking of the artist on Spotify, and genre tags.
We faced a new obstacle. We had over 300 artists but no way of knowing their gender. Luckily, the free and open knowledge base that stores information from other Wikimedia projects (such as Wikipedia) called Wikidata includes gender information for any person who has a page. We decided to use Wikidata and were able to get the gender for most of the artists found in our data collection. However, Wikidata still had missing data for some artists, and therefore, we were unable to find the gender for all of the artists.
This was the moment where we got to be more creative and explored different options. Manual data collection was possible, but it was not the most efficient way to get the data we needed. Therefore, we decided to crowdsource the information from Spotify playlists.
We looked at both user-made and Spotify made playlists that featured cis-female artists, which included playlists such as “Female Artists” or “Best Female Songs Of All Time — Most Popular Music Hits by Female Artists”. We also looked at user-made playlists that featured cis-male artists, which included playlists such as “Top Male Artists of 2020” or “Male Solo Artists.” Sadly, there were not enough playlists dedicated to transgender or non-binary artists to be able to compare these with the female or male playlists.
After collecting these playlists, we classified artists in relation to the times they appeared in each set of playlists. If they appeared in more cis-female playlists than cis-male playlists, we classified them as female, and if they appeared in more male playlists than female playlists, we classified them as male. After testing the accuracy of this method, we realized that it worked pretty well! Out of 20 artists, it misclassified 2. We reviewed the labels afterwards, but we recognize that along with algorithmic error, there may also be some human error.
We then saved all of the artists that were featured in our records of Today’s Top Hits so that we could build a library that stored their gender information. This helped us save time so that we didn’t have to go search for their gender more than once.
We finally reached the point where we had our data collected and a gender classifier to help us actually perform our investigation. Phew! Are you still with us? Okay, great! Regarding our process of cleaning and tidying the data, we decided to remove all bands, duos, and groups (i.e. Black Eyed Peas, 5 Seconds of Summer, TWICE) from the playlist’s records to avoid miscounting the total number of people for any specific sex and gender. Then, looking at the number of single artists in each archive and labeling the gender for each artist, we computed the proportion of cis male artists in each record. We deliberately chose to find the proportion of cis male artists so that the difference would represent the proportion of all other gender identities, including but not limited to women, non-binary, trans, and gender fluid.
Today’s Top Hits playlist consists of only 50 tracks, which in effect makes it selective and difficult for artists to be featured on. The extremely limited number of spots suggests they may be reserved only for the biggest artists and viral emerging artists. We found that on average 70% of single artists within Today’s Top Hits were cis male. Therefore, on average only 30% of single artists featured in Spotify’s Today’s Top Hits playlist were non-cis male artists. Within the 30%, most were female artists. The only artists that did not identify as neither male nor female in our data were Miley Cyrus and Sam Smith who each respectively identify as gender fluid and non-binary. This further exposes the lack of gender diversity within our music culture.
The proportions of non-cis male artists over time were not so impressive either. Over the past year and a half, non-cis male artists never claimed 50% of the playlist tracks. Considering that we classified non-cis male artists to include all genders besides cis male, these results are quite concerning but not so surprising.
At the end of the day, it’s important to recognize how powerful playlists such as Today’s Top Hits continue to depreciate gender diversity. If anything, we hope our investigation inspires you to reflect on your music consumption. Who do you choose to listen to and why?
Interested in our work? Feel free to check out our project here!