More Money, Mathematics
Data can assist with decisions and predictions. Finding, combining, and using the correct data is important to a successful venture in the age of streaming and non-linear consumption of programming. The music business has experienced a net gain from the ubiquity of data, but the data that artists are receiving from digital distribution platforms lack sufficient detail to make good relative decisions about sales performance. Digital distribution platforms provide basic details on streaming quantity, and dollar amounts attributed to these sales, but anything beyond the ability to discern a trend in sales and listenership, the use of simple statistical analysis is missing from the information provided on these platforms.
Artists assess their sales through an absolute perspective with a minimal application of a relative benchmark. For example, Tyler the Creator and DJ Khaled, although both released their respective albums the same week, should not be compared. Rather, the likelihood of these two artists achieving their results compared to the total hip-hop field is a better assessment. Not seeking further analysis beyond a one-week comparison between Tyler and Khaled to assess their performance dismisses the ability to make good decisions. To get a better grasp of music sales over the last 36-week period of 2019, the sales from the album titled Bandana by Freddie Gibbs is used as a subject to comprehend performance and probability of achieving music sales.
Bandana is an album that embodies a focus on design, production, and lyricism. Gibbs uses diverse flows to enhance the production and bends the songs to the journey he wants to take the audience without losing the beat. The album is a throwback to a period that features the rappers’ ability while the beat takes a backseat. Two songs stand out on the Bandana album by Freddie Gibbs, the first titled Freestyle S**t, and the song titled Palmolive. On Freestyle S**t, Gibbs narrates a tale about what he had to do prior to achieving success with his music sales. When the music started moving, he continued to pursue his street hustles. The lyric, when music started moving from the song Freestyle S**t was the catalyst to determine whether music move units. A source for music data that is free and reliable is difficult to find. Fortunately, Freshhiphoprnb.com provides the Billboard 200 sales figures for hip-hop records on a weekly basis.
The data provided by Freshhiphoprnb shows the chart number, artist name, album title, physical sales, and physical sales plus streaming sales for that week. Bandana was released in week 27 of this year, selling 6,500 physical copies and 16,600 physical plus streaming its debut week. During that same week, the average number of physical sales was 2,400 physical units and 24,000 for physical plus streaming combined. Gibbs, an artist not chasing mainstream recognition, outshines many of his competitors that week. Week 27 included top sellers such as DJ Khaled, Travis Scott, and Cardi B. Gibbs numbers get stronger with a median sale amount of 380 physical units sold during that week and just over 16,000 for physical and streaming.
Debut week for a record is the most important because a significant portion of sales occur during the opening week, with subsequent weeks experiencing increasing declines week-over-week. Artists, excluding the top sellers, focus on absolute sales volume with limited analysis on relative performance over a sales cycle or a cross-sectional period to get a true reflection of their albums performance. Simple calculations such as the average, median, or standard deviation can provide significant insights into performance. In addition, using probability theory can also provide much-needed comfort for artists when the absolute number does not match their expectations in a debut week.
The digital distribution companies do a poor job of providing artists with information that can be used without further analysis. Although the information is timely, it does not enhance decision making or predictability. The result is for the artist to find simple ways to draw insight from the data provided.
The data from Freshhiphoprnb provides the sales performance over a 36-week period for albums on the Billboard Top 200. The sample contained 953 albums with an average physical sales number of 1,664 sold and 23,646 units of physical and streaming. The albums that ranked first across the 36-week period had an average sale quantity of 21,936 of physical units and a deviation from the average of 25,145. Based on the data, the albums ranked first on the chart sell on average 103,000 units with a deviation from the average of 44,000 with the combination of streaming and physical sales.
To sell greater than 50,000 over the 36-week period, the cumulative probability is approximately 13% based on an average of 22,000 units sold and a standard deviation of 25,000 units sold. The chart below shows the central limit theorem distribution of sales for the 36-week period using a sample size of 30 repeated 10,000 times[3]. For many artists attempting to get to the top spot on the chart, they should know that the probability is zero percent. It is unfortunate that many artists do not use these simple concepts to understand their true performance among their peers in the genre. The ability to predict future sales is difficult in the music business given the volatility in market sentiment or preference on a particular song or artist. The concepts discussed above are preliminary ways for artists to assess their viability using cross-sectional data to evaluate their current and historical performance.
Killa Mike, on the album, says make money, more money, [is all] mathematics on Palmolive, a lyric that not only sums up the thesis of the album but also reiterates the hypothesis that statistics can assist with decisions, predictions, which can lead to more money.
Notes
[1] Chart 1 - R Code:
hist(albumdata$Sales.Streams,breaks=seq(0,200000,1000),col = "cyan",main = "Distibution of Sales+Streams in 2019",xlab = "Sales + Streams #")
abline(v=c(mean(albumdata$Sales.Streams),median(albumdata$Sales.Streams)),lty=c(2,3),lwd=2,col="magenta3")
legend("topright",legend=c("Average","Median"),lty=2:3,lwd=c(2,2))
2] Chart 2 - R code
s30<-c()
n=10000
for (i in 1:n) {
s30[i] = mean(sample(1:mean(albumdata$Sales),30, replace = TRUE))
}
hist(s30,col="lightcyan1", main="Sample Distribution of Physical Album Sales",xlab = "Album Sales", ylab="Frequency")
abline(v=c(mean(s30),median(s30)),lty=c(2,3),lwd=2, col = "hotpink")
legend("topright",legend=c("Average","Median"),lty=1:2,lwd=c(2,2),bty="n")
[3] Central Limit Theorem is the concept that any distribution can be reflected as a normal distribution by continuously taking sample and finding the average, and then plotting the distribution.