Data Acquisition

Since we want to investigate the popular music, we needed sources for popular music according to the year. For this we found two different options. On the one side Spotify offers "Top Hits of YEAR"-Playlists ranging from 1970 to 2019. Here we get 100 Songs picked from Spotify for that year. On the other side we researched chart placements for the global year end charts ranging from 1955 to 2022. For roughly the first 40 years, there 20 tracks per year, and after that, there are 40 tracks per year.
We used the Spotify API to extract information for all the songs coming from these two sources. Additionally to that we used the Genius API to search for lyrics for the top songs of a year.
For most analyses, we worked on the spotify data set, because it has a much larger size, here we have 100 songs per year. For the charts only 20 songs are available up until 1998, and 40 songs after that.

Spotify API

Request the song features of the top songs.

Documentation of SpotifyAPI

Mediatraffic

Extraction of the songs of the year end charts.

Mediatraffic chart archive

Genius API

Search for song lyrics.

Documentation of GeniusAPI

Our data

We requested the information cleaned it and preprocessed it a bit, to omit unnecessary data. This data was then saved as JSON-Files for further analysis.

What is in the data?

Spotify provides numerous features, that describe a specific song. Some more interesting than others, but certainly all helpful to get a better understanding of the popular music. Let's investigate the meaning behind every feature we got to our disposal.


              

Discover the Features

Artist

The artist who performed the song. If there are more than one artist who performed the song together, they are listed.

Duration (duration_ms)

The length of the track in milliseconds.

Excplicit Lyrics (explicit)

Does the track contain explicit lyrics? If yes the tracks contains words or phrases, that are considered inappropriate or offensive. Examples are graphic depictions of sexual acts, references to drug use or violence, or the use of vulgar langugage and swear words.

ID

The Spotify ID for the track.

Name

The name of the track.

Popularity

The popularity of the track. A value between 0 and 100 is assigned, where 100 is the value for the highest popularity. The value is generated by an internal Spotify algorithm and is largely based on the total number of hits for the track and how long ago they were played.

Track number

The number of the track within its album. If an album consists of multiple CDs, it is the number on the specified disc.

Genres

The genres of the album the track belongs to.

Danceability

The value for danceability describes how suitable the song is for dancing and is based on a combination of musical elements like tempo, stability of the rhythm, strength of the beat, and the overall uniformity. A value between 0 and 1 is given, where a value of 1 means that you can dance to the track particularly well.

Energy

The energy describes a perceptual measure of intensity and activity. The energy increases with the high speed, loudness and noise. For example, death metal songs have high energy, while a Mozart song score low. It is measured with a value between 0 and 1.

Key

The key of the track. If no key was detected, the value is -1.

Loudness

The value describes the overall loudness of a track in decibels (dB). Loudness refers to the perceived strength or intensity of a sound, which is closely related to its physical amplitude. Loudness values typically fall within a range of -60 dB to 0 dB. By averaging the loudness values across an entire track, you can get a sense of its overall loudness.

Mode

The term "Mode" indicates the modality (major or minor) of a track, as well as the type of scale from which its melodic content is derived. Major is represented by 1 and minor is represented by 0.

Speechiness

Speechiness detects spoken words in a track. Attribute values closer to 1.0 indicate exclusively speech-like recordings, while values above 0.66 suggest tracks made entirely of spoken words. Values between 0.33 and 0.66 indicate tracks with both speech and music, and values below 0.33 represent music.

Acousticness

Acousticness describes a confidence measure of track's acoustics ranges from 0.0 to 1.0, with 1.0 indicating a high confidence, that the track is acoustic.

Instrumentalness

Predicts if a track has no vocals, treating "Ooh" and "aah" sounds as instrumental. Rap or spoken word tracks are considered "vocal". Instrumentalness values closer to 1.0 indicate a higher likelihood of no vocal content. Values above 0.5 represent instrumental tracks, with higher confidence as the value approaches 1.0.

Liveness

This feature detects audience presence in the recording. Higher liveness values indicate a greater chance of the track being performed live. A value above 0.8 strongly suggests the track is live.

Valence

A measure from 0.0 to 1.0 describes the positive/negative emotion conveyed by a track. High valence indicates positive emotions (e.g. happy, cheerful, euphoric) while low valence indicates negative emotions (e.g. sad, depressed, angry).

Tempo

The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.

Comparison of features

Bye Bye acoustic songs! Or is there a comeback?

Here some audio features are compared. The songs of the Media Traffic Year End Charts were used as a basis. Their values are all between zero and one. The comparability between them is rather unimportant. Much more interesting is the change of the individual features over time. As we can see, acousticness has felt the sharpest decline after its peak in 1960. However, it has been rising again in recent years. Is it making a comeback? Other values such as liveness and speechiness have changed little over the years. So they tend to be less influential on popular music and have tended to stay the same in musical taste.

"All songs sound the same"

The fact that most songs are similar or only a few stand out is something we hear again and again. That's why we looked at the similarity within the individual features and compared the values overall across all years. It is noticeable that many audio feautures actually have relatively very little variation. Whereas the instrumentales, acousticness and speechiness show a higher margin than the values of the other features. At the same time, this means that these features are probably a lesser criterion for whether these songs end up in the charts or not.