Generating the Mountain Goats lyrics

John Darnielle of the Mountain Goats holding up a limited edition LP of In League with Dragons with a shiny green slipcase that mimics dragon scales.

John Darnielle with the green-scaled slipcase for In League with Dragons (Merge Records, via Giphy)

The Mountain Goats released In League with Dragons today, their seventeenth studio album.

John Darnielle has written a lot of words across the Mountain Goat’s back catalogue. His lyrics are poetic and descriptive, covering fictional and autobiographical themes that include substance abuse, professional wrestling and cadaver-sniffing dogs.

Can we generate new Mountain Goats lyrics given this rich text data set? This is a short post to do exactly that using the {spotifyr}, {genius} and {markovifyR} packages for R.

Hit play below while reading to generate the right mood.

Get lyrics

The {spotifyr} package pulls artist and album information from the music streaming service Spotify, along with some interesting audio features like ‘danceability’ and ‘acousticness’. It also fetches lyrics from Genius via the {genius} package .

First get a developer account for the Spotify API. Run usethis::edit_r_environ() and add your client ID and secret in the form SPOTIFY_CLIENT_ID=X and SPOTIFY_CLIENT_SECRET=Y. The get_spotify_access_token() function will add an access token to your environment, which will authenticate each API request.

library(spotifyr)  # install.packages("spotifyr")
access_token <- get_spotify_access_token()

The get_discography() function fetches a named artist’s back-catalogue, including the lyrics. Beware: this may include some duplicates from different regions or because of reissues or deluxe versions.

goat_discography <- spotifyr::get_discography("the mountain goats")

You can run the line above, or you can just use download.file() to get an RDS version stored on rostrum.blog (note that this file will become out of date as the Mountain Goats release more material).

## [1] 399  41

This is a relatively wide data frame with 41 columns of data for nearly 400 songs. Let’s simplify the columns and for fun we can look at five random sings and their ‘energy’.

library(dplyr)  # for data manipulation and %>%

goat_disco <- goat_discography %>% 
  ungroup() %>% 
  select(
    album_name, album_release_year,  # album
    track_name, track_number, duration_ms,  # track info
    key_name, mode_name, key_mode, tempo, time_signature,  # music info
    danceability, energy, loudness, mode, speechiness,  # audio features
    acousticness, instrumentalness, liveness, valence,  # audio features
    lyrics
  )

sample_n(goat_disco, 5) %>%
  select(album_name, track_name, energy)  # a sample
## # A tibble: 5 x 3
##   album_name         track_name                         energy
##   <chr>              <chr>                               <dbl>
## 1 All Eternals Deck  Damn These Vampires                 0.273
## 2 Heretic Pride      Marduk T-Shirt Men's Room Incident  0.171
## 3 All Eternals Deck  High Hawk Season                    0.27 
## 4 Goths              Shelved                             0.598
## 5 Nine Black Poppies Pure Money                          0.56

I’ll be saving this data frame for some other analysis, but for now we’ll need only the lyrics. The lyrics are stored in a list-column as a separate tibble (data frame) per song.

library(tidyr)  # for unnest()

goat_lyrics <- goat_disco %>%
  filter(lyrics != "NULL") %>%  # remove rows where lyrics weren't collected
  unnest(lyrics) %>%  # unpack the lyrics list-column
  filter(!is.na(lyric)) %>%  # remove empty lyrics
  select(-line) %>%  # unneeded column
  group_by(lyric) %>% slice(1) %>%  ungroup() %>% # remove duplicate lyrics
  pull(lyric)  # convert column to character vector

sample(goat_lyrics, 10)  # a sample
##  [1] "You can't cross the same river twice"                     
##  [2] "And other times the sickness howls"                       
##  [3] "Holt Boulevard"                                           
##  [4] "I know you've been waiting for a long, long time"         
##  [5] "Hands in your pockets and sun on your face"               
##  [6] "Anyone here mentions \"Hotel California\" dies before"    
##  [7] "The way the wind seems to pass straight through your body"
##  [8] "In a studio in Harlem"                                    
##  [9] "Wait for the coming disaster"                             
## [10] "Of the children playing double-dutch"

Generate lyrics

We can use a Markov chain to generate new lyrics based on our data set. Basically, it will predict the next word from the current one based on the likelihood from our input data set. You can read more about this principle elsewhere.

The {markovifyR} package is a wrapper for the Python package markovify, which ‘is a simple, extensible Markov chain generator’. You can install markovify at the command line via R’s system() function. {furrr} is also needed.

# system("pip install markovify")
library(markovifyR)  # remotes::install_github("abresler/markovifyR")
library(furrr)  # install.packages("furrr")

Now we can generate the model given all the lyrics.

markov_model <- generate_markovify_model(
    input_text = goat_lyrics,
    markov_state_size = 2L,
    max_overlap_total = 25,
    max_overlap_ratio = 0.7
  )

You can meddle with these controls, but I’ve kept to the suggested defaults for now. Note that ‘overlap’ relates to the likelihood of generating whole sentences that already exist. See markovify for more detail.

Generate lines

Use the markovify_text() function with our markov_model object to generate single lines.

Fans of the Mountain Goats will no doubt recognise some of the phrases from existing songs.

goat_speak <- markovify_text(
  markov_model = markov_model,
  maximum_sentence_length = NULL,
  output_column_name = 'goat_speak',
  count = 50,
  tries = 100,
  only_distinct = TRUE,
  return_message = TRUE
)
## goat_speak: The human element drags you down to hell
## goat_speak: High as a rose
## goat_speak: And I know who might or might not be allowed to touch anything
## goat_speak: Open up your fishnets
## goat_speak: Clouds bounced against the sink
## goat_speak: I cut myself a switchblade
## goat_speak: I take what I needed
## goat_speak: But in the net
## goat_speak: I'm lying on the edge
## goat_speak: For dear life, I guess but Jesus what a mess
## goat_speak: They formed a heart pumping blood
## goat_speak: But it's something I won't be cashing in your car and we clink our glasses
## goat_speak: You can't cross the same blocked intersection
## goat_speak: Long dinner with some skinheads
## goat_speak: We are warm and the completeness
## goat_speak: You were standing in the skin
## goat_speak: Lighter than the first time
## goat_speak: Leann Rimes on the kitchen
## goat_speak: Down on the West Coast
## goat_speak: Red squirrel looking down to the end of things, where the blinds back up
## goat_speak: Planets in the front door
## goat_speak: And God is my resting place
## goat_speak: On the morning when I should not be your boy
## goat_speak: We've got stars in the harbor dawn
## goat_speak: Always seems to pass straight through your hair
## goat_speak: I let the water all day
## goat_speak: Where will the spell remain?
## goat_speak: And they pick me up and blesses me
## goat_speak: We went down to the shadows but the shadow of a second rate songwriter from the darkening light
## goat_speak: Remember soaring higher than a vanishing act
## goat_speak: And some days I don't know what I was cold, clear water in the door
## goat_speak: Headed straight for the flash
## goat_speak: But everything ends in a slow drawl
## goat_speak: And they're trying to beat to death
## goat_speak: Try to see the look on my yellow shirt
## goat_speak: In the squall of the hidden self-inflicted wounds
## goat_speak: And there was wind
## goat_speak: When I try to remember all your weapons
## goat_speak: Can't ever set aside an hour
## goat_speak: And brought me a seat?
## goat_speak: Wear black wherever you may be the place the river began to bend
## goat_speak: And I am the white bird was gone
## goat_speak: I will be hell to pay down the strangers at the racetrack
## goat_speak: Maybe tomorrow, maybe the next day someone's initials
## goat_speak: And I walked out of breath
## goat_speak: On the morning you went down to the floor now every last one
## goat_speak: Friends who don't have to come
## goat_speak: We'd been staring at the water hotter than the devil's heart
## goat_speak: For several hours and then orange then opting for secession
## goat_speak: But I will come at me in jail until you see

I ran this function a few times and here a few outputs that made me laugh (or think):

  • But I felt all the Portuguese water dogs?
  • I write reminders on my kimono that I could not remember
  • Leann Rimes on the ocean
  • Sunset spilling through your megaphone
  • It’s the most gorgeous cow I’d ever wanted
  • I hope I never liked Morrissey
  • Went and got the case of vodka from a disco in old east Berlin
  • Fresh coffee at sunrise, warm my lips like a dying man
  • But my love is like a tattoo into my ear
  • And you brought me a bowl of cooked wild grasses
  • We had hot caramel sticking to her skin
  • And then the special chicken
  • And a bird we would have liked brought the Norman invasion
  • How come there’s peacocks in the face of the rainbow

Generate a verse

You can also choose to seed the first word in the sentence. You can do this in such a way that you can create a sort-of possible-sounding stanza.

goat_speak <- markovify_text(
  markov_model = markov_model,
  maximum_sentence_length = NULL,
  output_column_name = 'goat_lyric',
  start_words = c("I", "And", "But", "So"),
  count = 1,
  tries = 100,
  only_distinct = TRUE,
  return_message = TRUE
)
## goat_lyric: I felt free, and I won't be here for
## goat_lyric: And the dying hours dry
## goat_lyric: But you're gonna feel it in my pocket
## goat_lyric: So I tore off to the dock dry

…or not.

I think John Darnielle probably remains the best generator of Mountain Goats lyrics for now.

Further reading

To learn more about the band:


Session info

## [1] "Last updated 2019-09-08"
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Locale: en_GB.UTF-8 / en_GB.UTF-8 / en_GB.UTF-8 / C / en_GB.UTF-8 / en_GB.UTF-8
## 
## Package version:
##   askpass_1.1      assertthat_0.2.1 backports_1.1.4  base64enc_0.1.3 
##   BH_1.69.0.1      blogdown_0.11    bookdown_0.9     cli_1.1.0       
##   clipr_0.6.0      codetools_0.2-16 compiler_3.5.2   crayon_1.3.4    
##   curl_3.3         digest_0.6.19    dplyr_0.8.1      evaluate_0.14   
##   fansi_0.4.0      furrr_0.1.0      future_1.13.0    genius_0.0.1.0  
##   globals_0.12.4   glue_1.3.1       graphics_3.5.2   grDevices_3.5.2 
##   grid_3.5.2       highr_0.8        hms_0.4.2        htmltools_0.3.6 
##   httpuv_1.5.1     httr_1.4.0       jsonlite_1.6     knitr_1.23      
##   later_0.8.0      lattice_0.20-38  listenv_0.7.0    lubridate_1.7.4 
##   magrittr_1.5     markdown_1.0     markovifyR_0.101 Matrix_1.2-16   
##   methods_3.5.2    mime_0.7         openssl_1.4      parallel_3.5.2  
##   pillar_1.4.1     pkgconfig_2.0.2  plogr_0.2.0      promises_1.0.1  
##   purrr_0.3.2      R6_2.4.0         Rcpp_1.0.2       readr_1.3.1     
##   reticulate_1.12  rlang_0.4.0      rmarkdown_1.13   rvest_0.3.4     
##   selectr_0.4.1    servr_0.13       spotifyr_2.1.0   stats_3.5.2     
##   stringi_1.4.3    stringr_1.4.0    sys_3.2          tibble_2.1.3    
##   tidyr_0.8.3      tidyselect_0.2.5 tinytex_0.13     tools_3.5.2     
##   utf8_1.1.4       utils_3.5.2      vctrs_0.1.0      xfun_0.7        
##   xml2_1.2.0       yaml_2.2.0       zeallot_0.1.0