[Devember 2021] Spotify archiver (for playlists etc)

[Devember 2021] Spotify archiver

Full name pending, work name SAINT

Project

I’ll be creating a thing to archive/backup whatever I can (that makes sense) from the Spotify API in regards to the user’s usage of it. Off the top of my head, at least the following data will be stored in the application’s database:

  • Play history
  • User’s own playlists, followed playlists
  • Liked lists

The main goal is to archive relevant information into a database that the user has control over

So in a sense, I want to backup my Spotify profile for my own use, with some data analysis later on I hope

Technologies

While Linode has sponsored Devember with some free credits for new signups, I want to run this thing as cheaply as I can in perpetuity with as little maintenance as possible, therefore it will (initially) be hosted on AWS as I’m already familiar with its services and am interested in the ecosystem

Tech/service stack I’ll be using used will most likely be:

  • Telegram
  • Spotify
  • Terraform
  • Python
  • AWS Lambda
  • AWS API Gateway
  • AWS EventBridge or Step Functions
  • AWS DynamoDB?

The plan so far is:

  • User interaction through Telegram’s bot API where Telegram will then call the bot through its webhook system
  • Lambda function to do the API calls to Spotify in a scheduled manner and store this data somewhere
  • Lambda function to process the data scraped from the API and store it in the database
  • Database, will probably use DynamoDB, although I haven’t really used it before

The architecture will probably end up looking something like this from a top level point of view (just a quick draft):

Project goals

Devember goals (must have):

  • Have a working setup that can be deployed from nothing to running on an AWS account
  • Archive/backup data relevant to the user’s music preferences (see above)
  • Allow user to download their archived/backed up stuff in some format
  • (Not sure if this is possible, but let’s assume yes) “Restoration” of the backups, as in for example create a new playlist from the database song data

Stretch goals (will make it to Devember if there’s time left, but also something I’ll probably work on at a later date anyway):

  • Try porting it to Kubernetes for the fun of it (looking at LKE)
  • Web UI maybe, some kind of Last.fm ripoff website
  • Data analysis of some sort
  • Automated deployment through GitHub Actions or something similar
  • ???

Repo

https://github.com/notimetoidle/saint

1 Like

Been working on this since last week, progress so far:

  • Basic project structure is there
  • Some documentation
  • Terraform modules for deploying the API gateway and the Lambda functions are there and the stack will automatically register the webhook with Telegram as well as correctly receive messages for it (so far not parsed though)

Well, while I’ve been working on this on and off, I certainly won’t be finishing this before 2022 :smiley:

Some notes on the road so far:

  • Coming up with a good Terraform code architecture/structure is harder than I thought, not having used it that much (I mostly have experience with small TF projects and CloudFormation on the bigger scale)
    • Having read a few articles/blog posts on this I think I now have a vision of what might work
  • Telegram bot was a small monolith, I didn’t like it and now it’s split up into several parts where data and invocation goes through EventBridge (API/webhook handler sends events based on /commands)
  • Not having used NoSQL databases that much, DynamoDB is giving me a small headache. But the only basically-free alternative I can think of would be storing an SQLite database in a bucket and using that somehow, but that would be a pretty bad database system in this landscape
  • Spotify OAuth is not making the user experience very fluid as far as connecting the user’s Spotify profile to their Telegram account goes. Or maybe there is something I’m missing
    • Right now I have it setup so that the user has to first register their profile with the bot and then do the OAuth login, after which the backend can match the Spotify user ID to the Telegram user/chat ID

As for the inevitable Kubernetes port (as I’ve been toying around with k3s on my home servers) I’ve been looking a little bit at some components which would allow a similar setup in kube

  • For functionality similar to Lambda functions, both OpenFAAS and Knative look pretty neat at least on the surface
  • Replacing the EventBridge dependency would be Argo Events I guess, or maybe a more traditional MQ might work
1 Like