Introduction
I’m planning to do something a little different with this post. It’s probably more code heavy than my usual writing and more wordy than my usual coding. I wanted to share it because I think it’s simultaneously an interesting story and potentially useful script. My hope is to provide a gentle introduction to Application Programming Interfaces (or APIs): what they are, how to use them, and why having open access to them is so important in today’s society. Some familiarity with Python will be helpful when it comes to understanding the code snippets, but feel free to skip over those sections if you’re not feeling it.
Furthermore, I’d also caution in advance that this post contains a brief discussion of self-harm related content and mature language. It’s okay to set boundaries for yourself and skip to the code if that topic is sensitive.
While the narrative is my own, the code here is based on code samples from Google and Spotify licensed under the Apache 2.0 License.
Google’s quick start guide is available at: https://developers.google.com/youtube/v3/quickstart/python
Spotify’s quick start guide is available at: https://developer.spotify.com/documentation/web-api/quick-start/
You may obtain a copy of the Apache 2.0 License at: http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
The Python code that follows is designed to be run in Google Colab. Hypothetically, you should be able to open this code in your own notebook by using the following link: https://colab.research.google.com/github/rruff82/misc/blob/main/YTM2Spotify_clean.ipynb
The reason I’m using Colab for this is that it makes it extremely easy to authenticate to your Google account. Please understand that doing things the easy way carries inherent risks, and you should be mindful of any code you run in this manner.
If you’ve never used an interactive notebook before, it’s broken down into “cells” which can be either text or code. I’m going to talk you through all the steps I went through to migrate my likes and describe what I was thinking at each stage of the process. You’ll need to run each code cell one at a time as you follow along. You can do this directly in your web browser by selecting a code cell and either clicking the “play” icon on the left or by using CTRL+Enter on your keyboard.
Background
This story begins back in 2014 when Google did a collabaration with Marvel to release the Guardians of the Galaxy soundtrack for free on Google Play Music. Total marketing campaign success. It got me using the app more frequently, and the more I used it, the more I loved it! Before long I was buying an album a month and it seemed totally reasonable to pay for a subcription to the service.
The performance of the Play Music app was top notch. I especially appreciated the option for caching songs while on WiFi so I could listen to my favorites without eating through my cell phone’s data plan. It also gave me the ability to upload some of the more obscure albums in my collection and easily integrate them into playlists with subscription content. As I started liking songs on the platform, the algorithm started to get really good at providing me with new music that I actually liked. Furthermore, the integration with Google’s voice activated assistant was perfect for controlling my music mix while driving.
Unfortunately, Google terminated the Play Music service in 2020 and replaced it with YouTube Music. They migrated over my library well and my Play Music subscription had been upgraded to suppress ads on YouTube, but as a music service it wasn’t really meeting my needs. Songs started cutting out when my phone changed networks. The voice commands I relied on while driving stopped working. I’m also pretty sure the web app had a memory leak which would cause it to crash if I listened to a single station for too long. Everything that I loved about “Play Music” gradually slipped away.
The nail in the coffin came in 2022 when Google started rolling out content filters on YouTube that flagged videos for depictions of “self-harm”. I can understand that this is a good idea in theory, but it essentially made YouTube Music inoperable for me because there was no way to bypass this warning within the YouTube Music app. Rage Against the Machine’s entire debut album became inaccessible on the platform overnight — presumably because of the cover art’s depiction of self-immolation. What was most frustrating was that this ban was predominantly affecting playlists I had explicitly made to cope with the grief of having lost close family members to suicide. When you’ve relied on certain playlists of music to help you process painful emotions for years, having that safety net abruptly taken away from you by a third-party vendor feels pretty fuckin’ horrible. And then they had the balls to try to raise the price of their shitty service too? I felt so betrayed.
I don’t think Google understood the role music really played in my life. They knew the music was a commodity that I was willing to pay for, but they didn’t know the extent to which music was the cornerstone of my emotional foundation. Heck, I don’t think I understood it fully either. Other people talk about feeling as originating from the heart, but I’ve started to wonder if I’m different. I feel like my feelings are closely tied to the physical sensations I associate with music. It’s in my fingers when I feel the vibration of the strings against my fingers as I press down up on the frets. I feel it in my lungs when I try to belt out that falsetto. I can feel it resonate through my whole body when I stand to close to the bass amp. Music is more than just a background soundtrack in my life; the rock opera is a mirror to my soul.
For better or worse, Google had forced me into paying closer attention to how much of my music library was connected to suicidal themes. It was disturbing see some corners of my music collection getting banned while also seeing obvious references go unnoticed by the algorithm. The lyrics to “Pardon Me” by Incubus describe the same event as shown in Rage Against the Machine’s album cover but somehow wasn’t affected by the filter. Who gives a damn what the album cover looks like if I’m listening to it in my car? Trying to protect me from unanticipated self-harm related imagery in music is a impossible task when I grew up listening to artists like Kurt Cobain, Chris Cornell, and Chester Bennington. The line between the music that heals and the music that hurts is one that that only I can draw.
The problem with migrating to a new music service lies in training it to understand which music I want and what I don’t. Over the 8 years I used Google’s music services I had accumulated over 2200 “likes” and built dozens of playlists. This music profile has an emotional value that made it difficult to leave the platform. Cory Doctorow recently wrote about the “enshittification” of social media platforms, and I couldn’t help but feel that’s essentially what happened to Play Music. I was locked in by my data and needed a way out.
Having made a decision to leave, signed up for Spotify and found a service called TuneMyMusic that imported some of my key playlists. However, there was a 500 song limit on what they’d transfer for free and the playlists I did import weren’t as accurate as a would have liked. There was also the problem of it playing a lot of music that I didn’t like on radio because my library of favorites was still empty. Hearing Creed in my alt rock radio is enough to make me want to break shit. That’s when I decided I would use the Spotify API to just move my Youtube Music “likes” over myself.
There’s something empowering about being able to do this. The continual enshittification of Twitter serves as a constant reminder of the powerful impact of API access. Despite all its flaws, Google had graciously provided me with all the tools I’d need to exit it’s own platform — provided I was willing to put in some effort. I respect that and it’s comforting to know I can always reverse this process later if I so choose.
At the same time, this is probably not something the “average person” would be able to do on their own. But I think they could do it with assistance! I wrote this code primarily for myself, but hopefully this will provide a detailed enough example for others to build on.
I don’t think of APIs as a difficult concept to understand. They’re basically just a computer’s version of a contract for certain services. The difficulty lies mostly in knowing of their existence in the first place. You have to know who provides the service and what they allow you to do with it. Without some idea of what’s possible using an API, you’d have no idea what you could use them for!
Code
First, we’ll need some credentials to use the APIs. These credentials contain sensitive information that proves we’re supposed to have access to the APIs we want to use. It’s good practice to keep these secrets separate from your code, so I’m storing them in Google Drive because it’s convenient.
PLEASE DON’T SHARE THESE FILES WITH ANYONE!
For Youtube Music, you’ll need to first open up the Google Cloud Console:
Go to “Credentials” and select “Create OAuth client ID”. Configure the consent screen then create and download OAuth client secret. Upload this file to google drive as “creds_google.json”.
For Spotify, you’ll need to go to the Developer Dashboard:
Select the option to create a new application. Where it asks you for a “redirect URI”, add “https://localhost:8888/callback”. We won’t actually use this since we’re not building a full web application, but we need to put in something here so we’re choosing something that would be useful in a debugging environment later.
Once you’ve created the credentials, take note of the “client_id” and “client_secret”. You’ll also need your Spotify username which you can find on the account overview page:
Using your favorite text editor, create a new text file containing the information above using the following JSON format:
{
"client_id":"REPLACE WITH CLIENT ID",
"client_secret":"REPLACE WITH SECRET",
"user_id":"REPLACE WITH SPOTIFY USERNAME"
}
Save and upload to Google Drive as “creds_spotify.json”.
Next, we’re going to install a Python module to simplify access to the Spotify API. The following command will tell Colab to install the latest “Spotipy” package from the PyPI repository. We do this first because we might need to restart the interpreter afterwards.
!pip install spotipy
Now that the “Spotipy” package is installed, we’ll import all the modules we’ll need for the rest of the script. Typically I import them as I produce my code but collect them at the top. This makes it easy to check that all my prerequisites are accounted for before I get too deep into my code.
# The first section are modules we'll need to read our credentials files in
import json
import os
import os.path
# The following are modules for accessing the Spotify API using the package
# we just installed.
import spotipy
import spotipy.util as util
from spotipy.oauth2 import SpotifyOAuth
from spotipy.oauth2 import SpotifyClientCredentials
# We'll need "sleep" so that I can limit the rate of my Spotify API calls
from time import sleep
# The following are modules that we'll need to authenticate to the Youtube API
import google_auth_oauthlib.flow
import googleapiclient.discovery
import googleapiclient.errors
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google.colab import drive
Since we stored our credentials in Google Drive, we’re going need to give ourself access to it from this notebook. The following line will tell the interpreter to treat our Google Drive as a local folder. Please exercise caution when doing this as we’re technically granting more access than we actually need.
drive.mount('/content/drive',readonly=True)
Now that we have access to our credentials files, we’ll need to authorize the application to read our playlists from Youtube. The following code will prompt you to log in with your Google account and warn you that you are providing your information to an application that’s under development. This is okay because we created the app. Copy and paste the code from the confirmation screen into the running cell to complete the authorization process.
api_service_name = "youtube"
api_version = "v3"
client_secrets_file = "/content/drive/My Drive/creds_google.json"
scopes = ["https://www.googleapis.com/auth/youtube.readonly"]
flow = google_auth_oauthlib.flow.InstalledAppFlow.from_client_secrets_file(
client_secrets_file, scopes)
credentials = flow.run_console()
The following code may prompt you to enable the YouTube API for the project in the Google Cloud Console the first time you run it. Follow the directions as needed.
youtube = googleapiclient.discovery.build(
api_service_name, api_version, credentials=credentials)
Getting this far means we now have access to our YouTube data. Youtube Music basically stores all your likes in a playlist called “LM”. I’m presuming it stands for “Liked Music”. The API has a limit to how many results it would return at a time, so the following code will loop through the pages of the API response until we’ve collected the entire list.
def get_all_items_by_playlist(playlistId="LM"):
request = youtube.playlistItems().list(
playlistId=playlistId,
part="contentDetails",
maxResults=50
)
response = request.execute()
all_items = response['items']
while response.get('nextPageToken',"") != "":
request = youtube.playlistItems().list(
playlistId=playlistId,
part="contentDetails",
pageToken=response['nextPageToken'],
maxResults=50
)
response = request.execute()
all_items.extend(response['items'])
return all_items
all_items = get_all_items_by_playlist()
As a quick check, I compared the length of this list with my expected value of 2255 to ensure I’m getting the correct playlist.
print(len(all_items))
Next I took a peek at the structure of the data by printing the first 5 items.
all_items[0:5]
It turns that “contentDetails” didn’t contain very many details about the content. There’s not enough information to identify a song based on this, so I reached back out to the YouTube API and asked for a “snippet” instead.
def get_music_metadata(playlistItem):
request = youtube.videos().list(
id=playlistItem['contentDetails']['videoId'],
part="snippet"
)
response = request.execute()
return response
test_md = get_music_metadata(all_items[1])
print(test_md)
At least now we have some information to work with here! There’s no clear cut way to pick out the song and artist though. Some YouTube videos have both the artist and the song in the title, but the majority only include the artist name in the channel. For my purposes, I’m going to treat the channel name up to the first dash as a proxy for the artist and video title as the song title. It’s not going to be 100% accurate, but hopefully will be close enough that Spotify can figure it out.
def guess_song_title(metadata):
if len(metadata['items']):
return metadata['items'][0]['snippet']['title']
else:
print('Failed to find song name for item:')
print(metadata)
return "Unknown Song"
def guess_artist_name(metadata):
if len(metadata['items']):
return metadata['items'][0]['snippet']['channelTitle'].split('-')[0].strip()
else:
print('Failed to find artist for item:')
print(metadata)
return "Unknown Artist"
print(guess_song_title(test_md))
print(guess_artist_name(test_md))
Having verified that I get a reasonable guess as to the song and artist associated with a sample video ID, I looped back through my list of all likes to acquire the rest of the dataset.
for i,item in enumerate(all_items):
all_items[i]['metadata'] = get_music_metadata(item)
all_items[i]['artist'] = guess_artist_name(all_items[i]['metadata'])
all_items[i]['song'] = guess_song_title(all_items[i]['metadata'])
My results showed that my script failed on 1 item which seems to have been deleted, but that’s not a bad success rate considering how many it ran through. The next step is to see if we can find the songs on Spotify, so lets pull our credentials from the file we created earlier.
SPOTIFY_CREDS = "/content/drive/My Drive/creds_spotify.json"
with open(SPOTIFY_CREDS,"r") as f:
spotify_credentials = json.load(f)
Now that we have our credentials in hand, we need to authenticate to the Spotify API like we did with Google earlier. Note that the redirect URI appears again here, so you may need to update this if you used something different.
market = [ "AD", "AR", "AT", "AU", "BE", "BG", "BO", "BR", "CA", "CH", "CL", "CO", "CR", "CY",
"CZ", "DE", "DK", "DO", "EC", "EE", "ES", "FI", "FR", "GB", "GR", "GT", "HK", "HN", "HU",
"ID", "IE", "IS", "IT", "JP", "LI", "LT", "LU", "LV", "MC", "MT", "MX", "MY", "NI", "NL",
"NO", "NZ", "PA", "PE", "PH", "PL", "PT", "PY", "SE", "SG", "SK", "SV", "TH", "TR", "TW",
"US", "UY", "VN" ]
sp_scope = "user-library-read user-library-modify"
sp_credentials = SpotifyClientCredentials(
client_id=spotify_credentials["client_id"],
client_secret=spotify_credentials["client_secret"])
auth_manager = SpotifyClientCredentials(client_id=spotify_credentials["client_id"],client_secret=spotify_credentials["client_secret"])
sp_oauth = SpotifyOAuth(
username=spotify_credentials["user_id"],
scope=sp_scope,
client_id=spotify_credentials["client_id"],
client_secret=spotify_credentials["client_secret"],
redirect_uri="https://localhost:8888/callback",
open_browser=False
)
Like the Google API before it, Spotify will also ask us to provide an access token. We don’t actually have a web server set up to provide a nice user interface here, so you’re going to need to pull the access token from the address bar in your web browser.
When you run the following cell, it provide a link that opens in your web browser. When Spotify completes the authorization, it sends you to the redirect URI which will likely fail because your computer is probably not running a web server on the specified port (if you are, you might need to adapt a little here). This “Site can’t be reached” error is okay, but what we need here is the URL from the address bar. It should look something like this:
This whole URL will needed to be copy and pasted into this notebook to complete the authorization process.
sp_token = sp_oauth.get_access_token()
Now that we have an access token, let’s finish the the log in process and verify that we’re authenticated as the correct user.
sp = spotipy.Spotify(auth_manager=sp_oauth)
print(sp.current_user())
The next step is to use the artists and song titles we extracted from YouTube to find the corresponding track on Spotify. I’m going to put a dash between the artist name and the song title for my search string, then use the first hit in the results so see what I found.
def get_search_string(item):
return item['artist']+" - "+item['song']
res = sp.search(get_search_string(all_items[0]), type="track", market=market, limit=1)
print(res)
The key piece of information we need here is the “track id”.
res['tracks']['items'][0]['id']
Since we need to do this for a lot of songs, we’ll wrap this process in a function and verify that it works.
def get_track_id(item):
if item['song']=="Unknown Song" or item['artist']=="Unknown Artist":
print(f"Skipping unknown item: {item}")
return None
search_str = get_search_string(item)
res = sp.search(search_str, type="track", market=market, limit=1)
if len(res['tracks']['items']):
return res['tracks']['items'][0]['id']
else:
return None
my_track_id = get_track_id(all_items[0])
print(my_track_id)
Having extracted this track id, we can use it to add the item to our Spotify library. Before we do that, we should probably make sure it’s not already in the collection.
sp.current_user_saved_tracks_contains([my_track_id])
If it’s your first time running the above code, it should return “[False]” because we haven’t added anything to our library yet. Next, we’ll add it and try again.
def add_track_to_library(item):
track_id = get_track_id(item)
track_name = get_search_string(item)
if track_id is None:
print(f"Couldn't find track for: {track_name}")
return
if sp.current_user_saved_tracks_contains([track_id])[0]:
print(f"Track is already liked: {track_name}")
return
print(f"Attempting to add track: {track_name}")
sp.current_user_saved_tracks_add([track_id])
add_track_to_library(all_items[0])
sp.current_user_saved_tracks_contains([my_track_id])
The cell above should return “[True]” to indicate that the item has been successfully added to our library. The only thing left is to apply this process to the rest of our list. I’ve added a small sleep step to this loop to ensure that I stay within the Spotify usage limits
for i,item in enumerate(all_items):
add_track_to_library(item)
sleep(0.1)
Conclusion
That’s it! We’ve successfully migrated our YouTube Music Likes to Spotify Favorites. Now when Spotify tries to dynamically generate radio stations, it at least has some general information about what I’d like to listen to. There’s plenty of room for improvement in this code, but it got the job I needed done so I probably won’t be coming back to it any time soon.
If you’re looking for some project ideas to further develop your own understanding of these APIs, here are some potential modifications you might consider:
- Try to improve the search criteria for better accuracy.
- Try to create a new Spotify playlist from a Youtube Music playlist.
- Try to sync Spotify favorites back to YouTube Music.
- Try to implement this as a complete web application.
- Try to analyze patterns in your music collection using data available through Spotify’s “Audio Features”.
There’s just so much you can do with these APIs! All it takes is a little imagination and some elbow grease!