Decentralized Push Notifications for Podcasting
Building any type of podcast consumption or aggregation platform is terribly difficult. And a good 75% of the difficulty revolves around discovering new episodes in a timely manner when they are published. Podping is a system we created to take 100% of that difficulty away. It has been working 24/7 for the past 6 months. It handled 21,089 notifications yesterday.
If you aren’t familiar with the way the open RSS podcasting ecosystem works, it’s incredibly simple. A “podcast” is really just a file - a file that contains all of the episodes. It looks like this:
You can see the <item> section at the bottom of this picture. That <item> is an episode. There are many of them in this podcast file.
When a new episode is published for this podcast, all that happens is that this file (called the “RSS feed”) changes and a new <item> section is added to it. That’s it. That’s how podcasting works. It’s incredibly, elegantly simple.
But, that simplicity comes with a cost. It’s very hard to keep up with millions of files spread across the internet. And, it’s even harder to keep up with them in a way that you can know very quickly when they have changed. It boils down to checking those files as fast as you can; over and over and over. It’s a total waste of energy, time, money and bandwidth.
If you want to know within 1 minute if a podcast has a new episode (i.e. if the RSS feed file has changed) you would have to hammer that server once every minute: did the feed change yet? Did the feed change yet? Did the feed change yet?
Now multiply that scenario times millions of podcasts and you see the scale of the problem.
Various solutions to this problem have been developed over the years. Most of them have been proprietary. A few, like RSScloud, have been open. The one that has gained the most traction and is open source is WebSub (formerly called PubSubHubbub). It solves the problem of how to be notified when an RSS feed file (the podcast) changes.
The way WebSub works is right there in the name: Sub. It uses a subscription mechanism. If my software wants to know when a podcast has changed, it can “subscribe” to that podcast’s RSS feed file by sending a subscription message to the “hub” that has been chosen to manage that podcast’s subscriptions. The “hub” computer will then send my server a message every time the file changes.
In this example feed, you can see the hub server listed in the RSS feed above. Each feed that supports WebSub will list it’s hub server in it’s feed in a “link” tag that has the rel=”hub” attribute.
This is great. It means that all I have to do is subscribe to each podcast feed file using it’s designated hub and I can just sit back and let the hub do all the work of notifying me when a new episode is published.
But, it’s not all candy and nuts…
Subscriptions at Scale
WebSub subscriptions last, at most, 15 days. That means, in order to keep getting notified, you must track your subscription status in a database and be sure to resubscribe to that feed file before the 15 days (or whatever the hub allows) expires. The burden of resubscribing on a per-podcast basis every 7-15 days goes up exponentially as the podcasts being monitored grows into 6 or 7 digits.
Some quick math shows that with 4,389,710 podcasts (at the time of writing) a podcast platform would be sending 4 resubscriptions per second just to maintain those subscriptions if they were all 15 day expirations. They aren’t. Some are even shorter at 7 days, so the problem is even worse. At Podcastindex.org we maintain as many as 700k WebSub subscriptions at a given time. Our servers are constantly resubscribing to feeds 24/7.
There Aren’t Many Hubs
Most all publishers end up using Google’s appspot hub since it’s free to use. The other big hub on the net is Superfeedr, but it’s a paid service, so it sees less podcasting traffic than Google. This means that while WebSub is technically decentralized, in practice it’s highly centralized. If Google’s hub has issues or outages you stop getting notifications.
In essence, you can’t know if you aren’t receiving a notification from the hub because the feed didn’t update or because the hub is down. The silence tells you nothing.
Which leads to the last big problem…
WebSub Needs Servers
WebSub is a server protocol. In order to subscribe to change notifications for a podcast’s RSS feed file your software sends a message to the hub server telling it the address of your “webhook” server. The hub then sends the notifications to that “webhook” server each time. If your server isn’t online 24/7, you don’t get the notification. This makes it useless for apps and client software that can’t accept incoming connections on the internet.
Also, If your server goes down for an hour you have no way of knowing what notifications you missed. Because of this, most platforms still check podcast files over and over even if they are subscribing to WebSub also. It’s the only way to safely ensure you aren’t missing notifications.
The Podping solution uses a public “blockchain” to record all podcast file updates into a public ledger that anyone can monitor without the need to subscribe to anything. It’s also highly decentralized, with each part of the ecosystem having redundancy, failover and geographic distribution.
Here is what the Podping network currently looks like:
Hopefully you can see how the Podping network operates from this diagram. Podcast hosting companies and platforms send a notification to any Podping server whenever a podcast RSS feed on their system changes or posts a new episode. The notification is just a standard web (GET) request and contains the podcast RSS feed’s url. If you’re curious it looks like this:
The Podping server accepts that feed URL and writes it to the Hive blockchain using one of the numerous Hive api servers scattered around the globe. Each block in the Hive blockchain can contain many podcast feed URLs and a new block is created every 3 seconds.
It’s also extremely fast. From the time that a hosting platform notifies Podping that a podcast RSS feed has changed until the time it shows up for apps to see it is about 45 seconds, on average. Less than a minute. Most of that time (about 15-20 seconds) is actually just some internal buffering we do to make sure that quick, back to back feed updates don’t result in duplicate notifications.
No Subscription Required
Hive is also a fully public blockchain, so any software can monitor it and see those feed URLs show up every 3 seconds. So apps, directories, aggregators and anyone else can simply watch this public blockchain and see when a podcast posts a new episode. There are no subscriptions required and there is no need to even run a server. You can watch the blockchain on a workstation, smartphone app or web app.
Watching the blockchain for new episodes is as easy as this:
python3 -u ./hive-watcher.py --old=1 --urls_only
It’s just a simple Python script that outputs each podcast feed URL so you can capture that information in any other software and know to pull a fresh copy of that feed. When you watch the blockchain like this, you get the full firehose of podcast feed updates. If there are some you don’t care about, just ignore them. But, you’ll see them all regardless.
To see the Podping network in action right now, you can go to Podping.watch and see each podcast update come through in real time.
Downtime? No Biggie
That other big gotcha of WebSub - your server has to be online 24/7 in order to never miss a webhook call - is solved with Podping as well. Since blockchains are immutable and permanent, you can always “walk” back in time to see which podcasts have updated. With Podping, your aggregators aren’t receiving notifications. They are reading notifications from the Hive blockchain every 3 seconds. If you need to reboot your aggregators, fine. Go for it. When they come back up after maintenance, they can just start back a few hundred blocks in the chain and pick up where they left off. No podcast notifications are ever missed.
It’s hard to overstate how important this is. We live in a world of BGP routing issues, DDoS attacks and constant server patching. Problems do happen. It’s great to not have to worry about whether those things will impact your ability to get podcast episodes in a timely manner for your users.
Podping is 100% open source. The code is here, and anyone can run a Podping node if they want to. To bootstrap the network however, we at Podcastindex.org are running 3 geographically load balanced Podping servers. They are very cheap to run (about $12 each) and each one can handle over 2 million pings per day, which is plenty of capacity for years to come. We’ve also invested in the necessary Hive blockchain credits (again, not very expensive) to get things going. It’s a public service we are glad to provide forever.
There are 7 hosting companies already using Podping.cloud, which between them publish more than 210,000 podcasts:
There are also some independent podcasters sending updates as well. These hosts and independents could run their own Podping servers just as well, but Podping.cloud just makes the whole thing easier. In the future this will become more and more distributed.
For hosting companies, Podping allows a simple (single call) way to ensure that your customer’s episodes show up rapidly without the entire world of podcasting hammering you constantly for updates. The Podcastindex.org API doesn’t even poll those 7 hosting companies. We rely entirely on Podping to be notified of new episodes. It’s an important way we will eventually make this industry more green.
To start sending updates to Podping.cloud just reach out to us and we can give you an Authorization header token and an example of how to send the notifications. It’s very simple. If you already support WebSub, it would be effortless to add an extra call to Podping.
But, Wait… There’s More!
What we have now, with Podping, is a simple push only notification system for publishers to let the whole world know when a podcast publishes a new episode, or if the feed changes. But, this is just the foundation of what is to come.
With a global push notification system we can now send different types of messages for podcasts. One of the first uses of this will be the forthcoming <podcast:liveItem> tag in the Podcast Namespace. We will be adding a “reason=live” property to Podping so that a publisher can signal to all watching platforms that a podcast has just started a live stream.
Think of all the possibilities this offers. Being able to send out different notification types when certain types of events happen with your podcast or episodes. The possibilities are pretty mind blowing. We’re just getting started.
All of the Podcasting 2.0 projects we create are open source and we’d love to have anyone and everyone participate to make it better. If you are a Rust or Python developer with an interest in this topic feel free to join the discussions at Podcastindex.social and help us build Podping.
How do we use the live tag when hosting from wordpress/powerpress plugin?
Podping is now supported in Hive-Tube: https://hive-tube.com