IPFS, Again

By Tom MacWright

ipfs graphic

In 2017, I wrote how to decentralize your website with IPFS. I wasn’t able to do it back then.

Two years later, a few things have changed. Instead of 52 million dollars, Protocol Labs has about 257 more, bringing the total over 300 million dollars. The kind of money where you start giving money away and hiring top-tier talent. Two years of development have produced a multitude of open source repositories - 245 in the IPFS organization, plus another 67 under multiformats, and 168 under libp2p, and 102 under IPFS Shipyard.

So what should we expect of IPFS? At five years old, is this a project that’s usable ‘here and now’, as the homepage promised in 2017? Are all the parts in place, just waiting for web and application developers to see the light? Have the stumbling blocks I noticed in 2017 been smoothed over?

No

IPFS is still not usable for websites.

I tried, again, to make IPFS work. Where I ran into bugs, I reported them. I asked around in the forums, and hopped on a very informative call with a team from Pinata. When the process required scripting and workarounds on my side, I implemented what was necessary. I wrote & proposed fixes for many of the documentation issues.

I want something like IPFS to exist. I’m skeptical of the ‘crypto’ ecosystem that IPFS sits next to, but I also take the crisis of the web seriously. If blockchains prove to be useful, I’ll welcome them as part of the solution.

In other words, this isn’t a knee-jerk rant. I tried to make IPFS work until I had to call it quits. These are my notes, and some conclusions I reached.

The goal

The goal is the same as it was in 2017: make a decentralized version of macwright.org. A decentralized website.

macwright.org is the easiest possible kind of website to decentralize. It’s lightweight, has few external dependencies, no trackers, and a simple, customizable build process. If you can make any website work, you should be able to make this one work.

My only restrictions on this mission are:

I won’t use a service. It’s great that folks are creating services around IPFS to make it easy. But I’m evaluating IPFS, not a service, and by relying on something prebuilt, I’d likely insulate myself from knowing about the kinks. I also don’t want to host on, say, DigitalOcean or a cloud server at first. Decentralization means it needs to work in my house.

Updates need to be reasonable. This is a blog. I write new posts, and link those posts from Twitter. IFPS aims for immutability of content, but this blog isn’t immutable, so I need a mechanism of updating a pointer to the latest version.

It needs to be addressable. IPFS uses lots of hashes. You’ll see a lot of references to things like Qmc5cLBCY6fsWkxqpMS1RTovhERoXZSSDrjkjogrVfrDfJ. That’s perfectly great for architecture, but that’s not human-friendly. There should be a way for humans to find things.

The process

What follows are my notes from trying to build a system, step by step. I aim for little goals in increments.

Step 1: Install IPFS locally

In 2017, to get from ipfs.io to the ‘Download’ button, it took 4 clicks through redundant pages. This number has been reduced to 3. But the inconsistencies remain: macOS is referred to as darwin in one place, OS X in another, Mac OS X in another. And there are some new bugs: the IPFS desktop app is included in the installation instructions, but the link to it is broken.

The naming inconsistency is going to be a theme. One of the biggest problems with the Protocol Labs ecosystem is its carelessness with language. Jargon and acronyms are reproducing without need or limit, and producing a world of words that mean too much and too little at the same time.

Step 2: Browse your website locally

Okay, so I’ve got IPFS running. Let’s check out the simplest case: serving my website locally and browsing it locally. In my case - using Jekyll to build, this looks like:

ipfs daemon &
bundle exec jekyll build
cd _site
ipfs add -r

This prints a long line of output, ending with

added QmesFFG92JmyZsPfSrAtbEMSWELsfkPfADhhcg8FjxKB3z _site/resources
added QmTrLEV3NQBhgYM2ZiW52eoSdvVqwU36p3qsQvH4oTfYkM _site/simple-statistics
added QmbkWSVC1VqbswL6NCN4VSZZRoKPtvcBEfkfFbeWVnddGn _site/swift
added QmNb7T9sa3zgvpoRs1k92MQ1pNZxjk3FxaHDc6U9tGAPfj _site/talks
added QmP662VnQbyxK6tbnqhueTzJnPgkFBx7G6hwwTwAtAYyz4 _site/tmp
added QmR2eSRGdDhnnUHWYFvwwbyMDkAJ2gPVMJQZh4TQwKKJW4 _site/topics
added QmVwS9a5LYs3rU3nCyMkmbdHj2ZguE5PpbQy14gjeSytQt _site

Rather terse, right? No ‘success’ method to tell me whether it’s done or the process just died. No indication of which one’s the root entry, or how I might be able to get to it. Just a lot of hashes.

Now, this is a CLI, but it’s also the recommended route for getting a website set up. Some creature comforts might be worthwhile here.

You access this website by copy & pasting the last hash on that line, installing the IPFS Chrome Extension, and prefixing it with http://127.0.0.1:8080/ipfs/. So here we are:

IPFS

Step 3: Make links work again

Ready to publish it to the web? Not so fast. Clicking a link brings us back to my issue in 2017: the way that the IPFS gateway works will break your links.

Essentially, since you’re seeing your website rooted at /ipfs/HASH/, then any link you have that starts with / will go to the wrong place. And you can’t just add that prefix to all your links, because the hash comes from the content, and also because folks might be looking at your website from /ipns/hash or /ipns/domainname.com.

So, links don’t work. I posted an issue detailing this issue, and while I got an encouraging response that there’s a real solution planned, there’s no real solution. People use specific plugins just for IPFS, like this one for GatsbyJS, to get it to work.

I ended up writing make-relative, a script that rewrites my built site to use relative links. This is where the story about IPFS being useful here and now for web developers breaks down a little. I’ve done enough HTML-mangling and path-resolution in my decade in industry that writing this script was straightforward. But the knowledge required to do it is not all that common, and I think this is where a majority of web developers would call it quits, because IPFS’s ‘website hosting’ story would look broken.

And it is pretty much broken. The current way of using IPFS in a browser is a bad hack, just like it was in 2017. Websites will be broken by default by it, and it still has no semblance of the web security model.

Step 4: Give it a URL

So I’ve written a custom script and now have a browsable local website. I’ll want to have a local server that keeps this website online, but first I’ll dive into the URL problem.

Okay, so we rely on the web’s existing, working, good-enough addressing system every day. You’re on this page right now by typing in macwright.org, or clicking a link, and in the address bar is the domain name macwright.org. Domain names are things that you pay for: I pay gandi about $18 a year for the privilege of that name, and they split the money with Verisign, who runs .com, and ICANN, which runs the bureaucracy around disputes, renewals, and governance. Your browser than uses DNS, a decentralized naming system, to resolve that name to an IP address which is the IP address of the server that has this content.

Decentralized web projects have to decide whether they can replicate this system in a purely decentralized way (with no ICANN, Registrar, or company involved), or to rely on DNS to ‘bootstrap’ their systems.

The IPFS stack chose to implement all of the ways.

There’s DNSLink, a rather practical system that has you add another DNS record to ‘link’ it to your decentralized website.

There’s also IPNS, which is purely decentralized, but at the cost of being very, extremely, notably, slow. It also doesn’t produce pretty names – instead it gives you a long hash, just like the ones I saw with ipfs add -r.

Those are the fundamentals. Let’s add some detail.

So, an IPNS address looks like

/ipns/QmSrPmbaUKA3ZodhzPWZnpFgcPMFWF4QsxXbkWfEptTBJd

The documentation advises us that DNSLink is faster than IPNS and yields more pretty names. So here’s a DNSLink address:

Careful readers will notice that this address also begins with IPNS. But, as far as I can tell, it doesn’t necessarily use IPNS: it could be – and usually is – macwright.org uses DNSLink to indicate an IPFS address. Why would DNSLink use /ipns/ as a prefix if it’s often used instead of IPNS? I’m as confused by this as you are.

So, anyway, I set up DNSLink within IPNS, thinking that this must be what most folks do. Surely they don’t update their DNS records every time they update their website.

That was an incorrect assumption. IPFS-based websites do update their DNS records every time that they update their website, so that they can avoid using IPNS, because IPNS is just too slow.

This was a tough discovery, because it works against everything I know about DNS – a system that isn’t particularly designed to be fast or scriptable.

But, I had to forge on, so I wrote gandi-ipfs, a tool that would let me update my DNS records to use DNSLink. Reportedly other folks use CloudFlare and have their own scripts. I gleaned this information from a call with the very friendly and helpful Pinata team.

Step 5: Host it

To mark our point in the journey, here’s what my script is looking like now.

ipfs daemon & echo "Jekyll building…"
bundle exec jekyll build cd _site
make-relative https://macwright.org/
hash=`ipfs add -r -q . | tail -1` echo "Root hash: ${hash}"
gandi-ipfs $hash

So I’m using my custom script make-relative to make all the links on the site relative, and the script gandi-ipfs to update DNS records.

But the server that I start with ipfs daemon stops every time I close my laptop’s lid. So the next step is setting up a server.

I decided to start with a Raspberry PI kit - the Pi-Hole in particular, which is intended to be set up as an ad blocker, but happens to contain all the parts I need for a simple server. I know full well that Raspberry PIs are small, low-power, low-memory computers and I might need something bigger for long-term performance, like an Intel NUC. But it should do the trick for testing.

Setting up the PI was pretty smooth, and it’s magic to just log into a tiny computer. It’s an ARM chip and runs a certain flavor of Linux.

Installing IPFS on the Raspberry PI was also smooth: just download the ARM distribution of go-ipfs and you’re set.

Unfortunately, once I started getting this set up as a ‘pinning device’, the fun stopped.

I tried running ipfs pin add with the hash generated earlier from ipfs add -r, but it just ‘hung’ - outputting nothing at all. After a while I realized that, like ipfs pin add, IPFS doesn’t communicate very well when it’s having a problem. So I figured out how to turn logging information all the way up, and then… I was never able to get past a ‘got error on dial’ failure, despite trying all potential configurations of the IPFS daemon, enabling logging, upgrading to the newest version, and so on. There are about 63 similar issues in the tracker, 21 of which are marked as bugs.

Epilogue

That’s as far as I got.

Like last time, I might have missed some critical step or made the process harder on myself. For ✨journalistic integrity✨, this post hasn’t been edited or reviewed by anyone. Hence the typos.

I tried to make the process documentation straightforward and fact-based, but couldn’t help but add some foreshadowing about the issues. And there really are issues.

Here are the problems

An overextended, under-documented, and unfinished constellation of projects. No single part of the ecosystem is truly finished and well-documented, and new parts are spawned every day. Projects that are almost universally avoided – like IPNS – are still included in the main documentation and recommended as if they’re usable.

The same goes for language: IPFS has a sprawling set of jargon that’s inconsistently used. Is it merkledag or merkle-dag? There are enough terms that there’s a glossary, but the glossary itself refers to both merkle-dag and merkledag. Is a node a peer or a piece of content? A low point was discovering two repos whose readmes referred to each other as the same thing.

And then there are the three big issues: usability, reliability, and performance. Do CLI commands have explanatory output? Are error messages informative? Can I count on tools working as advertised? Will the decentralized web have Bittorrent-like scalability, or will it be more like Bitcoin?

Here are the recommendations

First, make recommendations. I went down the IPNS rabbit hole because the documentation sent me, and only by a chance encountered did I learn that it’s almost universally avoided. There’s no shame in saying that a project’s not ready. Recommending unusable projects burns goodwill. Recommend paths that work.

Second, fix your words. Words are work, and that work is not happening. Finish a glossary, standardize usage and meaning, and cull unnecessary jargon. Treat new bits of jargon like technical debt, because that’s what they are.

Third, set realistic goals and make realistic statements. IPFS.io still has a web-centric message and promises that it’s useful here and now. It promises ‘fast performance’, and support for ‘huge datasets’. These are goals, not realities. An effort to put 300TB of data was met with mixed results and notes about adding and retrieving data being extremely slow.

Fourth, set a goal. This is a slightly different question than the last one. A core question is: is IPFS trying to be an internet? The website would say yes, and some of the documentation. But the 2019 goals punt the ‘decentralized web’ to 2020+, instead focusing on NPM on IPFS. Which then leads us to entropic, the most promising distributed package manager, which has a discussion about using IPFS that immediately brings up its performance problems.

Maybe I’m being too tough on IPFS. But this isn’t 2014. IPFS isn’t a new project, and it isn’t resource-limited. Protocol Labs has raised over 300 million dollars, and has been around for 5 years. That’s a lot of money to pay a lot of smart people.

So a few scenarios are possible. Maybe most IPFS users are using it for file storage and as an API backend, kind of like textile. I’m the odd one out expecting it to be useful for websites. Which would explain the haphazardness of DNSLink and IPNS, but not the performance issues. Or maybe I’m misjudging the arc of history – that Protocol Labs is a 20 year project, not a 10 year one. But really I suspect that some of the hype exists because folks are talking about IPFS but they don’t rely on it it. People excited about the potential of FileCoin and otherwise hyped on crypto technology want to imagine uses and combinations of technology without being tethered by the reality of what doesn’t work.

I hope that Protocol Labs sets a goal and achieves it. The IPFS future is exciting. But we aren’t there yet, and I’m not sure we will be.

Issues & PRs: docs/161, docs/162, multicodec/132, go-ipfs/6354, go-ipfs/6357, docs/173, forum thread