I’ve recently discovered that it’s not possible to do a server-side redirect for anchor links on web pages! The reason is that the anchor part of the URL (the part containing the “#” and anchor name) is not sent back to the server. Not ever!
This is what happened to me recently: I had a long document, presented in one long web page. I decided to split it into two pages. A number of the headings had IDs which act as anchor links, and we know that external documents link to those anchors. I could fix any incoming links in our own documents, but not in the unknown number of documents out there on the world wide web.
This is a prime case for a redirect, or so you’d think.
Let’s say I have a page called “Ice Floes“, and I have the following HTML in my document:
<h2 id="penguin_joke">Penguins are cool</h2>
<h3 id = "introduction">Introduction</h3>
Did you hear the one about the penguins on an ice floe?
<h3 id="story">The story</h3>
<p>There are two penguins on an ice floe,
drifting north into warmer waters.
These penguins are very fond of each other,
but they don't speak English very well.
Suddenly, with a terrific crack,
the ice floe splits in half, right between the penguins.
As they begin drifting apart, one penguin
sadly waves a flipper and calls out,
It’s quite feasible that there’d be incoming links to this document, like this:
Read all about the
<a href="http://example.com/icefloes#penguin_joke">cool penguins</a>.
That’s called an “anchor link” because it points to a specific anchor in the page. In this case, the anchor is a heading ID.
Now let’s say I want to move the penguins to a page all of their own, called “Penguins“. I’d like to redirect the relevant links to the new page. So, dear server, please redirect all links of this form:
To something like this:
Or even redirecting to the top of the new page would do:
So, what’s the problem?
The server can only redirect the link at page level. It cannot redirect incoming anchor links, because it never sees the anchor part of the URL.
For a link like this:
The browser only sends the server this:
That’s right! The browser removes the anchor, stores it, and then puts it back when it needs it. Huh, who’d a thunk it.
Is there a workaround?
In my case, I’m lucky because the original page will still be there. So I’ve left a heading in the page, with a textual explanation and a link for people to click. Manual redirection.
I’ve also added a bunch of dummy, empty <div> sections with IDs, to cater for all the subheadings that used to exist within the section. This will bring all relevant incoming links to the same part of the original page, and people can click through to get to the right place. Ugly, but at least the readers will find their way to the right place.
This is what the updated, minimised section would look like, on the original “Ice Floes” page:
<p>Read all about penguins in the
<a href="http://example.com/penguins">dedicated penguins document</a>.</p>
If I wanted to, I could also add explicit references to each section of the new page, but in my case that was too much text.
I hope this post is useful to someone who may run into the same problem as I did. If you have any more tips to share, I’d love to hear them.
A pretty flower from yesterday’s bush walk: