Using the Wayback Machine to Archive (and Backup) WordPress

Using the Wayback Machine to Archive (and Backup) WordPress

Sometimes, WordPress backups fail and restoring your many blog posts and pages can become a brutal challenge. Fortunately, archiving them using the Wayback (time) Machine can make them easily recoverable.

Sure, you can’t actually go back in time, but using the Wayback Machine isn’t far off. It archives public web documents to preserve human culture for future generations.

In this post, I’ll go into detail about the Wayback Machine, what it is, how you can use it to automatically or manually archive your blog posts and pages, and also how you can retrieve archived content. I’ll also show you a few plugins you can use for easy archiving.

Continue reading, or jump ahead using these links:

What is the Way Back Machine?

The Wayback Machine is a three-dimensional index that archives publicly accessible web pages by crawling them, similar to search engines. It was created in 1996 as a non-profit project by The Internet Archive.

It’s name is actually a reference from the popular cartoon, Rocky and Bullwinkle. In the show, Mister Peabody’s fictional time machine, WABAC, was pronounced “way back”, and that’s how the index appropriated it.

Archiving your blog posts and pages using the Wayback Machine can be useful if your site breaks and your backups fail. While you can’t archive all dynamic content, the text on your posts and pages are saved, which means you can copy and paste it into a new post.

You can recover the posts and content you’re missing while also contributing to a non-profit project. By archiving your site, you’re preserving information and artifacts from the cultures and heritages of humanity for future generations and civilizations.

Future human beings can take a look at everything that the Wayback Machine archived, and gain access to a digital history and a reference for learning from us. Much like archaeologists uncover ancient artifacts from our past, that we endeavor to use for the betterment of our future.

In fact, here’s a really cool post of ours that was almost entirely written on captures from the Wayback Machine.

When and What Is Archived?

The Wayback Machine only crawls public web pages and can’t access content that’s password-protected or on a secure, private server. It also doesn’t crawl sites that discourage search engines from crawling them.

Popular sites that get a lot of traffic are automatically crawled, but you can manually archive pages in a few seconds.

The only prerequisite is that you need to make sure your WordPress website is set up to let crawlers go through your pages and posts. To ensure your site can be archived:

  1. From the WordPress admin dashboard, click on Settings > Reading.
  2. Under Search Engine Visibility, make sure the box for Discourage search engines from indexing this site is unchecked, then click Save Changes (if you made one).

If you have any plugins installed and activated that have a similar setting, be sure to change it to let crawlers through.

Once that’s done, you’re ready to archive your posts and pages.

Archiving Your Blog Posts and Pages

There are two main ways to archive your site using the Wayback Machine.

The first method is by typing web.archive.org/save/ in front of the URL in your browser’s address bar. You don’t need to omit the http:// or https:// at the beginning of the web address.

You can also go to the Wayback Machine Web Archive page and enter the URL of the page or post you want to archive in the field under Save Page Now. Then, Click the Save Page button.

The Way Back Machine Web Archive page
You can archive your post by visiting the Way Back Machine and entering a URL.

In either case, the process takes a few seconds but can take a bit longer depending on the size of the page. Once the archiving has been completed, you should see a direct URL you can copy and save to directly access the archived post or page later.

An example archived page.
Once your page has been archived inserting the URL addition, you can get a direct link to the archive.

Accessing Your Archived Content

Once you have archived your posts and pages, you can access them by visiting the Wayback Machine. Keep in mind that it can take several days for a page to get fully archived so you may not be able to access the content you archived right away, but it should be there later on.

You can search for archived pages and posts by clicking on the web icon. Then, enter a URL into the field that dynamically appears toward the top of the page and press Enter on your keyboard.

Way Back Machine search
You can search for your previously archived posts and pages.

If you don’t remember the exact URL of the post or page you’re trying to recover, you can enter only your main web address or the link to your blog. The Wayback Machine should pull up all the results related to the address you entered, including URL strings.

The search results return a calendar with colored circles to highlight the days where content was archived. You can hover over one of these circles to view a list of pages that were indexed on that day.

Way Back Machine search results
Hover over one of the days on the calendar to view archived pages.

You can click on one of the hyperlinked times that are listed to view the archived page.

From there, you can copy and paste the text into your post or page editor and save a new copy of your content to recover your site.

Voilà! Your site is fully recovered.

WordPress Archiving Plugins

If you would like different ways to archive your posts and pages, check out these plugins. Not all of them archive to the Wayback Machine, but they offer other complimentary capabilities to archiving.

Simple Yearly Archive

Simple Yearly Archive plugin banner

Simple Yearly Archive is a rather neat and simple WordPress plugin that allows you to display your archives in a year-based list.

It works mostly like the usual WP archive, but displays all published posts separated by their year of publication. In addition, you can also restrict the output to certain categories, and much more.

Smart Archive Page Remove

Smart Archive Page Remove plugin banner

The Smart Archive Page Remove plugin allows you to remove Archive Pages automatically generated by WordPress.

WordPress automatically generates Author based, Category based, Tag based and Date based (daily, monthly and yearly) archives for your posts. Even if you do not want to use these Pages (for example, you don’t want to have a daily archive because you don’t post several times a day), they exist and they can be accessed by their automated generated URL.

This plugin adds an item ‘Archive Pages’ in the ‘Settings’ section of your WordPress Admin. Here you can select which Archive Pages you want to remove, then they can be restored anytime.

Broken Link Checker plugin banner

The Broken Link Checker plugin doesn’t archive anything, but it can help you figure out what pages or posts are missing on your site since it searches for broken links.

Once you know what’s been lost, you search for it in the Way Back Machine. Then, you can copy and paste your text content into a new page or post and replace the old links with the new ones.

Wrapping Up

It’s not a real time machine, but if you’re having troubles restoring your posts or page content after your site breaks, searching the Wayback Machine for your previously archived content can help you get it back. Archiving your site can act as a backup to your backups in case disaster strikes and your site isn’t able to be fully restored.

Obviously, archiving your site with the Wayback Machine is not a solid solution for backing up your websites. If you’re looking for a more reliable solution, check out our managed backups and all-new storage plans that make backing up your sites not only a no-brainer, but simple and affordable.

Editor’s Note: This post has been updated for accuracy and relevancy.
[Originally Published: February 2017 / Revised: March 2022]

Do you use the Wayback Machine? Have you ever had issues restoring your site? Feel free to share your experience in the comments below.
Tags:
Jenni McKinnon
Jenni McKinnon A copywriter, copy editor, web developer and course instructor, Jenni has spent over 15 years developing websites and almost as long for WordPress. A self-described WordPress nerd, she enjoys watching The Simpsons and names her test sites after references from the show.
Janette Burhans
Janette Burhans Janette is a wily wordsmith and impassioned illustrator who does her best work surrounded by pink, sparkly things. Aside from creating articles and art, she treasures time with family (human & furry), reading or watching stories, and long walks down the makeup aisle. She is happy to share her thoughts and feelings, but draws the line at ice cream.