Using the Wayback Machine to Archive (and Backup) WordPress
Sometimes, WordPress backups fail and restoring your many blog posts and pages can become a brutal challenge. Fortunately, archiving them using the Wayback (time) Machine can make them easily recoverable.
Sure, you can’t actually go back in time, but using the Wayback Machine isn’t far off. It archives public web documents to preserve human culture for future generations.
In this post, I’ll go into detail about the Wayback Machine, what it is, how you can use it to automatically or manually archive your blog posts and pages, and also how you can retrieve archived content. I’ll also show you a few plugins you can use for easy archiving.
Continue reading, or jump ahead using these links:
- What is the Way Back Machine?
- When and What Is Archived?
- Archiving Your Blog Posts and Pages
- Accessing Your Archived Content
- WordPress Archiving Plugins
What is the Way Back Machine?
The Wayback Machine is a three-dimensional index that archives publicly accessible web pages by crawling them, similar to search engines. It was created in 1996 as a non-profit project by The Internet Archive.
It’s name is actually a reference from the popular cartoon, Rocky and Bullwinkle. In the show, Mister Peabody’s fictional time machine, WABAC, was pronounced “way back”, and that’s how the index appropriated it.
Archiving your blog posts and pages using the Wayback Machine can be useful if your site breaks and your backups fail. While you can’t archive all dynamic content, the text on your posts and pages are saved, which means you can copy and paste it into a new post.
You can recover the posts and content you’re missing while also contributing to a non-profit project. By archiving your site, you’re preserving information and artifacts from the cultures and heritages of humanity for future generations and civilizations.
Future human beings can take a look at everything that the Wayback Machine archived, and gain access to a digital history and a reference for learning from us. Much like archaeologists uncover ancient artifacts from our past, that we endeavor to use for the betterment of our future.
In fact, here’s a really cool post of ours that was almost entirely written on captures from the Wayback Machine.
When and What Is Archived?
The Wayback Machine only crawls public web pages and can’t access content that’s password-protected or on a secure, private server. It also doesn’t crawl sites that discourage search engines from crawling them.
Popular sites that get a lot of traffic are automatically crawled, but you can manually archive pages in a few seconds.
The only prerequisite is that you need to make sure your WordPress website is set up to let crawlers go through your pages and posts. To ensure your site can be archived:
- From the WordPress admin dashboard, click on Settings > Reading.
- Under Search Engine Visibility, make sure the box for Discourage search engines from indexing this site is unchecked, then click Save Changes (if you made one).
If you have any plugins installed and activated that have a similar setting, be sure to change it to let crawlers through.
Once that’s done, you’re ready to archive your posts and pages.
Archiving Your Blog Posts and Pages
There are two main ways to archive your site using the Wayback Machine.
The first method is by typing
web.archive.org/save/ in front of the URL in your browser’s address bar. You don’t need to omit the http:// or https:// at the beginning of the web address.
You can also go to the Wayback Machine Web Archive page and enter the URL of the page or post you want to archive in the field under Save Page Now. Then, Click the Save Page button.
In either case, the process takes a few seconds but can take a bit longer depending on the size of the page. Once the archiving has been completed, you should see a direct URL you can copy and save to directly access the archived post or page later.
Accessing Your Archived Content
Once you have archived your posts and pages, you can access them by visiting the Wayback Machine. Keep in mind that it can take several days for a page to get fully archived so you may not be able to access the content you archived right away, but it should be there later on.
You can search for archived pages and posts by clicking on the web icon. Then, enter a URL into the field that dynamically appears toward the top of the page and press Enter on your keyboard.
If you don’t remember the exact URL of the post or page you’re trying to recover, you can enter only your main web address or the link to your blog. The Wayback Machine should pull up all the results related to the address you entered, including URL strings.
The search results return a calendar with colored circles to highlight the days where content was archived. You can hover over one of these circles to view a list of pages that were indexed on that day.
You can click on one of the hyperlinked times that are listed to view the archived page.
From there, you can copy and paste the text into your post or page editor and save a new copy of your content to recover your site.
Voilà! Your site is fully recovered.
WordPress Archiving Plugins
If you would like different ways to archive your posts and pages, check out these plugins. Not all of them archive to the Wayback Machine, but they offer other complimentary capabilities to archiving.
Simple Yearly Archive
Simple Yearly Archive is a rather neat and simple WordPress plugin that allows you to display your archives in a year-based list.
It works mostly like the usual WP archive, but displays all published posts separated by their year of publication. In addition, you can also restrict the output to certain categories, and much more.
Smart Archive Page Remove
The Smart Archive Page Remove plugin allows you to remove Archive Pages automatically generated by WordPress.
WordPress automatically generates Author based, Category based, Tag based and Date based (daily, monthly and yearly) archives for your posts. Even if you do not want to use these Pages (for example, you don’t want to have a daily archive because you don’t post several times a day), they exist and they can be accessed by their automated generated URL.
This plugin adds an item ‘Archive Pages’ in the ‘Settings’ section of your WordPress Admin. Here you can select which Archive Pages you want to remove, then they can be restored anytime.
Broken Link Checker
The Broken Link Checker plugin doesn’t archive anything, but it can help you figure out what pages or posts are missing on your site since it searches for broken links.
Once you know what’s been lost, you search for it in the Way Back Machine. Then, you can copy and paste your text content into a new page or post and replace the old links with the new ones.
It’s not a real time machine, but if you’re having troubles restoring your posts or page content after your site breaks, searching the Wayback Machine for your previously archived content can help you get it back. Archiving your site can act as a backup to your backups in case disaster strikes and your site isn’t able to be fully restored.
Obviously, archiving your site with the Wayback Machine is not a solid solution for backing up your websites. If you’re looking for a more reliable solution, check out our managed backups and all-new storage plans that make backing up your sites not only a no-brainer, but simple and affordable.
Editor’s Note: This post has been updated for accuracy and relevancy.
[Originally Published: February 2017 / Revised: March 2022]