How @WPEngine Is Failboating Your SEO and Leaking Your Information

Social Signals!

If you’re hosting sites with WP Engine then the following article might ruffle your feathers a bit. From the amount of indexed Google results you’ll see in a moment, we estimate that at least a few hundred thousand websites are currently running with potential SEO flaws. Plus, to make matters worse, there are countless development sites fully-exposed to the public via a simple Google search.

Aside from the SEO implications, there is a fundamental privacy problem with the way WP Engine’s system is set up. This problem is partially the fault of the developers who use the system, but WP Engine really needs to fix this or educate their customers on how to set up a site properly.

At first we were pretty shocked that something so simple could be overlooked, but at the same time we definitely understand how something like this could happen. Rather than worry too much about the how and why, this post will serve as a guide for WP Engine customers to fix the problem and hopefully encourage WP Engine to change the way their system works. Since our primary focus around here is SEO, let’s jump into the SEO problems first.

Two SEO Problems Lurking in the Darkness

Although it’s one of the best content management systems on the market, WordPress still has quite a few SEO problems out of the box. This post isn’t about those problems. This is about some pretty serious SEO issues that are directly tied to the way WP Engine’s servers are set up. Let’s dive in.

Problem #1 – Improper Subdomain Redirection

Can You Say Duplicate Content? If you know anything about SEO, then you know duplicate content is something you want to avoid. Now… a duplicate version of your entire site? That makes an SEO cringe. Unfortunately, this is exactly the issue affecting a massive amount of WP Engine users. In fact, it’s even present on their own publication Torque as well as WooThemes, one of the largest WordPress theme shops on the web, as you can see here:

torque-wpengine-fail

woothemes-wpengine-fail

Note: In these examples there is a canonical tag implemented, but keep in mind this will ONLY be present on sites running an SEO plugin or when the tag has been manually added to the theme. The canonical tag can help with this problem, but really it’s best to avoid it entirely. On another note, the canonical tag on WooThemes is pointing to https which redirects to http on their site… but that’s another post for another day. ;)

We don’t want to single anyone out here, but it’s pretty obvious that if their own publication and one of the most well-known theme shops are suffering from this issue, it’s pretty wide-spread on the WP Engine network. Surprisingly, we discovered this with a simple, yet revealing, Google query – site:.wpengine.com

wpengine subdomain google results
Yup, you read that right, 1,650,000 results. By simply scanning through the results, you’ll see the insane amount of duplicate sites currently indexed in Google. Hopefully Google is smart enough to determine which is the correct site and not let this issue hurt organic ranking positions.

With that said, in our experience, it is never smart to let Google decide which is the correct site and which is a duplicate. We love you Google, but let’s face it, you’re far from perfect.

Luckily, the fix for this is pretty easy. You just need to implement a 301 redirect in your control panel. Don’t worry, we’ll explain exactly how to fix this step-by-step so you can get it taken care of. Hopefully once you see how easy this is to fix, you’ll never make the mistake again on a new WP Engine install.

How To Fix Problem #1 – Set Some Redirects

Now upon creating a new WordPress install in WP Engine, there will be a subdomain created with your user name.

Example: username.wpengine.com

Once you’ve got your site running on your actual domain, you should see something like this in your dashboard:

wpengine backend screen 1

Highlighted is the initial subdomain and the button to edit/delete the domain. Click this and delete it, but before you do make sure you copy it to have it handy for the next step.

Now click “Add Domain Redirect to this Domain“, the link under your main domain.

Now paste in the subdomain and click the Add Domain Redirect button to save.

wpengine backend screen 2

Ok, we’re almost there, now the next step depends on how you set up your domain initially. You need to pick a permanent location for your domain if you haven’t already, either the www. or without.

You can see here on Auditwp.com we have gone without it.

wpengine-failboat

So, if your setup is the same, you will need to redirect the www. back at the root of the domain. If you’re wanting to keep the www., you’ll need to do the opposite and redirect the root domain at the www. version instead. This part is somewhat trivial as WordPress will handle this redirect for you, but just to be extra safe we recommend that people do this redirect at the server level whenever possible.

wpengine backend screen 3

Alright, so now we have successfully configured our domain with the proper redirects to avoid any duplicate content or indexation issues! If you have your site set up like this, there’s little to no chance that your site is listed in the query we mentioned earlier, which is exactly what you want.

It should also be noted that some time ago WP Engine started making the default WordPress installs set to use the “Discourage search engines from indexation” option in the WordPress admin reading settings. This is certainly helping the problem, but it’s not a real fix. With over 1.5 million results currently indexed in Google, it’s pretty obvious that the additional steps we outlined above need to be taken.

SEO Problem #2 – Improper CDN Configuration

The second problem area has to do with the content delivery network configuration. When you enable the CDN that WP Engine offers, you’ll usually see decreased load times, which is great. Unfortunately, turning on the CDN without properly configuring a CNAME record in your DNS hosts file will create yet another SEO issue. If you use the default settings when enabling your CDN, you will essentially be offloading your images to a domain which is being shared by thousands of other users.

To Google, these images aren’t on your site, they are on - wpengine.netdna-cdn.com

You may notice that this is not even the wpengine.com domain but netdna-cdn.com. NetDNA is Wpengine’s CDN provider so don’t panic. The problem is you have effectively orphaned all of your images to a worthless subdomain on NetDNA-CDN.com, no bueno if you’re trying to get some image traffic from Google.

Similar to the duplicate site problem query, we use this query – site:wpengine.netdna-cdn.com to estimate the effected sites.

wpengine cdn google results
98,500 results… now that’s a lot of site under performing in Google image search! Especially since it’s only because they haven’t modified a few settings that anyone can change. So, if you’re in this boat and you like Google traffic as much as we do, the next section will help you get things buttoned up for good.

matt cutts image serp

How to Fix Problem #2 – Edit Your DNS Zone

Both of these fixes are pretty simple, but like many things they can be easily overlooked when setting up a new blog or website. In order to fix this problem, you’re going to need to log into your domain registrar’s control panel and create what’s called a CNAME. This lets you create an alias so that it appears as though your images are being served from your site’s domain even though they’re being served from NetDNA’s network.

Assuming you’ve already got your CDN enabled, the first thing you need to do is find your raw CDN URL. You can find this by inspecting a media file, such as an image, on your site. Here’s how to do it:

First, right click on the image, and choose to either “Open Image in New Tab,” or “View Image Info.” The choice you see depends on what browser you’re working in.

With the CDN turned on, the image path in your browser address bar should look something like http://username.wpengine.netdna-cdn.com/wp-content/uploads/2014/02/some-picture.jpg.

In order to set up your CNAME, you’re only going to need the CDN URL without the image information like this: http://username.wpengine.netdna-cdn.com. Keep this tab open for now because you’re going to need to reference this URL during the next step.

Create a CNAME record

To create a custom CDN URL, you need to setup a CNAME record for cdn.your-domain.com to point at username.wpengine.netdna-cdn.com. You’ll need to do this within your domain registrar’s control panel. Here are some resources that will explain exactly how to do this on some of the most popular domain registrars:

How to Set a CNAME at Popular Registrars

Once you’ve got your CNAME set up, there’s one final step that you need to take. You need to log back into your WP Engine control panel and submit a support ticket asking them to enable a custom CNAME for your CDN. It’s a little ridiculous that they make you submit a support request for this, but as of right now this is the only way to get it done.

Once their techs respond to the ticket and set up your custom CNAME, you should be all set. No more outside domain images!

But Wait, There’s More

At the risk of causing more of a ruckus than necessary, we wanted to point out the fact that the first SEO problem we fixed is also a major privacy issue. When developers are building a website, they typically work on a staging server of some kind. Ideally, this server would always be completely hidden from Google’s view.

Unfortunately, if they’re working on WP Engine, that’s probably not the case. In the past, even WP Engine’s staging environment was being indexed, but thankfully they have cleaned up most of that. Still, many developers are using standard sites rather than staging areas, or both in some cases.

Your Developer Probably Knows How to Keep Google Out of Their Staging Environment

With that said, the data seems to suggest that a large amount of developers don’t. As of right now, there are tons of development sites indexed and many of them haven’t been launched yet. This might not sound like a huge deal, but having an unfinished site indexed could open a real can of worms. To illustrate this point, how would you like to see what the new version of The Harvard Law Review is probably going to look like? Here you go:

harvard-law-review-redesign

Although this site is looking really nice, we’re going to assume they probably don’t want this being viewed while it’s still under construction. Yet, here it is, publicly available to anyone who runs a Google search: http://upstatement.wpengine.com

Talk about a potential PR nightmare!

Update: It looks like Upstatement has made their development site private after seeing this post. Much better! If anyone else is suffering from this problem, that’s a quick way to add some protection.

This isn’t an isolated case. We found dozens of sites like this by just looking through the first pages of results. If you dig through the queries we posted you’ll find all kinds of interesting things that probably shouldn’t be public. Again, some of the responsibility here is on the developers of these sites, but it really wouldn’t be that hard for WP Engine to prevent this from happening in the first place.

Man, That’s a Lot of Information

Well there you have it, two ridiculously easy fixes that will improve your SEO. As an added bonus, following these instructions will also reduce your visibility to hackers and anyone who might want to snoop on your projects. Enjoy the increased search traffic and if you have friends running WP Engine, be sure to share this post with them!

Comments

  1. pmgarman says

    Just want to point out that the CDN fix, only works with HTTP only sites. If a site runs HTTPS (hopefully most ecommerce sites) as far as I know you cannot use your own CNAME for the CDN, at least while you are using WPEngines offering. This is primarily because at CDN services, custom SSL certificates are a paid feature.

    So, to get a custom CNAME for your CDN && HTTPS support, you may need to simply find another CDN provider of your own.

      • pmgarman says

        WPE’s CDN does work over SSL yes, but you cannot use your custom CNAME unless WPE has the CDN configure the CNAME with your SSL certificate (usually not free, not sure if WPE even offers it). The process is exactly the same regardless of if the SSL is a wildcard or not.

        • says

          Hi Brian, this is the Founder of WP Engine. Yes we do support loading your SSL on a custom CNAME on the CDN. It’s a bit of a manual process, meaning we do it for you in tech support rather than fully self-serve, but we will do it for you upon request.

          • pmgarman says

            Hey Jason – Is there any extra fee for that? iirc WPE’s CDN is over netdna/maxcdn and they do charge for it?

          • says

            We have a direct contract with NetDNA so you don’t need to worry about that. :-) One of the ways we’re able to provide the services we do at the prices we do is that we are purchasing in such large bulk from vendors.

            In any case, I’m sure a conversation over tech support or phone etc would be a better way to get your questions answered and/or get you set up with that, but thank you!

  2. says

    Hi Jacob, this is the Founder of WP Engine.

    Quite a bit of the information above is factually incorrect, but some is good feedback that we’re going to take away and act on.

    First, on staging areas, it is incorrect that they are counted as duplicate content, because we force a “deny robots everything” robots.txt file on staging. Using your own example above of our TorqueMag website: http://torque.staging.wpengine.com/robots.txt

    It’s true that some bots will ignore robots.txt, but all the major search engines that matter for SEO, which is your point, do honor it. Some — including Google! — will *scan* it anyway, but it doesn’t count for duplicate content. Matt Cutts has been extremely clear on this point, publicly.

    Second, on duplicate content on the WP Engine domains (e.g. torque.wpengine.com), again what you’re saying is factually incorrect *for Google* but is a good point for some other search engines. Here’s why:

    Google maintains a set of root domains that they know are companies that do exactly what we and many other hosting companies do. Included in that list are WordPress.com, SquareSpace, and us. When they detect “duplicate content” on subdomains from that list, they know that’s not actually duplicate content. You can see it in Google Search, but it’s not counted against you.

    We have had a dialog directly with Matt Cutts on this point, so this is not conjecture, but fact.

    However, your suggestion that it’s better to 301 that domain is still *also* very valid. Also, not all search engines are aware of this scenario, and thus one of the take-aways we have from your article is that we should auto-force robots.txt for the XYZ.wpengine.com domains just as we do for the staging domains, so that other search engines won’t be confused.

    So thank you for causing us to improve, but hopefully likewise you can correct the objectively incorrect information in the article as well.

    • says

      Hey Jason, yeah we didn’t mean the staging area, we said

      “In the past, even WP Engine’s staging environment was being indexed, but thankfully they have cleaned up most of that.”

      I saw that quite sometime ago but it has since been corrected. The query would have to be site:.staging.wpengine.com which we can see there is next to no results for.

      I’m not sure what you guys call the main blog

      Not the – staging.username.wpengine.com

      But this still indexed just fine – username.wpengine.com

      For example, even with the canonical set still indexing – https://www.google.com/#q=site:torque.wpengine.com

      This is something I was unaware of – “Google maintains a set of root domains that they know are companies that do exactly what we and many other hosting companies do”

      And that makes perfect sense, thanks for bringing that up, I’ll be adding that to the post shortly.

      To this – “We have had a dialog directly with Matt Cutts on this point, so this is not conjecture, but fact.”

      I consider no information from Matt Cutts to be “fact”, part of his job is to literally spread misinformation and propaganda throughout the SEO community. So while I can see how his words might be comforting, I wouldn’t trust him as far as you can throw him.

    • Gar says

      “…one of the take-aways we have from your article is that we should auto-force robots.txt for the XYZ.wpengine.com domains just as we do for the staging domains…”

      That’d be awesome, Jason. I’ve been trying to do this manually for all sites (very tedious) but hadn’t found a good way to do it efficiently. 9/10 times a site on a subdomain is a work in progress and hasn’t been mapped to a domain yet. For those rare occasions where it’s a blog.domain.com or something similar, a way to disable the default disallow robots.txt for certain sites in the User Portal would be an easy fix.

      • says

        You’re absolutely right that it’s not at all easy, whereas we could just make it automated. So we *must* do the latter! It’s on the schedule for next week’s development, in fact.

  3. Joost de Valk says

    “Failboating” your SEO. Wow Jacob you’re out-doing yourself. I honestly think that this is completely overblown. I don’t mind a bit of hubris, I’m probably well known for it myself, but this is too much. You’re in fact incorrect in several parts, as I see Jason Cohen has already told you.

    WP Engine should and probably will prevent some of these things you describe from happening, but to be honest I think the reason for that is not to “fix” everyone’s SEO: it’s to prevent people like you from jumping to stupid conclusions. Whether or not you believe Matt Cutts is up to you, just as it’s up to me to believe ANY post written by someone who I basically only know because he “baited” the entire SEO community with a very tacky “SEO Assholes” competition. “All in good fun”, of course. You have a talent for creating fuzz, that much is clear.

    Now where you *could* have impressed me is if you’d come up with good suggestions for WP Engine to fix the issues you described and had reached out to them. Since you neglected to I just did, with a couple of suggestions on how they could fix some of the issues. And you know what? Within 5 minutes of me sending the email I was talking to Jason and Tomas from WP Engine on Skype.

    • says

      Hey Joost, maybe my verbage was a bit strong, but how else is an issue suppose to get noticed amongst all the noise each day?

      Personally I don’t have Jason on Skype so that wasn’t an option. But do you really think they would have done something when little old me submitted a ticket to give them a heads up?

      I doubt it, so we ran this post. Sure the actual “SEO” implications of the indexation issue can be argued, and I’ll admit it, maybe I did blow it out of proportion a bit, but the CDN issue of having your images on a completely different domain seems like it should grant some merit.

      And on the privacy side, it took me just 12 hours of scraping to harvest over 2,000 WP Engine user emails from those sites, that has to be an issue worth the salt alone.

      For the record, I have a ton of respect for you and I definitely don’t think you’re an “asshole”. And given your spot in 6th place I’d say neither does the rest of the community!

      • Tomas Puig says

        Hello Jacob,

        I’d love to connect with you so you feel like you have someone to talk with at WP Engine when these things come up. We’re also incredibly active on twitter so you could have reached us there if need be. I’ll send you a tweet to connect.

    • Alister Amo says

      first:
      “it’s marketing, idiots!”
      so it’s pretty hilarious to see so-called “marketing ninjas” complaining about this post.
      somebody call the wuuuambulanceeee

      second:
      FULL DISCLOSURE FOREVER! Users have full right to know that this happened and this is the only way.

      Or do you prefer to never notice that Apple has blatantly failed protecting the security of your data through SSL connections during more than two years because of a single line of code?

      COME ON!!

      Everybody is human and makes mistakes, but hiding the failures of the weaker is being unfair with the stronger and more skilled. We want to know so we can make good decisions with our money.

      Some errors are really braindead level. They should feel embarrased at least. If I were they, I’d say thanks fot the audit, say sorry for the mistake, and start working on fixes ASAP.
      If they don’t make more mistakes like that and continue caring about the customer base, surely 99% people will quickly forgive and everything will come back to normality. If they continue complaining, things can go worse. It’s a matter of attitude against problems.

      • says

        Thanks Alister.

        So I replied to their comments yesterday, with civility, although the comments from Jason and Joost are a bit patronizing of the overall picture here.

        Their comments immediately received a spike in votes (Jason’s got 12 in about 60 seconds) and were basically stickied to the top of the comment section. It was a busy morning of course so I didn’t think much of it. Then I started to notice the “counter-strike” they had launched on Twitter replying to ANYONE who tweets this article linking to the comments to expose the “fallacies”.

        Well I’m still looking for them (the fallacies), but with how easily influenced people can be, I’d like to set the record 100% straight, with the click bait aside.

        —————————-

        These Are NOT Fallacies, These Are Facts

        1. There is a massive indexation issue of WP Engine user blogs, over 1,500,000 indexed results currently.

        SEO or no SEO, WP engine users are exposed with an incredibly simple Google query.

        2. There are near 100,000 indexed results highlighting the improper CDN configuration.

        Many things in SEO are up for interpretation, such as example 1, but the improper CDN config hindering visibility can’t be. Any SEO knows about domain equity, it’s our job to build it. So hosting our images on a completely different domain is hurting our image visibility, fact.

        3. The above issues have both SEO and privacy implications.

        To what extent, that can be argued, but they seem pretty hell bent on making sure every person on Twitter thinks these truths are fallacies.

        —————————-

        It seems like an attempt to discredit me by focusing on the “duplicate content” SEO aspect but this isn’t the biggest issue, it’s just one piece of the puzzle that indicates a larger problem. As I said in my reply to Jason, we did NOT say the staging areas are being indexed. Quote from the piece:

        “In the past, even WP Engine’s staging environment was being indexed, but thankfully they have cleaned up most of that.”

        username.staging.wpengine.com

        Not indexed ^^

        ursername.wpengine.com

        Indexed ^^

        That’s not a fallacy, just shows that he didn’t read the article fully.

        What about the CDN issue?

        What about the massive list of WP Eengine blogs I scraped last night?

        Is this a fallacy? – http://auditwp.com/wp-content/uploads/2014/02/WPengine-User-Blogs.txt

        They’re focusing way too much on picking apart the SEO when there is a much bigger issue ^^

        • Asterisk_Admin says

          Even if the staging area is blocked, it is still accessible by anyone who knows you are on WPengine, or by someone who got the link from a careless employee. In either case I just had an order come in on the staging version of my ecommerce site that got my system in a tizzy because of duplicate order numbers, and I just came across this thread because I was looking to solve this issue – and I have come up with the following solution which works quite well.

          Basically, just block the domain and redirect to the main site if not being accessed by an admin using htaccess.

          Options +FollowSymlinks

          RewriteEngine on

          RewriteCond %{REMOTE_ADDR} !=123.45.678

          RewriteCond %{HTTP_HOST} ^yourinstall.staging.wpengine.com$

          RewriteRule (.*)/$ http://yourdomain.com/$1 [R=301,L]

    • Asterisk_Admin says

      I just dropped this in my .htaccess via your plugin to solve this issue. (Thanks for a great plugin)

      Options +FollowSymlinks
      RewriteEngine on
      RewriteCond %{REMOTE_ADDR} !=123.45.678
      RewriteCond %{HTTP_HOST} ^yourinstall.staging.wpengine.c…$
      RewriteRule (.*)/$ http://yourdomain.com/$1 [R=301,L]

  4. says

    There’s actually a surprising amount of good information in this article but despite this the title is wayyy linkbaited and it’s clear you should have reached out to WP Engine first. It similar to if you have had found a bug in Core or a plugin and you’re gloating or showboating rather than doing the responsible and humble thing and informing first. If after 3 tickets you never hear a peep then maybe post an open letter or blog post without such a ridiculous title to get some important information out there for the community.

    That aside, my main takeaways from this article are that definitely make sure you use a CNAME with the CDN – I believe sites like flywheel require this before they set it up (don’t quote me on this). Secondly, the staging sites being indexed is an issue for me. That’s why I use Maintenance Mode plugins to effectively hide the site. That’s alarming that you were able to find new version of unreleased sites like Harvard. I can just imagine the call now from that client… oh boy. So maybe as a suggestion WP Engine forces some sort of maintenance mode plugin to block non-user traffic out.

    Ultimately the outcome of this article will be an improved service from WP Engine. I have had a mainly positive experience with their hosting and the people who run the company.

    • David Abramson says

      I agree…this article is massively helpful for people hosting with WP Engine but I feel like the title is more than a little inflammatory (perhaps on purpose).

      I just put my first client on WP Engine…not using the CDN yet so that’s good info to have for the future :)

      In their instructions for switching DNS they give pretty clear instructions for how to redirect the .wpengine subdomain site to your main site so that one isn’t really their fault.

      In the limited time with WP Engine, I’ve experienced slow support but they do seem to solve the problems which I can’t really say after many 15 minute phone calls with GoDaddy. That said, they need to hire more L1 and L2 tech support people sooner rather than later if they don’t want people to leave for another hosting company..

  5. Jacob Nicholson says

    Umm so, sorry, you might want to remove your link for http://upstatement.wpengine.com. When we tried it, it was pulling up the WordPress install script. We didn’t think it would work, but it did…

    Probably a good time to talk about general WordPress security :)

  6. says

    I was just working on a client’s WP Engine account and noticed this problem as well. One positive note is, their tech support were some of the best I have encountered if that helps anyone.

  7. says

    The multi installs come with a required domain mapping plugin, with this you don’t need to set a redirect as the plugin does it automatic when you set the primary domain….

  8. SureFireWeb says

    Wow, awesome post!! I had no idea that this was happening =D. Love it, keep up the amazing work!

Get At Me:

If you're a robot and made it this far, congrats. Your money site domain has been sent to Matt Cutts.