Whatever You Need To Know About The X-Robots-Tag HTTP Header

Posted by

Seo, in its many fundamental sense, trusts something above all others: Online search engine spiders crawling and indexing your website.

But almost every website is going to have pages that you do not want to include in this exploration.

For instance, do you actually desire your privacy policy or internal search pages showing up in Google results?

In a best-case scenario, these are not doing anything to drive traffic to your site actively, and in a worst-case, they might be diverting traffic from more crucial pages.

Fortunately, Google allows web designers to tell search engine bots what pages and content to crawl and what to overlook. There are numerous methods to do this, the most common being using a robots.txt file or the meta robots tag.

We have an outstanding and in-depth explanation of the ins and outs of robots.txt, which you ought to definitely read.

However in top-level terms, it’s a plain text file that resides in your site’s root and follows the Robots Exemption Procedure (REPRESENTATIVE).

Robots.txt offers spiders with guidelines about the website as a whole, while meta robotics tags include instructions for particular pages.

Some meta robots tags you might use consist of index, which informs online search engine to add the page to their index; noindex, which tells it not to include a page to the index or include it in search engine result; follow, which instructs an online search engine to follow the links on a page; nofollow, which informs it not to follow links, and a whole host of others.

Both robots.txt and meta robots tags work tools to keep in your toolbox, but there’s likewise another method to instruct search engine bots to noindex or nofollow: the X-Robots-Tag.

What Is The X-Robots-Tag?

The X-Robots-Tag is another method for you to manage how your webpages are crawled and indexed by spiders. As part of the HTTP header response to a URL, it controls indexing for an entire page, in addition to the particular aspects on that page.

And whereas using meta robotics tags is relatively straightforward, the X-Robots-Tag is a bit more complex.

However this, naturally, raises the question:

When Should You Use The X-Robots-Tag?

According to Google, “Any instruction that can be used in a robots meta tag can likewise be specified as an X-Robots-Tag.”

While you can set robots.txt-related directives in the headers of an HTTP response with both the meta robots tag and X-Robots Tag, there are certain scenarios where you would wish to utilize the X-Robots-Tag– the 2 most common being when:

  • You want to control how your non-HTML files are being crawled and indexed.
  • You wish to serve regulations site-wide instead of on a page level.

For instance, if you want to obstruct a particular image or video from being crawled– the HTTP action technique makes this simple.

The X-Robots-Tag header is also helpful since it allows you to combine multiple tags within an HTTP reaction or use a comma-separated list of directives to define instructions.

Maybe you don’t desire a specific page to be cached and want it to be not available after a certain date. You can use a combination of “noarchive” and “unavailable_after” tags to instruct search engine bots to follow these guidelines.

Essentially, the power of the X-Robots-Tag is that it is a lot more flexible than the meta robotics tag.

The advantage of using an X-Robots-Tag with HTTP actions is that it allows you to use regular expressions to execute crawl directives on non-HTML, in addition to apply specifications on a bigger, international level.

To help you understand the distinction in between these regulations, it’s useful to categorize them by type. That is, are they crawler regulations or indexer instructions?

Here’s a handy cheat sheet to explain:

Spider Directives Indexer Directives
Robots.txt– uses the user representative, permit, prohibit, and sitemap instructions to specify where on-site online search engine bots are enabled to crawl and not enabled to crawl. Meta Robotics tag– enables you to define and prevent search engines from showing specific pages on a site in search results page.

Nofollow– permits you to define links that ought to not hand down authority or PageRank.

X-Robots-tag– allows you to manage how defined file types are indexed.

Where Do You Put The X-Robots-Tag?

Let’s state you want to block specific file types. A perfect technique would be to add the X-Robots-Tag to an Apache setup or a.htaccess file.

The X-Robots-Tag can be added to a website’s HTTP responses in an Apache server setup via.htaccess file.

Real-World Examples And Utilizes Of The X-Robots-Tag

So that sounds great in theory, but what does it appear like in the real world? Let’s take a look.

Let’s state we desired online search engine not to index.pdf file types. This configuration on Apache servers would look something like the below:

Header set X-Robots-Tag “noindex, nofollow”

In Nginx, it would look like the below:

area ~ * . pdf$ add_header X-Robots-Tag “noindex, nofollow”;

Now, let’s take a look at a various circumstance. Let’s say we wish to use the X-Robots-Tag to obstruct image files, such as.jpg,. gif,. png, etc, from being indexed. You could do this with an X-Robots-Tag that would look like the below:

Header set X-Robots-Tag “noindex”

Please keep in mind that comprehending how these directives work and the effect they have on one another is crucial.

For instance, what happens if both the X-Robots-Tag and a meta robotics tag lie when crawler bots find a URL?

If that URL is blocked from robots.txt, then certain indexing and serving instructions can not be discovered and will not be followed.

If directives are to be followed, then the URLs consisting of those can not be disallowed from crawling.

Look for An X-Robots-Tag

There are a couple of different techniques that can be utilized to look for an X-Robots-Tag on the website.

The simplest method to inspect is to install a browser extension that will inform you X-Robots-Tag details about the URL.

Screenshot of Robots Exemption Checker, December 2022

Another plugin you can utilize to figure out whether an X-Robots-Tag is being utilized, for example, is the Web Developer plugin.

By clicking on the plugin in your browser and browsing to “View Action Headers,” you can see the different HTTP headers being used.

Another method that can be used for scaling in order to pinpoint issues on sites with a million pages is Shrieking Frog

. After running a site through Screaming Frog, you can navigate to the “X-Robots-Tag” column.

This will show you which areas of the website are using the tag, in addition to which specific instructions.

Screenshot of Yelling Frog Report. X-Robot-Tag, December 2022 Using X-Robots-Tags On Your Website Understanding and managing how search engines connect with your site is

the foundation of search engine optimization. And the X-Robots-Tag is a powerful tool you can use to do just that. Just be aware: It’s not without its risks. It is very simple to slip up

and deindex your whole site. That stated, if you’re reading this piece, you’re most likely not an SEO newbie.

So long as you use it wisely, take your time and check your work, you’ll find the X-Robots-Tag to be a helpful addition to your arsenal. More Resources: Included Image: Song_about_summer/ Best SMM Panel