WordPress 5.7 features a new Robots API that provides filter-based control over the robots meta tag. So if your site is running WordPress 5.7 or better, you will notice a new <meta /> tag included in the <head></head> section of your web pages. By default, the meta tag added by WordPress has a value of max-image-preview:large, which is fine IF it is the only robots meta tag on the page. If your site already has its own meta robots tag, […] Continue reading »
One way to prevent Google from crawling certain pages is to use <meta /> elements in the <head></head> section of your web documents. For example, if I want to prevent Google from indexing and archiving a certain page, I would add the following code to the head of my document: Continue reading »
Controlling the spidering, indexing and caching of your (X)HTML-based web pages is possible with meta robots directives such as these: <meta name="googlebot" content="index,archive,follow,noodp"/> <meta name="robots" content="all,index,follow"/> <meta name="msnbot" content="all,index,follow"/> I use these directives here at Perishable Press and they continue to serve me well for controlling how the “big bots”1 crawl and represent my (X)HTML-based content in search results. For other, non-(X)HTML types of content, however, using meta robots directives to control indexing and caching is not an option. An […] Continue reading »
Want to make sure that your feeds are not indexed by Google and other compliant search engines? Add the following code to the channel element of your XML-based (RSS, etc.) feeds: <xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex"></xhtml:meta> Here is an example of how I use this tag for Perishable Press feeds (vertical spacing added for emphasis): Continue reading »
During the most recent Perishable Press redesign, I noticed that several of my WordPress admin pages had been assigned significant levels of PageRank. Not good. After some investigation, I realized that my ancient robots.txt rules were insufficient in preventing Google from indexing various WordPress admin pages. Specifically, the following pages have been indexed and subsequently assigned PageRank: Continue reading »
This XHTML header tags resource is a work in progress, perpetually expanding and evolving as new information is obtained, explored, and integrated. Hopefully, you will find it useful in some way. Even better, perhaps you will share any complimentary or critical information concerning the contents of this article. To get a better idea, scroll through the Table of Contents. Continue reading »
Head Meta Data (previously known as Head MetaData Plus) adds a complete set of <meta /> tags to the <head></head> section of all posts and pages on your site. Including meta information about your site is a great way to refine definition, enhance branding, and improve the semantic quality of your pages. Continue reading »