MF Web Services logo

HTML Help & Resources

META Data

For the purposes of using this reference, in coordination with our product, only certain, specific and relevant, meta data elements will be discussed in detail here. They are as follows, below:

<META NAME="Description">

This meta tag is used somewhat inconsistently, but does provide a snapshot, or indication, of what to expect to see on a given page. Ideally, this tag is suppose to represent a textual description, to a visitor, of what they will read or see upon clicking a link to the page. And, this tag is usually displayed right after, and below, the TITLE of a document in search results.

 Example of search result

While the intended purpose of this tag is to provide descriptions, or summaries, to search engines, it has too often been misused or abused. As a result, Google and some other search engines, may choose to display contents from the page itself, instead of relying on this meta tag. In other words, it's a useful tag, but it appears as though it will be phased out over time.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

You should always resolve to include text that is visible inside the document, when creating your meta description tag. In fact, the description should probably include a prominent message, quote, or sales pitch, found on your page that will inspire people to click on a link to your page. Avoid too short, or too long, descriptions whenever possible. And, avoid stuffing keywords or unrelated information into this tag too, as most search engines ( namely Google ) will completely disregard its contents, penalize your page rank, may substitute it's own description of a page, or ignore the page altogether.

<META NAME="Keywords">

This meta tag is already, and virtually, been phased out, in large part due to the misuse and abuse by web designers who, in years past, attempted to use this tag to stuff keywords and manipulate search results, and to a lesser extent, the advancement of search technology whereby search engines crawl pages to determine relevant keywords.

However, including this tag in documents is certainly not going to harm a documents' status, unless, of course, your intention is to place meaningless and irrelevent words and phrases inside it, which may result in penalty. Always include words, but not too many ( maximum 8-10 ), that can be found inside the document, and preferrably words found in heading tags, anchors, and paragraphs, informing search engines of the fact, the words you chose, do hold some importance to the document.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

Form a list of keywords, anywhere in number up to a maximum of 10, that can be found inside the document. Choose the words carefully, and remember, these words are “ key ” words, meaning they hold some important relevance to the document.

Separate your keywords by commas, omit any spaces where possible, and use lower case letters. Avoid using phrases, common words, and contractions such as is,has,you,would, and so on.

<META NAME="Robots">

The Robots meta tag has evolved over time, advancing the interaction between search engines, webmasters, websites, and documents. All of the popular search engines will respect and adhere to the properties found inside this tag. The properties of this tag can be of some importance, because there are circumstances in which a document found on your website may not be for public viewing, or examination by a search engine.

Documents of this type could include personal information stored about users of your website, older pages that are no longer relevant to your site, or special promotional pages intended for a limited audience, in which case you want the pages accessible to only those who subscribe or have clearance to view them. Another possible circumstance could be pages that are necessary to your website but do not reflect actual content, or provide any meaningful or relevant data.

An example of this type of document would be a website “ Terms of Use ”, which could be filled with legal jargon and terms that have nothing to do with the purpose of your site, or representative of the content found on your site, in which case you would not want search engines to index this page because it will inadvertently attach relevance and importance to this document, which in turn, will decrease the overall effectiveness of other, more relevant, data and keywords found on your other pages.

Expressed another way, the terms of use could contain more textual content, by virtue of the detailed nature of legal documents, than all other pages on your website, in which case your site would overwhelming produce keywords and content that are rarely displayed resulting in unusual and unpredictable search results for your site.

Below is a typical example of a robots meta entry.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
    <META NAME="Robots" CONTENT="index,follow">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

“ index,follow ” tells robots to index the page contents and follow all links found on the page.

“ noindex,follow ” tells robots to not index the page contents but follow all links found.

“ index,nofollow ” tells robots to index the page contents but do not follow any links found.

“ noindex,nofollow ” tells robots to not index the page contents and do not follow any links found.

<META NAME="Date">

The Date meta tag helps to define, for robots and visitors, when the document was created. This information can help robots determine whether the page needs to be crawled again, or can inform visitors as to when the document was created.

Of course, this tag is entirely optional, and not required information, but it does help to add this tag when considering search engines and robots.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
    <META NAME="Robots" CONTENT="index,follow">
    <META NAME="Date" CONTENT="Mon, 21 May 2012 08:33:05 GMT">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

Date complies with RFC 2616 . Use whatever date information is relevant to you, as you are not required to add a time or a time zone.

<META NAME="Author">

This tag is practically self-explanitory, as it can be used to describe who is responsible for the content inside the document.

Of course, this tag is entirely optional, and not required information, but it does help to add this tag when considering search engines and robots, having the authors name indexed and linked as a possible keyword combination.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
    <META NAME="Robots" CONTENT="index,follow">
    <META NAME="Date" CONTENT="Mon, 21 May 2012 08:33:05 GMT">
    <META NAME="Author" CONTENT="John Doe">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

Any relevant name will due here. Could be a company name, your name, or the author of a web page you are displaying on your site.

<META HTTP-EQUIV="Content-Type">

Content type accesses a distinctive character set, available to a user's computer. Of course, there are many different languages, and that means many different character sets too. This tag is helpful if you are planning to publish documents in different languages, utilizing the various different character sets.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
    <META NAME="Robots" CONTENT="index,follow">
    <META NAME="Date" CONTENT="Mon, 21 May 2012 08:33:05 GMT">
    <META NAME="Author" CONTENT="John Doe">
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

ISO-8859-1 or “ Latin-1 ” is the most commonly used set for Western European Languages, while “ UTF-8 ” is the most common universal language set.

However, there are many different character sets available for use. For a complete reference guide visit http://www.w3.org .

<META HTTP-EQUIV="Expires">

Use the document expiration tag to inform robots when the content is no longer valid, thus discarding or removing the document from future indexes when the date has passed. However, this tag is somewhat unreliable, and it would be wise to utilize this tag using your own processing methods. For example, as soon as the document expires, build in a server-side script element that can display alternate content based on the date, to ensure the expired document is not viewed by an audience you did not intend or plan for.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
    <META NAME="Robots" CONTENT="index,follow">
    <META NAME="Date" CONTENT="Mon, 21 May 2012 08:33:05 GMT">
    <META NAME="Author" CONTENT="John Doe">
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
    <META HTTP-EQUIV="Expires" CONTENT="Mon, 21 May 2012 09:33:05 GMT">
  </HEAD>
  <BODY>
  This is my first HTML document.
  </BODY>
</HTML>

Date complies with RFC 2616 . Use whatever date information is relevant to you, as you are not required to add a time or a time zone.

<META HTTP-EQUIV="Refresh">

The meta refresh tag, if defined, requests an automatic, timed, browser reload of the current document content. This is particularly useful if the contents of the page are being drawn from an external source that updates frequently, requiring the need to display a current version of the document.

It's important to note, not all browsers recognize this meta tag, not to mention the fact, a user can define, in their browser options, whether the meta refresh feature is on or off. Users will often disable this feature to avoid being re-directed to page without their consent.

Avoid using this tag to re-direct users to another page. However, if it's necessary to use this tag, always define a visible link, users can see and click manually, in the event the meta refresh has been disabled. An example below.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
  <HEAD>
    <TITLE>My First HTML Document</TITLE>
    <META NAME="Description" CONTENT="This is my first HTML document">
    <META NAME="Keywords" CONTENT="first,html,document">
    <META NAME="Robots" CONTENT="index,follow">
    <META NAME="Date" CONTENT="Mon, 21 May 2012 08:33:05 GMT">
    <META NAME="Author" CONTENT="John Doe">
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
    <META HTTP-EQUIV="Expires" CONTENT="Mon, 21 May 2012 09:33:05 GMT">
    <META HTTP-EQUIV="Refresh" CONTENT="10; URL=http://www.re-directed.com/url.html">
  </HEAD>
  <BODY>
  This is my first HTML document.
  
  Sending you to my second HTML document...
  
  
  If this page does not refresh in 10 seconds, go to second HTML document .
  </BODY>
</HTML>

Refresh operates with a time interval and a related URL. The time interval is in seconds, followed by the ; In this example, the browser should re-direct the user to another URL after 10 seconds has passed.

Move on to the Next Topic - LINK Document Relationships.