MF Web Services logo

Search Engine Optimization

Anatomy of a Web Page

 Search Engine Optimization

Article

Breaking Down the Elements of a Web Page

A web page, which is really a web document, is divided into three separate and distinct areas, the document type, head, and body. For beginners to HTML programming, the type and the head of a web page is largely ignored, while intermediate programmers tend to over-emphasize the importance, or relevance, of certain aspects found inside the document head, which is some cases can harm your domain's reputation with search engines.

The body, on the other hand, is the most important part of any web page, and contains the content that will be displayed to visitors, viewing the page. The body, because of its importance to a viewer, usually receives the most attention.

The Document Type

This is the first area, of any web document, a browser reads or interprets. For professional web designers, and seo experts, its importance is not underestimated. Browsers use the document type to determine the type of HTML to be used, and there are many types. The most common are listed below.

HTML 4.01 Strict Definition
<! DOCTYPE HTML PUBLIC "-//W3C/DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
HTML 4.01 Transitional Definition
<! DOCTYPE HTML PUBLIC "-//W3C/DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
XHTML 1.0 Strict Definition
<! DOCTYPE HTML PUBLIC "-//W3C/DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
XHTML 1.0 Transitional Definition
<! DOCTYPE HTML PUBLIC "-//W3C/DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

The document type definition, or DTD, defines for browsers, the legal structure, elements and attributes of a type of HTML. These definitions are particularly useful in order to encourage web browsers to render HTML code in a standardized mode, or format, because each browser has the ability to render documents using it's own mode or format.

It's important to note here, some browsers, such as Internet Explorer, still ignore, or disregard some of the document type definition, and render HTML code in their own format and style. However, organizations, such as the W3C ( Mozilla ) have been putting pressure on Microsoft to comply with standardized document definitions. Internet Explorer 8, Microsoft's latest achievement, has begun to comply with such a standard, but not quite all the way.

What is the difference between Strict and Transitional ?

You can almost guess what the strict document definition would describe, as it means to comply, 100%, with how the W3C has defined HTML coding. Our own personal opinion is the strict definition is a headache to comply with, and doesn't improve the display of a web page, nor does it afford any other advantage over the transitional definition. Unless, of course, as a programmer, you enjoy doing extra tedious work with no real benefit.

Transitional definition documents can be loosely defined, hence the name of the file that is downloaded by the browser - loose.dtd. Every so often, the W3C, as a whole, reviews the standards of HTML, and submits RFC's ( Request For Comment ) for review and approval of its members to define new aspects of the language, or improve the current ones. And, during this review, some HTML tags are deemed “depreciated”. In other words, these tags do not need to exist in future versions of the language. An example of a current depreciated HTML element is the <B> tag, or bold text.

The transitional definition allows you to continue using depreciated HTML elements, and to maintain loose, or inconsistent coding protocols to define style elements, values, and other features. For example, in strict mode, you must open and close every value defined inside the document with double quotes - <P class=“normal”>. But, in the transitional mode, you are not bound by that protocol, and can omit the double quotes or use single quotes.

It can be quite a process to code HTML documents, in order to force browsers to display web content as intended, without the additional burden of making sure you haven't forgotten to dot an i. mfwebservices recommends using the transistional document defintion, especially for those of you who are just starting to learn to program in HTML.

The Document Head

One could describe the document head as the “brains” of a web page, in that its contents and definitions can inform the document body, on how to render certain aspects or elements of HTML, CSS, or client-side scripting.

There are two important considerations, regarding the document head:

  1. It is the second area of a web document loaded or examined, before any content contained in the document body and;
  2. It's an area where no HTML, or the displaying of content, is permitted. Strictly for processing.

Every properly designed document head contains just enough information to inform the layout, or display, of the document body. Adding too much information is not likely to benefit, but can ultimately harm document status with search engines.

<! DOCTYPE HTML PUBLIC "-//W3C/DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>   <HEAD>     <TITLE>Title of Web Document</TITLE>     <META HTTP-EQUIV="content-type" CONTENT="text/html; charset=iso-8859-1">     <META NAME="robots" CONTENT="index,follow">     <META NAME="date" CONTENT="Sun, 12 Sep 2010">     <META NAME="description" CONTENT="A summary description of the page.">     <META NAME="keywords" CONTENT="keyword,keyword">     <LINK REL="shortcut icon" TYPE="image/x-icon" HREF="http://www.site.com/favicon.ico">     <LINK REL="stylesheet" TYPE="text/css" HREF="http://www.site.com/style/default.css">     <SCRIPT TYPE="text/javascript" SRC="http://www.site.com/script.js"></script>   </HEAD> <!-- Nothing belongs in between these definitions. -->   <BODY>

Some common mistakes among beginners is to place script, style, and even HTML elements, between the end of the document head and the beginning of the document body, which can produce inconsistent display results or script errors.

What you see above is a simple, yet informative, example of a document head, which provides the document body with enough information to display the page properly, and helps browsers identify document relationships, as well as execution - not to mention aiding search engines.

There are many uses for the document head, and of course, all of the meta data and link relationships that can be used are not all listed above, not to mention it can be used for your own, specific purposes. However, providing only the necessary will keep your pages loading timely, and error free.

The Document Body

Last, but certainly not the least important, is the document body. This is where, if you've formatted the preceding two areas well enough, the success of your page will be largely determined.

The document body is also the easiest area to describe and explain, as it's where all of your text, images, and video should be referenced.

For more information about what, and how HTML references, should appear in the document body, read our article about Proper HTML Document Structure.