HTML Element Structure
Jakob Jenkov |
As mentioned a HTML document consists of HTML elements. This text takes a closer look at the basic structure of an HTML element.
An HTML element consists of the following parts:
- Start tag
- Attributes
- Element body
- End tag
Each of these parts are explained throughout the rest of this text.
Element Names + Start and End Tags
HTML elements have a start tag and an end tag. For example, <html> and </html>.
The character < opens a tag and this > closes the tag. The slash-character before the tag name marks it as an end tag.
The element content is written in between this two tags. The element content is also called the element body. Here is an example:
<b>Text bla bla</b>
The example shows a <b> element consisting of a start tag (<b>), element content (Text bla bla), and an end tag (</b>).
The name of the element is written inside both the start and end tags. In the example above,
the name of the element is b
which is short for "bold". The b
element marks
its body to be displayed in bold. Later texts will get into more detail about the various elements available.
Empty Elements
Some HTML elements have no element body, meaning you cannot write any text between the start tag and end tag.
An example is the br
element which puts a line break into the HTML document. Here is an example:
<br>
Empty HTML elements do not need an end tag. You just write the start tag as shown in the example above.
Element Attributes
An HTML element can have attributes. Attributes are embeddes inside the start tag of the HTML element. Here is an example:
<table border="0"> ... </table>
The table element above has an attribute named border. The value of the attribute is 0.
The format of an attribute declaration is:
name="value"
The = and the quotes are part of the attribute declaration, but not part of neither attribute name nor value. You can use both the double or single quote character as quote. Here are two examples:
<table border="0"> ... </table> <table border='0'> ... </table>
Both of these examples are valid.
It is actually allowed to omit the quotes around the attribute value. Here is an example:
<table border=0> ... </table>
This is the same table element with the same border attribute, but without quotes around the attribute value. If you omit the quotes, the attribute value cannot contain spaces. If the attribute value contains spaces, enclose the attribute value in quotes.
Some attributes have no value. Their presence or absence carries the meaning itself. Here is an example:
<input type="checkbox" checked>
This input element has an attributed named checked
which has no value. If the checked
attribute is present,
it means that the checkbox should be displayed as checked. If the checked
attribute is absent, it means that the
checkbox should be displayed unchecked.
Nested Elements
Some HTML elements can have other HTML elements nested inside them. Here is an example:
<p> Mary had a <b>little</b> lamb. </p>
This example shows a <p> element with a text inside. Part of the text is surrounded by a <b> element. This <b> element is nested inside the <p> element. The <p> element is used around paragraphs of text, and the <b> element is used to mark text to be displayed as bold. Both of these elements will be explained in more detail in later texts.
Make sure that the nesting of elements is done correctly, meaning that the start and end tags are written in the correct order. Here is an example of a text that is both bold and italic:
<p><b><i>Bold italic text</i></b></p>
Notice how the inner element, <i> both starts and ends inside the <b> element. A common mistake is to swap the end tags, for instance like this:
<p><b><i>Bold italic text</b></i></p>
Notice how the end tags of the <i> and <b> elements are now swapped. The <b> element is ended before the <i> element. This is not valid HTML. Most browsers will still display the HTML correctly, but don't rely on it. Make sure your nesting of start and end tags is correct.
Nesting elements is often necessary to achieve more advanced document formatting, as you will see later in this book.
White Space and Line Breaks
The browser ignores extra white spaces and line breaks. White space characters include the space character, tab character, line break character and similar characters that represent blank (white) space in the document. Line breaks are characters that represent new lines in a normal text editor like notepad.
Here are two examples with different amounts of white space characters in:
<p>John is good</p>
<p> John is good </p>
These two paragraphs of text will be displayed similarly in the browser. The browser ignores white space before the first non-white-space character in a paragraph. The browser also ignores the line breaks inside the <p> element.
Additionally, the browser ignores the extra white space between the words, displaying only a single space between the words, and it ignores the extra white space and line break after the last word in the paragraph.
A single line break between two words will be displayed as a space. Extra line breaks in between words will be ignored, just like white space characters.
White Space Inside Tags
The browser also ignores extra white space and line breaks at some points inside HTML tags. Here are some examples:
<p > Text </p> <table width="100 height="200" > </table>
Both of these examples are valid HTML.
It not allowed to have white space between the < character and the element name though. Thus, this is not valid HTML:
< p>
The browser uses the text right after the first < character to determine what HTML element it has encountered. If a white space is found, the browser assumes that it is not an HTML tag, but instead just the < character, which is then just displayed as it is.
Uppercase and Lowercase Characters in Element Names
The HTML element names are case insensitive meaning it does not matter if you write them in uppercase or lowercase. That means, that the following two examples are displayed the same in the browser:
<b>My Text</b> <B>My Text</B>
In many cases the attribute names and attribute values are also case insensitive.
Personally I always use lowercase characters when writing element names, attribute names and values. I find it easier to read, and easier to write too.
The content inside the element is not case insensitive though. The text inside the <b> element (My Text) will be displayed with uppercase and lowercase characters exactly as you write them.
Tweet | |
Jakob Jenkov |