HTML and URLs

Jakob Jenkov
Last update: 2014-06-15

Every HTML document which is accessible via the web, is located at some URL. The word URL is short for Uniform Resource Locator. The URL is the "address" on the web of the HTML document.

URLs as Addresses of Resources

In fact it is not just HTML documents that have URL's. Any file accessible via the web has a URL, including images, JavaScripts, CSS style sheet files, Flash files etc. All these files are called resources. Even dynamically generated files of data have a URL.

When you want to view a HTML document in a browser, you type in the URL of the document into the browsers address bar.

Here is an example URL:

http://www.jenkov.com/books/jquery/index.html

This URL consists of three parts:

  1. Protocol
  2. Domain
  3. Resource path

All three parts are illustrated here:

A URL consists of a protocol name, domain name, and resource path.
A URL consists of a protocol name, domain name, and resource path.

The protocol tells what protocol is needed to access the resource the URL points to. Typically the protocol is either http or https.

The domain is a name that is translated into an IP address. Thus, the domain name really points to a server somewhere on the internet. This is the server hosting the resource. In the example above the domain is www.jenkov.com .

The resource path is the location of the resource within the server the resource is hosted on. In the example above, the resource path is /books/jquery/index.html . The resource path can be thought of as a logical directory structure. In the example above, the resource path contains two logical directories: books and jquery.

Query String

A URL can contain a query string. Here is an example:

http://www.jenkov.com/books/jquery/index.html?param1=value1¶m2=value2

The query string part of the above URL is:

?param1=value1¶m2=value2

The query string starts with a ? character, and then comes one or more name=value pairs. Each pair is separated by an & character. In the example above the query string contains two name=value pairs. The param1=value1 and the param2=value2.

The name of a name=value pair is the name of a parameter passed to the server where the resource is hosted. The value is the value of the parameter named by the name.

It is up to the server how it interprets the query string, and whether a query string is needed at all to access a given resource. If a resource does not need a query string in its URL, but you add one anyways, the server typically just ignores the query string.

Fragment Identifier

A URL can contain a fragment identifier. A fragment identifier points to (identifies) a fragment of the HTML document the URL points to. Fragments are typically only used in HTML documents. Using a fragment identifier in the URL you can point not only to the HTML document itself, but to a location inside the HTML document. This is covered in more detail in the text about links.

Here is an example of a URL with a fragment identifier:

http://www.jenkov.com/books/jquery/index.html#someFragmentId

The fragment identifier is appended after the # character. Thus, the fragment identifier in the example above is someFragmentId.

The fragment identifier has to point to a fragment ID in the target HTML document. How to insert that, is explained in the text on links.

A URL can consist of just the fragment identifier. Here is an example:

#someFragmentId

In that case the URL is interpreted as pointing to a fragment ID inside the same document as the URL is contained in.

In case a URL has a query string appended to it, the fragment identifier is appended after the query string. Here is an example URL with both query string and fragment identifier:

http://www.jenkov.com/books/jquery/index.html?param1=value1¶m2=value2#someFragmentId

Relative URLs

A URL can be relative. A relative URL consists of just the resource path itself.

A relative URL is interpreted relative to the URL of the HTML document that contains the URL. Thus, if the URL of the HTML document containing the URL is:

http://jenkov.com/books/jquery/index.html

then all relative URL's inside that HTML document are intepreted as being relative to that URL.

A relative URL containing just a document name, e.g. html-book.html is interpreted as being located in the same logical directory as the /books/jquery/index.html page. That means in the logical directory /books/jquery. The full resource path will be interpreted as:

/books/jquery/html-book.html

The protocol and domain name is also interpreted as being the same as the document containing the relative URL. Thus, the resource path /books/jquery/html-book.html is interpreted to be located at the URL:

http://jenkov.com/books/jquery/html-book.html.

You can use two dots (..) to signal that the relative URL points one directory up from the resource path of the document containing the URL. Thus, this relative URL

../html4/html-book.html

found inside a HTML document at the resource path /books/jquery/index.html, will be interpreted as the resource path:

/books/html4/html-book.html

Notice how the directory jquery is cut off the resource path, before the rest of the relative URL is appended to it.

The full URL will be interpreted as:

http://jenkov.com/books/html4/index.html

You can use multiple dots (..) separated by a slash character (/) if you want the relative URL to go up more than one logical directory. Thus, the relative URL

../../html-book.html

inside a document located at resource path /books/jquery/index.html, will be interpreted as:

/html-book.html

Notice how both the books and jquery logical directory path is cut off the documents URL, before the rest of the relative URL is appended to it.

The full URL will be interpreted as:

http://jenkov.com/html-book.html

A relative URL that starts with the slash (/) character is always interpreted as being relative to the root of the logical directory hierarchy, instead of to the URL of the document containing it.

Here is an example of a URL that is relative to the root of the logical directory structure:

/books/jquery/index.html

It doesn't matter what the URL is of the HTML document containing this URL, it will always be interpreted the same - as relative to the root of the logical directory hierarchy.

The Advantage of Relative URL's

It can be an advantage to link internally between pages in your website using relative URL's instead of full URL's including protocol and domain name.

Often when you develop a website on your local machine, the web server is running on the URL http://localhost:8080. If you use a full URL in your links and URL's, you will need to search and replace all URL's before putting the website online, on e.g. a domain like jenkov.com.

Using relative URL's, though, the URL's will be interpreted as being either relative to http://localhost:8080 or your websites domain (e.g. http://jenkov.com), depending on where you access the website. This makes development a lot easier, since you don't have to change URL's before uploading the website to the web server.

Jakob Jenkov

Featured Videos

Java Generics

Java ForkJoinPool

P2P Networks Introduction



















Close TOC
All Tutorial Trails
All Trails
Table of contents (TOC) for this tutorial trail
Trail TOC
Table of contents (TOC) for this tutorial
Page TOC
Previous tutorial in this tutorial trail
Previous
Next tutorial in this tutorial trail
Next