Uniform Resource Locator definition World-Wide Web
(URL, previously "Universal") A standard
way of specifying the location of an object, typically a web page
, on the Internet
. Other types of object are described below. URLs are the form of address used on the World-Wide Web
. They are used in HTML
documents to specify the target of a hypertext link
which is often another HTML document (possibly stored on another computer).
Here are some example URLs:
http://w3.org/default.html http://acme.co.uk:8080/images/map.gif http://foldoc.org/?Uniform+Resource+Locator http://w3.org/default.html#Introduction ftp://wuarchive.wustl.edu/mirrors/msdos/graphics/gifkit.zip ftp://spy:firstname.lastname@example.org/pub/topsecret/weapon.tgz mailto:email@example.com news:alt.hypertext telnet://dra.com
The part before the first colon specifies the access scheme or protocol
. Commonly implemented schemes include: ftp
(World-Wide Web), gopher
. The "file" scheme should only be used to refer to a file on the same host. Other less commonly used schemes include news
or mailto (e-mail
The part after the colon is interpreted according to the access scheme. In general, two slashes after the colon introduce a hostname
(host:port is also valid, or for FTP
user:passwd@host or user@host). The port
number is usually omitted and defaults to the standard port for the scheme, e.g. port 80 for HTTP.
For an HTTP or FTP URL the next part is a pathname
which is usually related to the pathname of a file on the server. The file can contain any type of data but only certain types are interpreted directly by most browsers
. These include HTML
and images in gif
format. The file's type is given by a MIME
type in the HTTP headers returned by the server, e.g. "text/html", "image/gif", and is usually also indicated by its filename extension
. A file whose type is not recognised directly by the browser may be passed to an external "viewer" application
, e.g. a sound player.
The last (optional) part of the URL may be a query string preceded by "?" or a "fragment identifier" preceded by "#". The later indicates a particular position within the specified document.
Only alphanumerics, reserved characters (:/?#"%+) used for their reserved purposes and "$", "-", "_", ".", "&", "+" are safe and may be transmitted unencoded. Other characters are encoded as a "%" followed by two hexadecimal
digits. Space may also be encoded as "+". Standard SGML
"&;" character entity encodings (e.g. "é") are also accepted when URLs are embedded in HTML. The terminating semicolon may be omitted if & is followed by a non-letter character.
The authoritative W3C URL specification (http://w3.org/hypertext/WWW/Addressing/Addressing.html).