Several HTML elements, most notably th A element, may contain an attribute which takes a URL as value. URLs, Uniform Resource Locators, are addresses of Web documents. More generally, URLs can be used on the Web to refer to "objects"on the Web or in other information systems
The general syntax of absolute URLs is the following:
scheme://
host:
port/
path/
filename
wher
http | a Web document (to be accessed using Hypertext Transfer Protocol HTTP) |
ftp | a resource to be retrieved using FTP(File Transfer Protocol), usually a file in a so-called FTP server, |
file | a file on a particular computer; file URL is hardly useful on the Web |
gopher | a file in a Gopher server |
mailto | electronic mail address |
news | a newsgroup or an article in Usenet news |
telnet | for starting an interactive session via the Telnet protocol (which is part of TCP/IP) |
www.hut.fi
(or sometimes a numerical TCP/IP
address);notice that typically, but not necessarily, Web
servers have domain names starting wit www
:
portWarning:Although many browsers allow you to omit the
par http://
when specifying the URL of a document to be
visited, you must not omit it in when writing a normal URL
into an HTML document. (Otherwise browsers will try to interpret it
as a relative URL.)
Actually, this pattern is mainly for Web documents, i http
URLs. For other URLs, simplifications and special interpretations are
applied. For example, mailto
URL is just of the for
mailto
:addresswher addressis
a normal Internet E-mail address lik
[email protected]
(as specified i
RFC 822).
Please notice that appending anything to the E-mail address in
mailto
URL
is nonstandard and
may result in lost mail without
anyone noticing! (See also
the discussion o mailto:
URLs in the description of th
Aelement.)
A http
URL can also be
fragment identifierwhich consists of an absolute URL, the # sign and
name(which refers to a location within the
document specified by the absolute URL).
See the description of th Aelement for more information
It is safest to enclose URLs i quoteswhen writing them as attribute values in HTML
For an overview of URLs, se W3C material on addressing
As regards to thetechnical specifications of the syntax of URLs, se RFC 1738(absolute URLs)an RFC 1808(relative URLs)
In particular, the specificationssay that within a UR only a limited set of characters can be used as such:
A
t
Z
a
t z
0
t 9
)$-_.+!*'(),
;/?:@=&#
provided that they
are used in th special meaningreserved for them
in th RFCs mentioned above
Other characters must be encoded.
(The character ;/?:@=&#
must also be encoded, if they
are not used in the special meaning.)This encoding (which is defined by URL specifications, not HTML
specifications)consists of using the percent sign followed by two
hexadecimal digits, presenting the code position.
For example, tilde (~
)should be presented a
%7E
and space a %20
.
(Violating the rules causes problems
much more likely in the latter case than in the former.)
When a URL occurs as anattribute valuein HTML, there is another complication caused by th &character which may have special use in quer formsubmissions. In principle, that character should b escapedas &or as &(there i a footnotein th HTML 2.0specification about this)and browsers should process it so that the actual URL passed to the processin CGI scripthas that notation replaced by plain &character. (Notice that it mus notbe encoded. This is a confusing issue, and CGI scripts should really be written so that semicolon ;and not ampersand &is used as field separator.)