Creating a Sitemap is a desirable though not necessary step. In the case of Yandex it must be a page or pages which contain links to all of the site's documents. In the case of Google, Yahoo!, MSN, etc. it must be represented by xml-tags. The rules for creating sitemaps can be seen at http://www.sitemaps.org/ru/protocol.php or in the help files for Google's Webmaster Tools. The resulting file must be resembling the following:
In the example above we describe just two pages - index.html and page_1.html:
1. <?xml version="1.0" encoding="UTF-8"?> - declaring
the document type and encoding. Be sure to save the file with .xml extension
and in UTF-8 encoding.
2. The rest of the Sitemap will reside between tags <urlset>
</urlset>, we also insert there the address of the scheme against which
the validation of the map will be performed (in this example xmlns="http://www.sitemaps.org/schemas/sitemap/0.9").
3. Between opening and closing tags <url> and </url> contained
all the information concerning individual pages (which we want
to be indexed by search engines) of our site:
a) <loc>http://BoBa.net/index.html</loc>
- location - required parameter, must contain full URL of
the page.
b) <lastmod>2008-04-27</lastmod> - last
modified - the date of last modification of the file.
Optional parameter, but potentially insertion of its even approximate value
will allow robots to work more efficiently with new and (or) updated files
of your site. Pay attention to the time format. I've used YYYY-MM-DD.
You're allowed to represent the date in other ways, but according to the recommendations of W3C:
Complete date plus hours and minutes, + time zone designator:
YYYY- MM - DD
Thh:mmTZD - 1997-07-16T19:20+04:00;
Complete date plus hours and minutes and seconds, + time zone designator:
YYYY-MM-DDThh:mm:ssTZD - 1997-07-16T19:20:30+04:00;
Complete date plus hours and minutes, seconds and and a decimal fraction
of a second, + time zone designator:
YYYY-MM-DDThh:mm:ss.sTZD
- 1997-07-16T19:20:30.45+04:00;
where:
YYYY = four-digit year;
MM = two-digit month (04 = April);
DD = two-digit day of month (01 through 31);
hh = two digits of hour (00 through 23, am/pm not allowed);
mm = two digits of minute (00 through 59);
ss = two digits of second (00 through 59);
s = one or more digits representing a decimal fraction of a second;
TZD = time zone designator (Z or +hh:mm or -hh:mm).
c) <changefreq></changefreq> - optional parameter, defines how
often the page is likely to be changed, can be ignored by search engines'
robots. Valid values are:
always
hourly
daily
weekly
monthly
yearly
never;
d) <priority></priority> - optional parameter with valid range
of values from 0.0 to 1.0, defines the priority of a page relatively to other
pages of the same site, the default value is
0.5.
*You can add the address of your Sitemap to robots.txt file:
And it can be all what the file is containing.
Sitemap ping. Sitemap can be submitted through an HTTP
request using the address bar of your browser
to the following search engines (after the equal sign type
in full url of your Sitemap, for example, once again http://BoBa.net/sitemap.xml):
Google: http://google.com/webmasters/sitemaps/ping?sitemap=full_url_of_your_sitemap
Yahoo!: http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=
Live Search (MSN): http://webmaster.live.com/ping.aspx?siteMap=