Googles Sitemap Protocol
Google has made the latest step towards making the Internet into a semantic web with its release of its sitemap protocol.
"The Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc." - Google
Benefits to Google
One of the main benefits of the Sitemap Protocol is the possibility of allowing search engines such as Google to see the most recent URLs created or modified that are only available from behind a form. This so called "invisible net" is currently unavailable to most search engines and may contain information that is not available anywhere else on the Internet.
Another reason that Google is implementing its' Sitemap Protocol is to be able to find updated pages faster. This could mean that updated pages being indexed by Google the same day as their release, instead of a month latter in some cases. Although this could be achieved through Google crawling every page on site every day, this is nearly impossible and the Sitemap Protocol should be an alternative to this.
I can see that with the Sitemap Protocol, Google could have 2 types of crawls in the future:
- Full Crawl - The typical Google-bot crawl over the whole site looking at the links and doing the page rank thing
- Quick Update - Google gets the new updated pages but does not run all algorithms, thus saving processor time and bandwidth while being up to date.
Benefits to websites
The first possible benefit to websites is the saving of bandwidth. If Google crawled more often to achieve the save result as they would with theSitemap Protocol a lot of a websites bandwidth would be wasted.
The Google Sitemap Protocol seems to be of a benefit to larger sites than small sites. Google states that "Using this protocol does not guarantee that your Web pages will be included in search indexes. In addition, using this protocol may not influence the way your pages are ranked by a search engine.". The main benefit the Sitemaps to websites is the possibility of getting updated pages into Googles index quicker.
RSS Feed Replacement
Not just Google stands to benefit from Googles' Sitemap Protocol as there are many alternative uses for the Sitemap files. One possible application for the Sitemap files is an RSS Feed replacement. Many RSS feeds are now used to tell visitors with RSS readers that posts have been added to a forum or that articles have been added to a website. An RSS reader alternative could use the Sitemap files to find the latest updated files and use aspects of the updated page to provide a similar service to an RSS feed.