Posts tagged google
Google Caffeine : Google’s New Search Engine Indexing System
Jun 15th
Google is now announcing there new indexing system called Caffeine. Google says Google Caffeine search engine indexing system is much faster than there older system. One advantage is that news stories, blog posts and forum posts are indexed better and faster allowing users to find links to relevant content much sooner after it is published than was possible before.
Google Caffeine provides 50 percent fresher results for web searches than there last indexing system. According to Google this is the largest collection of web content Google has ever offered.
Why did Google create a new indexing system instead of using the current index ?
Content on the web is continually expanding. This content might be text, images, videos, news and some real time updates. Also present web pages are richer and more complex. From the Searches perspective it is also important to provide users with the latest relevant content . Publishers need content to quickly appear on the search engines shortly after their posts in search results. To meet these expectations Google implement his new search indexing system.
The image below explaining how different between old indexing system and the new caffeine.

Before comparing old index and caffeine , lets look how search engines crawl the entire web and store the data found there. When you search Google , your not searching the live web, your searching your query through the Google database or index or register like we do in a library. If we want to find specific book , first we go to librarian or searching the index of the library. After the librarian finds the specific place for that book we go to that rack to get the book. This index system just like a card catalog in a library helps you to find that specific content. Google does the same for you. Google has an index of the web , it shows the path to find relevant data for your search query,
Google’s old indexing system had several layers, and some layers refreshed in faster rate than others. The main layer update every couple of weeks. Couple of weeks means there is some kind of significant delay from the published date. News for example thought it is highly relevant shortly after it happens, tends to become history rather than news in a very short time. We live in a world of instant gratification and get news immediately on our cell phones or computers – a week delay isn’t really acceptable.
In Caffeine, Google analyzes the web in small portions and update there search index rapidly and continuously and also globally. If the “bot” finds some new information on existing pages it is directly added to there index. So for both searchers and for publishers, if you need fresher information ,here you have it. Publishers also can also publish there hot information and get it prioritized for indexing making it appear in searches much faster.
According to Google in every second caffeine processes hundreds of thousands of pages in parallel. So the new indexing system is much busier than the old one getting content indexed and available faster.
Google – “We’ve built Caffeine with the future in mind”. And there idea is to build a faster and more comprehensive search engine that delivers more and more relevant results. Google engineers are still working on this great system, there will be some more improvements in coming months. We have to wait to see what those are.
Determine the best way to automate Sitemaps
Nov 16th
In simple terms, a sitemap (or site map) is a list of all the pages in your website. Sitemaps provide two benefits: easier navigation (for visitors of your site) and better visibility by search engines.
With the rise of modern SEO techniques the importance of the sitemap has been growing. Sitemaps are the best way to inform search engines about changes on your website.
As a development company, we always apply current SEO techniques on our customer’s websites to ensure that they get top ranking on search engines. Not only on Google, but Yahoo, Bing and Ask.com etc. as well.
Including sitemaps is one of the important tasks we perform when developing sites for our clients. And this is done either manually or dynamically according to the customer needs.
Typically a good site with lots of content changes regularly. In this case it is expensive and tedious to continually update the site map. For this reason, we feel that in some cases it is important to be able to auto generate a sitemap.
We evaluated some sitemap auto generating tools and following are some of the solutions we like:
- Google XML Sitemaps [ http://wordpress.org/extend/plugins/google-sitemap-generator/ ]
- It is a plugin for WordPress sites, which automatically updates the sitemap and notifies all major search engines every time you create a new post/content.
- Addme.com [ http://www.addme.com/ror-sitemap-generator.htm ].
- An online tool dedicated for sites developed using Ruby on Rails (ROR)
- Google Site Map Generator [ http://www.neuroticweb.com/recursos/sitemap/ ]
- Another simple online tool to generate site maps.
- PHPClasses.org [ http://www.phpclasses.org/browse/package/2612.html ]
- A class library which can be integrated with PHP sites.
It is obvious that selecting a sitemap generating mechanism is depending on several facts such as the nature of the site, sever side technology used etc. So making the correct decision is up to your experience in SEO and web development team.
Contact us if you would like help generating a sitemap for your site, have general web marketing, seo, or web design questions…
RDFa vs microformats for Google Rich Snippets
Nov 16th
We are in the process of implementing Google Rich Snippets into a customers web site. Specifically to provide ratings to travel vendors and travel destinations, which will appear in search results on Google as google snippets. These will be delivered in Cake PHP. Also we are considering using it for some product ratings for e-commerce sites delivered in Ruby on Rails. We may even consider adding it to WordPress sites as a way of rating the content described on the page.
Of course the immediate concern was which technology to use microformats or RDFa? Not that it should be a big concern but we have to support whichever technology we implement.
Searching microformats vs RDFa on, brought up a ton of articles, and in depth debates on the subject.
Understandably so, since microformats appear to be more adopted, and the other is developed by W3C.
I think Googles approach is a good one, they support both of them so we can use whichever standard we prefer.
As a development company we inherit a lot of work started by other development companies, so we will also support both. If we are creating or adding them from scratch to a site however, we will choose RDFa.
Initially I thought the opposite because microformats seemed more intuitive and easier to implement. After more research however, it seemed that RDFa would win out in the long run, and be more flexible. No one can know for sure though and I guess the best solution for everyone would be for them to just merge as a standard.
For more detail on the subject Evan Prodomou has a good write up RDFa_vs_microformats
