Spacer Sidebar Directory Map

The Training Book, the handbook for trainers

Training Express computer learning guides

 


ITrain - International Association of Information Technology Trainers

Lots of Linux

Google knows Linux will crunch it's data


ITINFO Sponsor

ITrain Certified Software Learning Guides

Instructor-led training materials for all popular software applications.

Printed and electronic formats with plenty of hands-on exercises.
Sample guides available online.
Members in good standing automatically receive a 15% discount.

books@itrain.org
Certified Software Learning Guides

Internet Poll
Have you attended a seminar via e-learning?
yes
no

poll archive


4,000 Linux Servers Used as Backbone of Google Search Engine

by Dave Murphy
ISSN 1535-3613

Dave Murphy, DGL President & ITrain founder The high-end search engine Google has setup up 4,000 PC servers running Red Hat Linux, and it has plans to upgrade the system to a total of 6,000 servers later this year. I think this is the largest Linux installation in the world. The practically free cost of Red Hat Linux compares to approximately $1,000 for the software to run a Windows NT server and even more for Windows 2000. And the cost of hardware is reduced because Linux doesn't have significant hardware requirements.

The choice to use Linux rather than Windows NT/2000 will save Google over $6 million this year in software cost alone. Overall, I estimate the savings will be more than double that, because Linux is cheaper to buy, more quickly installed, and requires less physical periodic system maintenance.

"The hypertext analysis is computationally expensive," said Sergey Brin, founder and president of Google.com. "We need to have an efficient system for doing that. That's why we use a lot of cheap PCs. It's a cheaper platform. The dollar per MIPS is better for PCs."

The Linux systems will be used to rank the importance of submitted webpages by counting how many referential links to that page exist and the importance of the referential pages. The system will also conduct a hypertext analysis to determine where keywords are located on submitted pages.

This work is computationally intensive, with 500 million variables and 2 million terms in a search equation to index the web, performed about every month, resulting in about 1 TB of data to index 300 million webpages. One terabyte (TB) is the equivalent of 1,024 gigabytes (2^40 bytes).

Google has in-house talent to maintain the Linux servers, and it values the ability to look at the source code of the operating system and applications to correct problems as they appear. Linux allows the Google staff to be less reliant on external vendors.

Call for Comments

What do you think? Leave your comments on the message center.

References

Google
Red Hat
Message Center


Subscribe to ITINFO.
Receive computing and Internet news & tips
by subscribing to the ITINFO information service.
Type your Internet email address in the form, and click "Subscribe."
Email Address:

Damar Group, Ltd. helps business use technology.

ITINFO is again accepting sponsors. Sponsor messages are included in ITINFO's email newsletter and are permanently posted to DGL's website and online reference areas.

ITINFO is an electronic publication of Damar Group, Ltd., publisher of Training Express computer learning guides. Comments and submissions to info@dgl.com.

Previous issues are on our website at http://dgl.com/itinfo/.

updated May 31, 2000
http://dgl.com/itinfo/2000/it000531.html

Return to DGL homepage
Copyright © 2000, Damar Group, Ltd., All Rights Reserved