CyberArmy University | Open Source Institute | CyberArmy Intelligence & Security | CyberArmy Services & Projects

[Library Index]

[View category: Programming] [Discuss Article]

Google Code Search

Article is yet to be rated
Author:      icklenewt
Submitted:      31-Oct-2006 18:49:44
Imported From:      zZine (original author: icklenewt)


Google Code Search is a great new feature from Google.
Google Code Search is a great new feature from Google, it gives programmers the ability to search through publicly accessible code. However, this kind of search is not new, sites such as koders and krugle are already widely used by many developers.

On Google Code Search, users can restrict their search by language, file type/name, license or package as well as perform advanced searches using regular expressions. It is this extra regex functionality that takes Google Code Search one step ahead of the others!

How Does it Work?
The code search works like any other search engine. Like the main Google search, it is a 'crawler-based' search engine, or 'spider' as they are sometimes called. This kind of search engine uses a software-based agent that visits websites, reads information from the website and follows links to other pages within the site and to find other new sites. All this information is stored and indexed in a central datastore. The crawlers will revisit sites regularly to pick up any changes that have occurred.

Google code search specifically indexes publicly available source code, hosted in archives (.tar.gz, .tar.bz2, .tar, and .zip), CVS and Subversion repositories throughout the internet.

Security Threat?
Although the tool is clearly beneficial to developers, security professionals are warning that it also makes it easier for attackers to find exploits and program flaws. Due to this search delving much deeper into places where code is available it makes it quicker and easier for exploits and vulnerabilities to be found.

Not only can well known vulnerabilities be searched for, but many searches into the comments within code will return useful results for an attacker. Some examples would be searching combinations of 'todo', 'security', 'vulnerable'.

Code analysis articles: Aiding Open Source?
Security may be threatened by this tool, but it may also be an aid to Open Source. Open Source can be argued to be more secure due to more people reviewing the available code and submitting bug reports and fixes. As such, Google Code Search may prove to benefit the Open Source community more than harming it.

Overview
Google Code Search may be used to find vulnerabilities in a program for dishonest means, but it may also be used for reviewing software security and implementing necessary patches. This could be a great aid for the Open Source community, providing developers are quick to react. Developers are urged to be more vigilant in checking their code for vulnerabilities before releasing it to the public eye. It is also important to note that comments should be paid attention to as well, don't advertise known security problems! Users of Open Source software are also advised to regularly check for patches/upgrades as they will most likely contain important security updates.

Security issues aside, Google Code Search really is a great tool for developers! No doubt most developers reading this have had times where an area is barely documented at all, or the documentation available just isn't clear enough. Code search enables you to better search for useful examples in real working code - for some of us this can be an extremely beneficial way to understand new concepts and functions.

For any developers out there who are concerned about their code being searchable via Google and other search engines, do not despair! You can prevent your site, or specifically your code, from being searched by appropriately using a robots.txt file. Find out how from robotstxt.org. If your code is hosted on a site where you cannot change the robots.txt file, you can put a robots.txt in the root directory of your code package (either within an archive, or repository).

This article was originally published by CyberArmy.net in the CyberArmy Library.

You must be logged in to vote on an article

About Us | Privacy Policy | Mission Statement | Help