Looking beyond the surface – Exploring Deep Web

ImageAfter reading about it around 5 years, the concept of Deep Web continues to fascinate me. The Facebook, Wikipedia and all form just 4% of the World Wide Web. The rest 96%, tens of trillions of pages, not reachable by any search engine forms the Deep Web or the Invisible Web. The content may range from boring statistics to sale of human organs on the black market. In fact, in October 2013, FBI shut down Silk Road, a popular online black market where everything from ammunition, drugs to assassins could be bought.

The concept behind the Deep Web is not as dark as it seems. The reason is simple – Google, Bing and all search engines use crawlers to traverse the web. They follow the links from one page to another and are able to collate all the static pages. The pages which are generated directly in response to some stimuli are not captured. Around 54% of the websites are databases and thus not captured.

There are other pages which are available only on the intranet/private networks and thus not captured.

Then there is a hidden part of the web called Tor, that requires specialised software to access it. It is used so their web activity cannot be traced. It runs on a relay system that bounces signals among different Tor-enabled computers around the world.

Well that’s about Deep Web. Lets look at the importance of Deep Web.

  • A search engine that can crawl the entire Web can be used for Big Data analysis providing more accurate information on climate, finances etc.
  • The deep web contains 550 billion documents compared to one billion on the surface web.
  • Deep web contents is highly relevant to every information, need and market
  • 95% of the content on deep web is freely accessible information, not subject to fees or subscription

Companies are doing their best to mine into this treasure trove of information and coming up with new methods of search for this.


Related Readings







One thought on “Looking beyond the surface – Exploring Deep Web

  1. Shodan the search engine of connected objects
    Made in USA, Shodan is a search engine that can identify all connected objects (webcams, automation tools, robots, hydro, IT companies …), but also to take control. An application that allows you to become aware of the risks of piracy, some will say. Which encourages hackers, say others.
    The year-old site known as Shodan makes it easy to locate internet-facing SCADA, or supervisory control and data acquisition, systems used to control equipment at gasoline refineries, power plants and other industrial facilities. As white-hat hacker and Errata Security CEO Robert Graham explains, the search engine can also be used to identify systems with known vulnerabilities.
    If you are interested, I have posted an article about Shodan that you can read here:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s