THE IMPORTANCE OF BUG TESTING 

                           - Editorial by dethy 

                        _________________________________
                
                          1. Software Development Stages
                            i. What defines beta?
                           ii. What defines alpha?
                          iii. What defines stable

                          2. Why bug test?
                           i. Importance to Client
                          ii. Importance to Programmer
        
                          3. Development Goals
                           i. Software testing vendor's goals
                          ii. public's goal as bug testers

                          4. Software Testing Strategies
                             i. Functional Prototypes
                            ii. Designing Test Sets
                           iii. Defect Testing
                            iv. Acceptance Testing
                             v. Structural Prototypes
                           vi. Signs to observe

                          5. Bug discovery ?
                           i. Alerting the Vendor
                          ii. Alerting Clients

                          6. Final Note
                        _________________________________

 1. Software Development Stages

  The following whitepaper discusses 'The Importance of Bug Testing' with respe
ct to client
  and vendor environments. Various responsibilities are placed on either side o
f product
  development, and it is necessary to understand the reasons behind practising 
secure code 
  and ethical loyalty.

  In the Real World systems (hardware or software) often go through two stages 
of release 
  testing: 

   * Alpha (in-house)
   * Beta  (out-house)

 What defines alpha ?

  The term 'alpha' was adopted from the Greek number '1', in the early 1960's f
or computer 
  terminology used to describe product cycle checkpoint, first used at IBM, and
 thus 
  advanced to become a standard throughout the computer industry. 

  This first phase of testing of a product/system is apart of the software deve
lopment 
  process. 
  The Alpha development stage includes unit testing, component testing, and sys
tem testing, 
  but originally was endowed to feasibility and manufacturability evaluation do
ne before 
  any commitment to design and development.

What defines beta ?

  The term 'beta' was adopted from the Greek number '2', again with IBM using t
his 
  terminology to categorise product development.

  In software development, a beta test is the second phase of software testing 
in which a 
  sampling of the intended audience tries the product out. Beta testing can be 
considered 
  "pre-release testing" of a product/system, in which there is a potential for 
code 
  flaws/logic errors distributed throughout the program. However, in recent yea
rs Beta test
  versions of software have become distributed to a wide audience on the World 
Wide Web 
  partly to give the program a "real-world" test and partly to improve the func
tionality by 
  clients voicing their approval/(dis)satisifaction/comments about the product/
system.

  Often it is the case that a product may stay in the Beta stage for several ye
ars, and 
  could be considered stable, but it is the choice of the product's vendor to r
eside the
  product's stage in Beta until furthermore rare bugs have been ironed out.

  Releasing an advisory could take place after testing a beta release of a prod
uct. It is
  a general rule of thumb that most advisories come out as product testing of m
ore stable 
  applications, but it is know that in some cases an advisory for beta releases
 is
  necessary. Sometimes, it is the only way to get vendors to make a patch for t
heir
  security risks involved in bug vulnerability discovery, additionally beta rel
eases may
  stay in the phase for months or years, as was previously discussed. Releasing
 advisories
  for alpha products is somewhat non-sensical. Alpha releases are known to be h
ighly 
  unstable and should not be run without caution and hesitancy; most of the bug
s are found
  at this stage, so releasing bug alerts publically at this time becomes relati
vely trivial.

 What defines stable ?

  Stable is the final outcome of the developed product. After the Beta product 
has been 
  decided to be fit for the task, with most of the known bugs fixed or patched,
 and the 
  product successfully fulfills all requirements of functionality, it then term
ed, 'stable'.

  Product testing has been utilised with most (if not all) flaws worked out, an
d as much
  code optimisation has been implemented as possible for the end software appli
cation. This
  is the accepted version of the product that is capable of handling data corre
ctly, as 
  needed. This phase attempts to please customers/clients to satisfy their need
s as 
  consumers and users of the product.

 2. Why bug test ?

  It is often debated why people must test programs, as ethical people we do no
t have to, 
  but the world is not entirely full of ethical people who ensure correct data 
is computed 
  into a system, that is why safe practises need to be developed. The only way 
for this to 
  take place is through bug testing. There are two categories that effect both 
the client 
  and the programmer, each have different needs and wants in terms of the impor
tance of bug 
  testing.

   * Importance to Client
   * Importance to Programmer

  In the client's perspective, having a stable program that is guaranteed to pe
rforms it's 
  desired task is not only a reflection of the program but also a reflection of
 the company
  itself. Poor products shines the light dimly on the company, that is why a so
lid and well
  tested product needs to be entrusted through bug testing, before manufacturin
g takes 
  place.

  Whilst it is not always known by management about product flaws, company dire
ctors assume
  that every function works smoothly without defects at all. However, experienc
e shows that 
  no product/system can be deemed completely secure without controversy. There 
will always 
  be existence of bugs in program, whether they are found or not is another que
stion. On 
  the other hand, open source software is much easier to spot bugs and code fla
ws, but 
  active security checks through the public help create a much more stable and 
operable 
  program. This is one of the reasons why Microsoft (c) products fail consisten
tly when it 
  comes to testing; their products are not open source, and therefore it is muc
h harder to 
  create a secure and flexible program without aid of the programming community
 to help 
  optimise code.

  The importance to the client, purchaser of the software, is without doubt a k
ey aspect in
  performing their daily tasks successfully. If the program was vulnerable to o
verflows, 
  lack of input checks, or even lack of encryption, the program would quickly b
ecome known 
  for its unstableness, and product sales will drop dramatically. Customers wil
l purchase 
  alternative products available that perform the same task, that have been car
efully 
  checked by multiple tests, as will be seen in the testing section of this doc
ument.

  There is a high level of ethics involved when the programmer is contracted to
 develop a 
  program. The programmer is the top of the chain for importance in testing and
 coding a 
  proficient software application. He/she is responsible for ensuring all funct
ions of the 
  program work, and work efficiently; code optimisation should be at its peak, 
with 
  security functions in check. Better programs are known to have been thoroughl
y tested
  with all sorts of data sets been properly dealt with from within the program,
 operating 
  systems like Linux are tested everyday by programmers, and hackers alike. Yes
, security 
  problems do exist in this environment but most have now been patched or fixed
, pushing 
  towards one of the most stable systems currently around.
  
  Sloppy programmers will not care about ethics, and will simply code the progr
am to 
  minimally function with all it's client side requirements implemented. Some p
rograms 
  deem financial security more important than ethical security - becareful of t
hose whom 
  you contract to fulfil your programming requirements. 

 3. Development Goals

  Goals should be adopted by programmers to ensure software quality assurance, 
but the 
  customer has a responsibility to communicate to the programmer once a bug has
 been found.

 Software testing vendors goals

  The most important primary goal of a programmer is to actually complete a wor
king program
  that serves purpose to client-side requirements. Once this stage has been rea
ched, the 
  more advanced and less known methods should be then put into practise. Added 
  functionality such as:

   * security features
   * help support
   * contact addresses

  Added security features is a must, and assures code quality to be evident wit
hin a 
  program. Use of secure functions and methodologies/implementations should at 
this stage 
  make themselves known. This is where a gap between sloppy and aware programme
rs becomes 
  apparent. All programs should aim for a level of code quality by utilising th
e secure 
  function calls within their specified programming language which helps create
 a more 
  reliable and flexible program. Of course one of the only certain ways to dete
rmine a 
  programs reliability is through testing. Testing focuses on the need for rapi
d feedback
  and the evolving nature of the program under test, this is where clients/cust
omers come 
  into the picture.

 Public's goal as bug testers

  Although programmers bare the most responsibility in terms of code reliance, 
clients and
  customers alike need to be prepared to communicate with software engineers if
 a bug or 
  flaw is observed in a program. If the expected output is different to what is
 given, it's
  time to get in contact by means of a bug discussion list, email, phone - what
ever, but be
  sure to advise the correct people. Especially if the bug could lead to increa
sed 
  privileges, it most important to inform product vendors before the public kno
w about it. 
  This gives time for the vendors to write patches/advisories for their clients
, before any
  harmful damage could be used against their products.

  Testing software is always a step in the right direction. Effective bug testi
ng by 
  customers/clients will force the programmer to improve code quality and secur
ity in 
  future products, that is why we must tolerate and thank the software task for
ces out 
  there, that make software vulnerability's public, such a bug advocacy is BUGT
RAQ, 
  http://www.securityfocus.com.

  When reporting a bug, always be sure you can reproduce it, always include det
ailed 
  descriptions of *exactly* how the bug was found, and the type of system that 
you tested 
  the software application on. The more information the better, but be sure not
 to obscure 
  of obfuscate the description - get as much as basic facts down as possible. I
n particular
  segmentation faults generally cause core dumps (a memory image of the termina
ted process 
  when any of a variety of errors occur), hold vasts amounts of information for
 the 
  programmer to locate where the bug took place. Remember full disclosure is bl
iss.

 4. Software Testing Strategies

  Developing a program or system effectively needs to be thoroughly thought out
 before any 
  raw code is actually written down. One of the most important methods of estab
lishing 
  functional requirements is through a storyboard, as a means of a prototype. P
rototypes 
  may consist of a storyboard, which is a sequence and series of screens, showi
ng the 
  end-user a typical scenario of using the program/system.

 Functional prototypes

  This is one of the most useful methods for making sure the programmer underst
ands just 
  what a program is intended to do. A functional prototype is a very limited ve
rsion of the 
  final program, it gives some idea of the appearance of the final product, but
 with a lot 
  of functions missing. Displaying a simple storyboard to a client or bug teste
r is 
  necessary, as they will be able to comment on whether the 'expected input' ta
kes the 
  'observed output' resulting from running the program. This will also force th
e programmer
  to think through many of the details of what the program is meant to do.

 Designing Test Sets

  Creating workable and effective sets of tests is intellectually challenging. 
Testing can 
  almost never be exhaustive, and it may even be possible that not all programm
ing flaws 
  are evaluated even after very stringent testing has been covered. In the real
 '
  commercial' world, a significant source of program defects is due to people r
unning tests
  and not checking the results carefully. This means that the programmers actua
lly run 
  tests but do not take enough care in reviewing the results to see that the te
sts showed
  unexpected flaws in the programs.

  Tests must be convincing and must demonstrate a successful performance of the
 program. In 
  a commercial setting there are many methodologies used to produce a designed 
set of tests.
  One of the necessary tests that should be first evaluated is the main functio
n of the 
  program. This means deciding on a set of tests that enable you (the programme
r) to see if
  the code achieves its desired outcome.

  All conditions of the program need to be undoubtedly checked, statements like
:

   * case, loops, if then else structures
   * boundary conditions [Ex. The pseudocode: IF $i<100 THEN .. - make sure tha
t 99,100,101
                         values for $i are properly dealt with]
   * exercise all parts of the code [ Ex. designing a rigorous set of tests ]

  Naturally sets of tests will assess the same parts of the program known as 'e
quivalence 
  partitioning' for tests, although this may seem duplicitous, it is standard o
f economical
  testing. Perhaps part of the code works in one scenario, but not another - th
is needs to 
  be carefully checked. 
  The first thing a programmer needs to understand is that testing will demonst
rate the 
  presence of bugs, but it will not demonstrate the absence of bugs.  Semantic 
errors fall 
  into this category, that is, errors in the logic of the program, that the com
piler or 
  interpreter is unable to help you with. 

  Testings falls into two broad categories:
 
   * defect testing
   * acceptance testing

 Defect Testing

  This type of test tries to detect all the defects the program may have. All p
arts of the
  program should be tested, and if the programmer feels that one part of the co
de may not
  properly deal with unexpected input, more rigorous tests should be performed 
on that area
  of the code. One key point to remember from this is that "nobody knows a prog
ram better, 
  than the programmer himself" - the programmer will know the area of the progr
am that is 
  most likely defective, such that a designed set of tests should be practiced 
before a 
  Beta release is produced. Stemming from defect testing is 'regression testing
'.

  Regression testing is the process of testing changes within the programming e
nvironment 
  to programs to make sure that the older program still works with the new impl
emented 
  changes. Regression testing is a normal part of the program development proce
ss and, in 
  the commercial world is performed by code testing specialists. Test departmen
t coders 
  develop code test  scenarios and exercises that will test new units of code a
fter they 
  have been written. These test cases form what becomes the test bucket. Before
 a new 
  version of a software product is released, the old test cases are run against
 the new 
  version to make sure that all the old capabilities still work. The reason the
y might not 
  work is because changing or adding new code to a program can easily introduce
 errors into
  code that is not intended to be changed, and thus will obscure test results. 
Recursive 
  regression testing is a must !

 Acceptance Testing

  In conjunction to defect testing is acceptance testing. This designed sets of
 tests means 
  running an agreed set of sets with an agreed output. These should demonstrate
 that the 
  code does an agreed task well enough for the programmer and client to be conv
inced that 
  the program performs the task well enough. In the commercial world, the accep
tance tests
  are part of the contract for defining what the customer insists on before act
ual monetary
  finance for the software has been transacted.

 Structural Prototyping
 
  Prototyping of this nature is relatively simple. Structural prototyping is a 
stripped 
  down version of a program that will show a structure, in skeleton form, of th
e complete 
  version. All major aspects of the code are written but routines and sub progr
ams are 
  written only as stubs, that is comments/statements within the program that sh
ow the 
  programmer that the actual routine has been called or executed.

  Maintaining effective code that is easily interpreted by the programmer and o
ther 
  developers, and allows further extensions of the program with easy, follows t
hree code 
  cliche` characteristics:

   * understandibility
   * adaptibility
   * cohesion

  Understandibility means that programs that are easier to understand are consi
dered to be
  better designed that ones that do the same task but are harder to understand.
 A key to 
  developing stable code is a good functional prototype that allows the general
 idea of the
  program to be observed before code practise takes place. It may also be neces
sary to note
  that better code is clear and neatly presented - that is spaced out where nec
essary with 
  comments throughout the program to let the reader understand what internal wo
rking is 
  going on.

  Adaptibility effectively means how easy it is to modify areas of the code to 
perform 
  alternate tasks. This is directly linked to understandibility. The more under
stand the 
  code, the easier the adaptibility.

  Cohesion is a routine or sub program that does one clear task, apparent to th
e reader and
  programmer. A well-defined task should give a clear indication of what the pr
ogram is 
  intended to do, this includes well chosen names for variable, constants, head
ers etc. As
  small as this concept may seem, it allows any coder to pick up the source and
 be able to
  quickly scan through and understand what the program is about.
 
 Signs to observe

  Whether you are checking the source for bugs or testing the binary/executable
 file for 
  presence of flaws all of the above tests need to be considered and exercised.
 It is most
  common that bugs present themselves in bounday structure conditions. When des
igning a set
  of tests, it can not be stressed enough that boundaries need to be checked on
 either side
  of their 'walls'. Other recent flaws that should checked before releasing a b
eta release
  of a product, is the current malpractice of dealing with format control bugs,
 such as %s.
  The programmer must employ capable input routines/parameters to correctly dea
l with user
  supplied input, ensuring all possible scenarios have been considered before a
dopting the
  most suitable code to perform the given command. This includes identifiers th
emselves, 
  such as avoiding use of getenv() , strcpy(), sprint() wherever possible, in e
xchange for 
  more secure methods like strncpy() or snprintf(); the 'n' refers to the numbe
r of bytes
  allowed to be copied to a buffer. Avoid common mistakes often used by sloppy 
programmers 
  to get user supplied environment variables from the terminal or environment. 
Establish 
  your own method of setting or checking the environment make it unsusceptable 
to malformed
  data that could possible lead to unexpected outcomes, such as spawning a shel
l - a 
  definite security risk, one that is often observed in many UNIX environments.
 (Early ZGV 
  [console graphics viewer] programs were always victim to getenv('HOME') probl
ems, of this
  nature.)
  
  Another probability of using acceptance testing to expose bug flaws, is using
 the proper
  data set to be inputted to the program but sending extensive data to a partic
ular input 
  command, such as sending 1024 bytes to a 512 byte buffer, will cause an overf
low, while 
  the acceptance test of sending 256 bytes to the terminal would be deemed acce
ptable, and 
  will pass this test, the 1024 byte buffer would not.  

  Sometimes when a program appears to have decreased it's efficiently level in 
terms of 
  speed or processing of the actual data may be directly linked to a heap or st
ack 
  overflow, caused by corrupted data being entered. It is at this stage where v
ital tests 
  need to be conducted by the bug tester for the presence of bugs.

  Let's take a real life example of a program that I exposed with a flaw not lo
ng ago.
  -   WinSMTPD mailer/pop3d daemon. Version 1.06f and 2.X.
 
  After acceptance testing this program everything worked well. All the desired
 tasks of 
  the program were fulfilled and the smptd/pop3d server performed their tasks e
fficiently.
  Now, here is where defect testing comes in to play.
  
  Firstly to start an SMTP transaction, the client needs to send a 'HELO %s' ca
ll, where 
  the format string "%s" is your hostname. WinSTMPD only allows a fixed buffer 
of 170 bytes 
  before the expected output becomes unexpected. So by sending 150 bytes after 
the HELO 
  field, the program noticeably paused before proceeding to function as normal.
 This tells
  us one of two possibilities.

  1. The program has been coded poorly in terms of speed,  OR
  2. The program does not deal with boundary tests, with exceeding data being e
ntered.

  As it turns out WinSTMPD was vulnerable to a stack overflow, by sending 170+ 
bytes to the 
  HELO field. The unexpected output for the program was:

        WINSMTP caused a general protection fault
        in module WINSMTP.EXE at 0003:00002359.
        Registers:
        EAX=461e0001 CS=42e7 EIP=00002359 EFLGS=00000246
        EBX=00807fe0 SS=4207 ESP=00007e36 EBP=00004141
        ECX=00010283 DS=4207 ESI=0000544c FS=05c7
        EDX=58600000 ES=461e EDI=00001547 GS=0000
        Bytes at CS:EIP:
        cb 49 73 49 63 6f 6e 69 63 00 00 58 4c 6f 63 00 
        Stack dump:
        41414141 41414141 41414141 41414141 41414141 41414141 
        41414141 41414141 41414141 41414141 41414141 41414141 
        41414141 41414141 41414141 41414141

  Obviously this isn't what the programmer had in mind when performing an SMTP 
transaction.
  The 41414141 that appears on the stack is "A" binary value, which I had fille
d the buffer
  with. From this general protection fault, we as bug testers and programmers, 
are able to 
  ascertain that this 16-bit program (judged by the leading 0's within the memo
ry registers)
  have successfully overwritten the EBP register (+4 bytes for EIP), and as eth
ical 
  programmers/bug testers that's all we need to know to fix or patch this bug. 
If there 
  were say, an unethical hacker out there, loading up the stack with malicious 
data could 
  effectively allow arbitrary code to be executed from the stack, and anything 
is possible 
  from there. This is why it is important to test for bugs, and especially chec
k the 
  boundaries and data that is allowed to be inputted by the client/user. 
  Although I approve of people writing 'proof of concept' exploits to expose th
e existence 
  of a bug in a program, as I am a firm believer in full disclosure and vouchee
 for open 
  source, it not ethical or urged to run these scripts without the direct conse
nt of those 
  people(s) you are exploiting. (POC exploits are necessary in whitehat hacker 
security 
  firms to prove and demonstrate a code flaw.)

  Data sets and tests computed to the program/system are effectively system cal
ls executed 
  by active processes. 
  These include different kinds of programs (Ex. programs that run as daemons a
nd those 
  that do not), programs that vary widely in their size and complexity, and dif
ferent 
  purposes of programs. Spawns or fork()'s by applications are therefore tested
 when the 
  maximum process limit is exhausted by various resource-depleting exploits, th
is too needs
  to be prepared for when making a heavily used program. Normal computed data c
an be 
  "synthetic" or "live". Synthetic traces are collected in production environme
nts by 
  running a prepared script, often called a driver program; the program options
 are chosen
  solely for the purpose of exercising the program (acceptance testing), and no
t to meet 
  any real user's requests. Live normal data traces of programs are computed du
ring normal 
  usage of a production computer system (manual specificities of code testing; 
boundary 
  testing). Both these methods are often put to test when processing en-mass so
ftware 
  applications.

 5. Bug Discovery ?

  So, you think you've found a bug ? then read on, here's what to do next.

 Alerting the vendor

  If the client or user has somehow stumbled on a logical error, or security vu
lenrability
  with in the tested (beta/stable) product, it is then necessary to inform the 
bug
  immediately to the vendor. More of this criterion was discussed in the 'Devel
opment 
  Goals' subtopic, but visually displaying a practical advisory was not. The bu
g report
  should include most, if not all of the following information, generally in br
ief 
  conceptual form.

   * bug synopsis  (brief paragragh explaining the vulnerability)
   * description   (the sequential steps taken to produce the proposed bug)
   * attachments   (any revelant materials, such as: core dumps, message logs)
   * environment   (system specifications and conditions used to test the bug)
   * contact info  (how the vendor can contact you for further comments/queries
)

 Alerting clients  

  If the proposed bug has been accepted by the vendors as being a risk or vulne
rability
  that could lead to such things as network/software penetration, increased pri
vileges,
  excessive system resource usage, the vendor should then issue their own advis
ory
  publically, through use contact by mailing-lists, the vendor's URL, and/or by
 e-mail.
  It is now the responsbility of the programmer/manufacturer to maintain sure f
ire
  advice for the client to patch their software/system so the vulnerability bec
omes
  non-existant. The Advisory after such an event has occured, should include th
e
  following information:
 
   * Date     (date of advisory release)
   * Affected systems (listing of the environment/setting in which the bug may 
occur)
   * Description (similar to clients description, but with more technical insid
e info.)
   * Patch    (URL of patch or description of how to correct the bug)
   * Contact  (how clients can contact the vendor for more info, phone, e-mail,
 URL.)
  
  Having the above communicational link creates a much more friendly atmosphere
 between
  users and vendors, which in effect helps forward software development into be
coming a more
  stable and reliable community - one that excels in safe security practices.

 6. Final Note

  I made a generic resource kit named reskit.tgz earlier this year. Basically t
hese are 
  just 7 skeletal template scripts coded in perl for various purposes of testin
g network 
  services on a Linux/Unix environment; such as malformed HTTP 'GET' requests, 
multiple 
  thread connections, random data streaming, ICMP error generator etc. Mainly U
sed as a 
  research and development kit to help spot bugs more easily, particularly on s
erver/router
  applications/software, feel free to expand on them.
  These scripts can be downloaded in tarball form from: http://dethy.synnergy.n
et/reskit.tar

 Comments
  Main editorial by dethy [ dethy@synnergy.net | www.synnergy.net ]