exploit the possibilities
Home Files News &[SERVICES_TAB]About Contact Add New

coding.txt

coding.txt
Posted Feb 25, 2000
Authored by Mixter | Site mixter.void.ru

Coding in C - a summary of some popular mistakes. Most of them are not detected at compile time with all warnings enabled, which makes them very nasty and hard to detect.

SHA-256 | 737d50616c03d55f8e032bb3348892b062e5ced53d2c378786dbda33ef725c28

coding.txt

Change Mirror Download

Coding in C - a summary of some popular mistakes
==================================================

Security papers - mixtersecurity.tripod.com/papers.html


Introduction

First of all, in this paper, I'm not going to verbosely talk about
secure and fail-safe coding practice, nor security code auditing.
These are interesting topics, for which many good papers and FAQs are
out, but my goal here is just to introduce you to some common, but nasty
errors, which are not all related to security. Instead, they are errors
that I made in the past or noticed that they could easily be made. Most
of them are not detected at compile time with all warnings enabled, which
makes them very nasty and hard to detect. If you have experience in writing
security relevant (e.g. suid) programs and properly check them, you probably
will notice them as well sooner or later. But it is the "little things" in
debugging and re-coding that are time expensive, so maybe this summary will
help you detecting these errors faster, or preventing them, thus saving your
time for the stability, performance and security relevant code improving.
I'm going to start with some real simple things, and will then be trying
to cover some errors which can be trickier to spot.

Most of the widespread program bugs that cause unexpected behavior are
caused by the arithmetic notation of C, which can seem ambiguous to people
who aren't used to the language. When fundamental and detailed experience with
the operations are made, comparison operations and numbering schemes in C
actually reveal to be very practicable and useful. For example, it is
important to realize that every iteration in C starts with 0 as the first
positive integer value, not 1. (int *) array[0] points to the first item,
not before it. Pointers always directly point to the beginning of an item,
since in C, they are "real" pointers, e.g. the actual address references
the machine uses internally. If you allocate space with "int array[10];",
you are actually allocating 11 items, of which the first one is array[0],
the last usable one is array[9], and array[10] is the 11th one, which
contains a delimiting binary zero. This can easily be overwritten, by
using bad loops to parse data, e.g.: for(i=0;i<=10;i++) array[i] = ...
Note that this will overwrite array[10], and therefore remove the separating
zero. If that happens to a character string, most string parsing functions
can no longer identify its bounds, and will output memory after the end
of the array until the next binary zero in memory is found. This can
result in undefined output, and can be hard to track down in the source.

Something else are format errors. Most functions that use variable arguments
(see stdarg(3)), also use format to parse variables into a format string
specified in the code. Common mistakes are made while parsing signed and
unsigned variables with wrong format strings. Unsigned variables cannot
represent values smaller than zero, instead, their value can be twice as large
as their signed equivalent. Subtracting one from zero makes an unsigned variable
represent the largest value possible (Ex.: 0x000000 - 0x1 = 0xffffff ). This
makes bad conversions a dangerous thing. For a signed int, use %d. For a signed
long use %ld. For unsigned int use %u, for parsing the hex representation of
the value %x, and so on. This is verbosely explained on the manpage.

Another important thing you can mess up with is sizeof. Sizeof is an expression,
not a function, and it is often interpreted by the compiler in ways that you
wouldn't expect. Mind that sizeof references the full address space for a
variable or pointer which the compiler can recognize at compile time. Using
sizeof on a character buffer like "char buf[1234]; ... sizeof(buf);" returns
the value 1234. However, if a buffer or array is either dynamically allocated,
or if it is allocated or created outside of the function in which sizeof is
used, the sizeof command will NOT and cannot reference the address range that
the pointer references. Instead, it will return the size that the pointer
occupies in memory. For example, try compiling and running this program:
p(char *buf){printf("%d\n",sizeof(buf));}main(){char buf[100];p(buf);}

Generally, ambiguous compiler expressions which are not being regarded as
such, can be one of the biggest problem during the tracking of persistent bugs
or unwanted behavior. A good trivial example is the negation ('!' expression).
This is a binary, not a real arithmetic operation, and it only differentiates
between zero and non-zero. !1 is exactly the same as !100 or !-100, zero.
This means that in some cases, it is not recommended to rely on this
expression, e.g. when checking if 'i' is smaller than or equal to
zero, it would be simply wrong to use "if (!i) ...".

Besides these things, it is very important to closely study the behavior
of any library or system functions that one uses frequently. Most of the time,
the documentation for functions is accurate, but the detailed behavior of
a function in all situations, as well as its conformance to established
standards can be important. For example, the select() call waits for a
change of status on a specified amount of sockets. However, this can often be
a change that doesn not necessarily indicate an established connection. I've
seen programs that try to wait for a socket becoming ready to read from and
then assume a connection is established. However, it could have returned an
error or disconnected immediately again. Doing a getpeername operation is
recommended in this case. What I'm trying to say is that functions, especially
system calls, provide a documented behavior, which however is not always
reliable, and that documented exceptions are possible to occur, and those
exceptions should always be handled in advance in a stable program.

System events and signal handling represent another whole category of
problems. In fact, a program that has to deal with all external events and
signals, or uses many of them internally, e.g. for multithreading, is
beyond the scope of a single paper. Only practical programming and testing
experience can help you to develop good skills with handling these events.
One interesting example are the alarm calls/timers and signals. The easiest
way to handle timeouts is to setup a signal handler for alarm, then change
the restart handler for blocking functions with siginterrupt() to return -1,
when returning from a handled alarm event. A more complex way of alarm
handling is to make use of setjmp/longjmp, which can save and return to a
execution context anywhere in a program. While they are easy to implement,
these commands are actually very complex in their behavior and should be
used with care. An experience I made is that due to alarm handlers, programs
can be brought back to functions that are restarting on timeouts (the
default for blocking functions), and therefore repeat actions that the
programmer never intended to perform more than once. The more sophisticated
functions a programmer uses, the harder can errors be found and tracked.

As a last advice for writing good code, using compiler checks are
recommended, such as -Wimplicit, -Wall, -ansi, -pedantic, as well as always
using prototypes, or even designing a concept before designing a more
complicated program. It is also a good experience to try and port a program
to many other platforms, as other errors might be revealed on other
systems. If security and stability is very important, for example in a
suid application, or if writing a big server program, it is recommended to
use compiler parser generators (yacc, bison, etc. which are, however, probably
some of the most complex programs to use ever), or lint-based automatic code
checking systems (e.g. lclint), which provide source code checking
at any desired strictness level.

_______________________________________________________________________________

Mixter <mixter@newyorkoffice.com>
http://mixtersecurity.tripod.com
Login or Register to add favorites

File Archive:

April 2024

  • Su
  • Mo
  • Tu
  • We
  • Th
  • Fr
  • Sa
  • 1
    Apr 1st
    10 Files
  • 2
    Apr 2nd
    26 Files
  • 3
    Apr 3rd
    40 Files
  • 4
    Apr 4th
    6 Files
  • 5
    Apr 5th
    26 Files
  • 6
    Apr 6th
    0 Files
  • 7
    Apr 7th
    0 Files
  • 8
    Apr 8th
    22 Files
  • 9
    Apr 9th
    14 Files
  • 10
    Apr 10th
    10 Files
  • 11
    Apr 11th
    13 Files
  • 12
    Apr 12th
    14 Files
  • 13
    Apr 13th
    0 Files
  • 14
    Apr 14th
    0 Files
  • 15
    Apr 15th
    30 Files
  • 16
    Apr 16th
    10 Files
  • 17
    Apr 17th
    22 Files
  • 18
    Apr 18th
    45 Files
  • 19
    Apr 19th
    0 Files
  • 20
    Apr 20th
    0 Files
  • 21
    Apr 21st
    0 Files
  • 22
    Apr 22nd
    0 Files
  • 23
    Apr 23rd
    0 Files
  • 24
    Apr 24th
    0 Files
  • 25
    Apr 25th
    0 Files
  • 26
    Apr 26th
    0 Files
  • 27
    Apr 27th
    0 Files
  • 28
    Apr 28th
    0 Files
  • 29
    Apr 29th
    0 Files
  • 30
    Apr 30th
    0 Files

Top Authors In Last 30 Days

File Tags

Systems

packet storm

© 2022 Packet Storm. All rights reserved.

Services
Security Services
Hosting By
Rokasec
close