what you don't know can hurt you
Home Files News &[SERVICES_TAB]About Contact Add New

texto.txt

texto.txt
Posted Dec 21, 1999

texto.txt

tags | encryption, steganography
SHA-256 | b131f795620fdfa15e400f49478d73bb70a5b71edba23ab2a31ab4d74329f6b8

texto.txt

Change Mirror Download
------------------------------------
TEXTO - Text steganography
------------------------------------

Texto is a rudimentary text steganography program which transforms
uuencoded or pgp ascii-armoured ascii data into English sentences.

This program was written to facilitate the exchange of binary
data, especially encrypted data. Why is this necessary? People or
programs may be reading your mail. Recent events in the US congress may
_require_ Internet Service Providers to monitor incoming mail and determine
whether or not it is "obscene" or lives up to particular parochial moral
standards. Since they can't scan the contents of an encrypted message,
and probably don't have time to manually look at each uuencoded message,
such emails will probably go into the bit bucket. This program's output
is hopefully close enough to normal English text that it will slip by
any kind of automated scanning.

Texto text files look like something between mad libs and bad poetry,
(although they do sometimes contain deep cosmic truths) and should be close
enough to normal english to get past simple-minded mail scanners and to
entertain readers of talk.bizarre.

Texto works just like a simple substitution cipher, each of the 64 ascii
symbols used by pgp ascii armour or uuencode is replaced by an english word.
Not all of the words in the resulting English are significant, only those
nouns, verbs, adjectives, and adverbs used to fill in the preset sentence
structures. Punctuation and "connecting" words (or any other words not in
the dictionary) are ignored.

The obvious main drawback to using this program: the resulting text
is larger than the original data by a factor of 10. This is bad to the
point of uselessness if you need to send a 5MB uuencoded file. What
are some possible solutions to this problem? Using shorter words would
yield only minimal improvement as most of the words are pretty short now,
and you would still need the same number of english words. The best
solution I can think of is to use more words, one for every 2 symbols
instead of a one-to-one symbol to word mapping. This requires 4096 words
for each part of speech, (finding that many adverbs will be a real challenge),
but search speed shouldn't become a big factor when transforming text to data,
since texto uses a hash table for the words and their lengths in order to
minimize search times. The net result would probably be an average expansion
by ~5x instead of ~10x, which is significant enough to warrant trying it.
Changing the code will be easy, the hard part is typing in the dictionaries.
Look for this feature in texto 2.0 coming Real Soon to a net near you.

Since words are occasionally pluralized and/or gerundized (-ing), and
they're not all regular verbs/nouns, there are plenty of strange spelling
mistakes. While normally I despise misspelled words, they add a nice
human touch to the repetitive text, and add to the feeling that who/whatever
wrote the text was quite clearly out of his/her/its mind.


Usage:
------

texto msgfile > engfile - Transforms the contents of msgfile into
English text and places results in "engfile"
msgfile must be a uuencoded or pgp ascii-
armoured text file.

texto -p engfile > pgpfile - Takes English text from engfile and produces
OR a pgp ascii-armoured text file, which will
texto -p engfile | pgp -f be readable by pgp if the original message
file was. Alternatively, the output from
texto can be piped directly into pgp.

texto -u engfile > uufile - Takes English text from engfile and produces
OR a uuencoded file, which will be readable by
texto -u engfile | uudecode uuencode if the original message file was.
Alternatively, the output from texto can
be piped directly into uudecode.
NOTE that uudecoding the results will always
produce a file called "texto.out" mode 644,
unless you redirect texto's output into a
file and hand edit that file.

Installation:
-------------

This program has only been tested on IRIX 4.0.5, linux kernel 1.0.x,
and Solaris 2.3. To build it, just type "make", on SGIs make it with the
command "make sgi". If you're on a Solaris machine or any other machine
whose uuencode uses spaces instead of ` characters, uncomment the
"DEFINES" line in the makefile.


Rolling your own:
-----------------

The usually-correct English sentence structures are found in the file
"structs", which is basically a file of mad lib-type "fill in the blank"
sentences. Feel free to add your own, just be really really careful about
not using words in the "words" file. You're safe if you use words that
you see elsewhere in the "structs" file. Using varying "structs" files
could at least annoy mail scanners. Using different "words" files as
well should totally defeat them.

The 64 verbs, 64 adjectives, 64 adverbs, 64 places, and 64 things
which are used to fill in the blanks are in the "words" file. Again, feel
free to add your own, but again, be careful. Don't use words that end in
"s" or "ing" (they'll get chopped), don't use words that are already in
there (you can double check with the command "sort words | uniq -d"). The
order of the words in each section of the file is also significant, so for
example rearranging the nouns will change the result.

If you use a modified "words" file, the person on the other end of
your communication must of course be using the same one, or the transformation
will fail miserably. The "structs" file is totally irrelevant however, and
can be modified to suit your taste or literary style, so long as it doesn't
conflict with the "words" file as mentioned above. The structs file is
not used in "decoding" text, so two people can still communicate whether
or not they have the same "structs" file.

BUGS
----

uuencoded files lose the mode and filename information, which is a bummer.
Always writing to stdout may not be the best way to go.
The text produced by texto'ing a uuencoded file can be _really_ repetitive.
The 64-word dictionaries thing vs. the 4096-word ones, as mentioned above.
Texto is a dorky name, but it sortof rhymes with stego.
Please report any other bugs or fixes to kmaher@ucsd.edu

LICENSE
-------

Copying, modifications, improvements, etc. are highly encouraged, just
let me know so I can incorporate them.

All rites reversed.

AUTHOR
------

Kevin Maher
kmaher@ucsd.edu
Underware Software Production Ltd. Inc. etc.
"Covering your ass since 1981"

Login or Register to add favorites

File Archive:

April 2024

  • Su
  • Mo
  • Tu
  • We
  • Th
  • Fr
  • Sa
  • 1
    Apr 1st
    10 Files
  • 2
    Apr 2nd
    26 Files
  • 3
    Apr 3rd
    40 Files
  • 4
    Apr 4th
    6 Files
  • 5
    Apr 5th
    26 Files
  • 6
    Apr 6th
    0 Files
  • 7
    Apr 7th
    0 Files
  • 8
    Apr 8th
    22 Files
  • 9
    Apr 9th
    14 Files
  • 10
    Apr 10th
    10 Files
  • 11
    Apr 11th
    13 Files
  • 12
    Apr 12th
    14 Files
  • 13
    Apr 13th
    0 Files
  • 14
    Apr 14th
    0 Files
  • 15
    Apr 15th
    30 Files
  • 16
    Apr 16th
    10 Files
  • 17
    Apr 17th
    0 Files
  • 18
    Apr 18th
    0 Files
  • 19
    Apr 19th
    0 Files
  • 20
    Apr 20th
    0 Files
  • 21
    Apr 21st
    0 Files
  • 22
    Apr 22nd
    0 Files
  • 23
    Apr 23rd
    0 Files
  • 24
    Apr 24th
    0 Files
  • 25
    Apr 25th
    0 Files
  • 26
    Apr 26th
    0 Files
  • 27
    Apr 27th
    0 Files
  • 28
    Apr 28th
    0 Files
  • 29
    Apr 29th
    0 Files
  • 30
    Apr 30th
    0 Files

Top Authors In Last 30 Days

File Tags

Systems

packet storm

© 2022 Packet Storm. All rights reserved.

Services
Security Services
Hosting By
Rokasec
close