Read an Excerpt
Disappearing Cryptography
Information Hiding: Steganography & Watermarking
By Peter Wayner
MORGAN KAUFMANN
Copyright © 2009 Peter Wayner
All right reserved.
ISBN: 978-0-08-092270-6
Chapter One
Framing Information
On its face, information in computers seems perfectly defined and certain. A bank account either has $1,432,442 or it has $8.32. The weather is either going to be 73 degrees or 74 degrees. The meeting is either going to be at 4 pm or 4:30 pm. Computers deal only with numbers and numbers are very definite.
Life isn't so easy. Advertisers and electronic gadget manufacturers like to pretend that digital data is perfect and immutable, freezing life in a crystalline mathematical amber; but the natural world is filled with noise and numbers that can only begin to approximate what is happening. The digital information comes with much more precision than the world may provide.
Numbers themselves are strange beasts. All of their certainty can be scrambled by arithmetic, equations and numerical parlor tricks designed to mislead and misdirect. Statisticians brag about lying with numbers. Car dealers and accountants can hide a lifetime of sins in a balance sheet. Encryption can make one batch of numbers look like another with a snap of the fingers.
Language itself is often beyond the grasp of rational thought. Writers dance around topics and thoughts, relying on nuance, inflection, allusion, metaphor, and dozens of other rhetorical techniques to deliver a message. None of these tools are perfect and people seem to find a way to argue about the definition of the word "is".
This book describes how to hide information by exploiting this uncertainty and imperfection. This book is about how to take words, sounds, and images and hide them in digital data so they look like other words, sounds, or images. It is about converting secrets into innocuous noise so that the secrets disappear in the ocean of bits flowing through the Net. It describes how to make data mimic other data to disguise its origins and obscure its destination. It is about submerging a conversation in a flow of noise so that no one can know if a conversation exists at all. It is about taking your being, dissolving it into nothingness, and then pulling it out of the nothingness so it can live again.
Traditional cryptography succeeds by locking up a message in a mathematical safe. Hiding the information so it can't be found is a similar but often distinct process often called steganography. There are many historical examples of it including hidden compartments, mechanical systems like microdots, or burst transmissions, that make the message hard to find. Other techniques like encoding the message in the first letters of words disguise the content and make it look like something else. All of these have been used again and again.
Digital information offers wonderful opportunities to not only hide information, but also to develop a general theoretical framework for hiding the data. It is possible to describe general algorithms and make some statements about how hard it will be for someone who doesn't know the key to find the data. Some algorithms offer a good model of their strength. Others offer none.
Some of the algorithms for hiding information use keys that control how they behave. Some of the algorithms in this book hide information in such way that it is impossible to recover the information without knowing the key. That sounds like cryptography, even though it is accomplished at the same time as cloaking the information in a masquerade.
Is it better to think of these algorithms as "cryptography" or as "steganography"? Drawing a line between the two is both arbitrary and dangerously confusing. Most good cryptographic tools also produce data that looks almost perfectly random. You might say that they are trying to hide the information by disguising it as random noise. On the other hand, many steganographic algorithms are not trivial to break even after you learn that there is hidden data to find. Placing an algorithm in one camp often means forgetting why it could exist in the other. The best solution is to think of this book as a collection of tools for massaging data. Each tool offers some amount of misdirection and some amount of security. The user can combine a number of different tools to achieve their end.
The book is published under the title of "Disappearing Cryptography" for the reason that few people knew about the word "steganography" when it appeared. I have kept the title for many of the same practical reasons, but this doesn't mean that title is just cute mechanism for giving the buyer a cover text they can use to judge the book. Simply thinking of these algorithms as tools for disguising information is a mistake. Some offer cryptographic security at the same time as an effective disguise. Some are deeply intertwined with cryptographic algorithms, while others act independently. Some are difficult to break without the key while others offer only basic protection. Trying to classify the algorithms purely as steganography or cryptography imposes only limitations. It may be digital information, but that doesn't mean there aren't an infinite number forms, shapes, and appearances the information may assume.
1.0.1 Reasons for Secrecy
There are many different reasons for using the techniques in this book and some are scurrilous. There is little doubt that the Four Horsemen of the Infocalypse– the drug dealers, the terrorists, the child pornographers, and the money launderers– will find a way to use the tools to their benefit in the same way that they've employed telephones, cars, airplanes, prescription drugs, box cutters, knives, libraries, video cameras and many other common, everyday items. There's no need to explain how people can hide behind the veils of anonymity and secrecy to commit heinous crimes.
But these tools and technologies can also protect the weak. In book's defense, here's a list of some possible good uses:
1. So you can seek counseling about deeply personal problems like suicide.
2. So you can inform colleagues and friends about a problem with odor or personal hygiene.
3. So you can meet potential romantic partners without danger.
4. So you can play roles and act out different identities for fun.
5. So you can explore job possibilities without revealing where you currently work and potentially losing your job.
6. So you can turn a person in to the authorities anonymously without fear of recrimination.
7. So you can leak information to the press about gross injustice or unlawful behavior.
8. So you can take part in a contentious political debate about, say, abortion, without losing the friendship of those who happen to be on the other side of the debate.
9. So you can protect your personal information from being exploited by terrorists, drug dealers, child pornographers and money launderers. 10. So the police can communicate with undercover agents infiltrating the gangs of bad people.
There are many other reasons, but I'm surprised that government officials don't recognize how necessary these freedoms are to the world. Much of government functions through back-corridor bargaining and power games. Anonymous communication is a standard part of this level of politics. I often believe that all governments would grind to a halt if information was as strictly controlled as some would like it to be. No one would get any work done. They would just spend hours arguing who should and should not have access to information.
The Central Intelligence Agency, for instance, has been criticized for missing the collapse of the former Soviet Union. They continued to issue pessimistic assessments of a burgeoning Soviet military while the country imploded. Some blame greed, power, and politics. I blame the sheer inefficiency of keeping information secret. Spymaster Bob can't share the secret data he got from Spymaster Fred because everything is compartmentalized. When people can't get new or solid information, they fall back to their basic prejudices-which in this case was that the Soviet Union was a burgeoning empire. There will always be a need for covert analysis for some problems, but it will usually be much more inefficient than overt analysis.
Anonymous dissemination of information is a grease for the squeaky wheel of society. As long as people question its validity and recognize that its source is not willing to stand behind the text, then everyone should be able to function with the information. When it comes right down to it, anonymous information is just information. It's just a torrent of bits, not a bullet, a bomb or a broadside. Sharing information generally helps society pursue the interests of justice.
Secret communication is essential for security. The police and the defense department are not the only people who need the ability to protect their schedules, plans, and business affairs. The algorithms in this book are like locks on doors and cars. Giving this power to everyone gives everyone the power to protect themselves against crime and abuse. The police do not need to be everywhere because people can protect themselves.
For all of these reasons and many more, these algorithms are powerful tools for the protection of people and their personal data.
1.0.2 How It Is Done
There are a number of different ways to hide information. All of them offer some stealth, but not all of them are as strong as the others. Some provide startling mimicry with some help from the user. Others are largely automatic. Some can be combined with others to provide multiple layers of security. All of them exploit some bit of randomness, some bit of uncertainty, or some bit of unspecified state in a file. Here is an abstract list of the techniques used in this book:
Use the Noise The simplest technique is to replace the noise in an image or sound file with your message. The digital file consist of numbers that represent the intensity of light or sound at a particular point of time or space. Often these numbers are computed with extra precision that can't be detected effectively by humans. For instance, one spot in a picture might have 220 units of blue on a scale that runs between 0 and 255 total units. An average eye would not notice if that one spot was converted to having 219 units of blue. If this process is done systematically, it is possible to hide large volumes of information just below the threshold of perception. A digital photo-CD image has 2048 by 3072 pixels that each contain 24 bits of information about the colors of the image. 756k of data can be hidden in the three least significant bits for each color of each pixel. That's probably more than the text of this book. The human eye would not be able to detect the subtle variations but a computer could reconstruct them all. Spread the Information Out Some of the more sophisticated mechanisms spread the information over a number of pixels or moments in the sound file. This diffusion protects the data and also makes it less susceptible to detection, either by humans looking at the information or by computers looking for statistical profiles. Many of the techniques that fall into this category came from the radio communication arena where the engineers first created them to cut down on interference, reduce jamming, and add some secrecy. Adapting them to digital communications is not difficult. Spreading the information out often increases the resilience to destruction by either random or malicious forces. The spreading algorithms often distribute the information in such a way that not all of the bits are required to reassemble the original data. If some parts get destroyed, the message still gets through.
Many of these spreading techniques hide information in the noise of an image or sound file, but there is no reason why they can't be used with other forms of data as well. Adopt a Statistical Profile Data often falls into a pattern and computers often try to make decisions about data by looking at the pattern. English text, for instance, uses the letter 'p' for more often than the letter 'q' and this information can be useful for breaking ciphers. If data can be reformulated so it adopts the statistical profile of the English language, then a computer program minding ps and qs will be fooled.
(Continues...)
Excerpted from Disappearing Cryptography by Peter Wayner Copyright © 2009 by Peter Wayner. Excerpted by permission of MORGAN KAUFMANN. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.