NCSC-TR-005-1.txt

NCSC-TR-005-1.txt: Posted Aug 17, 1999; NCSC Technical Report 005 Volume 1/5: Inference and Aggregation Issues In Secure Database Management Systems; tags | paper; SHA-256 | 2323d286401cba4250f29c3da7a46929bcab5c4421fceff4eee16aeb5519d0f9; Download | Favorite | View
NCSC-TR-005-1.txt

NCSC TECHNICAL REPORT - 005

Volume 1/5

Library No. S-243,039

FOREWARD

This report is the first of five companion documents to the Trusted Database Management System 
Interpretation of the Trusted Computer System Evaluation Criteria. The companion documents 
address topics that are important to the design and development of secure database management 
systems, and are written for database vendors, system designers, evaluators, and researchers. This 
report addresses inference and aggregation issues in secure database management systems.



ACKNOWLEDGMENTS

The National Computer Security Center extends special recognition to the authors of this 
document. The initial version was written by Victoria Ashby and Sushil Jajodia of the MITRE 
Corporation. The final version was written by Gary Smith, Stan Wisseman, and David Wichers of 
Arca Systems, Inc.

The documents in this series were produced under the guidance of Shawn P. O'Brien of the 
National Security Agency, LouAnna Notargiacomo and Barbara Blaustein of the MITRE 
Corporation, and David Wichers of Arca Systems, Inc.

We wish to thank the members of the information security community who enthusiastically gave 
of their time and technical expertise in reviewing these documents and in providing valuable 
comments and suggestions.

TABLE OF CONTENTS

SECTION  PAGE

1.0  INTRODUCTION  l

1.1  BACKGROUND AND PURPOSE  1

1.2  SCOPE  1

1.3  INTRODUCTION TO INFERENCE AND AGGREGATION  2

1.4  AUDIENCES OF THIS DOCUMENT  3

1.5  ORGANIZATION OF THIS DOCUMENT  4

2.0  BACKGROUND  5

2.1  INFERENCE AND AGGREGATION - DEFINED AND EXPLAINED  5

2.1.1  Inference  5

2.1.1.1  Inference Methods  6

2.1.1.2  Inference Information    7

2.1.1.3  Related Disciplines  7

2.1.2  Aggregation  8

2.1.2.1  Data Association Examples  9

2.1.2.2  Cardinal Aggregation Examples  9

2.2  SECURITY TERMINOLOGY  10

2.3  THE TRADITIONAL SECURITY ANALYSIS PARADIGM  12

2.4  THE DATABASE SECURITY ENGINEERING PROCESS  14

2.5  THE REQUIREMENTS FOR PROTECTING AGAINST INFERENCE
AND AGGREGATION    15

2.5.1  The Operational Requirement  16

2.5.2  The TCSEC Requirement  16

3.0  AN INFERENCE AND AGGREGATION FRAMEWORK  18

3.1  DETAILED EXAMPLE  18

3.2  THE FRAMEWORK  20

3.2.1  Information Protection Requirements and Resulting Vulnerabilities  20

3.2.2  Classification Rules  20

3.2.3  Vulnerabilities from Classification Rules  25

3.2.4  Database Design  26

3.2.5  Vulnerabilities from Database Design  27

3.2.6  Database Instantiation and Vulnerabilities  28

3.3  OVERVIEW OF RESEARCH  28

4.0  APPROACHES FOR INFERENCE CONTROL  31

4.1  FORMALIZATIONS  31

4.1.1  Database Partitioning  31

4.1.2  Set Theory  32

4.1.3  Classical Information Theory  33

4.1.4  Functional and Multivalued Dependencies  33

4.1.5  Sphere of Influence  34

4.1.6  The Fact Model  34

4.1.7  Inference Secure  35

4.1.8  An Inference Paradigm  35

4.1.9  Imprecise Inference Control  36

4.2  DATABASE DESIGN TECHNIQUES  37

4.2.1  Inference Prevention Methods  37

4.2.1.1  Appropriate Labeling of Data  37

4.2.1.2  Appropriate Labeling Integrity Constraints  38

4.2.2  Inference Detection Methods  39

4.3  RUN-TIME MECHANISMS  40

4.3.1  Query Response Modification  41

4.3.1.1  Constraint Processors  41

4.3.1.2  Database Inference Controller  42

4.3.1.3  Secondary Path Monitoring  43

4.3.1.4  MLS System Restrictions  43

4.3.2  Polyinstantiation  44

4.3.3  Auditing Facility  44

4.3.4  Snapshot Facility  45

4.4  DATABASE DESIGN TOOLS  47

4.4.1  DISSECT  47

4.4.2  AERIE  48

4.4.3  Security Constraint Design Tool  49

4.4.4  Hypersemantic Data Model and Language  49

5.0  APPROACHES FOR AGGREGATION  51

5.1  DATA ASSOCIATION  51

5.1.1  Formalizations  52

5.1.1.1  Lin's Work  52

5.1.1.2  Cuppens' Method  54

5.1.2  Techniques  54

5.2  CARDINAL AGGREGATION  56

5.2.1  Techniques  56

5.2.1.1  Single Level DBMS Techniques  56

5.2.1.1.1  Database Design and Use of DAC  56

5.2.1.1.2  Use of Views  57

5.2.1.1.3  Restricting Ad Hoc Queries  58

5.2.1.1.4  Use of Audit  58

5.2.1.1.5  Store all Data System-High  59

5.2.1.2  Multilevel DBMS Techniques  59

5.2.1.2.1  Use of Labeled Views  59

5.2.1.2.2  Start with All Data System-High  60

5.2.1.2.3  Make Some Data High  60

5.2.2  Mechanisms  61

5.2.2.1  Aggregation Constraints  61

5.2.2.2  Aggregation Detection  61

6.0  SUMMARY  63

REFERENCES    65

LIST OF FIGURES

FIGURE  PAGE

2.1:  TRADITIONAL SECURITY ANALYSIS PARADIGM  13

2.2:  DATABASE SECURITY ENGINEERING PROCESS  14

3.1:  ENTITIES AND ASSOCIATIONS  19

3.2:  ATC ATTRIBUTES  19

3.3:  INFERENCE AND AGGREGATION FRAMEWORK  21

3.4:  ATC EXAMPLE DATABASE DESIGN  27

4.1:  PARTITIONING OF THE DATABASE FOR A USER U  32

4.2:  DATABASE WITHOUT AN INFERENCE PROBLEM  32

4.3:  DATABASE WITH AN INFERENCE PROBLEM  32

4.4:  QUERY PROCESSOR  41

4.5:  DBIC SYSTEM CONFIGURATION  42

4.6:  GENERAL-PURPOSE DATABASE MANAGEMENT SYSTEM MODEL  46

4.7:  DATABASE MANAGEMENT SYSTEM MODEL AT THE
U. S. BUREAU OF THE CENSUS    46

5.1:  ATC COMPANY DATA ASSOCIATION EXAMPLES  55

LIST OF TABLES

TABLE  PAGE

3.1:  TAXONOMY OF PROTECTION REQUIREMENTS AND
CLASSIFICATION RULES    22

3.2:  FORMAL DEFINITIONS  29

3.3:  RESEARCH SUMMARY (PART 1)  29

3.3:  RESEARCH SUMMARY (PART 2)  30

SECTION 1

INTRODUCTION

This document is the first volume in the series of companion documents to the Trusted Database 
Management System Interpretation of the Trusted Computer System Evaluation Criteria [TDI 91; 
DoD 85]. This document examines inference and aggregation issues in secure database 
management systems and summarizes the research to date in these areas.

1.1  BACKGROUND AND PURPOSE

In 1991 the National Computer Security Center published the Trusted Database Management 
System Interpretation (TDI) of the Trusted Computer System Evaluation Criteria (TCSEC). The 
TDI, however, does not address many topics that are important to the design and development of 
secure database management systems (DBMSs). These topics (such as inference, aggregation, and 
database integrity) are being addressed by ongoing research and development. Since specific 
techniques in these topic areas had not yet gained broad acceptance, the topics were considered 
inappropriate for inclusion in the TDI.

The TDI is being supplemented by a series of companion documents to address these issues 
specific to secure DBMSs. Each companion document focuses on one topic by describing the 
problem, discussing the issues, and summarizing the research that has been done to date. The intent 
of the series is to make it clear to DBMS vendors, system designers, evaluators, and researchers 
what the issues are, the current approaches, their pros and cons, how they relate to a TCSEC/TDI 
evaluation, and what specific areas require additional research. Although some guidance may be 
presented, nothing contained within these documents should be interpreted as criteria.

These documents assume the reader understands basic DBMS concepts and relational database 
terminology. A security background sufficient to use the TDI and TCSEC is also assumed; 
however, fundamentals are discussed whenever a common understanding is important to the 
discussion.

1.2  SCOPE

This document addresses inference and aggregation issues in secure DBMSs. It is the first of five 
volumes in the series of TDI companion documents, which includes the following documents:

·  Inference and Aggregation Issues in Secure Database Management Systems

·  Entity and Referential Integrity Issues in Multilevel Secure Database Management 
Systems [Entity 96]

·  Polyinstantiation Issues in Multilevel Secure Database Management Systems [Poly 96]

·  Auditing Issues in Secure Database Management Systems [Audit 96]

·  Discretionary Access Control Issues in High Assurance Secure Database Management 
Systems [DAC 96]

This series of documents uses terminology from the relational model to provide a common basis 
for understanding the concepts presented. This does not mean that these concepts do not apply to 
other database models and modeling paradigms.

1.3  INTRODUCTION TO INFERENCE AND AGGREGATION

Many definitions of inference and aggregation have been suggested in the literature. In fact, one of 
the challenges to understanding inference and aggregation is that there are different (but sometimes 
similar) notions of what they are. We will discuss the differences and similarities at length in later 
sections. At this point, the following general definitions are sufficient:

Inference or the inference problem is that of users deducing (or inferring) higher level 
information based upon lower, visible data [Morgenstern 88].

Aggregation or the aggregation problem occurs when classifying and protecting collections 
of data that have a higher security level than any of the elements that comprise the aggregate 
[Denning 87].

Inference and aggregation problems are not new, nor are they only applicable to DBMSs. 
However, the problems are exacerbated in multilevel secure (MLS) DBMSs that label and enforce 
a mandatory access control (MAC) policy on DBMS objects [Denning 86a].

Inference and aggregation differ from other security problems since the leakage of information is 
not a result of the penetration of a security mechanism but rather it is based on the very nature of 
the information and the semantics of the application being automated [Morgenstern 88].

1.4  AUDIENCES OF THIS DOCUMENT

This document is targeted at four primary audiences: the security research community, database 
application developers/system integrators, trusted product vendors, and product evaluators. 
Inference and aggregation are problems based on the value and semantics of data for the part of the 
enterprise being automated. Understanding inference and aggregation is important for all of the 
audience categories, but the audience for whom this document is most relevant includes those 
involved in designing and engineering automated information systems (AIS)--in particular, the 
database application developers/system integrators. In general, this document is intended to 
present a basis for understanding the issues surrounding inference and aggregation especially in 
the context of MLS DBMSs. Implemented approaches and ongoing research are examined. 
Members of the specific audiences should expect to get the following from this document:

Researcher

This document describes the basic issues associated with inference and aggregation. Important 
research contributions are discussed as various topics are examined. By presenting current theory 
and debate, this discussion will help the research community understand the scope of the issue and 
highlight approaches and alternative solutions to inference and aggregation problems. For 
additional relevant work, the researcher should consult two associated TDI companion documents: 
Polyinstantiation Issues in Multilevel Secure Database Management Systems [Poly 96] and Entity 
and Referential Integrity Issues in Multilevel Secure Database Management Systems [Entity 96].

Database Application Developer/System Integrator

This document highlights the need for analysis of the application-dependent semantics of an 
application domain in order to understand the impact of inference and aggregation on MLS 
applications. It describes the basic issues and current research into providing effective solutions.

Trusted Product Vendor

This document describes the types of mechanisms from the research literature that can be 
incorporated into products to assist application developers with enforcing aggregation-based 
security policies. Research on tools to support inference analysis provides the basis for additional 
tools that might be provided with a DBMS product.

Evaluator

This document presents an understanding of inference and aggregation issues and how they relate 
to the evaluation of a candidate MLS DBMS implementation.

1.5  ORGANIZATION OF THIS DOCUMENT

The organization of the remainder of this document is as follows:

·  Section 2 discusses several first principles that form the foundation for reasoning about 
inference and aggregation. Included in Section 2 are subsections that define and explain 
many aspects of inference and aggregation, present relevant security terminology (e.g., 
comparing inference to direct disclosure), and describe two fundamental processes: the 
traditional security analysis paradigm and the database security engineering process.

·  Section 3 presents a framework for reasoning about inference and aggregation based on 
the two processes described in Section 2. Section 3 also introduces a detailed example that 
is used throughout the document to illustrate the material. Section 3 concludes with a 
matrix that shows how each research effort discussed in Sections 4 and 5 relate to the 
framework.

·  Section 4 summarizes research efforts on inference.

·  Section 5 summarizes research on aggregation.

·  Section 6 summarizes the contents of this document.

SECTION 2

BACKGROUND

Inference and aggregation are different but related problems. In order to understand what they are, 
what problems they embody, and their relationship, it is essential to revisit several first principles. 
In Section 2.1, we give further descriptions and discuss inference and aggregation in some detail, 
including examples. Some important terminology and security concepts are covered in Section 2.2. 
The next two sections describe fundamental processes that form the basis for reasoning about 
inference and aggregation: the traditional security analysis paradigm (Section 2.3) and the database 
security engineering process (Section 2.4). The final section summarizes the operational and 
TCSEC requirements for inference and aggregation.

Although much of what will be discussed in this section will be considered by some as intuitively 
obvious, it is these principles that are often forgotten and lead to misunderstandings. Even 
experienced researchers and practitioners should review this section.

2.1  INFERENCE AND AGGREGATION - DEFINED AND EXPLAINED

This section provides further definitions and explanations of inference and aggregation problems. 
Many examples are given to illustrate the relationship and complexity of these problems.

2.1.1  Inference

Inference, the inferring of new information from known information, is an everyday activity of 
most humans. In fact, inference is a topic of interest for several disciplines, each from its own 
perspective. In the context of this document:

Inference addresses the problem of preventing authorized users of an AIS from inferring 
unauthorized information.

Preventing unauthorized inferences is complicated both by the variety of ways in which humans 
infer new information and by the wide variety of information they use to make inferences.

Some researchers characterize the inference problems in terms of inference channels [Morgenstern 
88; Meadows 88a]. SRI's research identifies three types of inference channels based on the degree 
to which HIGH data may be inferred from LOW data [Garvey 94]:

·  Deductive Inference Channel. The most restrictive channel, requiring a formal deductive 
proof (described by propositional or first-order logic) showing that HIGH data can be 
derived from LOW data.

·  Abductive Inference Channel. A less restrictive channel where a deductive proof could be 
completed assuming certain LOW-level axioms.

·  Probabilistic Inference Channel. This channel occurs when it is possible to determine that 
the likelihood of an unauthorized inference is greater than an acceptable limit.

These descriptions of inference channels combine the method by which the inference is made and 
the information used to make the inference. Both of these areas of complexity are discussed further 
in the following paragraphs. Section 2.1.1.3 identifies related disciplines and how they address 
inference.

2.1.1.1  Inference Methods

One of the factors that makes the inference problem so difficult is that there are many methods by 
which a human can infer information. The following list [Ford 90] illustrates the variety of 
inference strategies that can be used:

·  Inference by Deductive Reasoning New information is inferred using well-formed rules, 
either through classical or non-classical logic deduction. Classical deductions are based on 
basic rules of logic (e.g., assertion that "A is true" and if "A then B" deduces "B is true"). 
Non-classical logic-based deduction includes probabilistic reasoning, fuzzy reasoning, 
non-monotonic reasoning, default reasoning, temporal logic, dynamic logic, and modal 
logic-based reasoning.

·  Inference by Inductive Reasoning Well formed rules are utilized to infer hypotheses from 
observed examples.

·  Inference by Analogical Reasoning Statements such as "X is like Y" are used to infer 
properties of X given the properties of Y.

·  Inference by Heuristic Reasoning New information is deduced using various heuristics. 
Heuristics are not well defined and may be a "rule of thumb."

·  Inference by Semantic Association The association between entities is inferred from the 
knowledge of the entities themselves.

·  Inferred Existence One can infer the existence of entity Y from certain information (e.g., 
from the information, "John lives in Boston," one can infer that "there is some entity called 
John").

·  Statistical Inference Information about an individual entity is inferred from various 
statistics compared on a set of entities.

2.1.1.2  Inference Information

The other factor that makes preventing unauthorized inferences within a DBMS a difficult problem 
is the wide variety of information that can be used to make an inference. The following are some 
types of information that can be used to make inferences:

·  Schema metadata including relation and attribute names

·  Other metadata such as constraints (e.g., value constraints or other constraints that may be 
implemented through triggers or procedures)

·  Data stored in the relations (i.e., the value of the data)

·  Statistical data derived from the database

·  Other derived data that can be determined from data in the database

·  The existence or absence of data (as opposed to the value of the data)

·  Data disappearing (e.g., upgraded) or changing value over time

·  Semantics of the data that are not stored in the database but are known within the 
application domain

·  Specialized information about the application domain (both process and data) not stored 
in the database

·  Common knowledge, and

·  Common sense

Denning [Denning 86a] and many other researchers adopt a so-called "closed-world assumption" 
that limits inference to the database. With this limitation, both the information used to make the 
inference and the inferred information must be represented in the database. The first seven types 
of information described above encompass information that would be included using this closed-
world assumption. Unfortunately, in the real world in which a DBMS must operate, significant 
amounts of the remaining types of information will be used by an adversary to attempt to infer 
unauthorized information.

2.1.1.3  Related Disciplines

Inference in the context of this document focuses on the information contained in a DBMS and 
preventing inference of unauthorized information by authorized users of the DBMS. Inference is 
also a key issue in several other disciplines including:

·  Operations Security (OPSEC) OPSEC is concerned with denying external adversaries the 
ability to infer classified information by observing unclassified indicators. The key 
difference is that the adversary is external to the organization, while this document 
addresses individuals that are internal to the organization (i.e., DBMS users).

·  Artificial Intelligence (AI) The ability to infer new facts from existing knowledge is a 
positive goal of several of the sub-disciplines of AI. The AI research community is looking 
for better ways to represent knowledge and approaches for inferring new information.

·  Uncertainty Management The inverse problem of preventing inferences is the need for 
individuals, such as intelligence analysts, to infer information about an adversary based 
upon evidence that can be collected. There are several approaches to evaluating evidence 
in an effort to quantify one's confidence in the evidence and the resulting inferred 
information [Schum 87].

2.1.2  Aggregation

Like the word inference, the word aggregation has multiple meanings in the information 
technology community. An aggregation, or aggregate, is a collection of separate data items. In the 
database context, data aggregation normally has a positive connotation. Aggregation is an 
important and desirable side effect of organizing information into a database. Information has a 
greater value when it is gathered together, made easily accessible, and structured in a way that 
shows relationships between pieces of data [National Academy Press 83]. In some sense, every 
database is an aggregation of data.

The term aggregation has a different, more common use in the commercial database management 
community, where it usually refers to the use of aggregate operators. The five standard operations 
are sum, average, count, maximum, and minimum. Others such as standard deviation or variance 
may also be found in query languages [Ullman 88]. These operators are applied to columns of a 
relation to obtain a single quantity. They allow conclusions to be drawn from the mass of data 
present in a database.

The aggregate operators in commercial DBMSs relate closely to the operations performed on 
statistical databases. Statistical databases are used to answer queries dealing with sums, averages, 
and medians, and use aggregation to hide the separate data items found in individual records. The 
concern in a statistical database is for security in the sense of privacy, not classification. Security 
in statistical databases has a large body of literature and will only be discussed briefly in this 
document.

In the context of this document:

Aggregation involves the situation where a collection of data items is explicitly considered to 
be classified at a higher level than the classification assigned to the individual data elements 
[Lunt 89].

Denning has identified two classes of the aggregation problem [Denning 87]. We describe them 
here in terms of entities and their attributes.

·  Attribute Associations Associations between instances of different data items of which 
there are two sub-classes.

Associations between Entities Associations between instances of attributes of different 
entities.

Associations between Attributes of an Entity Associations between instances of different 
attributes of an entity.

·  Size-based Record Associations Associations among multiple instances of the same entity.

We will use the current terminology to represent these classes of aggregation: Data Association 
[Jajodia 92] for Attribute Associations (both Inter- and Intra-Entity) and Cardinal Aggregation 
[Hinke 88] for Size-based Record Associations.

The difference between data association and cardinal aggregation is important. Data association 
refers to aggregates of instances of different data objects; cardinal aggregation refers to aggregates 
of instances of the same type of entity. Distinguishing between data association and cardinal 
aggregation as two types of aggregation is also important since data association can be effectively 
controlled using database design techniques with labeled data [Denning 87]. On the other hand, 
effective mechanisms to enforce cardinal aggregation policies are still a research issue.

2.1.2.1  Data Association Examples

A traditional example of intra-entity data association involves salary: the identity of an employee 
and the actual number that represents the employee's salary may both be unclassified, but the 
association between a particular employee and the employee's salary is considered sensitive. This 
is an instance of intra-entity data association as defined above: employee name (or other 
identifying attributes such as employee number) and salary are both attributes of the employee 
entity.

An example of inter-entity data association is the following: the identity of commercial companies 
and government agencies are unclassified, but the association of a particular company as the 
contractor for a particular government agency may be considered classified. In this case, the data 
association is between two entities, commercial company and government agency.

2.1.2.2  Cardinal Aggregation Examples

The traditional example of cardinal aggregation is the intelligence agency telephone book: 
individual entries are unclassified but the total book is classified [Lunt 89]. The challenge with this 
example is understanding what information the agency is trying to protect with the policy. Is it the 
name and number of staff personnel (and therefore capability) of each (or some) organizational 
element? The total staffing size of the agency? Identification of special employees? Special skills? 
A clear understanding of what information needs to be protected must precede a cardinal 
aggregation policy.

Another example of a cardinal aggregation policy relates to an Army troop list, that is, a list of 
Army organizations [Smith 91]. Information about each Army organization is unclassified (even 
if the organization's mission is classified), but the list as a whole is classified. Getting someone to 
determine how many organizations can be aggregated and still remain unclassified information is 
difficult. If the intent is to protect the Army's war fighting capability, then aggregating the combat 
arms units (as opposed to the support and civilian-based organizations) will directly disclose that 
information. Of course, one has to know about the war fighting capabilities of each type of combat 
arms unit, which is unclassified information and can be obtained externally to the troop list. 
Unfortunately, there may be other ways to infer the unauthorized information. For example, if one 
can determine the tactical communications capability (i.e., aggregation of communications units), 
can one then infer the war fighting capability? And what about any number of other aggregates of 
units where there is a reasonable understanding of their relationship to war fighting capability? It 
quickly becomes a difficult problem to describe, let alone to provide mechanisms to solve this type 
of aggregation problem. Yet keeping information about individual units unclassified is essential 
for operational efficiency.

A final example is network management data: information about outages of individual circuits is 
not sensitive, but the aggregation of information about all circuit outages for a network is sensitive. 
In this case, one can postulate that this cardinal aggregation policy is attempting to protect the 
identification of vulnerabilities of the network from an adversary that might want to disrupt or 
render the network inoperative. With information about only one or two circuit outages, an 
adversary cannot mount an effective attack to bring the network down. On the other hand, if the 
adversary knows all the circuit outages, the network may have a temporary single point of failure, 
a vulnerability that the adversary can exploit. It is operationally impractical to treat information 
about each circuit as sensitive, yet in the face of a credible threat, it is important to protect the 
aggregate information.

All of these example cardinal aggregation policies were established as a trade-off between the need 
to protect sensitive information versus the need for operational efficiency. To some researchers, 
cardinal aggregation policies are considered fundamentally unsound [Meadows 91). Thus, it 
should not be surprising that effective mechanisms to implement what some consider an unsound 
policy elude the community.

2.2  SECURITY TERMINOLOGY

This section discusses several security-relevant terms and concepts. We distinguish between two 
ways in which information can be revealed:

·  Direct Disclosure. Classified information is directly revealed to unauthorized individuals.

·  Inference. Classified information may be deduced by unauthorized individuals with some 
level of confidence often less than 100%.

The primary distinction, then, between direct disclosure and inference is whether. the information 
is revealed (direct disclosure) or must be deduced (inference). For example, if a user at LOW can 
execute a query to a DBMS and receive a response where the data displayed to the LOW user is 
actually classified HIGH (i.e., based on an explicit classification rule), then this is a direct 
disclosure problem not an inference problem. In most instances of inference, the adversary has 
some uncertainty with respect to the accuracy of the inference of the unauthorized information.

For the purposes of this document, we do not distinguish between where the HIGH data that is 
directly disclosed or inferred is located (i.e., in the database or not). The important consideration 
is that HIGH data was directly disclosure or inferred. This consideration is independent of whether 
the HIGH data is actually stored in the database or it exists somewhere else in the application 
domain of discourse.

Data association policies are stated to prevent direct disclosure of classified information related to 
an association between data elements. In most cases, cardinal aggregation policies are stated to 
prevent inferences that could be made from the aggregate data.

The discussion above refers generically to "unauthorized" individuals. In fact, we must distinguish 
between two types of unauthorized access to information:

·  based on authorized access to classified information (i.e., a label-based policy); and

·  based on informal "need-to-know" policies.

"Need-to-know" is a well-known concept and is defined in the DoD Information Security 
Regulation [5100.1-R 86] as:

·  "A determination made by a possessor of classified information that a prospective 
recipient, in the interest of national security, has a requirement for access to, or knowledge, 
or possession of the classified information in order to accomplish lawful and authorized 
government purposes."

There are two distinct types of need-to-know: "informal" (sometimes called discretionary), and 
"formal" need-to-know. With informal need-to-know, individuals may have the clearance (i.e., are 
considered trustworthy to properly protect the information) but not the job-related need to actually 
access the information. Informal need-to-know is enforced in the people and paper world by the 
individual who physically controls access to a classified document. In an AIS, informal need-to-
know is normally enforced by discretionary access control (DAC) mechanisms as described in the 
TCSEC. Formal need-to-know is enforced through compartments (or other release markings) to 
which individuals are formally granted access. Mandatory access control (MAC) categories are 
used in an AIS to enforce access based on formal need-to-know.

Direct disclosure is normally only thought of as being a problem when classified information is 
revealed to an individual without the appropriate clearance, including formal access to 
compartments (i.e., formal need-to-know). Given that U.S. Government information is classified 
based on laws or executive orders, direct disclosure of classified information is punishable by 
imprisonment. On the other hand, disclosing information in violation of an informal need-to-know 
policy is normally not considered a compromise of classified information, since the individual has 
the appropriate clearance to see the information and is trusted to properly protect it. Thus, a 
violation of an informal need-to-know policy may result in an employee being fired or sued, but 
not imprisoned.

Although an authorized user can attempt to infer information for which the user does not have an 
informal need-to-know, research efforts to date in inference and aggregation have dealt exclusively 
with direct disclosure or inference based on compromise of classified information, not on 
violations of informal need-to-know policies. As such, this document focuses almost exclusively 
on protecting classified information against inference and aggregation threats from individuals 
who are not cleared to receive this information.

2.3  THE TRADITIONAL SECURITY ANALYSIS PARADIGM

To better understand inference and aggregation and how they relate to each other, they need to be 
discussed in the context of the traditional security analysis paradigm shown in Figure 2.1. A brief 
description of the activities in the paradigm follows:

·  Identify information protection requirements The protection requirements traditionally 
addressed by the security community relate to confidentiality. However, information 
protection requirements for integrity and availability should also be specified. Although, 
the inference and aggregation problems strictly address confidentiality protection 
requirements, solutions to these problems typically adversely impact the ability to meet the 
integrity and availability requirements.

·  Identify vulnerabilities Given a set of protection requirements, one must identify the 
vulnerabilities that exist which could be exploited to prevent those requirements from 
being satisfied by the system. As the next section shows, vulnerabilities need to be 
assessed continuously throughout the design and engineering process.

·  Identify threats Given identified vulnerabilities, one must identify credible threat agents 
that can exploit those vulnerabilities. For inference and aggregation, the primary threat is 
the insider; an authorized user of the system who is authorized access to some, but not all, 
information in the system. Malicious code is a secondary threat.

·  Conduct a risk analysis The risk analysis must take into account the likelihood of a threat 
agent actually exploiting a vulnerability. The risk analysis considers tradeoffs between 
effectiveness, cost, and the ability to meet other requirements (e.g., integrity and 
availability) to determine appropriate controls.

·  Select controls Based on the risk analysis, controls are selected to eliminate or reduce a 
threat agent's ability to exploit the vulnerability. Note that some controls are selected to 
prevent while others are selected to reduce or detect a threat agent's ability to exploit the 
vulnerability. This distinction is important in that most cases dealing with inference 
involve determining during the risk analysis the likelihood of an individual inferring 
unauthorized information given a particular set of controls. The goal is to establish 
controls that provide an acceptable level of risk at an affordable cost, where cost is cast in 
terms of both money and adverse impact on functionality, ease of use, and performance.



Figure 2.1: Traditional Security Analysis Paradigm

Details of how this process is related to inference and aggregation are discussed in the next section.

2.4  THE DATABASE SECURITY ENGINEERING PROCESS

The process of implementing an AIS with a database that meets the specified security requirements 
is shown in Figure 2.2. The process should begin with a clear understanding of what information 
needs to be protected, in this case from unauthorized disclosure. If one does not have a clear 
understanding of what has to be protected, then it makes secure database design difficult. 
Unfortunately, many inference and aggregation problems are stated at the database design level 
(e.g., with sensitivity labels on tables) without a description of what information is to be protected 
from disclosure.

Given a precise statement of the information to protect, there are two distinct aspects of 
establishing rules for implementing those confidentiality protection requirements:

·  classification rules must be established that specify the confidentiality properties of the 
information, and

·  access control rules must be established that specify which authorized users can access 
information based on the classification rules.



Figure 2.2: Database Security Engineering Process

Classification rules should address all information that needs to be protected, not just information 
that must be protected for national security reasons (i.e., Confidential, Secret, and Top Secret) 
based on Executive Order 12356. For example, the Computer Security Act of 1987 requires 
unclassified but sensitive information have confidentiality protection. Categories of unclassified 
but sensitive information include, but are not limited to, personnel (i.e., subject to the Privacy Act 
of 1984), financial, acquisition contracting, and investigations. Certain other information is 
classified under the Atomic Energy Act of 1954 as Restricted Data or Formerly Restricted Data.

Confidentiality protection requirements are also applicable to non-government organizations. For 
example, proprietary information of many types may be considered "bet your company" data and 
needs to be protected to ensure the viability of the company.

Given a set of classification rules, the database designer must provide a database design which 
labels data objects and correctly implements the classification rules. The database instantiation 
represents the database as it is populated with data and used.

Access control rules are generically described as MAC for access based on the military 
classification system (including formal need-to-know) and DAC for access based on informal 
need-to-know. However, to effectively implement the protection requirements, one must explicitly 
identify the application-dependent information which identifies which authorized users, or classes 
of users (e.g., roles), are authorized access to each type of information identified in the 
classification rules.

The access control rules must be incorporated into the application system design. Although the 
access control rules can be implemented through mechanisms in the operating system or 
application software, inference and aggregation concerns are primarily addressed by DBMS access 
controls.

Classification rules are normally found in a Classification Guideline as required by the DoD 
Information Security Program Regulation [DoD 5100.1-R]. Generic access control rules are 
documented in the system security policy as required by the TCSEC. The instantiation of MAC 
and DAC with application-dependent access control rules that identify specific users and data is 
usually documented in a Security Concept of Operations.

The classification rules for an application domain need to be complete. That is, all information to 
be contained in the support information system should be explicitly addressed.

2.5  THE REQUIREMENTS FOR PROTECTING AGAINST INFERENCE AND 
AGGREGATION

In this section, we identify both operational and TCSEC requirements related to protecting against 
inference and aggregation. Operational requirements, identified in Section 2.5.1, include more than 
just the need to protect against inference and aggregation but also operational considerations, or 
tradeoffs, that impact decisions on how, or to what extent, inference and aggregation protections 
are implemented. Section 2.5.2 discusses the lack of direct TCSEC requirements for protecting 
against inference and aggregation.

2.5.1  The Operational Requirement

There are at least four operational requirements related to inference and aggregation protection.

1.  The enterprise's information must be properly protected from disclosure either directly or 
through inference.

2.  Employees (i.e., users) must have access to the information they need to do their jobs.

3.  The enterprise must conduct its operations in an efficient and cost-effective manner.

4.  The enterprise must interact efficiently with external organizations, which may involve 
mostly unclassified information.

Unfortunately, these operational requirements often are in conflict. Over-classifying data to meet 
the first requirement may reduce availability of information to certain employees (conflicts with 
second requirement). Cost-effective operations (the third requirement) may demand granting the 
lowest clearances possible to employees, which may conflict with the established classification 
rules to protect information (first requirement). Finally, the need to protect information (first 
requirement) may conflict with the efficient interface with other organizations (fourth 
requirement). This last conflict is the genesis of most cardinal aggregation policies.

2.5.2  The TCSEC Requirement

There are no requirements in the TCSEC, TDI, or Trusted Network Interpretation (TNI) [TNI 87] 
that directly address inference and aggregation issues. Inference and aggregation are properties 
intrinsic to data values and to the structure of a database (schema, metadata). Inference and 
aggregation are based on an interpretation of the enterprise's security policy and on specific 
requirements of the application domain being automated. Thus, inference and aggregation should 
best be addressed by the system developer, acquisition agency, and certification and accreditation 
authorities for each system and should not be looked at during a TCSEC /TDI evaluation.

Since inference is often characterized as an inference channel, there is often some confusion as to 
the relationship between inference and covert channels. Inference and aggregation are possible 
because of the existence of a set of specific known data values and a set of logical implication rules. 
Knowledgeable database administrators or information security officers may be able to apply 
inferential engines to the analysis of a populated database to uncover a set of potential unauthorized 
inferences which they can then protect.

However, new inferences may be introduced through error on the part of users authorized to update 
the database. It is important to note that the introduction of inference or aggregation is a data value-
based problem (including metadata that represents the database design), and not one of DBMS 
(i.e., the product) design or implementation. Unlike Trojan horses or covert channels, values 
needed to support an inferential attack are introduced consciously by benign actions of users, and 
not clandestinely by malicious implanted code or by flaws in a DBMS trusted computing base 
(TCB) design or implementation. In this sense, inference problems are different from covert 
channel issues since all the data used to make the inference are at LOW, and there is no need for 
an active agent at HIGH to signal information from HIGH to LOW [Jajodia 92]. Note also that 
inference will still produce the same result, independent of whether the compromised (inferred) 
data remains in (or ever existed in) the database at HIGH, as opposed to the need for HIGH data 
to be present in order for it to be signaled to LOW by a covert channel.

SECTION 3

AN INFERENCE AND AGGREGATION FRAMEWORK

The purpose of this section is to present a framework to reason about inference and aggregation. 
Section 3.1 begins with a description of a detailed example application domain that will be used to 
illustrate parts of the framework. Section 3.2 describes the framework using instances from the 
detailed example for illustration. Section 3.3 presents a matrix which summarizes how the research 
efforts described in Sections 4 and 5 relate to the framework.

3.1  DETAILED EXAMPLE

This section introduces a detailed example used to illustrate the concepts presented in this 
document. We use a mythical company, the Acme Technology Company or ATC. ATC contracts 
with many government agencies and commercial companies to provide technology-oriented 
products and services. ATC also conducts extensive applied research for its clients in several areas 
of technology. ATC needs extensive support for its business operations by AISs that must represent 
information about:

·  entities of interest in the application domain, and

·  functional or business processes used in the application domain.

Although there are other security considerations that should be addressed in designing functional 
processes, for the purpose of this document, we will only address the process view as it relates to 
access control.

We are most interested in understanding and modeling the information about entities. Tn 
particular, we are concerned with identifying the entities, associations, and attributes of interest to 
ATC.

Entities (real objects or concepts of interest):

·  Employees ATC employs a wide variety of personnel from clerks to very specialized 
engineers.

·  Education Data about the education of each ATC employee is retained.

·  Research Projects Although ATC has thousands of projects that cover a wide variety of 
purposes, research projects represent a special category of projects, especially from a 
security standpoint.

·  Critical Technologies Certain technologies are particularly sensitive.

·  Clients Information about the organization that sponsors a project.

Associations (relationships between instances of the entities). Figure 3.1 shows the following 
associations (in italics) between entities:

·  Employees work-on Projects.

·  Employees have Education.

·  A Client sponsors Projects.

·  Some Projects use Critical Technologies.

Figure 3.1: Entities and Associations

Figure 3.2.: ATC Attributes

Attributes (characteristics or properties of an entity). The addition of attributes adds more detail 
and, therefore, realistic complexity to the example. Figure 3.2 shows selected attributes for each 
ATC entity.

3.2  THE FRAMEWORK

The framework, shown in Figure 3.3, is an extension to the database security engineering process 
described in Section 2.4. The framework provides the context to discuss how inference and 
aggregation are applicable at each step in the development process. Given that inference and 
aggregation are inherently application-dependent, examples, based on the ATC example in the 
previous section, are given to illustrate the concepts. This framework with examples provides a 
basis to illustrate and compare the proposed approaches described in Sections 4 and 5.

3.2.1  Information Protection Requirements and Resulting Vulnerabilities

There are at least three types of information protection requirements: confidentiality, integrity, and 
availability. Inference and aggregation relate exclusively to the confidentiality objective of 
security. Therefore, although integrity and availability protection requirements are also important, 
this document will only address confidentiality requirements.

Given a set of information confidentiality protection requirements, there are two vulnerabilities 
that must be addressed by appropriate controls: direct disclosure and inference. Both of these 
vulnerabilities were discussed in Section 2.2. Aggregation is not yet a problem, because the 
protection requirements address specific facts or information that must be protected as opposed to 
a protection requirement for an aggregate of data.

3.2.2  Classification Rules

Classification rules for information are stated such that direct disclosure will be prevented, and the 
ability to infer unauthorized information is reduced to an acceptable level. Preventing direct 
disclosure is fairly straightforward: state classification rules for entities, attributes, and/or 
associations and then restrict access based on user clearance and data classification. Unfortunately, 
in practice, protection requirements can be so complex that ensuring that a set of classification rules 
are correct (i.e., consistent and complete) can be difficult.

Each of the following types of classification rules from Figure 3.3 is described in the following 
paragraphs:

·  entity;

·  attribute;

·  data association; and

·  cardinal aggregation.



Figure 3.3: Inference and Aggregation Framework

Table 3.1 summarizes this section by showing a taxonomy of types of confidentiality protection 
requirements along with examples of protection requirements and supporting classification rules 
for the ATC example. The ATC classification rules from Table 3.1 are used to illustrate the types 
of classification rules described in this section. The types of protection requirements are a synthesis 
of those found in papers by Smith, Pernul, and Burns, [Smith 90a; Pernul 93; Burns 92] and are 
meant to be illustrative of the primary types of protection requirements but not necessarily an 
exhaustive taxonomy.



Type Protection

Example Protection Requirement

Example Classification Rules

Entity Classification

1a

existence of all 
instances of an 
entity

·   the fact that ATC is involved with 
Critical Technologies must be protected 
at LOW

·   the existence of all instances of Critical 
Technologies entity are classified LOW

1b

existence of 
selected instances 
of an entity

·   the fact that a Research Project involves 
Critical Technologies must be protected 
at HIGH

·   the existence of Research Projects that 
use Critical Technologies are classified 
LOW

·   the existence of instances of all other 
entities is UNCLASSIFIED

2a

identity of all 
instances of an 
entity

·   the identity of Critical Technologies 
must be protected at HIGH 

·   the identity of Clients of Research 
Projects must be protected at LOW 
(this is a requirement of the client 
organization)

·   the value of all instances (i.e., all 
attributes) of the Critical Technologies 
entity is classified HIGH

·   the value of all instances (i.e., all 
attributes) of the Client entity is 
classified LOW

2b

identity of 
selected instances 
of an entity

·   selected Research Projects will be 
protected at LOW based on the 
contents of Project Description as 
determined by the classification 
authority

·   all attributes of the Research Project 
entity are classified LOW when the 
Description attribute contains LOW 
information

·   the identity of instances of all other 
entities is UNCLASSIFED

Attribute Classification

3

existence of an 
attribute (could 
also be of 
selected instances 
of an attribute)

·   the fact that specific Target Platforms 
are identified for Research Projects is to 
be protected at HIGH

·   the existence of the Target Platform 
attribute of Research Project is 
classified HIGH

·   all instances of the Target Platform 
attribute are classified HIGH

·   the existence of instances of all other 
attributes is UNCLASSIFIED except 
privacy information

4

value of an 
attribute

·   privacy information for employees (i.e., 
home address and phone number) is to 
be protected at LOW

·   the existence of the privacy information 
attributes of employees is classified 
LOW

Data Association

5

the association 
between two 
entities or two 
attributes of 
different entities

·   the association between a specific 
project and the sponsoring client must 
be protected at HIGH

·   the association between the Project 
entity and the Client entity is HIGH

·   the association between all other 
entities is UNCLASSIFIED

6

the association 
between two 
attributes of the 
same entry

·   the association between name and 
salary is to be protected at LOW

·   the association between the Salary 
attribute of the Employee entity and 
Employee identifying attributes (name, 
number) is LOW

Cardinal Aggregation

7

a collection of 
instances of the 
same entry

·   the identity of Critical Technologies 
must protected at HIGH

·   the education entity and all its attributes 
for all employees are UNCLASSIFIED

·   the collection of the Degree and Major 
attributes for all employees is LOW

Table 3.1: Taxonomy of Protection Requirements and Classification Rules

Entity classification rules can address the need to protect both the identity as well as the existence 
of instances of entities of interest. In some operational environments, the fact that an entity (or 
some selected instances of an entity) exists would disclose, directly or through an inference, 
classified information. In many other operational environments, knowing that something classified 
exists is not a problem; but you must not be able to know the classified information (i.e., its identity 
or value). A good example is a compendium of documents including some that are classified. The 
reader can know that the documents exist without having the clearance to read them. Entity 
classification rules from the ATC example include:

·  The existence of all instances of the Critical Technologies entity are classified LOW.

·  The values of all instances (i.e., all attributes) of the Critical Technologies entity are 
classified HIGH.

·  All attributes of the Research Project entity are classified LOW when the Description 
attribute contains LOW information.

Attribute classification rules are similar to entity classification, except that they address specific 
attributes of an entity. Attribute classification rules from the ATC example include:

·  The existence of the Target Platform attribute of Research Project is classified HIGH.

·  Instances of the Employee entity are UNCLASSIFIED, but the value (not existence) of 
two attributes, Home Address and Home Phone#, are classified LOW.

Data Association classification rules are stated when a relationship between entities or attributes is 
considered more sensitive than information about the entities or attributes. This type of 
classification rule implements inter- and intra-entity aggregation as discussed in Section 2. The 
ATC example includes these data association rules:

·  The association between a specific project (classified LOW) and the sponsoring client 
(classified LOW) must be protected at HIGH.

·  The association between an employee and the employee's salary is to be protected at 
LOW.

These three types of classification rules (entity, attribute, and data association), along with 
appropriate access control rules (e.g., the military classification system) will define the basis for 
preventing direct disclosure. Direct disclosure can be prevented if the classification rules correctly 
implement the protection requirements (i.e., are consistent and complete) and if the AIS correctly 
implements the rules.

On the other hand, eliminating inferences is difficult, if not impossible, in that an individual's 
capability to infer information is based on their already acquired knowledge and the innovative 
ways in which information can be combined [Morgenstern 88]. Much of the knowledge that may 
be used to infer unauthorized information is normally outside of the system in the form of domain-
specific knowledge or even common sense. At one end of the spectrum, one can ensure that no 
unauthorized inferences are possible by classifying all information at HIGH and then giving all 
users a HIGH clearance. Although this approach solves the inference problem (since every user is 
authorized to access all information in the system), we now have introduced new vulnerabilities 
(e.g., users have clearances they should not have, and therefore access to information they should 
not see) and potentially decreased the operational efficiency of the organization (e.g., having to 
manually downgrade a large percentage of the information produced by the system).

Establishing classification rules that will reduce unwanted inferences involves a constant trade-off 
between the desire to raise the classification of data to reduce inferences and the desire to maintain 
or even lower the classification of data to improve operational efficiency without directly 
disclosing classified data to those not cleared to see it. This process of evaluating the relative 
advantages of over-classifying information to prevent inferences is part of the risk analysis activity 
identified in Section 2.3.

At this point, a fourth type of classification rule, cardinal aggregation, is often added. Cardinal 
aggregation rules address the classification of collections of instances of the same entity. Often, 
these rules are established to reduce inference. At this point in the framework we can conclude the 
high-level relationships between inference and aggregation are:

·  inference is a vulnerability, e.g., authorized users can deduce information not authorized 
by the protection requirements, and

·  aggregation is a class of classification policies that are stated as one form of control, in 
many cases, to reduce the inference vulnerability.

The ATC example includes a cardinal aggregation classification rule:

·  the education entity and each of its attributes for all employees are individually 
UNCLASSIFIED, but

·  the collection of the Degree and Major attributes of all instances of the education entity is 
HIGH.

The ATC cardinal aggregation classification rule illustrates the difficulty of enforcing the intent of 
the rule, even though we have specified the information to be protected (i.e., the identity of Critical 
Technologies must be protected at HIGH). The difficulty is to determine how many employees' 
educational information has to be aggregated to allow an inference. Is it all employees on a 
particular project? Ten employees? Ten percent of the employees? All employees with blue eyes? 
To have an effective implementation of a cardinal aggregation classification rule, someone has to 
state explicit criteria for which collections must be considered HIGH. Unfortunately, cardinal 
aggregation classification rules introduce a new vulnerability, as discussed in the next section.

At this point it is appropriate to distinguish between an aggregation policy, such as those discussed 
above, and a Chinese Wall policy. Brewer and Nash introduced an important commercial security 
policy called the Chinese Wall policy [Brewer 89]. They described the policy as follows:

It can most easily be visualized as the code of practice that must be followed by a market 
analyst working for a financial institution providing corporate business services. Such an 
analyst must uphold the confidentiality of information provided to him by his firm's clients; 
this means that he cannot advise corporations where he has insider knowledge of the plans, 
status, or standing of a competitor. However, the analyst is free to advise corporations which 
are not in competition with each other, and also draw on general market information.

This type of policy seems to involve aggregates of information and therefore is sometimes 
considered an "aggregation problem." In fact, the Chinese Wall policy does not meet the definition 
of an aggregation problem; there is no notion of some information being sensitive with the 
aggregate being more sensitive. The Chinese Wall policy is an access control policy where the 
access control rule is not based just on the sensitivity of the information, but is based on the 
information already accessed. The situation can be considered similar to having compartmented 
data and subjects which can be cleared for exactly one compartment. The choice of which 
compartment the subject is cleared for is effectively made when the first compartmented data is 
viewed.

3.2.3  Vulnerabilities from Classification Rules

Figure 3.3 identifies three residual vulnerabilities given a set of classification rules designed to 
prevent direct disclosure and reduce inference (i.e., the vulnerabilities of information 
confidentiality protection requirements):

·  Flawed Rules that allow Direct Disclosure With complex operational environments, the 
classification rules can be flawed with respect to how faithfully they implement the 
information confidentiality protection requirements. They can be inconsistent, incomplete, 
or just incorrect such that they allow direct disclosure.

·  Flawed Rules that allow an Unintended Inference Although the entire set of classification 
rules are intended to prevent unwanted inferences, when faced with a complex set of 
classification rules there may be opportunities for unintended inferences. Again, this may 
be caused by rules that are inconsistent, incomplete, or incorrect.

·  Cardinal Aggregation If a cardinal aggregation rule has been stated that classifies an 
aggregate as LOW but individual elements are UNCLASSIFIED, then a vulnerability has 
been added since UNCLASSIFIED users may be able to assemble aggregates in an 
unintended or undetectable way.

3.2.4  Database Design

Database designers and application developers attempt to implement classification rules by 
accurately representing and attaching security labels to database objects. The Relational Model 
[Codd 70], one of several data models, is the model of choice for existing commercial MLS 
products. Figure 3.4 shows a relational model representation of the ATC example using tuple-level 
labeling.

The following are examples of how each type of classification rule can be implemented in a 
database design:

·  The rule, "the existence of all instances of the Critical Technologies entity is classified 
LOW," is implemented by classifying the Critical Technologies table metadata at LOW. 
This is done by the database designer.

·  The rule, "the value of all instances of the Critical Technologies entity is classified 
HIGH," is implemented by labeling each tuple in the Critical Technologies at HIGH. This 
is enforced by the user entering the data.

·  The rule, "all attributes of the Research Project entity are classified LOW when the 
Description attribute contains LOW information," is implemented by having tuples 
labeled UNCLASSIFIED or LOW based on the content of the Description attribute. This 
is enforced by the user entering the data.

·  The rule, "the existence of Research Projects that use Critical Technologies are classified 
HIGH," is implemented by classifying each Research Project tuple at HIGH when the 
project uses a Critical Technology. This is enforced by the user entering the data.

·  The rule, "the existence of the Target Platform attribute of Research Project is classified 
HIGH," is implemented by decomposing the Target Platform attribute (which would 
normally be an attribute in the Research Project table) into a separate table, Project-Target. 
All tuples in Project-Target are classified HIGH. Note that the metadata can also be 
classified HIGH to prevent obtaining the knowledge that projects have a target platform 
that could contribute to unauthorized inferences.

·  The rule, "all instances of the Target Platform attribute are classified HIGH," from an 
implementation standpoint, is redundant with the previous rule. Thus, this rule requires no 
changes to the database design.

·  The rules for data association between the Project and Client entities are, "the value of all 
instances (i.e., all attributes) of the Client entity is classified LOW," "the value of all 
instances of the Project entity (except those that use Critical Technologies) are classified 
LOW," and "the association between the Project entity and the Client entity is HIGH." 
These rules are implemented by labeling tuples (and metadata) in the Client table as LOW, 
labeling tuples in the Project table as LOW (except those that are labeled HIGH because 
they use Critical Technologies), and constructing an additional table, Client-Project, to 
represent the data association between Clients and Projects. The Client-Project table 
(metadata and tuples) is classified HIGH. Note that in the absence of a data association 
rule, the association between Project and Client tables would be represented by a foreign 
key from the Project table to the Client table.



Figure 3.4: ATC Example Database Design

·  The data association rule, "the association between Salary attribute of the Employee entity 
and Employee identifying attributes (name and number) is LOW," is implemented by 
decomposing the Salary attribute into a separate table, Employee-Salary. All tuples in the 
Employee-Salary table are labeled LOW.

·  The cardinal aggregation rule established by the rules, "the education entity and all its 
attributes for all employees are UNCLASSIFIED" and "the collection of the Degree and 
Major attributes of the entity for all employees is LOW," is implemented by labeling all 
tuples in the Employee table as UNCLASSIFIED and requiring some (unspecified) 
mechanism to control access to multiple records.

3.2.5  Vulnerabilities from Database Design

Given a database design that attempts to faithfully represent the established classification rules, 
four vulnerabilities are identified in Table 3.3:

·  Faithful Representation of Flawed Rules A database design that faithfully implements a 
flawed set of classification rules will not fix the flawed rules.

·  Flawed Design Allowing Direct Disclosure A flawed design may allow a direct disclosure 
at the schema level for uniformly classified data objects or at the content level for data 
objects where instances may be classified at one of several levels.

·  Flawed Design Allowing Unintended Inference The flawed design may allow unintended 
inferences at the schema level for uniformly classified data objects, or at the content level 
for data objects where instances may be classified at one of several levels.

·  Cardinal Aggregation Labeled LOW If the data is labeled LOW, then a vulnerability 
exists, since LOW users may be able to access aggregates in an unintended way.

These vulnerabilities are related to the values, application semantics, and the security labels 
assigned to data objects, not to the functionality (beyond enforcing a MAC policy) or assurance 
provided by the TCB. The TCSEC class of a TCB that enforces MAC has no impact on countering 
these vulnerabilities. An Al TCB is just as effective (or ineffective) as a B1 TCB in countering 
these vulnerabilities.

3.2.6  Database Instantiation and Vulnerabilities

For completeness, Table 3.3 shows the database instantiation as the final activity. Databases are 
populated with data, and users use the database to conduct their job tasks. The database 
administrator or security officer is tasked with ensuring the system is operating in a secure manner. 
Assuming clearances are properly assigned to users, two vulnerabilities are identified:

·  Inappropriate Labeling Allowing Disclosure Current MLS systems, including DBMSs, 
depend on the user being logged in at the access class that correctly represents the 
classification of the data being added to the database. If a user with a HIGH clearance 
enters HIGH data when the user's current access class is LOW, then HIGH data has been 
directly disclosed, since users with only a LOW clearance can access the HIGH data that 
is incorrectly marked LOW.

·  Inappropriate Labeling Allowing Unintended Inference In a similar manner, a user with a 
HIGH clearance may enter data at LOW that will allow other users with only a LOW 
clearance to infer HIGH data.

The TCSEC class of a TCB that enforces MAC also has no impact on countering these 
vulnerabilities.

3.3  OVERVIEW OF RESEARCH

This section provides a brief overview of the research that will be discussed in Sections 4 and 5. 
Each of the papers and research efforts on inference and aggregation address one or more parts of 
the framework. Table 3.2 lists formal approaches to defining the problems. Table 3.3 maps 
inference and aggregation research efforts against the framework in three categories (techniques, 
mechanisms, and analysis tools). As Table 3.3 shows, much of the research has been in developing 
techniques and designing mechanisms. There has also been limited work in developing tools to 
help address inference and aggregation. The remainder of the table shows which vulnerabilities 
each of the analysis tools and mechanisms attempt to address.



Inference

Aggregation

Database Partitioning (Denning)

Set Theory (Goguen and Meseguer)

INFER Function (Denning and Morgenstern)

FD and MVD (Su and Ozsoyoglu)

Sphere of Influence (Morgenstern)

Fact Model (Cordingley and Sowerbutts)

Inference Secure (Lin)

Inference Secure (Marks)

Imprecise Inference Control (Hale)

Security Algebra (Lin)

Modal Logic Framework (Cuppens)

Table 3.2: Formal Definitions



Techniques

Mechanisms

Analysis Tools

Classification Rules

Entity and Attribute

Direct Association

Database Design
(Denning, Lunt)

Cardinal 
Aggregation

Classification Rule Vulnerabilities

Direct Disclosure

Unintended 
inference

Cardinal 
Aggregation Rules

Faithfully 
implements flawed 
classification rules

Table 3.3: Research Summary (Part 1)



Techniques

Mechanisms

Analysis Tools

Data Design Vulnerabilities

Direct Disclosure at 
schema level

Secondary Path Analysis 
(Hinke, Binns)

Direct disclosure at 
content level

Unintended 
inference at schema 
level

Basic Security Principle 
(Meadows)

Semantic Data Models 
(Buczkowski, Hinke, 
Smith)

Classification
Constraints (Akl)

Constraint Satisfaction
(Morgenstern)

Integrity Constraints
(Denning)

Secondary Path Analysis 
(Hinke, Binns)

DISSECT
(Garvey et al.)

AERIE
(Hinke and Delugach)

Constraint Processor
(Thuraisingham et al.)

Hyper Semantic Model
(Marks et al.)

Unintended 
inference at content 
level

Cardinal
Aggregation-LOW
Data

Separae Structures
(Lunt)

Use of Views (Wilson)

Restrict Ad hoc Queries

Use of Audit

Some or All Data HIGH

Aggregation Constraints
(Haigh and Stachour)

Aggregation Detection
(Motro et al)

Database Instantiation Vulnerabilities

Labeling allows 
disclosure

Constraint Processor
(Thuraisingham et al.)

DB Inference Controller
(Buczkowski)

Secondary Path 
Monitoring (Binns)

Polyinstantiation
(Denning)

Auditing Facility
(Haigh and Stachour)

Snapshot Facility 
(Jajodia)

Labeling allows 
unintended inference

Table 3.3: Research Summary (Part 2)

SECTION 4

APPROACHES FOR INFERENCE CONTROL

Providing a solution to the inference problem is beyond the capability of currently available MLS 
database management systems. It is also recognized that a general solution to the inference 
problem is not possible given the inability to account for external knowledge and human reasoning 
[Binns 94a]. To provide partial solutions, the inference problem must be bounded.

Section 4.1 identifies several efforts by researchers to formally define the inference problem. These 
approaches are summarized to provide different perspectives on the problem. Implementing 
theoretical inference solutions has proven to be difficult. Section 4.2 describes database design 
techniques that could be used to prevent inference problems from occurring. Mechanisms for 
inference prevention and detection during run-time are described in Section 4.3. Finally, Section 
4.4 describes several tools that assist the database designer to detect and avoid inference channels.

4.1  FORMALIZATIONS

Researchers have proposed numerous ways to characterize or define the inference problem. This 
section presents several formal definitions from the ongoing research into the discovery of 
fundamental laws that determine whether the potential for undesirable inferences exist.

4.1.1  Database Partitioning

Database partitioning is used by Denning to define the inference problem [Denning 86a]. For each 
user *, the data in the database can be partitioned into two sets: a VISIBLE set and an INVISIBLE 
set, as in Figure 4.1. The user  * is allowed to access only elements from the VISIBLE set, and in 
is not allowed to know what elements are in the INVISIBLE set. (Note that if VISIBLE is null, the 
user has no access to database information and, therefore, the problem is outside the domain of the 
DBMS. In what follows, we assume that VISIBLE is not null.) Let KNOWN denote the set of data 
elements that is known to *. The set KNOWN is constructed by the user either as a result of the 
previous queries or as a result of some external knowledge that in possesses. There is no inference 
problem if the intersection of the two sets INVISIBLE and KNOWN is empty, as in Figure 4.2. 
There is an inference problem if the two sets INVISIBLE and KNOWN intersect, as in Figure 4.3.



Figure 4.1: Partitioning of the Database for a User U



Figure 4.2: Database without an Inference Problem



Figure 4.3: Database with an Inference Problem

4.1.2  Set Theory

Another formulation of the inference problem has been given by Goguen and Meseguer [Goguen 
84]. Consider a database in which each data item is given an access class, and suppose that the set 
of access classes is partially ordered. Define the relation > as follows: Given data items x and y, 
relation x > y is said to hold if it is possible to infer y from x. The relation > is reflexive and 
transitive. A set S is said to be inferentially closed if whenever x is in S and x > y holds, then y 
belongs to S as well. Now, for an access class L, let E(L) denote the set consisting of all possible 
responses that are classified at access classes dominated by L. There is an inference problem if E(L) 
is not inferentially closed.

Goguen and Meseguer do not set forth any one candidate for the relation >; they merely require 
that it be reflexive and transitive, and say that it will probably be generated according to some set 
of rules of inference (for example, first order logic, statistical inference, monotonic logic, and 
knowledge-based inference). They do note, however, that for most inference systems of interest, 
determining that A > b (where A is a set of facts and b is a fact) is at best semidecidable; that is, 
there is an algorithm that will give the answer in a finite amount of time if A > b, but otherwise 
may never halt.

4.1.3  Classical Information Theory

Yet another definition that uses classical information theory has been given by Denning and 
Morgenstern [Denning 86a]. Given two data items x and y, let H(y) denote the uncertainty of y, and 
let Hx (y) denote the uncertainty of y given x (where uncertainty is defined in the usual information-
theoretic way). Then, the reduction in uncertainty of y given x is defined as follows:



The value of INFER (x > y) is between 0 and 1. If the value is 0, then it is impossible to infer any 
information about y from x. If the value is between 0 and 1, then y becomes somewhat more likely 
given x. If the value is 1, then y can be inferred given x.

This formulation is especially useful since it shows that inference is not an absolute problem, but 
a relative one. It provides a way to quantify the bandwidth of the illegal information flow. On the 
other hand, Denning and Morgenstern point out its serious drawbacks [Denning 86a]:

1.  It is difficult, if not impossible, to determine the value of Hx (y).

2.  It does not take into account the computational complexity that is required to draw the 
inference.

To illustrate the second point, they give the following example from cryptography: With few 
exceptions, the original text can be inferred from the encrypted text by trying all possible keys (so 
Hx (y) = 1 in this case); however, it is hardly practical to do so.

4.1.4  Functional and Multivalued Dependencies

Su and Ozsoyoglu have studied inference problems that arise because of the functional and 
multivalued dependencies that are constraints over the attributes of a relation [Su 86,87,90]. 
Functional dependencies are defined below; the reader is referred to Date for a definition of 
multivalued dependencies [Date 86].

Let R be a relation scheme defined over attributes *, and let X and Y be subsets of in. A functional 
dependency X > Y is said to hold in R if for any relation r, the current instance for R, r does not 
contain two tuples with the same values for X, but different values for Y; that is, given any pair of 
tuples t and s of r, t[X] = t[X] implies t[Y] = s[Y].

Su and Ozsoyoglu illustrate how inference problems can arise if certain functional dependencies 
are known to the low-level users. If attributes are assigned security labels in a manner that is 
consistent with the functional dependencies, then these inference threats can be eliminated. This 
process is formalized by Su and Ozsoyoglu.

4.1.5  Sphere of Influence

Morgenstern expands upon the INFER function (described in Section 4.1.3) to propose a 
theoretical foundation for inference [Morgenstern 87,88]. He proposes a framework for the 
analysis of logical inferences and for the determination of logical inference channels. Morgenstern 
introduces the concept of a sphere of influence (SOI) and inference channel.

The SOI delimits the scope of possible inferences given some base data, called the core that may 
consist of entities, attributes, relationships, and constraints. Specifically, the sphere of influence 
relative to some core, i.e., SOI (core), is the set of all information that can be inferred from the core. 
The SOI is defined in terms of the INFER function. The SOI models the process by which a user's 
knowledge of an application can give rise to inferences about additional information. The SOI 
utilizes a forward chaining inference process from the given core to determine the scope of such 
inferable data.

Morgenstern's concept of inference channel serves to isolate the lower level data which could give 
rise to inferences about higher level data H. The computation of an inference channel is a 
backward-chaining inference process from some resultant data H to determine all information 
which contributes to upward inferences about H. An inference channel exists if information about 
some set of data H may be inferred from data in another set C which is at a lower level than H, or 
is incomparable relative to H in the classification lattice.

4.1.6  The Fact Model

Sowerbutts and Cordingley provide a formal definition of inference based on an abstract data 
model called the Fact Model [Cordingley 89, 90; Sowerbutts 90]. They identify two aspects of the 
inference problem: static inference and dynamic inference. Static inference is knowledge of the 
database together with authorized facts that can be used to determine unauthorized facts. Dynamic 
inference is where database operations can be used together with knowledge of the structure of the 
database to determine unauthorized facts. The Fact Model addresses static inference.

The Fact Model is a model of knowledge described as a set of facts and a set of constraints. The 
authors provide a language for specifying application-dependent facts and constraints. The Fact 
Model provides a formal statement of the standard constraints, such as functional and multivalued 
dependencies, and defines explicitly the inference threat associated with each [Cordingley 89]. The 
relationship between facts and subfacts is defined by construction rules. An INFER function is 
defined on facts, constraints, and construction rules. Given an arbitrary subset of facts and 
constraints, the INFER function identifies all the facts that are implicit in the given subset, and 
hence can be inferred by a user. At this point the classification attribute can be added for each fact.

This approach forms the basis for assigning classifications such that any facts inferable from a 
given subset of facts whose classifications are dominated by a classification level k also have a 
classification level dominated by k. The authors call this a Static Inferentially Secure Database.

The Fact Model also is extended to represent operations such that a Dynamic Inferentially Secure 
Database is defined. The Fact Model was developed as an abstraction of the relational model but 
can be used to describe any conventional database structure such as network or hierarchical.

4.1.7  Inference Secure

Lin introduces the concept of navigational inference as the process of accessing HIGH data via 
navigating through legally accessible LOW data [Lin 92]. Lin states that inference problems exist 
if the security classifications of data are inconsistent with some database structures. He identifies 
three such inference problems:

·  Logical Inference Problems The security classification of a theorem (a derivable formula) 
in a formal theory should be consistent with its proof. If not, then logical inference 
problems arise.

·  Algebraic Inference Problems The security classification of "algebraically derivable data" 
should be consistent with its relational algebraic structure. If not, algebraic inference 
problems arise.

·  Navigational Inference Problems The security classification of data should be consistent 
with its navigational operators (which generate navigational paths). If not, navigational 
inference problems arise.

Lin defines a multilevel data model as inference secure if:

(1)  it is a Bell-LaPadula Model (secure under MAC), and

(2)  it is navigational inference free.

Lin provides further descriptions and theorems to support the definitions [Lin 92].

4.1.8  An Inference Paradigm

Marks defines database inference as [Marks 94b]:

Inference in a database is said to occur if, by retrieving a set of tuples {T) having attributes 
{A} from the database, it is possible to specify a set of tuples {{T'), having attributes {A'), 
where {T') is not a subset of {T) or {A') is not equal to {A).

The definition may be stated as an inference rule: IF ({T), {A}) THEN ({T'}), {A'}), which may 
be denoted as ({T}, {A}) > ({T'}, {A'}). However, it may be possible to retrieve a set of tuples 
{T} and the associated attributes {A} from a database and to reason, using data and attributes 
outside the database, to arrive at a set of tuples {T'} and attributes {A'} that are again within the 
database. That is, it is possible to form a chain of reasoning, ({T}, {A}) > ({T1}, {A1}) > ({T2}, 
{A2}) >... > ({T'}, {A'}) where some of the tuples and/or attributes are outside the database. 
When some of the data or attributes are outside of the database, the chain of reasoning cannot be 
followed by the database system. Fortunately it is not necessary to actually follow such a chain of 
reasoning in order to control the inference threat. If the database contains the endpoints of the chain 
(({T}, {A}) and ({T'}, {A'})) there will exist what is referred to in logic systems as a material 
implication relating the two sets.

If a material implication exists such that ({T}, {A}) > ({T'}, {A'}) where ({T}, {A}) is classified 
LOW and ({T'}, {A'}) is classified HIGH, then the database can offer no assurance that there does 
not exist some chain of inference, using outside knowledge, that can connect the two, enabling a 
LOW user to infer HIGH information. If, however, ({T}, {A}) not > ({T'}, {A'}) within the 
database, then it can be guaranteed that no chain of inference, using outside knowledge or not, 
exists which connects the two sets. That is, the absence of a material implication between two sets 
of data is sufficient to guarantee the absence of any chain of reasoning between the sets of data. It 
is not necessary for inference control, however, since material implications may be coincidental, 
and not related to any reasoning process. These arguments may be reduced to:

Limitations on Database Inference: ({T}, {A}) > ({T'}, {A'}) is an inference rule capable 
of being controlled by the database if and only if all the tuples {T} and {T'} are in the database 
and all the properties {A} and {A'} are attributes in the database.

4.1.9  Imprecise Inference Control

Hale, Threet, and Shenoi introduce a powerful, yet practical, formalism for modeling and 
controlling imprecise functional dependency (FD) based inference in relational database systems 
[Hale 94]. The existence of an imprecise FD implies that if some tuple components satisfy certain 
equivalencies, then other tuple components must exist and their values must be equivalent. 
Imprecise FDs can specify constraints on precise and imprecise data. Examples of imprecise FDs 
are: "engineers have starting salaries about 40K" and "approximately equal qualifications and 
more or less equal experience demand similar salary."

An imprecise FD is formally defined from which the authors formally define an imprecise 
inference channel. The non-formal definition of an imprecise inference channel is a chain of 
imprecise FDs. To control and ultimately eliminate compromising imprecise inferences, it is 
necessary to specify a set of imprecise inference channels considered to be secure. This 
compromise specification set is defined by the database administrator. Suspect channels in a 
database are compared with these secure channels. A potential security compromise exists when a 
suspect channel allows the inference of information which is not coarser than information inferred 
by any secure channel. Potentially compromising imprecise inference channels are eliminated by 
hiding relations attributes or by "clouding" data manifested by imprecise FDs in the compromising 
channels.

4.2  DATABASE DESIGN TECHNIQUES

The problem of deciding how to label multilevel database objects - data, schemata, and constraints 
- should not be a problem for cleared database designers and users [Millen 91]. They ought to know 
the classification of whatever they are entering into the database. Unfortunately, many inferences 
are not simple ones, and the number and complexity of potential inferences can be quite large. 
Thus, raising the question of how well a person can anticipate such inferences in order to classify 
the data [Morgenstern 88]. The following subsections examine techniques for avoiding inference 
problems in the first place, and how to detect inference channels during database design.

4.2.1  Inference Prevention Methods

Ideally, if all unauthorized disclosures are to be prevented, the Basic Security Principle [Meadows 
88a, 88b] should be followed:

Basic Security Principle for Multilevel Databases - The security class of a data item should 
dominate the security classes of all data affecting it.

The reason for the Basic Security Principle is clear: if the value of a data item can be affected by 
data at levels not dominated by its own level, information can flow into the data item from data at 
other levels.

The task of predicting or detecting all inference problems appears to be very difficult. However, 
many of these problems can easily be prevented by careful consideration of the data items on which 
a data item is predicated. In practice, unfortunately, two problems with the Basic Security Principle 
have been identified:

1.  The number of potential inferences can be quite large.

2.  All possible inferences cannot be anticipated.

The combination of these problems makes the task of appropriately classifying data complex. 
However, several techniques that have been developed for dealing with inferences are discussed 
below. If information x permits disclosure of information y, one way to prevent this disclosure is 
to reclassify all or part of information x such that it is no longer possible to derive y from the 
disclosed subset of x. There are two approaches to dealing with violations of this type: reclassify 
either the data or the constraints appropriately. These approaches are discussed next.

4.2.1.1  Appropriate Labeling of Data

One of the approaches suggested to handle the inference problem is to design the multilevel 
database in such a way that the Basic Security Principle is maintained [Binns 92a; Burns 92; Hinke 
92; Garvey 93; Smith 90b; Thuraisingham 90c]. Security constraints are processed during 
multilevel database design and subsequently the schemas are assigned appropriate security levels. 
Several researchers have proposed that semantic database models be used to detect (and then 
prevent) some inference problems [Buczkowski 90; Hinke 88; Smith 90b, 91]. Conventional data 
models (such as hierarchical, network, and relational data models) use overly simple data 
structures (such as trees, graphs, or tables) to model an application environment. Semantic 
database models, on the other hand, attempt to capture more of the meaning of the data by 
providing a richer set of modeling constructs. Since integrity and secrecy constraints can be 
expressed naturally in semantic database models [Smith 90b], they can be used to detect inference 
problems during the database design phase.

Classification (or secrecy) constraints can also be used to eliminate inference problems [Akl 87]. 
Classification constraints are rules that are used to assign security levels to data as they are entered 
into the database. Classification constraints are required to be consistent (meaning that each value 
is assigned a unique security level) and complete (meaning that rules assign each value a security 
level). Inconsistent classification constraints indicate potential inference problems, and incomplete 
classification constraints point to incomplete labeling rules. Two different methods have been 
proposed for determining the consistency and completeness of the classification constraints. One 
is based on logic programming [Denning 86b], and the other is based on computational geometry 
[Akl 87].

Morgenstern views the overall process of classifying a database as a constraint satisfaction 
problem of the following form [Morgenstern 88]. Each potential inference which involves one or 
more relations or objects (data) from the database serves as a constraint. It limits the classifications 
which can be assigned to an object given the classification(s) of the other object(s). In some cases, 
the constraint may uniquely determine the classification of the remaining data object. Morgenstern 
defines safety for inference: a data object O, and its assigned classification label, are said to be safe 
for inference if there is no upward inference possible from O. That is, object O cannot be used to 
infer information about other data objects at higher or incomparable levels of the classification 
lattice. A classification level is safe for inference if all the data objects having that classification 
are safe.

The techniques described in this section rely on increasing the classification of one or more 
elements in an offending path following its detection. Note that in practice, however, it is not 
always possible to solve inference problems by raising the classification level of data that cause 
inference violations. Some data may be public knowledge, and therefore, cannot be reclassified. In 
some cases, certain data must be made public due to legal requirements or national policy. Binns 
also points out that an upgraded element may open apparently new inference channels [Binns 94b].

4.2.1.2  Appropriate Labeling Integrity Constraints

There are two ways of handling an explicit constraint defined over several data fields. One is to 
leave the constraint UNCLASSIFIED, but to identify the LOW data that, together with the 
constraint, will allow users to infer information about the HIGH data, and to reclassify that LOW 
data upwards. This approach satisfies the Basic Security Principle, but at the cost of reducing data 
availability to low-level users. The other approach is to classify the constraint itself at the least 
upper bound of the security classes of the data over which it is defined. This approach satisfies the 
Basic Security Principle and allows low-level users access to data, but at the possible cost of data 
correctness; data entered by a low-level user may not satisfy the constraint. This problem can be 
partially solved by allowing polyinstantiation to take place [Denning 87] and is briefly discussed 
in Section 4.3.2. When data at a LOW security level is entered, it is not checked against the 
constraint. However, the LOW data does not replace the HIGH data; both are stored in the 
database. Such an approach will guarantee correctness from the point of view of the high user. 
Inference controls mean that complete data correctness is not always possible for LOW users.

Since implicit constraints are always available to users whether or not they are allowed access to 
them in the database, it is useless to attempt to reduce inferences by classifying implicit constraints. 
Therefore, data over which the constraint are defined must be uniformly classified. Since the data 
over which the constraints are defined may be overlapping, it is difficult to assign security levels 
in a consistent manner while limiting the damage to data availability. For example, Su has shown 
that the problem of classifying data so that data availability is maximized, and the security class of 
each key in an implicit functional dependency dominates the classes of all data depending on that 
key, is NP-complete [Su 86] (see also [Su 87,90]).

4.2.2  Inference Detection Methods

Inference through secondary path analysis can be characterized as multiple paths between two 
attributes such that the paths' security levels are inconsistent [Binns 92a]. Given a large enough 
database, it is not likely that a database designer can be aware of all possible interactions between 
tables. Alternative, hence, secondary paths may exist which ultimately relate two attributes in a 
manner unforeseen by the designer. If the classification of the secondary path is different than the 
primary path, then a classification inconsistency exists. Classification inconsistency is not 
acceptable regardless of which path is conceived to be "primary". Binns describes an approach that 
aids the database designer by identifying potential aggregates and inference paths automatically.

The use of conceptual structures to handle the inference problem was first proposed by Hinke 
[Hinke 88]. A semantic network can be regarded as a collection of nodes connected via links. The 
nodes represent entities, events, and concepts. The links represent relationships between the nodes. 
Numerous variations of semantic networks have been proposed, including the conceptual graphs 
of Sowa, and are called conceptual structures [Sowa 84]. Conceptual structures are useful in that 
they can represent and reason about the real-world like humans. The database security research 
community became interested in conceptual structures because they can be used to represent a 
multilevel application [Thuraisingham 91].

Hinke showed how inference may be detected by traversing alternate paths between two nodes in 
the graph. Graph traversal in semantic nets corresponds to limited inference. The classification of 
implied links found by graph traversal is determined by the classifications of the traversed links. 
The technique enables detection of simple inferences via the transitivity property, but does not 
enable the detection of more complex inferences. Others since have also shown how to represent 
multilevel database applications using conceptual structures developed for knowledge-based 
system applications and subsequently reasoning about the application using deduction techniques 
[Thuraisingham 90f, 91; Delugach 92, 93; Hinke 92; Garvey 92].

Auxiliary semantic nets can be used to express security constraints. Smith suggests extensions to 
Urban's semantic data model to represent multilevel applications [Smith 90b; Urban 90]. Smith 
states that eventually the representation could be translated into statements of a logic programming 
system that could then be used to detect inferences.

4.3  RUN-TIME MECHANISMS

The previous section contained many different methods of controlling inferences through careful 
database design. The methods surveyed in Section 4.2 are, for the most part, application dependent. 
For example, it was seen that using proper database design, using the designer's knowledge about 
the data, and classifying data at an appropriate level all help with inference control. These activities 
occur outside a DBMS and, therefore, are not within its sphere of control.

However, certain run-time mechanisms can be provided to help with other approaches. These 
mechanisms are listed below and discussed in the remainder of the section.

1.  Query response modification. Users can build reservoirs of knowledge from responses 
that they receive by querying the database and it is from this reservoir of knowledge that 
they infer unauthorized information. Two methods to address this problem are 1) to 
restrict queries by modifying the query appropriately during the query evaluation phase, 
and 2) to modify query responses (see Section 4.3.1).

2.  Polyinstantiation. Some inference threats arise due to key integrity, which is a basic 
integrity requirement of the relational model, and these violations can be eliminated if the 
DBMS allows the use of polyinstantiation (see Section 4.3.2).

3.  Audit Facility. Since auditing can be used to detect inferences, a DBMS should provide 
an adequate audit facility so user activities can be monitored and analyzed for possible 
inference violations (see Section 4.3.3).

4.  Snapshot Facility. There are situations in which it was desirable to release some HIGH 
information to LOW users for reasons of convenience or necessity (as in the case of U.S. 
Bureau of the Census described in Section 4.3.4). In these cases, snapshots can be used 
effectively to eliminate or minimize the harm done by release of HIGH information. 
Thus, it is desirable that multilevel secure DBMSs provide a snapshot facility as well.

4.3.1  Query Response Modification

To block an inference channel during a session, user queries can be restricted or the query results 
modified. Diminishing the potential of query restriction control is the finding that through certain 
sequences of seemingly innocuous unrestricted queries called trackers, the answers to restricted 
queries can always be deduced [Denning 78]. Therefore, response modification, which introduces 
uncertainty so as to reduce the risk of disclosure, has been the focus of much run-time inference 
control research.

4.3.1.1  Constraint Processors

Some MITRE research focuses on handling inferences during query processing [Keefe 89; 
Thuraisingham 87, 88, 90a]. The query modification technique is illustrated in Figure 4.4 below 
[Ford 90].

MITRE has augmented Hewlett Packard's Allbase with a logic-based Constraint Processor and 
knowledge base in their current prototyping effort [Thuraisingham 94]. Security constraints 
contained in the knowledge base are utilized by the Constraint Processor to modify queries to 
prevent security violations through inference. Factors used by the Constraint Processor for query 
modification and response sanitization include 1) the security constraints, 2) the previously fetched 
tuples, and 3) external knowledge.



Figure 4.4: Query Processor

An Update Constraint Processor was developed to accurately label data as it is being loaded into 
the database during database update or design. This alleviated the need for the Constraint Processor 
to process all constraint types which could impact query response time. The Update Constraint 
Processor is designed to operate with the Constraint Processor to accurately label data associated 
with the simple and context-dependent constraints defined for an application.

The prototyping effort has a goal of extending from a centralized to a distributed environment with 
trusted DBMSs interconnected via a trusted network. MITRE implemented modules to augment 
the Constraint Processors to process security constraints in a distributed environment during the 
query and update operations.

In addition to the citations above, results of the MITRE-based research on the inference problem1 
including previous prototyping efforts, have been presented at various security conferences and 
documented in MITRE technical reports [Thuraisingham 89a, 89b, 90b-f; Collins 91].

4.3.1.2  Database Inference Controller

The Ford Aerospace proposed Database Inference Controller (DBIC) was a knowledge- based tool 
or set of tools used by the System Security Officer (SSO) to detect and correct logical inferences 
[Buczkowski 89a, 90]. The DBIC system configuration is given in Figure 4.5 [Buczkowski 89b]:



Figure 4.5: DBIC System Configuration

Rules would cause the Inference Engine to take the following actions:

1.  Determine which queries (along with relevant data) are presented to the user during 
knowledge elicitation.

2.  Check consistency and completeness of data provided by the user.

3.  Initiate procedures to calculate the probability of inference and identify violations.

4.  Determine how the Knowledge Base subsystem is updated with the data received from 
the user or procedure output.

5.  Recommend classification changes to parameters to correct inference problems.

The DBIC was based on Probabilistic Knowledge Modeling to identify inference, creating and 
using a Probabilistic Inference Network on and integrated with a semantic model of the target 
database. The Probabilistic Inference Network is a structure that identifies the logical dependencies 
of classified parameters on aggregates of objects at lower classifications.

The Ford Aerospace research culminated in a top-level DBIC design [Buczkowski 89b]. One of 
the conclusions of the effort was that aggregation is a special case of the inference problem.

4.3.1.3  Secondary Path Monitoring

Section 4.2.3 describes secondary path analysis as an approach for the database designer to detect 
potential inference paths. Binns also notes that removing all potential inference paths is not always 
feasible and that run-time analysis based on the potential inference channels detected but not 
removed during design may be necessary [Binns 92a].

Path definition yields the attributes, relations, and order of traversal used to substantiate the path. 
Thus, the definition indicates what the path is. Given the potentially offending inference path, 
Binns proposes that run-time analysis could monitor its existence. Each time a relation in the 
primary or secondary path is modified, the run-time analysis would fire a trigger to detect if a 
classification inconsistency exists. Triggers that monitor the existence of an inference path must 
dominate both the primary and secondary paths, or run as a trusted process. The run-time 
mechanism can flag an active inference channel to the SSO.

4.3.1.4  MLS System Restrictions

Inference problems can arise if queries are conditioned on data that are supposed to be invisible to 
the user. The inference problems of this type can be eliminated by ensuring that all data used in 
evaluating a query are dominated by the level of the user. In general, it is difficult to determine 
when a given operation can be replaced by an operation defined over data classified at lower 
security levels. Thus, it is necessary to develop algorithms for determining when an operation at a 
HIGH security level is equivalent to an operation at a LOW security level. Some work has been 
done on this problem by Denning [Denning 85].

There is a simpler way to deal with this type of inference problem. In certain implementations, the 
system will be unable to access the higher level data, and the query will not pose an inference 
problem. For example, a multilevel system rated at class B1 or higher implements a mandatory 
security policy that includes the simple security condition. Since only. those relations which are 
dominated by the user's clearance level (UNCLASSIFIED in this case) will be available to a 
subject processing this query, such queries will not result in an inference violation based on access 
to higher level data.

4.3.2  Polyinstantiation

A solution to inference violations that occur because of key integrity enforcement has been 
proposed by Denning et al. [Denning 87]. Suppose a user operating at the LOW security level 
wants to enter a tuple in a relation in which the data are classified at either the tuple or element 
granularity. Whenever a tuple with the same key at a higher security level already exists in the 
relation, both tuples are allowed to exist, but the security level is treated as part of the key. Thus, 
the tuple entered at the LOW security level and the tuple entered at the higher security level will 
always have different keys, since the keys will have different security levels. A multilevel relation 
is said to be polyinstantiated when it contains two or more tuples with the same apparent primary 
key values. The TDI companion document on Polyinstantiation [Poly 96] reviews this topic in 
greater depth.

Garvey et al., suggest development of tools for automatic inference channel detection and cover 
story generation using polyinstantiation be a long-term goal to provide automated assistance for 
System Security Officers [Garvey 91c].

4.3.3  Auditing Facility

In some systems, auditing is used to control inferences [Haigh 90]. For instance, a history can be 
kept of all queries made by a user. Whenever the user makes a query, the history is analyzed to 
determine whether the response to this query, when correlated with the responses to earlier queries, 
could result in an inference violation. If a violation arises, the systems can take appropriate action 
(for example, abort the query).

The LOCK Data Views (LDV) project represents an early attempt in data-level monitoring of 
inference [Haigh 90,91]. The project has as its goal to design and build an MLS DBMS hosted on 
the Logical Coprocessor Kernel (LOCK). The LOCK system is being developed to satisfy the 
TCSEC Al criteria. It includes a type manager that allows assured pipelines to be built to enforce 
application-specific extensions to the system security policy enforced by LOCK. The word 
"pipeline" is used because the type manager supports both encapsulation and the security policy 
by funneling an operation through subjects in different domains, each of which has access only to 
objects of designated types.

The LDV project has used the type manager and assured pipeline feature to address DBMS-
specific issues not covered by the LOCK security policy. This basic security policy is extended by 
incorporation of an explicit classification policy. The policy covers name-dependent, content-
dependent, and context-dependent classification policies, as well as inference control [Stachour 
90]. LDV maintains a set of history files that keeps track of which files have been accessed in 
answering queries from LOW users. As soon as enough files have been accessed by LOW users to 
compromise LOW data, LDV either upgrades the clearance of the querying process, or the 
classification of the query result, or the classification of the file that is being queried. The LDV 
project considers aggregation and inference as aspects of the same problem.

Unlike previous methods that seek to prevent inference problems from arising (by reclassifying 
data or metadata), auditing seeks to detect inference problems when they are about to occur and to 
take preventive action.

There is a side benefit of this approach: it may deter many inference attacks by threatening 
discovery of violations. There are three disadvantages of this approach:

1.  It may be too cumbersome to be useful in practical situations.

2.  It can detect a very limited type of inferences (since it is based on the hypothesis that a 
violation can always be detected by analyzing the audit records for abnormal behavior).

3.  The length of time for which history files will be kept will impact the amount of audit data 
recorded

Collusion between users where each user retrieves some of the information remains a difficult 
threat to counter and is not addressed by this method.

4.3.4  Snapshot Facility

There are situations where it is possible to allow limited inferences [Jajodia 92]. These methods 
are useful in cases in which the inference bandwidth is so small that these violations do not pose 
any threat. A snapshot can be created based on a sanitized joined relation for LOW users. Although 
this snapshot cannot be updated automatically without opening a signaling channel, it can be kept 
updated by periodic recreation by a trusted subject. In off-line, static databases, the "snapshot" or 
"sanitized file" is an important technique for controlling inferences. In particular, this approach has 
been used quite effectively by the United States Bureau of the Census [Cox 86,87; Denning 82].

The Bureau of the Census has the responsibility of releasing collected data for public (or 
unrestricted) use; however, the Bureau must do so in such a way that it is not possible to identify 
any individual or organization by use of the released data. To achieve both goals, the Bureau uses 
a conceptual model that is different from the on-line, general purpose DBMS model [Cox 88].

Unlike the general-purpose DBMS model, which is shown in Figure 4.6, the Bureau constructs a 
separate database that is designed for public use. This public database is the sanitized version of 
the Bureau database (see Figure 4.7).



Figure 4.7: Database Management System Model at the U. S. Bureau of the Census

To sanitize the database for public use, the Bureau starts with a microdata file, which typically 
consists of a randomly selected sample from the database. All microrecords are first stripped of all 
identifiers (such as names, exact addresses, or social security numbers of individuals), and then 
different inference control methods are applied to reduce the risk of inference to an acceptable 
level. Some of these methods are listed below:

1.  Subsampling-A second sample is chosen from the microdata to reduce further the 
probability of possible identification of individuals. Also, records having extreme values 
are eliminated.

2.  Compression-The microrecords are merged together (e.g., income is reported in five-year 
intervals rather than every year).

3.  Perturbation of data-Noise is added to data (by rounding off numerical values or by 
actually inserting bogus records).

4.  Slicing-The microdata file is divided into several subfiles.

While these methods work well in static databases (especially for statistical accesses), these 
methods must be used with a great care in general-purpose DBMSs. Since these methods perturb 
the data by adding noise, they have an adverse effect on the integrity of data. Moreover, these 
methods may be too time-consuming to apply dynamically to on-line databases (in which both 
statistical and non-statistical accesses must be provided).

4.4  DATABASE DESIGN TOOLS

It is a difficult task for the trusted database designer to label data so that the labels accurately reflect 
the actual sensitivity of the data and adequately protect the information from inference. This 
section describes tools for detecting potential inference paths. Action can then be taken to 
minimize or eliminate these inference paths. The tools are based on the techniques described in 
Section 4.2.2.

4.4.1  DISSECT

SRI International (SRI) has produced a prototype tool for detecting specific types of inference 
channels in a multilevel database. DISSECT (Database Inference System Security Tool) primarily 
detects inferences where two relations are connected by multiple paths with different security 
classifications. Once these multiple paths are detected, DISSECT constructs the security constraint 
formulas that if satisfied would eliminate the problem. However, it is up to the user of DISSECT 
to ascertain whether the paths represent relations that are semantically similar enough that a lower- 
classification path comprises a higher path. Selected constraints can then be satisfied by manually 
editing the schema or by an automatic upgrading procedure. A limited set of external information 
and assumptions are used.

DISSECT accepts as input a multilevel database schema provided in the form of MSQL relation 
definitions (MSQL is the query language of the SeaView prototype MLS DBMS developed by SRI 
[Lunt 88]). DISSECT is interactive and allows the user to edit relevant features of the schema. 
DISSECT can discover a subclass of deductive inference channels (defined in Section 2.1.1) called 
compositional channels and, with the help of user-supplied type declarations, type-overlap 
channels:

·  Compositional channels - A relationship can be inferred between any pair of entities that 
are connected by a sequence of foreign-key relationships. An inference channel may exist 
if two relations connected by a high foreign-key relationship are also connected by a path 
of one or more low foreign-key relationships. If the two paths between the relations 
represent essentially the same relationship, this is certainly an inference channel, since the 
relationship is explicitly classified high, but is accessible via an alternative path to a low 
user.

·  Type-overlap channels - Attributes whose types overlap are joinable. A typeoverlap 
relationship occurs between two attributes when the two attributes have been declared to 
be of the same or overlapping types. This corresponds to the case in which a theorem 
proved using a high axiom is also provable from all low axioms. This may not be a 
problem, since consequences of high data are not necessarily high.

These channels can be eliminated (by changing attribute classifications) or ignored (e.g., because 
they are not really problems, are unavoidable, or should be deferred to datalevel handling) at the 
discretion of the user. Channels can be eliminated by editing the schema to raise or lower some 
attribute classifications. DISSECT also provides an automated means of upgrading attribute 
classifications to eliminate all inference channels chosen for elimination. User-specified costs of 
upgrading attributes express preferences of how channels should be eliminated.

The SRI DISSECT research effort has been completed and documented [Garvey 94]. The 
DISSECT prototype has demonstrated its ability to detect and to optimally upgrade some attributes 
to automatically eliminate compositional and type-overlap inference channels. Results of the 
DISSECT-based research has been presented at various security conferences and workshops 
[Stickel 94; Lunt 94; Qian 92, 93a, 93b, 94; Garvey 91a, 91b, 91c, 93; Binns 92a].

4.4.2  AERIE

The University of Alabama in Huntsville has developed an inference model called AERIE 
(Activities, Entities, and Relationships' Inference Effects). The project is examining approaches to 
inference modeling and detection using inference targets (sensitive data that are to be protected 
from unauthorized disclosure through inference). The approach requires that the specific 
information that is to be protected be named, and inference paths to this information will be sought. 
The approach describes the database schema, the data, and the sensitive target information using 
conceptual graphs. It utilizes a knowledge-base of domain-specific information that can be 
expressed as a conceptual graph to be used in inference analysis.

A precursor to AERlE was the TRW inference engine analysis and the effort to discover inference 
rules from fundamental relationships among data that pertain to a domain [Hinke 90]. In TRW's 
Advanced Secure DBMS project, a tool was proposed to identify potential schema-level inference 
paths, and queries could be run periodically to detect the existence of data-level inference channels. 
One of the results of the effort was that the specific rules required to find the classified relationship 
did not have to be explicitly stated. The AERIE research is based on these results and on conceptual 
graph structures. AERIE extends the inference techniques suggested by Thuraisingham 
[Thuraisingham 90f] by classifying inference targets and using a two-phase inference approach. 
Two supporting prototype tools were developed: an inference detection tool called Merlin to 
validate the AERIE algorithms; and a rule-based, automated inference database generator called 
Genie to facilitate the development of a large, coherent set of inference relevant instances.

Merlin uses functional dependencies to detect second paths with schema-level data. Merlin adapts 
a transitive association algorithm for testing a relational database decomposition for the lossless 
join property [Elmasri 94]. The Merlin adapted algorithm is used in the detection of second path 
inferences within schema-level data and includes subtype relationships between attributes within 
the schema. Merlin has demonstrated identification of a secondary path between two attributes and 
is able to categorize secondary paths based on schema classification. However, it is recognized that 
legitimate inferences are not being detected by Merlin. Future plans of the AERIE project are to 1) 
move beyond functional dependency-based inference detection, 2) integrate path detection with 
path finding, and 3) adapt Merlin to handle catalog and instance levels of data.

Genie was developed to facilitate the creation of an inference benchmark database for the testing 
of inference detection tools. The database created by Genie is to contain inferences that represent 
some coherent activity in some microworld with the potential to grow to a size representative of a 
real database. Genie is encoded with rules to simulate activities of a microworld, providing the 
basis for the generation of coherent data.

Results of the AERIE-based research have been presented at various conferences and workshops 
[Hinke 92, 93a, 93b, 94; Delugach 92, 93].

4.4.3  Security Constraint Design Tool

MITRE developed a tool for use during database design using an Informix DBMS [Thuraisingham 
94]. The tool provides guidance in assigning minimum security classifications to the database 
schema as well as establishing minimum security classifications for the database schema of an 
existing non-secure database migrating to an MLS environment. The design tool examines simple, 
associative-based, logical and level-based security constraints and determines a minimum 
classification level for database attributes. This information can be utilized by the database 
designer in the design of an MLS database schema.

4.4.4  Hypersemantic Data Model and Language

Section 4 began with the recognition that a general solution to the inference problem is not 
possible. The techniques, mechanisms, and tools described in Sections 4.2,4.3, and 4.4 are partial 
solutions that complement each other. Unfortunately, the approaches are highly specialized. No 
single tool can handle all types of inference problems. Marks proposes that what is needed is a 
uniform representation scheme which can be transformed into specific representation schemes 
without much complexity (Marks 94a]. So while this approach is not a database design tool in and 
of itself, once done, various inference analysis approaches can be applied. For this objective, a 
hypersemantic data model called Multilevel Knowledge Data Model (MKDM), a graphical 
representation scheme called GRAPHICAL-MKDM, and a specification language called 
Multilevel Knowledge Data Language (MKDL) have all been developed.

Future research could include:

1.  Developing inference analysis tools for MKDM.

2.  Developing transformations to other representations so that current inference analysis 
tools can be applied.

3.  Testing tools with examples.

SECTION 5

APPROACHES FOR AGGREGATION

There is no general approach or mechanism for dealing with general aggregation problems, only 
specific measures taken to protect specifically defined aggregates. This section does not describe 
how to identify sensitive aggregates. Rather, it presents approaches for protecting certain classes 
of aggregation problems. A discussion on identifying aggregates is not presented because 
aggregates can be defined in almost any arbitrary way in order to protect almost any fact. As such, 
identifying an aggregate is totally up to the owner of the data. Once an aggregate which requires 
protection has been identified, it hopefully can be classified as either one of the two sub-classes of 
aggregation problems defined below and an appropriate approach defined in Section 5.1 or 5.2 can 
be used to protect that aggregate. If an aggregate does not cleanly fit into one of the two 
aggregation sub-classes, it may be a combination of the two forms of aggregation or another 
problem altogether, such as access control or inference. If an aggregate exhibits both aspects of the 
aggregation problem, it needs to be broken down into its constituent parts and each part protected 
appropriately.

This section describes the various approaches that have been identified for dealing with specific 
classes of aggregation problems. Some of these approaches can be used now while others are still 
ongoing research. As stated in Section 1, the aggregation problem occurs when each individual 
data item is less sensitive than the collection (or aggregation) of all the data items. This definition 
distinguishes the aggregation problem from the inference problem which uses LOW data to infer 
HIGH data, rather than a collection of LOW data being considered HIGH because of the quantity 
of, or relationships between, the data. Aggregation has been divided into two sub-problems 
[Jajodia 92]:

·  Data Association - associations between instances of different data items.

·  Cardinal Aggregation - the associations among multiple instances of the same entity.

5.1  DATA ASSOCIATION

This section presents the research to date on the data association problem. It specifically describes 
the work being done to formalize the description of the data association problem and possible 
solutions, and then presents the primary technique that has been identified for solving specific 
instances of this problem. No tools for detecting data association problems or mechanisms which 
require changes to standard DBMSs are presented as none have been developed or proposed.

5.1.1  Formalizations

Researchers have attempted to apply mathematical approaches to the data aggregation problem, 
especially in the context of MLS DBMS implementations. This section discusses the approaches 
that have been proposed to date.

5.1.1.1  Lin's Work

Lin applies a security algebra defined over classifications and extended over relational database 
operations to mathematically express the aggregation problem [Lin 89a]. The security algebra 
works with classification labels under the constraints of Denning's classification lattice algebra 
[Denning 76]. Lin defines a homomorphism to relate the classification of a relational operation's 
output to the individual classifications of the operation's operands. The reader needs to pay close 
attention to the algebra's notation, as Lin's analysis requires comparisons between purely label-
based operations on classification labels under the lattice algebra (called `algebraic classification') 
and the real-world security class of data (the classification of operations on data). The algebra of 
label manipulation is fully intuitive to readers familiar with Bell-LaPadula [Bell 76]: the label of a 
process that can read a set of data needs to dominate the least upper bound of the labels on its 
individual inputs; the label on a process that creates a set of outputs needs to be dominated by the 
greatest lower bound of the labels on the individual outputs.

Lin observes that an aggregation problem exists when this dominance constraint fails to hold as an 
equality -- i.e., when the real-world classification of a set of inputs fails to exactly equal the least 
upper bound of their individual classifications. He allows for the traditional case (the classification 
of the set strictly dominates the classification of its members), but also for the case where the real-
world classification of an operation is strictly dominated by the least upper bound of the inputs to 
the operation, as in the results of certain aggregate functions (`built-in' operators like SUM, 
AVERAGE, etc.). This algebra is then used to formally define the `aggregation problem for a 
database'.

Lin deals with the data-association aggregation problem, and not with the cardinality aggregation 
problem, and defines an aggregation to be `simple' if it is a set of data whose real-world 
classification strictly dominates the algebraic classifications of each of its constituent elements 
with the further constraint that no proper subset of the set has this property (i.e., the classification 
of any proper subset equals the least upper bound of its constituents' classifications). Since under 
this definition, the removal of any element(s) from a simple aggregate suffices to `solve' the 
aggregation problem, Lin presents an algorithm that requires the SSO to identify sets of elements 
that can be removed from a simple aggregate, and reclassified at a strictly dominating level in order 
to ensure that none of the resulting collections could result in disclosure of a simple aggregate to 
inadequately cleared users [Lin 89b]. Using this technique, it is up to the owner of the data to 
decide which sets of data are to be more highly protected and which are to be left at their original 
classification. Lin extends this work somewhat in [Lin 90a, 91]. An example problem that his 
technique would address would be if we had the following simple aggregates:

·  A, B, C

·  A, C, F

·  D, F, Y, Z

·  E, F, G, H

where a `simple aggregation problem' is as defined above. Given these four simple aggregates, 
Lin's technique would identify that A or C, and F were important elements that must either be 
protected, or all other elements need to be protected in order to protect the aggregates. For example, 
if elements A and F were marked HIGH, and all other elements were marked LOW then no user 
can compile all the elements of these aggregates without accessing HIGH data. Lin's method 
would identify that C was equivalent to A such that it could be protected instead. Conversely, 
access to A and F or C and F could be granted at LOW while all other data was protected at HIGH 
and the aggregates would be equivalently protected. It is up to the owner of the data to decide 
whether it is more important to make available the largest number of elements, or certain critical 
or more interesting elements, while protecting enough other elements to ensure that an aggregate 
cannot be collected together solely at the LOW level.

Lin goes on to define an important homomorphism, H, to relate the operations of relational algebra 
and security classification algebra. It is essential that this mapping be a homomorphism, and that 
it satisfy commutative mapping diagrams (i.e., that the mapping under H into a classification of a 
relational operation on two operands equal the algebraic classification of the operands' mapped 
labels in the label-combining algebra -- e.g., the classification of a join equals the least upper bound 
of the classifications of the join's arguments). Aggregation problems exist when equality fails. Lin 
notes that there is a need to consider aggregations that could hold if the homomorphism were non-
commutative aggregation, indicating that it will be addressed in a forthcoming paper [Lin 90b, 91]. 
Lin extends the work on simple aggregation problems to towers of aggregation problems [Lin 91]. 
The real-world classification of a tower strictly dominates the least upper bound of the 
classifications of the tower's components, but there exist proper subsets of the tower that are 
simple aggregation problems. That is, given

·  Individual elements A and Band C may be UNCLASSIFIED

·  A, B may be a LOW simple aggregation problem

·  A, B, C may be a HIGH tower aggregation problem

Lin would solve <A, B> by classifying, say B as LOW and replacing it with an UNCLASSIFIED 
element E (e.g., a cover story). This would reduce the HIGH tower aggregation problem to <A, E, 
C>, which may not be a problem or which may now be a simple aggregation problem.

Lin also discusses the case of statistical aggregation problems where the application of an 
aggregate function (built-in operator) produces an output whose classification is strictly dominated 
by the least upper bound of its inputs [Lin 91].

5.1.1.2  Cuppens' Method

Cuppens has defined a formal method for addressing the aggregation problem which he refers to 
as a modal logic framework [Cuppens 91]. This logic is based on modeling what a user knows as 
compared to what a user has permission to know. This logic framework can be used to model a 
security policy with aggregation and then prove whether a particular specification is secure or not. 
It does not detect aggregation in a database management system but rather determines whether a 
particular specification properly protects data as specified by a security policy. An aggregate must 
be predefined as part of the security policy in order for the technique to determine whether a 
particular specification will properly protect the data as required by the security policy.

5.1.2  Techniques

Since the data association problem is concerned with the sensitive association between two or more 
distinct less sensitive data elements, Denning [Denning 86a] and Lunt [Lunt 89] have both 
proposed fairly straightforward solutions to this problem. They point out that since a quantifiable 
entity (the sensitive relationship) requires more protection than the individual elements, database 
design and access controls can be used to control the data association problem. The basic solution 
to this problem is for the individual data items to be protected as normal, but the ability to associate 
the items together is removed from each item's relation. A separate relation is then created which 
describes the relationship between the items and is protected appropriately.

Assuming you have an MLS DBMS, this new relation is classified at the proper level for the 
aggregate. Because the data items cannot be aggregated without reference to the higher level 
relation, only users cleared for the level of the aggregate will be able to view the aggregate as an 
entity. Less cleared users will be able to access the individual elements at their proper 
classification. This approach has the additional advantage that the mandatory access control 
mechanism will ensure that the aggregate is properly labeled at the least upper bound of the 
accessed data. In this case, the new relation's higher classification.

Using our fictional company ATC we see two examples of data association. In the ATC database, 
Project names are UNCLASSIFIED while Client names are LOW but the association between a 
Project and its sponsoring Client is classified HIGH. This example of inter-entity data association 
is properly protected by placing each entity, as well as the relationship between them, in separate 
tables that can be independently labeled with the appropriate classification. An example of intra-
entity data association is the fact that all information about an employee is UNCLASSIFIED, 
except for the employee's salary, which must be protected at LOW. To properly protect the 
association between the employee and the employee's salary, the salary field is stored in a table 
separate from the rest of the employee data and protected at LOW. These two associations and 
their solutions are shown as follows:



Figure 5.1: ATC Company Data Association Examples

If an MLS database is not available, then these more sensitive relationships will have to be 
removed from the database or protected using discretionary access controls (DAC). The SSO can 
define roles or individual users that are associated with each type of data and then each user or role 
can be restricted to access only the appropriate tables. The restrictions can be in the form of DAC 
on the underlying base tables, as well as providing predefined views that are restricted to the 
appropriate users or roles. For the example above, to use DAC, one would protect the Client-
Project and Employee-Salary tables with DAC controls that restricted access to only those users 
who should be able to see this information. DAC may be a reasonable solution for mitigating the 
data association problem in some cases, but, given the inherent flaws of DAC, it may not provide 
sufficient assurance that the data is properly separated depending on the sensitivity of the 
aggregate. Such a solution can be used to separate categories of data (e.g., budget, sales, or cost 
data). However, if aggregating all the data within a specific category is an issue, this solution will 
not, for example, prevent a budget clerk from accessing all of the budget data. Accessing all the 
data within a particular category is a cardinal aggregation issue, and approaches to this problem are 
discussed in the next section.

The fundamental technique then to solving aggregation problems that fit into the sub-class called 
data association is to first identify whether a more sensitive aggregate is truly a data association 
problem, and if so, then properly construct the database to protect the more sensitive relationship. 
This protection implies separating the entity's relationship and defining a separate table which 
associates the entities. The table that relates the two entities together must then be appropriately 
protected at the level of the aggregate. This solution protects each entity at the appropriate level 
through careful database design without the need to overclassify any individual element of the data.

5.2  CARDINAL AGGREGATION

A fairly straightforward technique was presented for solving the data association problem. 
Unfortunately, similar straightforward techniques have not yet been identified which can address 
the cardinal aggregation problem in general. A number of techniques and mechanisms have been 
proposed to help address the cardinal aggregation problem. This section presents those that have 
been identified. No tools or formal methods have yet been developed for dealing with the cardinal 
aggregation problem.

5.2.1  Techniques

This section describes those techniques that rely solely on standard capabilities available in the 
typical commercial DBMS as well as those that would be expected to be present in Multilevel 
Secure (MLS) DBMSs.

The following techniques each address a certain aspect of the cardinal aggregation problem. They 
vary in complexity, effectiveness, and side effects or drawbacks. Remember that aggregation is a 
concern only when the security policy for a system specifically indicates that a collection of less 
sensitive items is considered more sensitive when aggregated together. If the policy does not 
indicate this is so, then no aggregates have been defined which must be dealt with. Once an 
aggregate has been identified, then the most appropriate approach for dealing with that particular 
aggregate will depend on the nature of the aggregate. The number and interrelationships between 
aggregates will also affect which techniques would be most appropriate. To date, most techniques 
do not consider interrelationships between aggregates and the complexity interrelationships can 
cause when proposing a solution to the cardinal aggregation problem. Such interrelationships can 
make it more difficult to provide acceptable solutions which balance the need to protect sensitive 
aggregates while still providing access to the less sensitive individual elements.

5.2.1.1  Single Level DBMS Techniques

Cardinal aggregation control is a problem in databases used in commercial applications as well as 
databases used in secure, system-high applications. In both these environments, techniques used 
for cardinal aggregation control have been developed in an ad hoc fashion and not uniformly 
applied. Most of these techniques are administrative and physical; examples are control and review 
of hard copy output and segregation of data across multiple machines. The following techniques 
rely on the use of DBMS security mechanisms and can be applied using most any commercial 
DBMS.

5.2.1.1.1  Database Design and Use of DAC

One of the simplest techniques to minimize cardinal aggregation in single level databases is to 
divide data that could aggregate into separate structures [Lunt 89]. Each of these separate structures 
could then be protected with discretionary controls that restrict individual users from aggregating 
enough information to construct an aggregate whose size was considered more sensitive than the 
individual elements. For example, if there were 1000 data elements which were divided into 10 
blocks of 100, and an aggregate of over 300 was considered sensitive, then each user could be 
limited to accessing at most three of the 10 data blocks. Which three may or may not be arbitrarily 
selected. Such a structure would mitigate the cardinal aggregation problem to some degree, but it 
has the significant drawback that users with a legitimate need to access certain individual elements 
may be prevented from doing so, depending on whether the element resides in the part they are 
permitted to see, or in the part for which they are restricted. It is also difficult to administrate such 
a structure if the amount of data rapidly changes over time, or the number of users changes to any 
significant degree.

Now if the data in an aggregate can be distinguished in some way, such as budget, sales, and cost 
data, then it makes it easier to decide how to divide up access to an aggregate among sets of users. 
However, this is a data association problem, not a cardinal aggregation problem, since cardinal 
aggregation only deals with multiple instances of the same entity. A single aggregate can have both 
data association and cardinal aggregation aspects to it. As in the budget, sales, and cost example 
just discussed, the distinction between the elements is a data association problem, while the 
collection of a sufficient number of elements within a single category is a cardinal aggregation 
problem. Such an aggregate needs to be broken into its constituent components and then each 
aspect of the problem dealt with using its respective technique.

An inherent vulnerability in any solution relying on DAC is that any user with access to data can 
make that data available to others because of the limitations of DAC in controlling data 
propagation. As such, this type of solution may not be sufficiently strong to protect the data in 
addition to the inherent drawback of restricting access to data from users with a legitimate need to 
access it.

5.2.1.1.2  Use of Views

Views can be used to provide access to specific portions of the database. Use of a view can then 
be granted to individuals which might not have the right to access the underlying base relations. 
Views can then be used to more precisely control what a user sees when accessing the database. 
These controls can help protect aggregates in a number of ways. Summary and grouping data can 
be presented which shield the user from accessing individual elements. Depending on the nature 
of the aggregate, such data may not be considered more sensitive than the individual elements, 
while the individual elements when seen together may be more sensitive. For example, a view may 
allow a user to ask how many of a particular element exists, or the average element value, and this 
might be acceptable to reveal, as long as the values of all the individual elements are not revealed. 
Another view may show individual element values but restrict the user to only a certain number of 
elements in any given query. Both techniques may help protect the aggregate.

However, it is typically difficult to be sure that views provide sufficient protection, depending on 
the number of views available, their complexity, and the size and volatility of the database. Normal 
views do not keep track of what each user has accessed over time so the user could assemble an 
aggregate by issuing numerous queries which each access only a few elements at a time. Group 
functions which attempt to hide the underlying data using sums, averages, counts, etc. may be 
subject to statistical inference attacks described in Section 4. Views should be able to protect data 
from the casual viewer but a persistent attacker may be able to circumvent the restrictions placed 
by views.

5.2.1.1.3  Restricting Ad Hoc Queries

One desirable feature of a relational DBMS is that it gives users the ability to formulate ad hoc 
queries against any database object. However, if all users are allowed unrestricted ad hoc queries, 
the SSO may not be able to pre-identify the likely aggregation results and therefore may not be able 
to define controls appropriately. This decision to allow ad-hoc queries is a tradeoff between 
convenience and security.

If ad hoc queries are restricted, then the administrator can rely on protecting access to specific 
views as just described, thereby relying on the access controls they provide to limit each user's 
access to only that data they need to perform their assigned tasks. It may not be necessary to restrict 
ad-hoc queries from all users. If certain users are cleared for all the aggregates contained on the 
system anyway, it may be perfectly reasonable to allow them to execute ad-hoc queries. In this 
way, this capability is not arbitrarily restricted from all users. In fact, most systems would not really 
restrict this capability from every user, because at least one user (the database administrator) 
typically needs to be able to define new views and issue arbitrary queries.

5.2.1.1.4  Use of Audit

Another approach relies on the use of auditing tools to track the accesses of each user to elements 
of an aggregate. Assuming sufficient data has been audited, an administrator may be able to review 
the data and detect whether a user is trying to collect an abnormally large number of values of a 
particular data element (e.g., too many employee phone numbers). Hopefully this audit analysis 
can be done by automated means, but few automated audit analysis (i.e., intrusion detection) tools 
currently exist. More typically, these analyses are done manually, long after the event, by an SSO 
on a set schedule. The audit trail analyses look for outputs that contain aggregates more sensitive 
than their component data elements. Based on these results, the administrator can then either 
restructure the database or apply other controls that would prevent the same problem in the future.

This approach has a number of significant drawbacks, including 1) it attempts to detect, not 
prevent, an aggregate from being pulled together, 2) today's audit analysis tools are typically 
difficult and time consuming to use, 3) the likelihood that an administrator would have the time or 
ability to detect a non-obvious aggregation attempt is questionable, 4) a determined adversary can 
easily avoid detection through slow and careful gathering of data.

5.2.1.1.5  Store all Data System-High

One of the simplest solutions to the aggregation problem, but one with significant drawbacks, is to 
store all the data in a single level (system-high) database which is classified at the same level as 
the most sensitive aggregate in the database. By definition then, all users of the database must be 
cleared for all the data, thus eliminating the possibility of exposing an aggregate to an individual 
that is not cleared to see such an aggregate. In essence, this solution proposes that you remove all 
users from a system who you do not want to access all the data. This has the significant drawback 
that all the less sensitive data, which may be the bulk of the data, is overclassified, possibly highly 
so, and inaccessible to individuals who should otherwise be able to access the data. Thus, although 
this addresses the security concerns related to the aggregation problem, it is not normally a 
practical alternative depending on the nature of the data. Now for a particular dataset, if only a 
small portion of the data is normally classified less than this system-high level, then this may be a 
practical solution.

If this type of solution is used, the less sensitive data must be reviewed before it is regraded to a 
lower level before dissemination. This review can be automated for specifically formatted, tightly 
controlled data, but is typically reviewed manually. If done manually, then the output is typically 
hardcopy, rather than in electronic form. Depending on the nature and volume of data, this review 
task can be extremely burdensome to perform. Sufficiently so, that this solution is impractical in 
many situations.

5.2.1.2  Multilevel DBMS Techniques

There are some additional techniques for addressing the aggregation problem by specifically 
labeling the sensitivity of data (e.g., UNCLASSIFIED, LOW, HIGH) and using this labeling 
capability, and the ability to control access based on labels, to help control the cardinal aggregation 
problem. These techniques build upon the techniques described in the previous section and are 
discussed below.

5.2.1.2.1  Use of Labeled Views

Wilson proposed that views in a multilevel system be used to properly label aggregates with their 
sensitivity level [Wilson 88]. Views are predefined queries which are labeled. They allow a user 
to access certain portions of the database in a well known, controlled, manner. Views can be 
preconstructed to access portions of an aggregate. If a view allows access to a small portion of an 
aggregate then the result that is returned can be labeled at the level of the data accessed. However, 
if certain views access large portions of an aggregate, then the result returned should be labeled at 
the level of the aggregate, which may be higher than the sensitivity of the individual elements 
accessed by the view. This can be accomplished by labeling the sensitivity of the view definition 
at the level of the aggregate. This will cause the results returned to be marked at least at the level 
of the aggregate and prevent users with insufficient clearance from accessing this view. The idea 
here is to use views to define aggregates and have them properly label aggregates whenever they 
are accessed.

This technique relies on the fact that users cannot construct their own queries to get around the 
classification constraints provided by predefined views. If ad hoc queries are permitted, they 
should be labeled at system-high, independent of the user's session level, to prevent unauthorized 
access to aggregates. This means that ad hoc queries by lower cleared users will have to be 
reviewed and downgraded before being displayed to those users. Based on audit trail analyses, any 
ad hoc queries that show up frequently could be converted into standard queries with an 
appropriate label. Use of labeled views and query restriction appears to be a workable strategy in 
some applications, but has some significant drawbacks. If users rarely need to execute ad hoc 
queries, this solution may be reasonable. However, if ad hoc queries are normally necessary to 
accomplish the tasks at hand, then the review and downgrade process could quickly become 
onerous, depending on the nature and volume of the data being reviewed.

This technique will not prevent a user from accumulating sufficient information on their own to 
assemble an aggregate of sufficient size to be considered more sensitive than the individual 
elements. What it does do is automatically label aggregates at the appropriate level for those users 
that are cleared to see the aggregate. It also make it more difficult for an uncleared user to do so, 
and auditing may be able to detect the accumulation of data in an aggregate that a user is not cleared 
for.

This technique may also draw attention to the fact that a sensitive aggregate exists which may be 
inappropriate for some operational environments.

5.2.1.2.2  Start with All Data System-High

Another approach is to start with all data on the system stored at system-high and then downgrade 
individual elements as necessary to satisfy the access needs of less cleared users. This will cause 
the data that is needed to migrate down to its actual level into the hands of individual users while 
the rest of the data remains at system high. The security advantage here is that a review process is 
automatically invoked each time a user requires access to an element which is part of a sensitive 
aggregate and that the data is only given to that particular user, rather than all users at that level. 
This allows the reviewer to monitor the necessity for and volume of data granted to a particular 
user. This is similar to the paper word concept where you can call the operator for some 
organization and request an individual phone number, but you cannot gain direct access to the 
entire phone book. This allows the operator to verify that you are a legitimate caller (e.g., you at 
least know the employee's name and where they work).

This approach can work in certain situations, as in the phone book example above, but clearly could 
be unfeasible when the volume of requests to downgrade data is high.

5.2.1.2.3  Make Some Data High

Similar to the technique of using discretionary access controls to separate data into chunks to 
ensure that an aggregate of sufficient size cannot be pulled together, sensitivity labeling can also 
be used to separate data such that it cannot be aggregated beyond an allowable size. In this case, a 
sufficient portion of the aggregate is marked at a higher sensitivity level (e.g., the level of the 
aggregate) in order to prevent or make it more difficult to create an aggregate which exceeds the 
allowed size at the lower level. If such an aggregate were pulled together, it would be marked at 
the higher level since it would require the inclusion of some of the higher level data. This has the 
advantage that an aggregate of sufficient size would most likely be labeled at the appropriate level, 
but the disadvantage that some, but not all, of the data would be inaccessible to users at the 
appropriate level of the data.

5.2.2  Mechanisms

The previous section described techniques that could be used to reduce the cardinal aggregation 
problem. These techniques relied on database design techniques and mechanisms available in 
standard single and multilevel DBMSs. This section describes new mechanisms that have been 
proposed to address the cardinal aggregation problem which require some degree of modification 
to standard DBMS mechanisms.

In proposed or prototype database implementations using MLS DBMSs, data aggregation control 
problems that have arisen have been handled by application-specific ad hoc controls [Doncaster 
90]. In addition, other control strategies, which are still ad hoc but are less application-specific, 
have been proposed. These are described below.

5.2.2.1  Aggregation Constraints

Section 4.3.3 described the Lock Data Views (LDV) approach for dealing with the inference 
aspects of classification consistency problems. The following describes the LDV approach for 
dealing with aggregation through the use of aggregation constraints. Aggregation constraints are 
defined using extended Structured Query Language (SQL) [ANSI 92, 94], and are enforced after 
the query results have been built. Aggregation constraints are similar to classification constraints, 
except the assigned security levels are applied only to the entire aggregate, and not to the individual 
elements [Denning 86a]. The module responsible for this examination may reclassify the query 
result based upon the constraints [Stachour 90]. The aggregation levels are then used to restrict how 
much data can be released to a given user. The constraints are enforced by building a rule-based 
classification scheme into the pipelines. In addition, a history-recording mechanism is consulted 
and updated whenever a query response is prepared by the DBMS [Haigh 90].

5.2.2.2  Aggregation Detection

The mechanisms just described attempt to control aggregates that are accessed by a single query. 
In order to completely protect an aggregate, a solution ultimately needs to track which data 
elements have been accessed by a particular user and then grant or deny access to additional data 
based on all previous data that user has seen. In this way, the system can ensure that the user cannot 
collect sufficient information to compile an aggregate which is more sensitive than the user's 
clearance. However, the difficulty of tracking all the data that any particular user has seen has 
seemed insurmountable.

Motro, Marks, and Jajodia have proposed an accounting based approach for tracking all 
information that a user has accessed previously [Motro 94]. This information is then used to make 
access decisions which ensure that the user cannot assemble an aggregate using multiple queries. 
This method uses views to define secrets that can be disclosed to a limited degree, and associates 
a threshold value with each such view, which they call sensitive concepts. If any view incorporates 
a sensitive concept, this broader view is also protected as well.

Access to sensitive concepts is then accounted for each user. In fact, to avoid overcounting due to 
multiple accesses to the same data, the scheme keeps track of the actual tuples that have been 
disclosed, rather than just the number of tuples disclosed from a particular sensitive concept. In this 
way, the method can accurately determine exactly what a particular user has seen before, in order 
to determine whether a subsequent query is allowed, or should be denied because a threshold would 
be exceeded. The method also determines whether other related data is equivalent to the data 
protected by a particular sensitive concept and controls access to this related data as well. For 
example, if in a particular database all the employees in Division A are located in Building 1 and 
the employees in Building 1 is a sensitive concept, then employees in Division A would also have 
to be protected as well.

The method described is limited to databases that are single relations and to queries and sensitive 
concepts that are projection-selection views, where all selections are conjunctions of simple 
clauses of the form attribute = value. The authors conjecture that this method can be extended to 
overcome these limitations.

SECTION 6

SUMMARY

Finding effective approaches for dealing with inference and aggregation continues to be a 
significant challenge for the security community. This report provides a survey of the state of the 
art of various aspects of the inference and aggregation problems. It defines and characterizes 
inference and aggregation as they relate to multilevel secure database systems and describes 
methods that have been developed for resolving some of these problems.

Inference addresses users deducing (or inferring) higher level information based upon lower, 
visible data [Morgenstern 88]. The aggregation problem occurs when classifying and protecting 
collections of data that have a higher security level than any of the elements that comprise the 
aggregate [Denning 87].

Inference and aggregation problems are not new, nor are they only applicable to database 
management systems (DBMSs). However, the problems are exacerbated in multilevel DBMSs that 
label and enforce a mandatory access control policy on DBMS objects [Denning 86a].

Inference and aggregation differ from other security problems since the leakage of information is 
not a result of the penetration of a security mechanism but rather it is based on the very nature of 
the information and the semantics of the application being automated [Morgenstern 88]. As a 
result, inference and aggregation are not amenable to complete solution through mechanisms in the 
DBMS. They inevitably have to be addressed in the context of the semantics of the data.

Inference and aggregation are different but related problems. Each of them can in turn take many 
different forms. Researchers have addressed different aspects of the problems making it difficult 
to relate different research efforts. In order to more precisely understand what inference and 
aggregation are and provide a framework for understanding partial solutions it is essential to revisit 
several first principles related to security concepts and processes. These concepts, described in 
Section 2, include:

·  Identifying methods and types of information that can be used to make inferences.

·  Characterizing two classes of aggregation problems: data association and cardinal 
aggregation.

·  Distinguishing between direct disclosure and inference.

·  Understanding the relevance of the traditional security analysis paradigm.

·  Identifying the steps in a database security engineering process.

The framework presented for reasoning about inference and aggregation identified types of 
classification rules and vulnerabilities at each step of the database security engineering process. 
This approach allows one to:

·  Establish the relationship among information protection requirements, implementing 
classification rules, database design, and database instantiation.

·  Distinguish between the types of classification rules (entity, attribute, data association, and 
cardinal aggregation).

·  Distinguish between direct disclosure and inference vulnerabilities.

·  Identify specific vulnerabilities that are introduced at each step in the database security 
engineering process.

·  Differentiate between inference as a vulnerability and cardinal aggregation as both a class 
of policies and as vulnerabilities.

Utilizing this framework, Sections 4 and 5 describe research efforts and other approaches to 
provide techniques, mechanisms, and analysis tools to address aspect of inference and aggregation.

Eliminating or reducing unauthorized inference remains a difficult problem because of diversity of 
methods and types of information that can be used to make inferences. The inference problem 
remains one that is dependent on understanding the value and semantics of data for the part of the 
enterprise being automated. Techniques, mechanisms, tools described in Section 4 provide support 
for identifying inference vulnerabilities during database design.

The data association class of aggregation policies can be solved today through database design 
techniques described in Section 5. Effective mechanisms to enforce cardinal aggregation policies 
remain a topic for research.

REFERENCES

[Akl 87]  S. G. Akl and D. E. Denning. Checking Classification Constraints for 
Consistency and Completeness. In Proceedings of the IEEE Symposium on 
Security and Privacy, pp. 196-201, April 1987.

[ANSI 92]  ANSI. Database Language SQL2. In ANSI K3.135-1992, American National 
Standards Institute, New York, December 1992.

[ANSI 94]  ANSI. SQL3 - Working Draft. Draft Version, Available From X3 Secretariat, 
Computer and Business Equipment Manufacturers Association (CBEMA), 1250 
Eye St. N.W., Suite 200 Washington, D.C. 20005-3992, September 1994.

[Audit 96]  National Computer Security Center, Auditing Issues in Secure Database 
Management Systems, NCSC Technical Report-005, Volume 4/5, May 1996.

[Bell 76]  D. E. Bell and L. J. LaPadula. Secure Computer Systems: Unified Exposition and 
Multics Interpretation, The MITRE Corporation, March 1976.

[Binns 91]  L. J. Binns. Inference Through Polyinstantiation. In Proceedings of the Fourth 
Rome Laboratory Database Security Workshop, 1991.

[Binns 92a]  L. J. Binns. Inference through Secondary Path Analysis. In Proceedings of the 
Sixth IFIP 11.3 WG on Database Security, August 1992.

[Binns 92b]  L. J. Binns. Inference and Cover Stories. In Proceedings of the Sixth IFIP 11.3 
WG on Database Security, August 1992.

[Binns 94a]  L. J. Binns. Inference: A Research Agenda. In L. Notargiacomo, editor, 
Research Directions in Database Security V, June 1994.

[Binns 94b]  L. J. Binns. Implementation Considerations for Inference Detection: Intended 
vs. Actual Classification. In IFIP Transactions A (Computer Science and 
Technology), A-47:297-306, 1994.

[Brewer 89]  D. F. Brewer and M. J. Nash, The Chinese Wall Security Policy, In Proceedings 
of the 1989 IEEE Symposium on Security and Privacy, May 1989.

[Buczkowski  89a] L. J. Buczkowski, and E. L. Perry. Database Inference Controller. Interim 
Technical Report, Ford Aerospace, February 1989.

[Buczkowski  89b] L. J. Buczkowski, and E. L. Perry. Database Inference Controller Draft 
Top-Level Design. Ford Aerospace, July 1989.

[Buczkowski  90] L. J. Buczkowski. Database Inference Controller. In Database Security, III: 
Status and Prospects, ed. D. L. Spooner and C. Landwehr, pp. 311-322, North-
Holland, Amsterdam, 1990.

[Bums 92]  R. K. Burns. A Conceptual Model for Multilevel Database Design. In 
Proceedings of the 5th Rome Laboratory Database Security Workshop, 1992.

[Codd 70]  E. F. Codd. A Relational Model of Data for Large Shared Data Banks. In 
Communications of the ACM, June, 1970, pp. 377-387.

[Collins 91]  M. Collins, W. Ford, J. O'Keefe, and M. B. Thuraisingham. The Inference 
Problem in Multilevel Secure Database Management Systems. In Proceedings 
of the 3rd Rome Laboratory Database Security Workshop, 1991.

[Cordingley  89] S. Cordingley and B. J. Sowerbutts. The Fact Database - A Formal Model 
of the Inference Problem for Multi-Level Secure Data bases. Plessey Report No: 
72/89/R420/U, November 1989.

[Cordingley  90] S. Cordingley and B. J. Sowerbutts. A Formal Model of Dynamic Inference 
for Multi-Level Secure Databases. Plessey Report No: 72/90/R078/U, March 
1990.

[Cox 86]  L. S. Cox, S. McDonald, and D. Nelson. Confidentiality Issues at the United 
States Bureau of the Census. In Journal of Official Statistics, Vol. 2, No. 2, pp. 
135-160, 1986.

[Cox 87]  L. S. Cox. Practices of the Bureau of the Census with the Disclosure of 
Anonymized Microdata. In Forum der Bundesstatistik, pp. 26-42,1987.

[Cox 88]  L. S. Cox. Modeling and Controlling User Interface. In Database Security: 
Status and Prospects, ed. Carl E. Landwehr, North-Holland, Amsterdam, pp. 
167-171,1988.

[Cuppens 91]  F. Cuppens. A modal logic framework to solve aggregation problems. In 
Proceedings of the 5th IFIP 11.3 WG on Database Security, September 1991.

[DAC 96]  National Computer Security Center, Discretionary Access Control Issues in 
High Assurance Secure Database Management Systems, NCSC Technical 
Report-005, Volume 5/5, May 1996.

[Date 86]  C. J Date. An Introduction to Database Systems, Vol. I, 4th Edition, 
AddisonWesley, Reading, MA, 1986.

[Delugach 92]  H. S. Delugach and T. Hinke. Aerie: Database inference modeling and detection 
using conceptual graphs. In Proceedings of the 7th Annual Workshop on 
Conceptual Graphs. New Mexico State University, July 1992.

[Delugach 93]  H. S. Delugach, T. Hinke, and A. Chandrasekhar. Applying conceptual graphs 
for inference detection using second path analysis. In Proceedings of the 
ICCS1993, Intl. Conf. on Conceptual Structures, pages 188-7, Laval University, 
Quebec City, Canada, August 4-71993.

[Denning 76]  D. E. Denning. A Lattice Model of Secure Information Flow. Communications 
of the ACM, Vol. 19, No. 5, May 1976, pp. 236 - 243.

[Denning 78]  D. E. Denning, P. J. Denning, and M. D. Schwartz. The Tracker: A Threat to 
Statistical Database Security. In ACM Trans. Database Systems. 4,1,7-18, 
March 1978.

[Denning 82]  D. E. Denning. Cryptography and Data Security, Addison-Wesley, Reading, 
MA, 1982.

[Denning 85]  D. E. Denning. Commutative Filters for Reducing Inference Threats in 
Multilevel Database Systems. In Proceedings of the IEEE Symposium on 
Security and Privacy, pp. 134-146, April 1985.

[Denning 86a]  D. E. Denning. A Preliminary Note on the Inference Problem in Multilevel 
Database Management Systems. In Proceedings of the National Computer 
Security Center Invitational Workshop on Database Security, June 1986.

[Denning 86b]  D. E. Denning and M. Morgenstern. Military Database Technology Study: Al 
Techniques for Security and Reliability, SRI Technical Report, August 1986.

[Denning 87]  D. E. Denning, S. G. Akl, M. Heckman, T. F. Lunt, M. Morgenstern, P. G. 
Neumann, and R. R. Schell. Views for Multilevel Database Security. In IEEE 
Transactions on Software Engineering, Vol. 13, No. 2, pp. 129-140, February 
1987.

[DoD 85]  Department of Defense Trusted Computer System Evaluation Criteria, DoD 
5200.28-STD, Washington, DC, December 1985.

[Doncaster 90]  S. Doncaster, M. Endsley, and G. Factor. Rehosting Existing Command and 
Control Systems into an MLS Environment. In Proceedings of the 6th Annual 
Computer Security Applications Conference, 1990.

[Elmasri 94]  R. Elmasri, and S. B. Navathe. Fundamentals of Database Systems, Second 
Edition. The Benjamin/Cummings Publishing Company, Inc., Redwood City, 
CA, 1994.

[Entity 96]  National Computer Security Center, Entity and Referential Integrity Issues in 
Multilevel Secure Database Management Systems, NCSC Technical Report- 
005, Volume 2/5, May 1996.

[5100.1-R 86]  Department of Defense, Information Security Program Regulation 5100.1- R. 
June 1986.

[Ford 90]  W. R. Ford, J. O'Keefe, and M. B. Thuraisingham. Database Inference 
Controller: An Overview. Technical Report MTR 10963 Vol. 1. The MITRE 
Corporation. August 1990.

[Garvey 91a]  T. D. Garvey, T. F. Lunt, and M. E. Stickel. Abductive and approximate 
reasoning models for characterizing inference channels. In Proceedings of the 
Fourth Workshop on the Foundations of Computer Security, June 1991.

[Garvey 91b]  T. D. Garvey, T. F. Lunt, and M. E. Stickel. Characterizing and Reasoning about 
Inference Channels. In Proceedings of the Fourth Rome Laboratory Database 
Security Workshop, 1991.

[Garvey 91c]  T. D. Garvey and T. F. Lunt. Cover Stories for Database Security. In 
Proceedings of the 5th IFIP 11.3 WG on Database Security, November 1991.

[Garvey 93]  T. D. Garvey, T. F. Lunt, X. Qian, and M. E. Stickel. Toward a tool to detect and 
eliminate inference problems in the design of multilevel databases. In M. B. 
Thuraisingham and C. E. Landwehr, editors, Database Security, VI: Status and 
Prospects, pages 149-167. North-Holland, 1993.

[Garvey 94]  T. D. Garvey, T. F. Lunt, X. Quin, and M. E. Stickel. Inference Channel 
Detection and Elimination in Knowledge-Based Systems. Final Report ECU 
2528, SRI International, October 1994.

[Goguen 84]  J. A. Goguen and J. Meseguer. Unwinding and Inference Control. In 
Proceedings of the IEEE Symposium on Security and Privacy, pp. 75-86, April 
1984.

[Haigh 90]  J. T. Haigh, R. C. O'Brien, P. D. Stachour, and D. L. Toups. The LDV Approach 
to Security. In Database Security, III: Status and Prospects, ed. D. L. Spooner 
and C. Landwehr, North-Holland, Amsterdam, pp. 323-339, 1990.

[Haigh 91]  J. T. Haigh, R. C. O'Brien, and D. J. Thomsen. The LDV secure relational 
DBMS model. In S. Jajodia and C. Landwehr, editors, Database Security IV: 
Status and Prospects, pages 265-279. North- Holland, 1991.

[Hale 94]  J. Hale, J. Threet, and S. Shenoi. A Practical Formalism for Imprecise Inference 
Control. In Proceedings for the 8th IFIP 11.3 WG on Database Security, August 
1994.

[Hinke 88]  T. H. Hinke. Inference Aggregation Detection in Database Management 
Systems. In Proceedings of the IEEE Symposium on Research in Security and 
Privacy, pp. 96-106, April 1988.

[Hinke 90]  T. H. Hinke. Database Inference Engine Design Approach. In Database 
Security, II: Status and Prospects, Proceedings for the IFIP 11.3 WG on 
Database Security, Kingston, Ontario, Canada, October 1988, C. E. Landwehr, 
ed., North-Holland, 1990.

[Hinke 92]  T. H. Hinke and H. Delugach. AERIE: Database inference modeling and 
detection for databases. In Proceedings 6th IFIP 11.3 WG on Database Security, 
August 1992.

[Hinke 93a]  T. H. Hinke, H. Delugach, and A. Chandrasekhar. Layered Knowledge Chunks 
for Database Inference. In Proceedings for the 7th IFIP 11.3 WG on Database 
Security, Lake Guntersville State Park Lodge, Alabama, September 1993.

[Hinke 93b]  T. H. Hinke and H. Delugach. AERIE: Database inference modeling and 
detection for databases. In M. B. Thuraisingham and C. E. Landwehr, editors, 
Database Security, VI: Status and Prospects: Proceedings for the Sixth IFIP 
11.3 WG on Database Security, Vancouver, Canada, August 1992. Elsevier 
Science Publishers, 1993.

[Hinke 94]  T. H. Hinke and H. Delugach. Inference Detection in Databases. In Proceedings 
of the Sixth Rome Laboratory Database Security Workshop, June 1994.

[Jajodia 92]  S. Jajodia. Aggregation and Inference Problems in Multilevel Secure Systems. 
In Proceedings of the 5th Rome Laboratory Database Security Workshop, 1992.

[Keefe 89]  T. F. Keefe, M. B. Thuraisingham, and W. T. Tsai. Secure Query Processing 
Strategies. In IEEE Computer. Vol. 22, #3, pages 63-70, March 1989.

[Lin 89a]  T. Y. Lin. Commutative Security Algebra and Aggregation. In Proceedings of 
the Second RADC Database Security Workshop., 1989.

[Lin 89b]  T. Y. Lin. L. Kerschberg, and R. Trueblood. Security Algebra and Formal 
Models, In Proceedings of the 3rd IFIP 11.3 WG on Database Security, 
September 1989.

[Lin 90a]  T. Y. Lin. Database, Aggregation, and Security Algebra. In Proceedings of the 
4th IFIP 11.3 WG on Database Security, September 1990.

[Lin 90b]  T. Y. Lin. Messages and Noncommutative Aggregation. In Proceedings of the 
3rd RADC Database Security Workshop, 1990.

[Lin 91]  T. Y. Lin. Multilevel Database and Aggregated Security Algebra, In Database 
Security, IV: Status and Prospects, ed. S. Jajodia, and C. E. Landwehr, North-
Holland, Amsterdam, pp. 325-350.1991.

[Lin 92]  T. Y. Lin. Inference Secure Multilevel Databases. In Proceedings of the Sixth 
IFIP !1.3 WG on Database Security, August 1992.

[Lunt 88]  T. F. Lunt, R. R. Schell, W. R. Shockley, M. Heckman, and D. Warren. Toward 
a multilevel relational data language. In Proceedings of the Fourth Aerospace 
Computer Security Applications Conference, December 1988.

[Lunt 89]  T. F. Lunt, Aggregation and Inference: Facts and Fallacies. In Proceedings of the 
IEEE Symposium on Security and Privacy, May 1989.

[Lunt 94]  T. F. Lunt. The Inference Problem: A Practical Solution, In Proceedings of the 
17th NCSC Conference, pages 507-509. Baltimore, MD, October 1994.

[Marks 94a]  D. Marks, L. Binns, and M. B. Thuraisingham. Hypersemantic Data Modeling 
for Inference Analysis. In Proceedings for the 8th IFIP 11.3 WG on Database 
Security, August 1994.

[Marks 94b]  D. Marks. An Inference Paradigm. In Proceedings of the 17th NCSC 
Conference, pages 497-506. Baltimore, MD, October 1994.

[Meadows 88a]  C. Meadows and S. Jajodia. Integrity Versus Security in Multi-Level Secure 
Databases. In Database Security: Status and Prospects, ed. Carl E. Landwehr, 
North-Holland, Amsterdam, pp. 89-101,1988.

[Meadows 88b] C. Meadows and S. Jajodia. Maintaining Correctness, Availability, and 
Unambiguity in Trusted Database Management Systems. In Proceedings of the 
4th Aerospace Computer Security Applications Conference, pp. 106-110, 
December 1988.

[Meadows 91]  C. Meadows, Aggregation Problems: A Position Paper, In Proceedings of the 
Third RADC Database Security Workshop, May 1991.

[Millen 91]  J. Millen. Honesty is the Best Policy. In Proceedings of the 3rd Rome 
Laboratory Database Security Workshop, 1991.

[Morgenstern  87] M. Morgenstern. Security and Inference in Multilevel Database and 
Knowledge-Base Systems. In Proceedings of the ACM SIGMOD International 
Conference on Management of Data, pp. 357-373, May 1987.

[Morgenstern  88] M. Morgenstern. Controlling Logical Inference in Multilevel Database 
Systems. In Proceedings of the IEEE Symposium on Security and Privacy, pp. 
245-255, April 1988.

[Motro 94]  A. Motro, D. G. Marks, and S. Jajodia. Aggregation in Relational Databases: 
Controlled Disclosure of Sensitive Information. In Proceedings of 3rd European 
Symposium on Research in Computer Security (ESORICS), pp. 431-445, 
Brighton, UK, November 1994.

[National Academy Press 83] Multilevel Data Management Security, National Academy Press, 
Washington, DC (FOR OFFICIAL USE ONLY). 1983.

[Pernul 93]  G. Pernul, W. Winiwarter, and A. M. Tjoa, The Deductive Filter Approach to 
MLS Database Prototyping, Proceedings of the Tenth Annual Computer 
Security Applications Conference, December 1993.

[Poly 96]  National Computer Security Center, Polyinstantiation Issues in Multilevel 
Secure Database Management Systems, NCSC Technical Report-005, Volume 
3/5, May 1996.

[Qian 92]  X. Qian. Integrity, secrecy, and inference channels. In Proceedings of the Fifth 
Rome Laboratory Database Security Workshop, October 1992.

[Qian 93a]  X. Qian. A model-theoretic semantics of the multilevel secure relational model. 
Technical Report SRI-CSL-93-06, Computer Science Laboratory, SRI 
International, November 1993.

[Qian 93b]  X. Qian, M. E. Stickel, P. D. Karp, T. F. Lunt, and T. D. Garvey. Detection and 
elimination of inference channels in multilevel relational database systems. In 
Proceedings of the 1993 IEEE Symposium on Security and Privacy, pages 6-205, 
Oakland, California, May 1993.

[Qian 94]  X. Qian. Inference channel-free integrity constraints in multilevel relational 
databases. In Proceedings of the 1994 IEEE Symposium on Security and 
Privacy, pages 158-167, Oakland, California, May 1994.

[Schum 87]  D. A. Schum, Evidence and Inference for the Intelligence Analyst, University 
Press of America, Lanham, MD, 1987.

[Smith 90a]  G. Smith, A Taxonomy of Security-Relevant Knowledge, In Proceedings of the 
13th National Computer Security Conference, Washington, D.C., October 1990.

[Smith 90b]  G. Smith. Modeling Security-Relevant Data Semantics. In Proceedings of the 
1990 IEEE Symposium on Security and Privacy, Oakland, California, May 1990.

[Smith 91]  G. Smith. Thoughts on the Aggregation Problem. In Proceedings of the 3rd 
RADC Database Security Workshop, May 1991.

[Sowa 84]  J. Sowa. Conceptual Structures: Information Processing in Minds and 
Machines, Reading, MA, Addison Wesley, 1984.

[Sowerbutts  90] B. Sowerbutts and S. Cordingley. Data Base Architectonics and Inferential 
Security. In Proceedings 4th IFIP 11.3 WG on Database Security, 1990.

[Stachour 90]  P. Stachour, and B. Thuraisingham. Design of LDV: A Multilevel Secure 
Relational Database Management System. In IEEE Transactions on Knowledge 
and Data Engineering, Vol. 2., No. 2, pp. 190-209, June 1990.

[Stickel 94]  M. E. Stickel. Elimination of Inference Channels by Optimal Upgrading. In 
Proceedings of the 1994 IEEE Symposium on Security and Privacy, pages 168- 
174, Oakland, California, May 1994.

[Su 86]  T. Su. Inferences in Database. Ph.D. Dissertation, Department of Computer 
Engineering and Science, Case Western Reserve University, August 1986.

[Su 87]  T. Su and G. Ozsoyogiu. Data Dependencies and Inference Control in Multilevel 
Relational Database Systems. In Proceedings of the IEEE Symposium on 
Security and Privacy, pp. 202-211, April 1987.

[Su 90]  T. Su and G. Ozsoyoglu. Multivalued Dependency Inferences in Multilevel 
Relational Database Systems. In Database Security III: Status and Prospects, 
eds. D. L. Spooner and C. Landwehr, pp. 293-300, NorthHolland, Amsterdam, 
1990.

[TDI 91]  Trusted Database Management System Interpretation of the Trusted Computer 
System Evaluation Criteria, NCSC-TG-021, National Computer Security 
Center, April 1991.

[TNI 87]  Trusted Network Interpretation of the Trusted Computer System Evaluation 
Criteria, NCSC-TG-005, National Computer Security Center, July 1987.

[Thuraisingham 87] M. B. Thuraisingham. Security Checking in Relational Database Management 
Systems Augmented with Inference Engines. In Computers & Security, Vol. 6, 
pp. 479-492,1987.

[Thuraisingham 88] M. B. Thuraisingham, W. Tsai, and T. Keefe. Secure Query Processing Using 
Al Techniques. In Proceedings of the Hawaii International Conference on 
Systems Sciences. January 1988.

[Thuraisingham 89a] M. B. Thuraisingham. Security checking with Prolog Extensions. In 
Proceedings of the 2nd RADC Database Security Workshop, Franconia, NH, 
May 1989.

[Thuraisingham 89b] M. B. Thuraisingham. Secure Query Processing in Intelligent Database 
Management Systems. In Proceedings of the 5th Computer Security 
Applications Conference. Tucson, AZ, December 1989.

[Thuraisingham 90a] M. B. Thuraisingham. Towards the Design of a Secure Data/Knowledge 
Base Management System. In Data and Knowledge Engineering Journal. Vol. 
5, #1, March 1990.

[Thuraisingham 90b] M. B. Thuraisingham. The Inference Problem in Database Security. In IEEE 
CIPHER. 1990.

[Thuraisingham 90c] M. B. Thuraisingham. Recursion Theoretic Properties of the Inference 
Problem in Database Security. MTP-291, The MITRE Corporation, Bedford, 
MA, May 1990.

[Thuraisingham 90d] M. B. Thuraisingham. Novel Approaches to Handling the Inference Problem 
in Database Security. In Proceedings of the 3rd RADC Database Security 
Workshop, Castile, NY, June 1990.

[Thuraisingham 90e] M. B. Thuraisingham. A Nonmonotonic Typed Multilevel Logic for 
Multilevel Data/Knowledge Base Management Systems. Technical Report MTR 
10935, The MITRE Corporation, June 1990.

[Thuraisingham 90f] M. B. Thuraisingham. The use of conceptual structures for handling the 
inference problem. Technical Report M90-55, The MITRE Corporation, 
Bedford, MA, August 1990.

[Thuraisingham 91] M. B. Thuraisingham. The use of conceptual structures for handling the 
inference problem. In Proceedings 5th IFIP 11.3 WG on Database Security, 
1991.

[Thuraisingham 92] M. B. Thuraisingham. Knowledge-Based Inference Control in a Multilevel 
Secure Database Management System. In Proceedings of the 15th National 
Computer Security Conference, October 1992.

[Thuraisingham 94] M. B. Thuraisingham, M. Collins, and H. Rubinovitz. Database Inference 
Control. In Proceedings of the 6th Rome Laboratory Database Security 
Workshop, 1994.

[Ullman 88]  J. D. Ullman. Database and Knowledge-Base Systems Volume I, Rockville, MD, 
1988.

[Urban 90]  S. Urban and L. Delcambre. Constraint Analysis for Specifying Perspectives of 
Class Objects. In Proceedings of the 5th IEEE International Conference on Data 
Engineering, Los Angeles, CA, December 1990.

[Wilson 88]  J. Wilson. Views as the Security Objects in a Multilevel Secure Database 
Management System. In Proceedings of the IEEE Symposium on Security and 
Privacy, Oakland, CA. April 1988.
Protection
Requirements
Threat
Risk Analysis
Vulnerability
Controls
Access Control Rules
System Design
DBMS Access Controls
Classification Rules
Database Design
Database Instantiation
Information Protection Requirements
use
sponsors
have
Employees
Research
Projects
Client
Education
Critical
Technologie
s
use
sponsors
have
Employees
Research
Projects
Client
Education
Critical
Technologie
s
Number
Name
Dept.
Home
Address
Home
Telephone
GPA
Degree
School
Major
Name
Name
Address
POC
Description
Target
Platform
Description
Name
ID
work-on
work-on
Information Protection Requirements
Integrity
Requirements
Confidentiality
Requirements
Availability
Requirements
Direct
Disclosure
Inference
Classification Rules
Entity
Attribute
Data 
Association
Cardinal
Aggregation
Flawed Rules
Allow Direct
Disclosure
Flawed Rules
Allow 
Unintended
Inference
Cardinal/
Aggregation
vulnerabilities
vulnerabilities
Database Design
Faithful
Representation
of Flawed Rules
Flawed Design
Allows Direct
Disclosure
Schema/Content
Flawed Design
Allows Unintended
Inference
Schema/Content
Cardinal
Aggregation
Labeled
Low
Database Instantiation
Inappropriate
Labeling Allows
Disclosure
Inappropriate
Labeling Allows
Unintended
Inference
vulnerabilities
Number
Name
Dept.
Hone.Address
Home.Telephone
Employee [U]
Proj.ID
Platform
ID
Name
Description
Employee-Salary [L]
Employee.Number
Salary
Research Projects [U]
Project-Target [H]
Client [L]
Name
Address
POC
Client.Name
Proj.ID
Client-Project [H]
Name
Description
Critical Technologies [L]
Emp.Number
E-Num
School
Degree
Major
Education [U]
[U]
[L]
[H]
[H]
[L]
[H]
[U, L]
[U]
Legend
classification of metadata
attribute1
attribute2
Table Name
[U]
[U]
classification of tuples
VISIBLE
INVISIBLE
VISIBLE
INVISIBLE
KNOWN
VISIBLE
INVISIBLE
KNOWN
User Interface
Inference
Engine
DBMS
Query
Translator
Request in Logic
Response
Response
Database
Relational Query
Modified
Request
Knowledge
Base
Request
User Interface
Knowledge
Base
Inference
Engine
Rules Base
Expert
Number
Name
Dept.
Home.Address
Home.Telephone
Employee [U]
Client.Name
Employee-Salary [L]
Proj.ID
ID
Name
Description
Research Projects [U]
Employee.Number
Salary
Client-Project [H]
Client [L]
Name
Address
POC
[L]
[H]
[U, L]
[U]
[L]
Inter-Entity Association
between Project and Client
Inter-Entity Association
between Project and Client
vulnerabilities
INFER (x > y) =  H(y) - Hx(y)
H(y)
Keith F. Brewster

Acting Chief, Partnerships and Processes
May 1996
Top Authors In Last 30 Days

Red Hat 210 files
Ubuntu 64 files
Debian 27 files
LiquidWorm 11 files
Valentin Lobstein 11 files
nu11secur1ty 8 files
Apple 6 files
Google Security Research 5 files
Ersin Erenler 4 files
E1.Coders 4 files
File Archives

Systems

AIX (429)
Apple (2,078)
BSD (376)
CentOS (58)
Cisco (1,927)
Debian (7,014)
Fedora (1,693)
FreeBSD (1,246)
Gentoo (4,467)
HPUX (880)
iOS (373)
iPhone (108)
IRIX (220)
Juniper (69)
Linux (49,227)
Mac OS X (691)
Mandriva (3,105)
NetBSD (256)
OpenBSD (488)
RedHat (15,501)
Slackware (941)
Solaris (1,611)
SUSE (1,444)
Ubuntu (9,439)
UNIX (9,391)
UnixWare (187)
Windows (6,649)
Other
NCSC-TR-005-1.txt

NCSC-TR-005-1.txt

File Archive:

April 2024

Top Authors In Last 30 Days

File Tags

File Archives

Systems

NCSC-TR-005-1.txt

Share This

NCSC-TR-005-1.txt

File Archive:

April 2024

Top Authors In Last 30 Days

File Tags

File Archives

Systems