# Center for Computational Science and Advanced Distributed Simulation (C^{2}SDS)

ANNUAL REPORT

September 29, 2000 - January 15, 2002

Continued

September 29, 2000 - January 15, 2002

Continued

previous page

**1.3 Knowledge Based Systems (ADS-KBS) Task **

** **

Task Coordinator: Richard Al>

** **

**OVERVIEW **

** **

**Objectives and Significance:**

The main objective for the fourth year of this project was to further our knowledge and to attack problems as related to software issues, in general, and specific applications in advanced distributed simulation in particular. Our foci are on: Fuzzy Quantities Estimates based on Uncertain Information; Extension of a Computer Security Model Based on Uncertain/Partial Information; Design of Unmanned Vehicle Controller; Spoken Language Dialogue in a Distributed Environment; A system for Image Matching for Subject Identification; Extending Temporal Query Languages to Handle Imprecise Time Intervals; Data Mining and Knowledge Discovery in Database Systems and for Classification and Prediction Rules in Large Data Base Systems; Techniques and Applications of Fuzzy Set Theory to Difference and Functional Equations and their Utilization in Modeling Diverse Systems; Concurrency Control; Validation of Authentic Reasoning using Expert Systems; Applications to Blood Analysis and to Quality versus Sample Size to meet Quality Goals. We show how the following tools significantly impact the above problems: Fuzzy Logic, Fuzzy Neural Networks, Genetic Algorithms, Image Analysis and Data Base Management Tools.

Our efforts produce a major impact on how future software may be written. In Estimating Fuzzy Quantities we present a general method for generating fuzzy information. In our Computer Security Model we enhance our previous investigations of an access control security model to one that allows us to obtain the probability of hostility of a user in a system based on a set of available fuzzy values. Our Unmanned Vehicle Controller project investigates methods to design such that will respond to changes in the environment in real time and to develop a minimal set of rules to efficiently guide the vehicle with minimal computing requirements. The Spoken Language Project develops a methodology and software development environment for creating distributed speech applications. The Image Matching Project is proceeding to develop an image-matching tool for investigative functions. Our Temporal Query Language project considers techniques for handling imprecision in temporal databases. Data Mining and Knowledge Discovery is concerned with the discovery of useful patterns that can be extracted from databases. The Data Mining Project for Classification and Prediction Rules in Large Data Bases investigates classification-modeling algorithms. In Concurrency Control we extend previous work and the use of Navigational Transaction Language to indicate how transactions perform on a spectrum of objects. There are several projects that consider the application of fuzzy set theory to such things as Blood Analysis, to meeting Quality Goals and the Validation of Authentic Reasoning.

**Army Relevance: **

Combat conditions provide very uncertain environments and first approximations of neural nets and fuzzy systems may be too rough. Combining neural nets and fuzzy rules offer the advantage of a system with learning capabilities while preserving a human like type of reasoning. In security systems determining if a remote user is permitted access is often based on uncertain information and we present methods to generate fuzzy information. When making decisions under uncertainty we develop methods to compare fuzzy sets or quantities arising from them. Computer Security is a prevalent problem for which we offer a flexible approach to software accessibility for users whose characteristics are only partially known. When dealing with large databases, concurrency control techniques allow a large number of simultaneous users. We provide some approaches that will help the army to better utilize and organize their databases. Fuzzy Concurrency Control provides additional insight on how to control the concurrency problem. These techniques are also applied to determining the correct level of carbon dioxide in the blood-an important factor when troops are exposed to toxicity. Also sampling is crucial in any activity involving production-especially large-scale production and the techniques apply to quality control aspects. When decisions on the battlefield must be done quickly and often without having full information our methods for determining the larger of two continuos fuzzy sets with unbounded support will assist. Also we present methods to validate authentic reasoning to numerous expert systems development incorporating partially conflicting views of the experts. In application to inventory analysis, learning models, system theory, genetics and ecological models our techniques applied to difference and functional equation will be useful. With distributed computer networks where systems allow general access for trustworthy users our methods are applicable. Temporal databases are extremely useful in many army applications such as virtual training exercises. Our methods enhance the query languages and provide them with a capability to deal with imprecise time intervals significantly increasing their usefulness and applicability. Algorithms for efficient data distribution and handling are provided. Data mining in general and prediction and classification modeling in particular have numerous applications such as course of action selection by producing a model based on data accumulated from previous war scenarios. The design of unmanned vehicle controller results can be generalized and applied to automatic control systems in general. The Spoken Language dialogue in a distributed environment will support a broad range of speech applications.

** **

**Accomplishments: **

Imprecise and conflicting data was input into classical neural nets and the effects were analyzed. We developed a framework for a security model using fuzzy sets to represent uncertain/incomplete information. We also devised a method for estimating fuzzy quantities as a user?s level of hostility which themselves depend on a set of fuzzy factors. We investigated the problem of handling time impreciseness in temporal databases and designed three models for the representation of imprecise time intervals. We analyze the underlying logic and important properties of each model. Extensions to existing query constructs at both transaction level and the operator level have been developed. Our models and extensions enrich the flexibility of temporal databases and can be used to help users obtain more meaningful replies for their temporal queries. We have implemented the ID3 algorithm, a popular classification algorithm that uses a decision tree method to generate classification rules from a training data set and evaluate its performance with several crisp data sets. We have investigated the underlying physics model for vehicle navigation and obstacle avoidance and defined a preliminary set of relevant parameters. We have also devised a genetic algorithm based method for generating the navigation rules. A strategy for software access involving uncertain quantities such as expected losses, user hostility, and allowable damage amounts was developed. The performance of protocols for multi user access for database has been shown to depend heavily on how transactions and subtransactions are formed. We study these protocols when not all of the information is available.

**Data Mining for Classification and Prediction Rules in Large Data Base Systems **

M. Beheshti, A. Berrached, O. Sirisaentaksin (UHD)

## Research Objectives and Significance

The objective of this project is to investigate classification modeling algorithms in large data base systems. The aim of classification modeling is to construct a classification model based on the properties of a subset of known objects from the database. The model so constructed can then be used to classify other objects in the database, make quality predictions about future objects based on their class properties, and in some instances aid in decision making process. Currently, the main focus of this project is to develop an algorithm to construct a classification model when the data in the training set exhibits conflicting properties.

## Army Relevance

Data mining in general and prediction and classification modeling in particular have numerous applications. Such algorithm can, for instance, be used to aid in decision making regarding warfare course of action selection by producing a model based on data accumulated from previous war scenarios.

## Methodology

Classification and prediction modeling consist of constructing classification/prediction models based on the properties of a subset of objects whose class label is known. This subset of objects constitutes what is called the training set. The modeling algorithm extracts from the training set classification/prediction rules based on which predictions and classification of other objects are made. Clearly, the predictive accuracy and robustness of the model depends to a great extent on the ability of the modeling algorithm to extract all relevant rules from the training set. When the training set exhibits conflicting properties, which is very common if the training set is a representative sample from a real database, it becomes difficult to generalize those properties into crisp rules. We plan to use results from the rough set and fuzzy set theories to construct ?fuzzy classification/prediction models? that are capable of extracting classification/prediction rules when properties of the training set are conflicting.

## Accomplishments

We have reviewed the literature on data mining in general and have studied a number of classification/prediction modeling algorithms. We have implemented the ID3 algorithm, a popular classification algorithm that uses a decision tree method to generate classification rules from a training data set, and evaluated its performance with several crisp data sets. Based on the basic ID3 algorithm, we have devised a method for extracting classification rules from conflicting class properties. Our preliminary study of the ID3 algorithm has also uncovered a number of its limitations and related open problems, including (1) dealing with missing/unknown data values, (2) dealing with continuous attribute values, (3) evaluation methods for measuring the algorithm?s classification accuracy, and (4) learning methods to adapt classification rules to changing data. Subsequently, we have also investigated a number of methods for dealing with missing attribute values and for adapting the ID3 algorithm to deal with continues data.

Design of Unmanned Vehicle Controller

**R. Al. Beheshti, A. Berrached, and A. de Korvin (UHD) **

**Research Objectives and Significance **

The main objective of this project is to investigate methods to design an unmanned vehicle controller. The controller should be capable of navigating an unmanned vehicle from an initial position to a target position and avoid any obstacles in its path. One of the main requirements in this design is that the controller should be capable of responding to changes in the environment (e.g. moving obstacles) in real-time. Our aim is therefore to develop a minimal set of rules to efficiently guide the unmanned vehicle with minimal computing requirements. The methodology developed in the project can be generalized and applied to automatic control systems in generals.

###### Army Relevance

The design of unmanned vehicle controllers is relevant in numerous army as well as industrial applications.

**Methodology **

The projects consist of two phases: The first phase consists of developing a set of rules to navigate the vehicle through obstacles to its target. The physics model of the system will be used to define the relevant parameters and their effects on the vehicle navigation requirements. Relying on the physics model alone, however, would render the controller too complex, requiring too much computing power to satisfy the real time response requirement. Fuzzy set theory will be used at this stage to abstract out the complexities of the physics model and generate a minimal set of high-level fuzzy rules.

The second phase consists of fine-tuning the original set of fuzzy rules developed in the first phase to meet the design requirements. A methodology will be developed that allows the controller to adjust and fine-tune its rules from a training set of obstacle avoidance and target reaching scenarios.

**Accomplishments **

We have investigated the underlying physics model for vehicle navigation and obstacle avoidance and defined a preliminary set of relevant parameters. We have also devised a genetic algorithm based method for generating the navigation rules. During Spring 2000, we have designed and built a preliminary simulation program with a graphical user interface for testing and verification purposes. Finally, we have compiled the basic ideas underlying the genetic algorithm based rule generator and how it is used for automatic vehicle navigation in a paper that was accepted for publication at the International Conference on Information Technology (IconIT 2001).

**Design and Evaluation of Data Distribution Algorithms in the HLA **

** **

**R. Al. Beheshti, A. Berrached, O. Sirisaengtaksin, and A. de Korvin (UHD) **

M. Bassiouni (UCF)

**Research Objectives and Significance: **

The High Level Architecture (HLA) provides a set of services to facilitate the explicit control of __data distribution__. However, the performance of HLA data distribution depends on the actual algorithm used to implement those services. The objectives of this project are to (1) to examine various approaches to __relevance filtering__ in the context of the HLA and devise alternative algorithms that can improve their efficiency and filtering effectiveness, and (2) develop better performance analysis methodologies to fully determine the performance characteristics of the various data distribution approaches.

**Army Relevance: **

The HLA, designated as the standard architecture for distributed simulation systems, has been supported by DMSO since its inception. Efficient and effective data distribution within this standard architecture is crucial to its scalability and future applicability. This project addresses issues that directly effect the efficient and effective data distribution in the HLA.

**Methodology: **

We developed simulation programs in C/C++ to investigate the performance of various approaches to relevance filtering in the context of the HLA. We studied the fixed-grid approach, which has been used in a number of current HLA implementations. This is the simplest approach, incurs the least overhead, and is highly scalable. However, it is the least effective in terms of its filtering performance. Because the grid-layout is predefined statically and remains fixed throughout the simulation, it results in inefficient utilization of multicast groups and relatively low reductions of irrelevant data traffic. Based on these results we devised a multi-resolution grid-based approach that allows the grid layout to be re-configured to match the distribution of objects attribute values in the routing space. Our evaluation study shows that this approach achieves better filtering, especially when the number of multicast groups is relatively limited. The filtering improvement is achieved at the cost of dynamic grid reconfiguration. One of the main conclusions from our study, is that in the context of large scale simulations (large number of federates and large number of entities) it is difficult to device one single optimal algorithm that satisfies the needs and capabilities of all federates. We formulated a hierarchical scheme that allows filtering to be done in a sequence of stages with increasing levels of accuracy as data moves in the network hierarchy from sender to receiver. An important aspect of this approach is that it provides a framework for partitioning a large-scale federation into a hierarchy of smaller ?sub-federations? and using different algorithms at different levels of the hierarchy according to the specific needs and capabilities of each ?sub-federation?.

**Accomplishments: **

The following tasks have been accomplished:

- 1. Developed simulation programs in C/C++ to investigate various approaches to relevance filtering in distributed simulation systems.
- 2. Devised new approaches to that promise to improve the performance of data distribution management of High Level Architecture (HLA) distributed simulations. These include a hierarchical grid-based approach and a dynamic/variable resolution grid based approach.
- 3. Extended the original hierarchical approach to incorporate clustering of objects that are in close proximity in the virtual world on simulation hosts that physically close. Such clustering methods reduce tremendously traffic over the WAN. In particular..
- 4. Evaluated the performance of the new approaches under various simulation parameters and compared it to the traditional fixed-grid based approach that is employed in current implementation of the HLA.
- 5. Investigated methods to incorporate a simulation object?s
*fidelity*requirements (we define*fidelity*for an object as the minimum rate at which an object needs to receive data updates to maintain an accurate and consistent view of the ?virtual world?). in the data distribution services of the HLA.

**Computer Security Model Based On Uncertain/Partial Information **

** **

**R. Al. Beheshti, A. Berrached, and A. de Korvin (UHD) **

**Research Objectives and Significance: **

Distributed systems provide tremendous new opportunities and benefits to their users but also raise new challenges. Because of the many different components, functions, resources, and users and the tight coupling between the cooperating systems, security in distributed systems is more difficult to achieve than in regular computer networks.

Previously, we have developed an access control security model that determines whether a user is permitted to access and perform particular operations on particular data sets based on the user?s level of hostility and the sensitivity level of the data effected by the requested service. However, such information as a user?s level of hostility tend to be difficult to represent since they depend on several attributes which are themselves ?fuzzy?. The main objective of this project is to extend our previous work and develop a methodology that allows us to obtain the probability of hostility of a user in a system based on a set of available fuzzy values.

** **

**Army Relevance:**

This work may have applications for security issues on distributed computer networks where the systems allow general access for trustworthy users. Methods developed in this projects can be applied to other applications.

**Methodology: **

Previously, we have developed an access control security model that determines whether a user is permitted to access and perform particular operations on particular data sets in the context of a distributed system. Given the level of hostility of a user in a distributed system, and the sensitivity level of the data effected by the requested service, the local host/security guard is called upon to evaluate whether such a request can be safely granted.

In general, information such as expected losses, user hostility, and allowable damage amount are very difficult to assess precisely in numerical terms. So it is natural to express them in the form of fuzzy sets. In linguistic terms, a user can be defined as very hostile, somewhat hostile, or not hostile, and the amount of damage can be expressed as very high, low, very low. In fuzzy expressions, somewhat hostile, for instance, can be expressed as:

*Ph = .9/.2 + .8/.1 *

where the supports (i.e. .2 and .1) are the probabilities of the user being hostile and the .9 and .8 are the membership values. Expected loss can also be expressed in a similar fashion, with the supports being expressed in dollar units for example.

We establish a procedure to determine whether a user *xi* should be allowed to perform the operation *oj* on data *dk* as follows:

(1) Find expected loss *Eijk* by evaluating *EWjk *?* Phi*, where *EWjk* is the worst loss expected from performing operation *oj* on data *dk, *and *Phi* is the estimated hostility level of user *xi. *

(2) Compare *Eijk* with the organization damage tolerance t (amount of damage the organization can tolerate) by constructing a maximizing set *M *from fuzzy sets *Eijk* and t

(3) Compute *Eijk *ټ/span>* M* and t* *ټ/span>* M* then make a comparison between the sets. If the greatest membership value of *Eijk *ټ/span>* M* is greater than the greatest membership value of t* *ټ/span>* M*, then permission is denied since the expected loss from allowing user *xi* to perform operation *oj* on data set *dk* is larger than amount of damage the organization can tolerate. On the other hand, if the greatest membership value of is less than the greatest membership value of t* *ټ/span>* M*, then the user *x* is allowed to perform the operation *oj* on the data *dk*.

However, the effectiveness of the model depends to a great extent on how accurate one can estimate such fuzzy quantities as a user?s level of hostility and expected worst lost.

It is clear that a user level of hostility (*Ph*) depends on a number of attributes about the user which are themselves ?fuzzy? (e.g. whether the user is attempting the access from a remote or local host, whether the user is attempting to access the local host from a friendly or hostile organization/country, the level of trustworthiness of the host, how closely the requested operation resembles the user?s previous access habits etc.). The traditional method used to estimate such a fuzzy quantity is to compute the fuzzy value of each of the attributes on which it depends on a normalized scale and take their (weighted) average. Though simple and computationally efficient, this method is not likely to produce accurate estimates since the average function is only one of an infinite number of possible relationships. In this project, we developed a method for determining the relationship between a user?s level of hostility and the set of factors that effect it. The relationship is determined from a training set (Ak, Phk) where each member of the set Ak is a fuzzy set of user attributes that are deemed to effect the user level of hostility, and Phk is the set of corresponding user hostility levels.

** **

**Accomplishments: **

We developed a framework for a security model using fuzzy sets to represent uncertain/incomplete information. The model allows a local host to determine access permission based on estimates of a user level of hostility and the expected worst lost that can be caused by granting such permission. We have also devised a method for establishing a fuzzy relation between a ?fuzzy? quantity (such as a user?s level of hostility) and the attributes on which it depends. The important feature of this method is that the fuzzy relation that it establishes is the __maximal__ relation in the sense that it represents the strongest correspondence between the target quantity and its dependents. We have also devised a method for approximating fuzzy relations when an exact maximal solution can not be found using the above method. This part work has already been published the proceedings of the IPMU2000 (Information Processing and Management of Uncertainty) conference. We are currently in the process of conducting a performance evaluation analysis of this algorithm and comparing it to Neural Net performance for selected applications. We have also extended this methodology to another relevant application, namely military threat analysis in the context of target select in the battlefield.

**Threat Analysis Using Fuzzy Set Equations **

**R. Al. Beheshti, A. Berrached, and A. de Korvin (UHD) **

**Research Objectives and Significance: **

One of the important factors in realistically simulating individual vehicles in the virtual battlefield is the targeting behavior of vehicles. Target analysis and selection involve several factors such as target detection, target identification and threat analysis. The main objective of this project is device a threat analysis algorithm based on fuzzy set theory. Fuzzy set theory has the capability of expressing ambiguous and complex situations where some or all of the information available for evaluating the threat level of a set of targets is uncertain or not precisely known, as is often the case in real life situations.

**Army Relevance:**

This work is directly relevant to Army applications as all battlefield simulator, such the Modular Semi-Automated Forces (ModSAF) system, use one algorithm or another for threat analysis and target selection.

**Methodology: **

The threat posed by various target is a function of a variety of circumstances and factors. Based on a review of the literature, we have identified nine factors involved in the threat analysis including, aggregate threat assessment, near count threat, target's effective range, target firing status, aspects angle, relative elevation of target, target movement, target type, and sector or fire. The traditional method used to estimate the threat level of a target is to compute the fuzzy value of each of those factors, on a normalized scale, and take their (weighted) average. Though simple and computationally efficient, this method is not likely to produce accurate estimates since the average function is only one of an infinite number of possible relationships. In this project, we developed an algorithm for determining the relationship between a target's threat level and the set of factors that effect it. The relationship is determined from a training set (Factork , Thk) where each member of the set Factork is one of the factors that effects the target's threat level, and Thk is the corresponding fuzzy set of threat level. The problem can then be stated as follows: Given a training set of pairs of fuzzy sets* (Factor1,Th1), (Factor2,Th2),?..,(FactorN,ThN*), estimate *R* such that the system of fuzzy equations* Ak?R = Phk *is satisfied for all* k=1,2,?,N, *where the ? operator is a sup-min operator.

**Accomplishments: **

We have devised a method for establishing a fuzzy relation between a ?fuzzy? quantity, such as a target's threat level, and the factors that effect it. The important feature of this method is that the fuzzy relation that it establishes is the __maximal__ relation in the sense that it represents the strongest correspondence between the target quantity and its dependent factors. This algorithm can be looked at as a learning algorithm since as more precise information is obtained, the obtained relation can be fine tuned to better fit the new training set. We have also devised a method for approximating fuzzy relations when an exact maximal solution can not be found using the above method. We are currently in the process of conducting a performance evaluation analysis of this algorithm and comparing it to Neural Net performance for selected applications.

Modeling Dust Behavior Using Fuzzy Controllers

**O. Sirisaengtaksin (UHD) **

**Research Objective and Significance:**

The objective of this project is to integrate a physical based model together with fuzzy controllers to visualize dust behaviors. This project will focus on realistic simulation of dust behaviors that generated by a fast traveling vehicle. The virtual environments that involve with moving vehicles will be able to simulate physically realistic dust behaviors.

**Army Relevance:**

In many virtual environments and distributed interactive simulations, we hope to simulate trucks, armored vehicles, and other moving objects. However, the simulations for objects travelling on an unpaved road typically do not generate dust behaviors. Simulating physically realistic, complex dust behaviors is useful in interactive graphics applications, such as those used for education, entertainment, and training. The results of this project can provide a realistic virtual environment in simulating a moving vehicle with dust behaviors.

**Methodology:**

We will develop a physical-based model for dust behaviors. The model will utilize particle systems, rigid-particle dynamics, and fluid dynamics. Fuzzy controllers will be integrated for the ease of computations and for fast simulations. OpenGL will be used to generate graphics in simulating the model.

**Physical model for dust behaviors **

#### Fluid dynamics

We have investigated the physical problem of airflow around a moving vehicle using separated flow. The main of the flow is the turbulence at the boundary of and behind the vehicle's movement. The model understudied using the elliptic nature of the flow describes pressure effects:

Where

The fluid-dynamics results provide the laminar flow around the vehicle where the velocity ** V **and pressure

**at any point in the flow volume are calculated, where**

*p*

The air velocity can be described using Prandtl's boundary layer theory as follows

where *u* is a random vector, and *L* is the size of the vehicle (*L*car, *W*car, *H*car).

**Dust-particle dynamics**

A dust-particle dynamic should consist of 3 stages -- generation, movement, and extinction. We consider a dust particle is generated with its initial position and velocity. Once a dust particle enters the air, its motion follows Newton' law. The external forces include the gravity force and air-friction drag caused by turbulent airflow around the vehicle and the surrounding wind. The drag force caused by vehicle movement can be described by

where

and *V*p * *is the current velocity of the particle, *V*air * *is the current velocity of the fluid at the point of the particle relative to the car, and *V*car * *is the current velocity of the car.

**Accomplishment**:** **

We have developed a physical model for a dust-particle dynamic. This model has been implemented to develop a fuzzy controller that mimics the mathematical model of dust particle dynamic. We are in a process of developing a simulation using FuzzyCLIPS and OpenGL to test our result.

Page maintained by CST Web Support Technician

Last updated or reviewed on 10/19/10