Human behavioral simulator for cognitive decision-making

Info

Publication number: 20100112528
Type: Application
Filed: Jul 9, 2009
Publication Date: May 6, 2010
Applicant:
Inventors: Tyson Alan Griffin (Oviedo, FL), Kevin Robert Geib (Orlando, FL), Lisa Solbra Ouakil (Orlando, FL), Robert Thomson McCormack (Oviedo, FL), Asuncion Lacson Simmonds (Winter Springs, FL), Courtney Kathleen McNamara (Oviedo, FL)
Application Number: 12/460,172

Abstract

A simulator is disclosed for training a responder to react in situations that have the potential for the use of force when encountering a main character, such as a person, whom a police officer might encounter during a routine stop and search. The simulator is equipped with natural language recognition and processing such that the main character reacts to voice commands from the trainee. In addition, feedback from the weapon and the weapon aimpoint alters the behavior of the main character. The dynamic scenario architecture is responsive to the language and weapon and interacts with the main character to affect the main character's behavior. The main character is equipped with a behavioral model to vary behavior in a given scenario.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

Pursuant to 35 U.S.C. 119 (e), this application claims priority to the filing dates of: U.S. Provisional Patent Application Ser. No. 61/198,304 filed on Oct. 31, 2008.

BACKGROUND

This invention relates to the field of simulator based training and more particularly to use of adaptive emulation simulation.

Judgment in the use of force is a necessary part of a training curriculum for law enforcement professionals. The training exposes a law enforcement professional to reactions that require a measured response to a spectrum of reactions. Typically, responses include the use of verbal authority, and physical force, up to and including deadly force.

In the exercise of judgment, the law enforcement professional typically encounters numerous situations and personality types leading to numerous potentially different reactions and responses. Generally, such situations require training in situational awareness and decision making under stress. Examples of these situations include, but are not limited to vehicle stops, domestic disputes, airport security situations, school shootings, bank robberies, drug enforcement actions, arrests, and similar situations.

A total training curriculum for a law enforcement professional typically consists of classroom instruction, weapon qualification, scenario based training, and on-the-job training. Scenario based training further comprises role playing, force-on-force, and simulation based training (SBT). A law enforcement simulation based training system incorporates the following components: (1) one or more computers, (2) one or more media players, which can play either video or computer generated graphics, (3) one or more screens that display interactive scenarios, and (4) one or more specially configured weapons.

Central to the use of interactive scenarios is trainee interaction with the screen images. The state-of-the-art in Law Enforcement SBT has progressed to using video clips and computer generated image files, but the clips and files do not respond in real-time to voice and sequential actions of the trainee (i.e., not dynamic). “Dynamic” is defined within this specification as providing changing scene architecture and realistic character reaction to the verbal and physical actions of the person being trained through the use of the simulator. It is also within the contemplation of the definition of “Dynamic” to include interaction or feedback between the scene architecture and the character reaction. “Verbal and physical actions” are those actions that can be processed by the simulator computer such as trainee speech, weapon aimpoint and trigger pull, and may include physical movements, including haptics.

Computer generated imagery (CGI) is the rendering of three dimensional geometrical representations of simulated or real world environments. The goal of computer generated imagery is to provide realistic scenes with characters that are lifelike. Typically, lifelike characters include expressive faces and human realistic motions. Expressive faces include gaze control, blinking, facial control and speech with lip synchronization. As used herein the “main character” is the character being confronted by the trainee (responder) and reacting to the trainee, as will be specifically described hereinbelow. A behavioral model defines a character's state within a scenario. The behavioral model affects the scenario main character and manifests itself as the outcome of cumulative factors affecting the facial expressions, verbal responses, bodily movements and attitude of the scenario object to different stimuli including the trainee.

The advantages of CGI are numerous. CGI eliminates the time, expense, and limitations associated with filming video scenarios. It greatly enhances variability of scenarios. CGI allows the user to uniquely define the training environment including lighting, props and sound effects. It can accelerate the process of developing and editing scenarios to meet changing requirements. It can be used to represent precise aim point and bullet impact location. It also provides for improved after action review (AAR) capability because the AAR scenario can be viewed from any angle within the three dimensional scene representations.

Therefore, there is a need for a training simulator with CGI having one or more characters that realistically and dynamically respond to the input of a trainee, and to the scenario environment.

There is a further need to integrate speech recognition technology to trigger and expose the trainee (responder) to different stimuli based upon the trainee's response and the reactions of the scenario “main character.”

There is an additional need to incorporate a behavioral model into the main character in order to improve human characteristics and create a more realistic encounter. The human characteristics are defined as showing emotion, demonstration of compliant, non-compliant and pre-assault indicators, mood changes, and reaction to speech, weapon position and trigger-pull. In addition, there is a need to integrate the latest weapon and control technology such as conducted energy devices (CEDs), exemplified by the TASER® X26.

These and other features and advantages of the present invention may be better understood by considering the following description of certain preferred embodiments. In the course of this description, reference will frequently be made to the attached drawings.

DRAWINGS

Referring now to the drawings wherein like elements are numbered alike in the several FIGURES:

FIG. 1 is a diagram of an embodiment showing the cooperation among physical components.

FIG. 2 is a functional diagram of an embodiment showing the cooperation among the components.

FIG. 3 is an exemplary flowchart showing the incorporation of behavioral states into a training scenario.

SUMMARY

A simulator is disclosed for training a responder to react in situations that have a potential for use of force when encountering a main character in a scene rendering. The main character has the ability to exhibit a spectrum of behavior. The simulator comprises a natural language interface for processing voice input from the responder, a weapon tracking system having a weapon aimpoint input, a trigger-pull input and at least one simulated weapon. The weapon tracking system provides tracking aimpoint and trigger-pull from each simulated weapon. Also included is a logic module comprising a component interface, and an image generator for the main character and the scene rendering. The image generator is responsive to the logic module. The logic module is responsive to the natural language interface, and the weapon tracking system. The image generator is responsive to the natural language interface, and the weapon tracking system and the logic module, for the main character and the scene rendering to react to the voice input, the aimpoint and the trigger-pull input. The natural language interface, the logic module, the computer image generator and the weapon tracking system are for training a responder to react in situations that have the potential for the use of force when encountering a character having the ability to exhibit a spectrum of behavior.

A computer program product embodied on a tangible computer readable storage medium for training a responder equipped with a weapon to react in situations that have the potential for the use of force when encountering a main character is disclosed. The computer program comprises recognizing natural language for processing speech input from the responder, processing weapon aimpoint and trigger-pull data from the weapon, processing dynamic scenario architecture responsive to the natural language speech recognition and further responsive to the weapon tracking system, determining the main character behavioral states, associating a behavioral state with the main character. The main character reacts to speech input from the responder, to the weapon aimpoint and the trigger-pull data from the responder, and to the scene rendering.

DESCRIPTION

Referring to FIG. 1 an exemplary embodiment of a use of force simulator (hereafter “simulator”) is shown generally at 10. Components include one or more computers 12 to run image generator, speech recognition, and weapon tracking software, with the computer(s) having typical input and output devices (i.e., human/machine interface devices). A wireless headset 14 with a power supply and USB cable for audio input communicates with the speech recognition computer. A sound system device (not shown) outputs voice and scenario sounds and receives input from the speech and tracking computer. A projector 16 with a power supply cable receives images from the image generator and projects the images onto a projection screen 18. An infrared (IR) sensitive CCD camera 20 images the projection surface and provides video data to the computer PCI (peripheral component interface) capture board. Simulated training weapons 22a,b,c instrumented with near-IR laser emitter and driver circuitry image a laser spot on the projection screen 18 when pointed at the virtual scene. The PCI capture board interfaces with a weapon tracking application to determine laser spot location and thus provides aimpoint tracking and trigger-pull detection. The training weapons include, but are not limited to conducted energy devices (CED's) 22a, side arms 22b, and OC/Pepper Spray 22c each having a laser emitter, microcontroller and driver circuitry, and power supply. In addition, various power strips, cables and communication hubs are used to connect the system components. A video camera 24 records the trainee for-after action playback and review.

Referring to FIG. 2, an exemplary block diagram showing the overall software-data relationship is illustrated at 50. The main portions of the software-logic are a scenario file 52, a plug-in 54, instructor operator station (IOS) software 56, weapon tracking system 58, speech recognition system 60, voice recorder 62, and video recorder 64. In the exemplary embodiment, the system includes scenario software for the scenario functionality.

The scenario software authoring package used is DI-GUY, which creates and renders the-3D virtual environment. DI-GUY Scenario is a well known commercial-off-the-shelf 3D simulation tool, and it is within the contemplation of the disclosure to use alternative image generation software, or to develop a new scenario engine rendering tool. DI-GUY Scenario provides specific elements to implement a basic CGI based human avatar behavioral model. DI-GUY Scenario provides for additional computer programming and implementation to modify the scenario files that provide system scenarios including computer generated images. The scenario files are a portion of the dynamic scenario architecture and include software and data that provide functionality for characters, sounds, signals, views, variables including behavior, paths, event handlers, and time and scene objects.

As is well-known in the art all files may reside in modules, modules being a generic term that defines an item of hardware (firmware) or specific computer software code that resides on one or more magnetic or other recording media for the provision of logic or data to a computer.

The Plug-in 54 is a portion of the dynamic scenario architecture and is the logic module that interfaces with the image generator 52 to modify the characteristics of the 3D virtual environment and to produce the disclosed embodiment. The Plug-in includes software and data that provide an interface for weapons tracking, speech recognition, voice recording, video recording and the image generation software.

Instructor Operator Station

An Instructor Operator Station (IOS) provides the instructor user interface for scenario control and override, scenario status, event logging, instructor notes, after action review (AAR) and system startup and shutdown. The IOS user interface is displayed on a computer monitor.

Image Generator

The image generator 52 provides computer generated scenes (e.g. park, schoolhouse, and alley) with props, and one or more characters. Characters have expressive faces, human realistic motions, and “react” to speech, weapon position, and weapon trigger pull. Character reaction is a function of the behavioral model and is described herein below.

In the illustrated embodiment, the behavioral model residing in or cooperating with the image generator defines the behavioral states of the main character. The behavioral model is based upon programmed logic, but it is within the contemplation of the embodiment that the behavioral model could be based on more complex and rigorous computational human behavioral models and cognitive architectures.

Referring to FIG. 3, an exemplary behavioral model is shown at 100. There are 4 levels of behavior 110, 112, 114, 116 that can further be refined into more levels/sub-levels if needed. Each behavioral state is defined by a set of character actions, facial expressions, and verbal responses, which determine the appearance of compliance/non-compliance of scenario main characters. In addition, behavioral state partially determines a scenario end path 118, 120, 122, 126, 128, 130.

Signals activate script files which are programs. Conditional logic is inherent in the design of scenarios, and affects the progression and outcome of a particular scenario.

Behavioral state is controlled by a variable that is set or changed in one of several ways:

- 1. by a script 100 that sets and modifies the variable, Mood, based on percentages
- 2. by signals activated by the Instructor Operator Station IOS 56 (See FIG. 2)
- 3. by signals activated by specific verbal commands or combinations of verbal commands 60 (See FIG. 2.)
- 4. by signals activated by position and activation of weapons 58 (See FIG. 2.)

Behavioral state (1) 110 is most compliant; the character will obey commands and respond to inquiries, and will select compliant end paths (such as surrender or calmly walk away). Behavioral state (4) 116 is the most non-compliant; the character will respond negatively to inquiries and commands, show pre-assault indicators, and select non-compliant end paths. Behavioral state (2) 112 leans towards compliance and behavioral state (3) 114 leans toward non-compliance.

Natural Language Interface

The natural language interface includes the hardware, software and data files that enable the main character to process trainee speech and output speech and sounds. The software comprises a speaker independent speech recognition system using domain specific data and allows any trainee/user/speaker to freely converse with the CGI main characters presented by the system, and recorded speech to allow CGI main characters to verbally act and react with the trainee (responder). It is within the contemplation of the embodiment that instead of recorded speech, synthesized speech could provide character verbal responses. Digital Audio capability includes both synthesized and recorded speech.

In the exemplary embodiment, speech recognition is effected by a statistical language model based on recorded speech, obtained from law enforcement training exercises. A word spotting technique provided by a recognizer application is also used. It is within the contemplation of the disclosure to implement natural language processing software that includes a parser and/or intelligent agent to allow a more natural interaction between the system and human speakers.

The speaker independent speech recognizer is a commercial-off-the-shelf product, Nuance™ V8.5. This product is a hidden Markov model (HMM) based system that implements hierarchical sets of HMMs in order to statistically model continuous human speech. Tools provided by this product allow creation of custom speech models and applications. Speech recognition using hidden Markov techniques are well-known and disclosed in U.S. Pat. No. 6,529,866, “Speech Recognition System and associated methods”, issued on Mar. 4, 2003, to the Department of the Navy which is herein incorporated by reference in its entirety. As is well known in the art, other speech recognizers including those based on other technologies are available and within the contemplation of the embodiment.

A domain specific statistical language model (SLM) is used to model the phraseology flexibly to permit interaction with a simulated character. Initial corpora for the model were developed through discussion with law enforcement subject matter experts, and then iteratively improved based on transcribed speech obtained from a series of law enforcement training exercises during field evaluation. A language interpretation feature provided in the speech recognition system is used to identify points of interest within an utterance. The corpora are divided in categories to minimize issues with ambiguity in interpretation. The corpora covers vocabulary a trainee might say to request a specific action such as “put your hands up.” It also covers vocabulary a trainee might use to obtain information such as “what is your name?” as well as issue warnings such as “you're under arrest.”

An application performs recognition and provides interpreted results to the Plug-in so that the main character responds to the modeled set of commands and inquiries. The application is designed to “listen” to the trainee continuously through the exercise and process individual utterances by implementing end pointing methods in the speech recognition system that monitor changes in the signal to noise ratio to detect the points when a phrase begins and when it ends. Individual utterances are then processed as said in order to be interactive with the character.

Weapon Tracking System

Referring again to FIG. 1 the weapon tracking system consists of one or more simulated lethal and non-lethal weapons 22a,b,c (e.g. firearm, conducted energy device or CED such as the TASER X26 device, and OC/pepper spray) that are un-tethered to the rest of the system. Alternately, one or more devices can be tethered to the system. Weapon position and trigger-pull are detected by the CCD camera 20 and/or other sensing devices, for example a trigger limit switch, or a recoil sensor, which interface with the image generator (through the Plug-in) allowing scene elements, in particular, a main character, to respond to weapon position and trigger-pull. Weapons provide varying force options for use-of-force training. Characters, scenes, and props respond to the type of weapon used by the trainee.

It is within the contemplation of the embodiment to utilize continuous aimpoint tracking and trigger pull to affect the reaction of the characters, scenes, and props. Aimpoint tracking is well known, and disclosed in U.S. Pat. No. 5,213,503, “Team Trainer” issued May 25, 1993, to the Department of the Navy, which is herein incorporated by reference in its entirety; patent and, U.S. Pat. No. 5,215,465, “Infrared Spot Tracker”, issued Jun. 1, 1993, to the Department of the Navy, which is herein incorporated by reference in its entirety.

Video Capture

The video recorder 64 (see FIG. 2) interfaces with the video camera 24 (see FIG. 1) and provides real-time playback and recording of the user/trainee during scenario runtime. The voice recorder 62 (see FIG. 2) interfaces with the microphone 14 (see FIG. 1) and provides real-time recording of the user/trainee during scenario runtime.

During after action review (AAR) the plug-in component interface 54 (see FIG. 2) synchronizes the video recorder's playback of the user/trainee with playback of the trainee voice recording. The plug-in component interface provides the after action review capability by synchronizing playback of the recorded 3-dimensional scenario environment, the video recording of the user/trainee, and the sound recording of the user/trainee. During after action review the operator can control scenario playback time, move the viewpoint within the 3-dimensional scenario environment, and jump to scenario events based on time. The recorded scenario data can be saved to magnetic media, or other typical computer storage devices. The plug-in component interface provides the capability to load a previously stored scenario dataset for after action review.

Instructor Operator Station (IOS)

Refer again to FIG. 2, the instructor operator station (IOS) 56 provides scenario configuration, selection and control capabilities. This station allows an instructor/operator to control scenario elements including main character behavior, and provides scenario override capabilities. The interactions are among the scenario file, the plug-in, the instructor operator station software, the speech recognition system software, the voice recorder, the video recorder, and the weapon tracking software. The scenario file is affected by both the instructor operator station (IOS) and the plug-in. Commands from either of these systems can be triggered at any time. The IOS is affected by parameters in the scenario file.

Scenario Development

Prerequisite knowledge of the system image generator, scenario and user guides is helpful for development of scenarios for the disclosed embodiment. If using DI-GUY, experience with Perl Script, the main programming language used with DI-GUY, and C++ is helpful.

Referring to FIG. 3, generally, each scenario has multiple end branches that can force the trainee to make final decisions and to take force action. The number of end branches in each scenario varies.

As the term has been used within this description “main character(s)” are human avatar(s) with whom the trainee (responder) interacts. Scenario character instances are defined by the Scenario File. The human character's appearance can vary based on race, gender, and age. In order for avatars to appear human, the character's lips could be in sync with character verbal responses, the main character preferably blinks naturally, and preferably the main character realistically responds to trainee commands.

The Plug-in

The plug-in also controls several scenario elements behind the scenes. It is responsible for controlling the inputs from weapon tracking, speech recognition, voice recording, and video recording into the image generator software. Variables and scripts used by the scenario are read by the plug-in, in order to force events and effects to occur.

System response to speech recognition is controlled throughout the scenario. Speech input triggers signals that are relevant to the scenario at a given time. These particular relevant signals are contemporaneously available on the IOS. If the trainee speaks a command that is not currently available, the command will be ignored. If the trainee's speech input is not producing recognizable results, the signals on the IOS can be used as a back up. By triggering the signal associated with the trainee command/question, the main character will respond as if the speech input was recognized or correctly processed by the speech recognizer. A slot map is used by the speech recognition system to map trainee speech to signals used in the scenario.

A trainee's dialogue is recorded by the voice recorder. This dialogue is saved to an audio file that is associated with the AAR. When an AAR session is running, the associated audio will play in time-sync with the scenario. The audio file allows the trainee and instructor to hear the commands given by the trainee and the main character(s)' response to commands.

During scenario play, a video camera records all the movements of the trainee using the video recorder. This footage is saved for use in AAR. The video recorder playback allows the trainee and instructor to review the physical presence and actions of the trainee.

As can be appreciated the embodiment provides a high level of interactivity between trainees and the simulator through its natural language interface. This disclosure provides realistic use-of-force situations, not just shoot/don't shoot situations. This simulator provides the full spectrum of use-of-force situations allowing a realistic range of control, including verbal commands through the use of lethal force. Communication skills and cognitive decision-making skills relating to use-of-force situations can be taught using this simulator, eliminating or drastically reducing training time with role players.

The disclosed simulator system embodiment incorporates all of the following in one system: natural language interface, weapon tracking, and computer generated imagery with human realistic characters having compliance and non-compliance behaviors, scenario generation and after action review capabilities. A particular advantage is that a responder is trained in using various degrees of force in response to non-compliant behaviors by being provided multiple simulated weapons that have varying kinds of simulated output.

While preferred embodiments have been shown and described, various modifications and substitutions may be made thereto without departing from the spirit and scope of the present invention. Accordingly, it is to be understood that the present invention has been described by way of illustrations and not limitation.

Claims

1. A simulator for training a responder to react in situations that have a potential for use of force when encountering a main character in a scene rendering, the main character having the ability to exhibit a spectrum of behavior, the simulator comprising:

a. a natural language interface for processing voice input from the responder;

b. a weapon tracking system having a weapon aimpoint input, a trigger-pull input and at least one simulated weapon, the weapon tracking system being for tracking aimpoint and trigger-pull from each simulated weapon;

c. a logic module comprising a component interface;

d. an image generator for the main character and the scene rendering, the image generator being responsive to the logic module;

e. the logic module being further responsive to the natural language interface, and the weapon tracking system;

f. wherein the image generator is responsive to the natural language interface, and the weapon tracking system and the logic module, for the main character and the scene rendering to react to the voice input, the aimpoint and the trigger-pull input; and

g. wherein the natural language interface, the logic module, the computer image generator and the weapon tracking system are for training a responder to react in situations that have the potential for the use of force when encountering a character having the ability to exhibit a spectrum of behavior.

2. The simulator of claim 1 further comprising an instructor operator station having an output device wherein the logic module and the image generator are responsive to signals from the instructor operator station.

3. The simulator of claim 1 having a natural language speech input which is visible on the instructor operator station output device, and input from the instructor operator station overrides the natural language input.

4. The simulator of claim 1 further comprising a digital audio capability wherein the main character includes voice output that reacts to voice input from the responder.

5. The simulator of claim 1 wherein the natural language interface comprises:

a. a speaker independent speech recognition system using domain specific corpora for natural language understanding to provide for the trainee to freely converse with the main character presented by the computer image generator;

b. digitized audio for the main character to verbally converse with the trainee; and

c. wherein the main character and the trainee communicate in a natural language.

6. The simulator of claim 5 wherein the weapon tracking system comprises, a plurality of simulated weapons, each weapon simulating a different force level from the other weapons for training the responder to respond with varying levels of force.

7. The simulator of claim 1 wherein the image generator includes a dynamic scenario architecture module comprising, a behavioral model for affecting verbal and physical responses of the main character.

8. The simulator of claim 7 wherein the behavioral model includes a probabilistic model for predetermined outcomes.

9. A computer program product embodied on a tangible computer readable storage medium for training a responder equipped with a weapon to react in situations that have the potential for the use of force when encountering a main character, the computer program comprising: recognizing natural language for processing speech input from the responder, processing weapon aimpoint and trigger-pull data from the weapon, processing dynamic scenario architecture responsive to the natural language speech recognition and further responsive to the weapon tracking system, determining the main character behavioral states, associating a behavioral state with the main character, and wherein the main character reacts to speech input from the responder, to the weapon aimpoint and the trigger-pull data from the responder, and to the scene rendering.

10. A simulator system for training a responder equipped with a weapon means to react in situations having the potential for the use of force when encountering a main character, the computer program comprising:

a. means for natural language recognition and process of speech input from the responder;

b. means for weapon aimpoint and trigger-pull data processing from the weapon;

c. means for processing dynamic scenario architecture responsive to the means for natural language recognition and process, and further responsive to the means for processing weapon aimpoint and trigger-pull;

d. means for determining character behavioral states, associating a behavioral state with the main character, and

e. wherein the main character reacts to the speech input from the responder, to the means for weapon aimpoint and triggering data from the responder, and to the scene rendering.

11. The simulator system of claim 10 comprising a means for after action review.