Method for networked interactive control of displayed information
One or more users may command their respective window applications on an interactive display networked to client computers using laser pointers and/or voice commands. Users' voices are associated with a particular laser pointer pattern. A sequence of computer decisions checks each laser pointer command so as to correctly associate respective users with their commands and application windows. The invention performs speech recognition of the user's voice command. If the command is recognized, the invention performs the speech-recognized command as a window operation. The location of all unique light patterns is broadcast to all networked client computers.
The present application is a divisional application of and claims priority from related, co-pending, and commonly assigned U.S. patent application Ser. No. 10/100,339 filed on Mar. 18, 2002, entitled “apparatus and Method for a Multiple-User Interface to Interactive Information Displays” also by Sakunthala Gnanamgari and Jacqueline Smith. Accordingly, U.S. patent application Ser. No. 10/100,339 is herein incorporated by reference.
STATEMENT OF GOVERNMENT INTERESTThe invention described herein may be manufactured and used by or for the Government for governmental purposes without the payment of any royalty thereon.
BACKGROUND OF THE INVENTIONThis invention relates to the interactive control of large information displays and, specifically, to the remote interactive control of such information displays by multiple, simultaneous users.
The standard configuration of the desktop computer introduced in the 1 970s consisted of a monitor for visual display, and a keyboard and mouse for inputs. Displays of computer desktops were traditionally controlled via the keyboard and mouse. The development of light pens and touch screens for direct interaction with the desktop monitor provided an alternate means of interaction with the desktop computer system. These tethered means of interaction for the human user constrained the number of people who could view the information to the single user and a small audience. The need to share the displayed information with a larger audience led to the use of large screen displays and video projection equipment with the desktop monitor.
The display of computer desktops onto a translucent screen via rear projection has become prevalent since the 1990s. The resulting magnified desktop display allows a larger audience to view information at meetings, lectures, and classroom settings. Manufacturers of video projector equipment have refined their high resolution projectors so as to offer resolutions of 1280×1024 pixels and to make them available at moderate cost.
Notwithstanding this development in technology, a human user interaction is still constrained to desktop based, tethered control of the application windows on the large, wall-based display. The earlier use of light pens to interact with the desktop monitor may have influenced the idea of using a laser pointer as an input device for activating window menus and elements and as an electronic grease pencil. The introduction of laser pointers as an alternative input device to the mouse and keyboard has allowed human users to interact in an untethered mode.
To detect and track a moving beam of laser light on a wall-based display area, wide-angle lens cameras positioned behind the translucent screen are used to capture a rapidly moving circular laser beam. This basic imaging capability motivated the idea to use a laser pointer as an input device to replace the traditional desktop keyboard and mouse, in conjunction with the replacement of the desktop monitor by the large projected display wall.
The first display areas were limited to the resolution and physical area of the screen. To achieve an increased display area, one needs to combine multiple displays together to create a larger contiguous display area that can be treated as a single screen for interaction. X-Windows (a UNIX based windows protocol) based software such as X-MetaX has allowed for the seamless horizontal tiling of multiple screens to form a single continuous display of the computer desktop. The X-Windows capability improved upon the display of separately horizontally tiled windows that were not contiguous. This represents the current state-of-the-art of the Air Force Research Laboratory (AFRL) Interactive DataWall.
The AFRL Interactive Data Wall art consists of single, one-at-a-time usage of a laser pointer as an input device with a video projection display screen. In essence, it uses a laser pointer as a substitute for a keyboard or mouse. An approach to tracking a single laser pointer has been disclosed in a U.S. patent application Ser. No. 09/453,258 entitled “Display Pointer Tracking Device” by Sweed. This approach is hardware based and is limited to the tracking of an unmodified laser pointer output, typically a circular focussed spot as seen by the human eye when projected on a screen surface. Single laser pointer implementations foreclose the possibility of multiple persons interacting simultaneously with a large display, as it has only one laser spot that is tracked on the basis of laser beam intensity.
The use of the aforementioned Interactive DataWall is still limited to a single user, i.e. only one person at a time can manipulate the computer desktop as projected onto the screen. There is no way to allow more than one person to simultaneously access the display system using the Display Pointer Tracking Device in Sweed, which is based strictly on intensity detection. Large display systems are designed to project and display the computer desktop in a larger format than is possible on a standard computer monitor.
The AFRL Interactive Data Wall is limited to single user interaction with the display wall using that user's laser pointer and voice commands. The Interactive Data Wall uses a “Display Pointer Tracking Device” developed by Sweed (U.S. patent application Ser. No. 09/453,258), which is hardware based and tracks the laser pointer output on the basis of laser beam intensity.
There exists a patent for a teaching installation for learning and practicing the use of fire-fighting equipment (Deshoux/U.S. Pat. No. 6,129,552). This invention involves a large display screen that shows varying fire sequences controlled by a computer. The user interacts with the display by using four fire extinguishers fitted with laser pointers. The optical sensors identify the point on the display where the laser image is focused. The computer can determine which of the four lasers is being used; however, it is not specified that multiple users can operate the invention simultaneously.
There exists a patent for a method and display control system for accentuating (Nguyen/U.S. Pat. No. 5,682,181). In this invention, the user can draw on a display by using a hand-held light wand. This light is picked up by a CCD camera aimed at the display. The accentuation drawn by the user can be displayed in different colors. It appears that this invention is intended for use by a single user and not multiple simultaneous users. In the computer input system and method of using the same (Hauck/U.S. Pat. No. 5,515,079), the input light source is that of a hand-held lamp. Aside from that, it is very similar to Nguyen's patent.
A similar invention, an information presentation apparatus, and information display apparatus (Arita/U.S. Pat. No. 5,835,078), allows multiple users to interact with a display using multiple laser pointers. The inventers claim that the pointers could be distinguished from each other by using laser pointers with varying wavelengths or even varying shapes. However, this particular patent does not incorporate the integration of voice commands with the users' laser pointers.
The unconstrained pointing interface for natural human interaction with a display-based computer system (Kahn/U.S. Pat. No. 5,793,361) may also facilitate multiple users (without voice commands). In this case, the laser pointer image detector is located within the laser pointer.
There also exists a method and apparatus for detecting the location of a light source (Barrus/U.S. Pat. No. 5,914,783). In this invention, the user can draw on a display by using a laser pointer. This light is not picked up by a CCD camera like the other patents. Instead, pixel mirrors are sequentially switched to reflect light from a corresponding on-screen pixel to a detector in an order which permits identifying the on-screen pixel illuminated by the spot of laser light. It appears that this invention is intended for use by a single user and not multiple simultaneous users. The multi-scan type display system with pointer function (Ogino/U.S. Pat. No. 5,517,210) is similar to Barrus/U.S. Pat. No. 5,914,783 in that it facilitates use of one laser pointer. The pointer position is handled mainly with circuitry as opposed to image processing software.
In view of the above, it would therefore be desirable to have an apparatus which expands the single user capability of the AFRL Interactive Data Wall to at least four independent users with those users being distinguished by their selected laser patterns. It would be further desirable to enable multiple users to work collectively by their simultaneous access of an information display in collaborative and team applications where such simultaneous access is provided by each user's respective laser pointer patterns and voice commands.
Applications for such an apparatus would include education, corporate and professional training environments, and planning and decision making applications where multiple users interact with a large amount of data. Other markets would include financial trading, budget preparation and analysis for organizations, product planning and marketing decisions. Advanced versions of such an apparatus could provide a solution for large network management for telecommunications, electric power, and corporate networking areas. These applications involve the use of geographic, educational curriculum, and information presentation displays, supplemented by supporting information and images, with multiple users trying to interact with display medium. Managing this myriad of information types and formats is unwieldy today, and leads to solutions which are at best, compromises.
OBJECTS AND SUMMARY OF THE INVENTIONTherefore, one object of the present invention is to provide a method for interactive control of large information display systems.
Another object of the present invention is to provide a method for the untethered, remote and collaborative interaction with and control over large information display systems.
Yet another object of the present invention is to provide a method for interactive control of large display systems that utilizes an user's voice commands, laser pointer, or traditional keyboard and mouse command inputs.
Still another object of the present invention is to provide a method for simultaneous collaboration by multiple users employing means and methods for identifying specific users' voices, laser pointer inputs, keyboard entries and mouse manipulations so as to distinguish any such input commands among respective users.
Briefly stated, this invention relates to the untethered multiple user interaction of large information displays using laser pointers coordinated with voice commands. A projection system projects application windows onto a large information display. One or more users may command their respective window applications using laser pointers and/or voice commands. A registration program assigns a unique identification to each user that associates a particular users's voice and a particular laser pointer pattern chosen by that user, with that particular user. Cameras scan the information display and process the composite of the application windows and any laser pointer images thereon. A sequence of computer decisions checks each laser pointer command so as to correctly associate respective users with their commands and application windows. Users may speak voice commands. The system will then perform speech recognition of the user's voice command. If the command is recognized, the system performs the speech-recognized command as a window operation.
According to an embodiment of the invention, method for interactive control of displayed information, comprising configuring a network connection between at least one client computer and an information display controller where at least one client computer is assigned to one enrolled user; where an information display controller cooperates between at least one client computer and the information display; displaying a separate application window on the information display representative of client computer application window on client computer's desktop; logging the enrolled user into his assigned client computer where the step of logging further comprises verifying the enrolled user's voice and where each enrolled user chooses a unique light pattern to project onto the information display; associating the enrolled user's identification, voice and said unique light pattern; projecting unique light pattern onto the information display; scanning the composite image of the information display and each of the unique light patterns; digitizing the composite image; identifying and locating each of the unique light patterns; broadcasting the shape and location of each of the unique light patterns to each of the client computers; determining, within the client computer, whether the unique light pattern is associated with that user assigned to that client computer where, if the unique light pattern is associated with that user assigned to that client computer, then determining the user's desired mode of interaction between the unique light pattern and the displayed application window and performing the user's desired operation on the displayed application window; otherwise, if the unique light pattern is not associated with that user assigned to that client computer then ignore the user's desired operation on the displayed application window.
According to a feature of the invention, method for verifying a user comprises vocally entering a user's identification; determining whether user is enrolled and if user is enrolled, then determining whether the user's vocal utterance matches his previously obtained voiceprint; if the user's vocal utterance matches his previously obtained voiceprint, then the user is notified that he is verified, otherwise an attempt is made confirm user's identification; if user's identification is confirmed, then the user's last name is verified; if user's last name is verified, then the user's company code is verified; if user's company code is verified, then the user is notified that he is verified. However, if it is determined that the user is not enrolled, then an attempt to confirm the user's identification is made. If the user's identification is not confirmed, the step of verifying the user is repeated until user becomes enrolled. But if the user is enrolled but his vocal utterance does not match his previously obtained voiceprint, then an attempt to confirm the user's identification is again made. If the user's identification cannot be confirmed, then determining if the user is enrolled is again repeated.
The above and other objects, features and advantages of the present invention will be apparent from the following description read in conjunction with the accompanying figures, in which like reference numerals designate the same elements.
BRIEF DESCRIPTION OF THE DRAWINGS
Referring to
In this particular embodiment, four projectors 104 project images onto the information display 100. While projectors 104 may be on either side of the display wall 100, in the preferred embodiment, projectors 104 are on the side of the information display 100 opposite the users. Projectors 104 project a collective image that is being generated by display controller 116. Display controller 116 provides network access between users and their respective client machines 114 and displays the desktop environments of all client machines onto information display 100.
Cameras 102 receive the image that is displayed on information display 100 by a combination of computer-projected and user-generated, laser pointer projected images 112. While cameras 102 may be on either side of information display 100, in the preferred embodiment, the cameras 102 are on the side of information display 100 opposite the users. Frame grabbers 118 digitize the image that is received by cameras 102. Display controller 116 performs image processing and analysis of sequential images retained and transmitted by frame grabbers 118.
Based on the detection of laser pointer 108 projected image 112, by cameras 102, the position of laser pointer 108 projected image 112 is obtained. Its coordinates relative to information display 100 are converted into mouse coordinates and used to simulate mouse movements. Display controller 116 performs image detection of unknown projected image 112 pattern and uses image processing software (such as commercial off-the-shelf software named HALCON) for pattern identification so as to match projected image 112 pattern to a known template. Image processing software outputs the corresponding template number if the cameras 102 detect any features that match any of the known projected image 112 patterns along with the spatial coordinate (i.e., x,y) locations of the projected image 112 pattern detected. The location and shape are then sent to all client computers 114 for interpretation and execution of the user's application window 110 activities.
The user's application window 1105 receives the projected image 112 shape and location information from display controller 116. If the shape information matches the shape assigned to that user's current laser pointer 108 projected image 112, then the user's_specified action is executed based on the laser pointer's 108 mode (pointing or drawing). Based on the detected location of laser pointer projected image 112, the display controller 116 modifies the image that is transmitted to the projector 104. If the projected image 112 is not the one chosen by the user, then the command is ignored.
Each user may also interact with his applications by using voice commands through his wireless headset-/microphone 106. In some applications, laser pointer 108 projected images 112 are combined with h voice commands to issue commands that require some action based on location information. One such example is: “Draw a circle here.” Another example is: “Draw line from here to there.”
These voice commands use words like “here” and “there” to describe locations. These locations are supplied by the display processor 116 when it detects user's laser pointer 108 projected image 112.
Referring to
Four (4) users are enrolled 202. Each user must cooperatively enroll (if they have not previously enrolled) by speaking specific utterances to create a speech model of that user's vocal characteristics. Voice verification identifies an individual user by his biometric voice pattern. The relationship between a user, the username, password, and their specific voice pattern is known to the system upon completion of the enrollment process. When a user wants to register using voice verification (assuming that the user is already enrolled), the user will utilize the wireless microphone/headset or handheld microphone (see 114,
Four (4) new VNC Viewer connections to the four (4) remote client computers are opened 204, which contain all of the applications to be displayed on the information display (see 100,
Users log in to their respective client computers (see 114,
Several parameters must be associated 210 with any one user to facilitate that user's control of and access to each respective application window (see 110,
Users point/project 212 laser pointer (see 108,
HALCON image processing software matches 218 the projected image (see 112,
Client computers (see 114,
The two modes of interacting with an application window (see 110,
Functionality of Display Controller
Referring to
The laser pointer (see 108,
A threshold function is then performed. The threshold function is performed in accordance with known image processing techniques. Thus, the bitmap format image can be reduced to a gray scale having, for example, 256 different shades. The threshold function is given a minimum grayscale value. These values can be programmer-defined variables that depend upon programmer requirements. Thus, the light that results from the laser hitting the screen will fall between the minimum gray scale value and the maximum gray scale value of the bitmap image which is given to the threshold function. Concurrently, the remainder of the image will be below the minimum gray scale value given to the threshold function. Again, the threshold function operates as a filter and will convert the laser image (which again falls between the minimum and maximum gray scale values) to a completely white (for example) area while the remainder of the image can be black (for example). The “GetArea” function may return, for example, the number of pixels that occupy the area that was previously image processed to be white. A “GetCoordinates” function returns (x,y) coordinates that correspond to the image processed white area. As the white area extends over a number of coordinates, the exact (x,y) coordinates within that area to be returned can be determined based upon user preference. The center of gravity of the image processed white area, for example, may be used. A “GetShape” function is also used to distinguish the shape of the laser pointer image pattern. The “PatternMatching” function compares the acquired shape of the laser pointer (see 108,
The “Exit Mode” variable is next obtained 306. “Exit Mode” variable is evaluated 308 to determine whether program functions should be terminated. If when “ExitMode” variable is evaluated and if “True” then program function is terminated. Otherwise, if “Exit Mode” variable is “False”, then processing proceeds to an evaluation of image areas 310.
Image area is evaluated 310 to determine whether or not it is equal to zero. The image area was previously determined from the “GetArea” function. If the image area is evaluated to be equal to zero 310, then processing proceeds to measuring the amount of time 312 between two “clicks” (i.e. illuminations) of the laser pointer (see 108,
A determination is made 310 as to the image-processed area of the projected image (see 112,
The previous time measurement 312 is next evaluated 314 to determine whether or not it is less than a predetermined value, “Delta”. If the previous time measurement 312 is determined to be less than a predetermined value “Delta”, it is then determined that a “LaserClick” has taken place 318. If the previous time measurement 312 is determined to be greater than a predetermined value “Delta”, it is determined that a “LaserClick” has not taken place.
The laser pointer (see 108,
Functionality of Client Computers
Referring to
Client computer (see 114,
The “Exit Mode” variable is evaluated 408 to determine whether program functions should be terminated. If the “ExitMode” variable is evaluated 408 as “True”, program function is terminated. Otherwise, if the “Exit Mode” variable is evaluated 408 as “False”, processing proceeds.
The current value of the “GetClickMode” variable is next determined 410. The current value of the “DrawMode” is determined 412. A cursor is moved on the screen 414 (which corresponds to mouse movement) in accordance with the information obtained in the previous steps. Another check is performed 416 to detect the presence of a “LaserClick”.
The process of generating mouse events is described next. A “mouse event” causes the laser pointer (see 108,
If, for example, “ClickMode” does not correspond to values “1”, “2”, or “3”, then the “MoveMouse” function is performed 430. The “MoveMouse” function includes a “SetCursorPosition” function. The “SetCursorPosition” function relies upon the (x,y) coordinates of the laser pointer (see 108,
Once the “MoveMouse” function has been completed 430, the “ResetTime” function is performed 432. In the “ResetTime” function, the predetermined value “Delta” and a predetermined “Maximum” value are summed to generate a time value:
Time=Maximum+Delta
This sets “Time” to a value greater than delta to avoid the unwanted click events. Once the “ResetTime” function is performed 432, the process again waits for another broadcast 404. If “ClickMode” equals 3 426, then “SimulateMouseDownEvent” 428 will be performed.
Referring to
If it is determined that the coordinates did change 502, then a determination is made as to whether the program is in the drawing mode 506. If the program is in the drawing mode, the line width 508 and the color 512 will be set based on where the user “clicked” with his laser pointer (see 108,
If the program is not determined to be in the drawing mode 506, then a “SimulateLeftDownEvent” (which is explained above) is performed 510.
Processing then proceeds to performing a “MoveMouse” function 516. A “MoveMouse” function is performed 516 where the mouse cursor is moved to the laser pointer (see 108,
Functionality of Enrollment and Verification
Referring to
Referring now to
If the company code is verified 720, the user is told that his identity is verified 712. Processing returns to logging on the user 604. If the company code is rejected 720, the user is told that he could not be verified 722. Processing then terminates and the user is not logged on.
If it is determined that a user is not enrolled 704, an attempt to confirm the user's ID is made 706. The system recites the account name that it heard.
The user must respond to the system by saying that the account name is correct or that it is incorrect. If the user says that the account name is incorrect, the user enters his ID again 702. If the user says that the account number is correct, the user is told that he is not enrolled 708. Processing returns to asking the user if he wishes to enroll in the system 606. If the user says no, the program terminates. If the user says that he would like to enroll in the system, enrollment is initiated 608.
Referring to
All of the following information is entered vocally by the user during enrollment. The process begins 800 when the user is prompted for his user ID (an alphanumerical string) 802. The enrollment process then prompts the user to enter his last name 804. The user is then asked for his company code 806. The enrollment software acquires a first test utterance 808, which records the users voiceprint by requiring the user to count from one to seven (“1,2 . . . 7”). A second test utterance 810 is also acquired; this time, the letters A through G are spoken (“A,B . . . G”). These steps being completed, the enrollment software saves the user's new account and voiceprint 812.
Alternate Explanation of the Exemplary Embodiment Referring to
Each user must cooperatively enroll (see 608,
Any of the four (4) users may logon 904 to any of the four client computers (see 114,
Registration 902 next comprises user selection 908 of a specific laser pointer (see 108,
The present invention associates together the user identification from voice logon 904 and voice verification 906, and the respective projected image (see 112,
Laser pointer (see 108,
Laser Pointer Input
Two methods of interacting with an application window using the laser pointer (see 108,
The projected image (see 112,
The analog output of the cameras (see 102,
If it is determined that no match occurs 919, the logic flow returns to determining 910 whether an input has been received from a laser pointer (see 108,
The template number and location is broadcast 922 to all client computers (see 114,
Voice Command Input
If it is determined 910 that the user has spoken a voice command, speech contained in the input voice is recognized 926 in a recognition grammar created for that application window (see 110,
Having described preferred embodiments of the invention with Preference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one skilled in the art without departing from the scope or spirit of the invention as defined in the appended claims.
Claims
1. Method for networked interactive control of displayed information, comprising the steps of:
- configuring a network connection between at least one client computer and an information display controller; said at least one client computer being assigned to one enrolled user; said information display controller cooperating between said at least one client computer and said information display;
- displaying a separate application window on said information display representative of client computer application window on said at least one client computer's desktop;
- logging said enrolled user into his said assigned client computer, said step of logging further comprising: verifying said enrolled user's voice, and each enrolled user's choosing a unique light pattern to project onto said information display;
- associating said enrolled user's identification, voice and said unique light pattern;
- projecting said unique light pattern onto said information display;
- scanning composite image of said information display and each of said unique light patterns;
- digitizing said composite image;
- identifying and locating each of said unique light patterns;
- broadcasting shape and location of each of said unique light patterns to each of said client computers;
- a first step of determining, within said client computer, whether said unique light pattern is associated with that user assigned to that client computer; IF said unique light pattern is associated with that user assigned to that client computer, THEN a second step of determining user's desired mode of interaction between said unique light pattern and said displayed application window; and performing user's desired operation on said displayed application window; OTHERWISE, IF said unique light pattern is not associated with that user assigned to that client computer, THEN ignoring user's desired operation on said displayed application window.
2. Method of claim 1 wherein said step of verifying further comprises:
- vocally entering user's identification; a first step of determining whether user is enrolled, IF user is enrolled, THEN a second step of determining whether user's vocal utterance matches his previously obtained voiceprint; IF user's vocal utterance matches his previously obtained voiceprint, THEN notify user he is verified; OTHERWISE, if user's vocal utterance does not match his previously obtained voiceprint THEN attempt confirm user's identification; IF user's identification is confirmed, THEN verify user's last name; IF user's last name is verified, THEN verify user's company code; IF user's company code is verified, THEN notify user he is verified; OTHERWISE, if user's company code is not verified THEN notify user he could not be verified; OTHERWISE, IF user's last name can not be verified, THEN notify user he could not be verified; OTHERWISE, IF user's identification cannot be confirmed, THEN repeat said step of verifying; OTHERWISE, IF user is not enrolled, THEN attempt to confirm user's identification; IF user's identification is confirmed, THEN notify user he is not enrolled; OTHERWISE, IF user's identification cannot be confirmed THEN repeat said step of verifying.
3. Method of claim 1 wherein said unique light patterns are selected from the group consisting of a cross pattern, an open circle pattern, an arrow pattern, and a solid circle pattern.
4. Method of claim 1, wherein said step of projecting said unique light pattern further comprises projection of light by laser pointer.
Type: Application
Filed: Mar 29, 2005
Publication Date: Aug 18, 2005
Inventors: Sakunthala Gnanamgari (Devon, PA), Jacqueline Smith (Lee Center, NY)
Application Number: 11/094,550