SYSTEM AND METHOD FOR USER MONITORING AND INTENT DETERMINATION
Sensing interfaces associated with a home entertainment system are used to automate a system response to events which occur in a viewing area associated with the home entertainment system. Data derived from such sensing interfaces may also be used to enhance the response readiness of one or more system components. Still further, user presence data derived from such sensing interfaces may be used to capture and report user viewing habits and/or preferences.
Latest UNIVERSAL ELECTRONICS INC. Patents:
- Method and apparatus for providing noise suppression to an intelligent personal assistant
- System and method for widget-assisted setup of a universal remote control
- System and method for determining the location and/or relative environment of a controlling device
- Systems and methods for controlling device configuration in a networked environment
- System and method for appliance control via a personal communication or entertainment device
This application claims the benefit of, and is a continuation of, U.S. application Ser. No. 13/925,966, filed on Jun. 25, 2013, which application claims the benefit of, and is a continuation-in-part of, U.S. application Ser. No. 13/758,307, filed on Feb. 4, 2013, the disclosures of which are incorporated herein by reference in its entirety.
BACKGROUNDHome entertainment systems comprised of plural appliances and/or the controlling devices used to issue commands to such appliances may be provisioned with devices for detecting user presence and/or user interaction via methods such as gesture, spoken voice, facial recognition, spatial analysis, etc., as known in the art. Furthermore, growing use of personal communication devices such as smart phones, tablet computers, etc., may provide additional means for identification of user presence via detection of such personal communication devices on a local wireless network such as a WiFi network, a Bluetooth network, etc. While multiple media sources and multiple media rendering devices may be coupled in many of these home entertainment systems through a central routing appliance such as an AV receiver, set top box, smart TV, etc., no systems or methods currently exist for using user presence and/or user interaction detection alone or in conjunction with a central routing appliance to provide enhanced home entertainment system functionalities.
SUMMARY OF THE INVENTIONThis invention relates generally to home entertainment systems and control methods therefor and, in particular, to enhanced functionalities for such home entertainment systems which are enabled by the availability of additional user-related input methods for such systems. For example, in one aspect of the invention sensing interfaces such as an image sensing interface (e.g., an interface associated with a camera), a sound sensing interface (e.g., an interface associated with microphone), and/or an interface for sensing the presence of an RF device such as a smart phone may be used to fully or partially automate a system response to common events which may occur during a TV viewing session, such as a user or users leaving or entering the viewing area, a user answering a telephone call, the detection of a doorbell ringing or a baby monitor alarm, etc. In another aspect of the invention, data derived from such sensing interfaces may be utilized to enhance the responsiveness of one or more system components, for example by sensing when a user is reaching for a physical remote control unit or preparing a component to issue a voice or gesture command. In a yet further aspect of the invention, user presence data derived from such sensing interfaces may be used by a central routing appliance in conjunction with media stream information in order to capture and report user viewing habits and/or preferences.
A better understanding of the objects, advantages, features, properties and relationships of the invention will be obtained from the following detailed description and accompanying drawings which set forth illustrative embodiments and which are indicative of the various ways in which the principles of the invention may be employed.
For a better understanding of the various aspects of the invention, reference may be had to preferred embodiments shown in the attached drawings in which:
With reference to
Turning now to
For brevity, the discussions which follow will generally be with reference to the exemplary equipment configuration of
As illustrated in
To cause the smart TV appliance 200 to perform an action, appropriate programming instructions may be stored within the memory 302 (hereafter the “TV programming”) for execution by TV engine and media processor(s) 300. An exemplary architecture for such TV programming is presented in
Under the control of such TV programming, smart TV appliance 200 may for example receive an incoming AV media stream from one of the input ports 306,308 to be processed, buffered, separated into audio and video components, and routed to outputs 318, 320 for rendering on TV display screen 322 and loudspeaker(s) 324; may receive commands from remote control interface 316 which are decoded and acted upon, for example to select an input media stream, adjust audio volume, etc.; may manage a connection to the Internet through Ethernet or WiFi interface 310 to enable browsing for content, download of software updates, video telephony utilizing inputs from camera 122 and microphone 120; etc. Additionally, in accordance with the teachings herein, the exemplary TV programming may receive and process input signals from controlling device 204, camera 122 and/or microphone 120 in order to detect user presence, identify individual users, and/or receive user command input, as will described hereafter. As will be appreciated, while in the illustrative embodiment the source of audio input signals may comprise a microphone 120 and associated interface 314 provisioned as part of a smart TV appliance 200, in alternative embodiments audio input signals may be captured by any other appliance in the system and forwarded to appliance 200 for processing, or may originate from a microphone provisioned in a controlling device such as remote control or smartphone 204, the output of which microphone may, by way of example, be digitized and/or processed by controlling device 204 and wirelessly forwarded to smart TV appliance 200 via remote control interface 326, WiFi interface 310, Bluetooth interface 328, or any other means as appropriate for a particular implementation. In an exemplary embodiment the user event processing module 436 of the TV programming of TV appliance 200 (hereafter “event processing”) may act as illustrated in the flowchart of
If the reported event is not a remote control command, at step 504 the event processing may next determine if the reported event constitutes an image change event reported by visual event detection module 432 in response to analysis of image data received from camera 122 via camera driver 422. Such image processing may utilize for example the techniques described in U.S. Pat. Nos. 5,534,917, 6,829,384, 8,274,535, WIPO (PCT) patent application publication WO2010/057683A1, or the like, and may for example periodically monitor an image comprising the field of view of camera 122 in order in order to initiate image analysis in response to detection of any variation in image data which exceed a certain threshold value. If it is determined that the event is a report of a detected image change, at step 518 it is next determined if the event comprises the departure or imminent departure of a user from the viewing environment of TV 200. If so, various actions may be taken by event processing 436 as appropriate. By way of example, if the departing user is the sole viewer (or, in some embodiments, the primary user, e.g. the user who initiated the current viewing session, the user that is provided within the system with a higher priority relative to remaining users, or the like) the event processing may be adapted to issue a “pause” command to the source of the current media stream. Other actions may include, without limitation, activating the recording function of a DVR, logging off a Web site, etc., as appropriate for a particular embodiment and configuration. If such an action is to be taken, at step 520 in the illustrative embodiment the event processing may cause display of a request for confirmation on TV screen 322, e.g. “Would you like to pause this show? (YIN).” If confirmed by the user at step 522, which confirmation may take the form of a gesture, spoken command, remote control input, etc., or, in those embodiments where the default is to take action, a timeout, at step 528 the indicated action may be executed. In this context it will be appreciated that in embodiments where voice or gesture responses are expected, the performance accuracy of audio and/or visual event detection modules 432,434 may be improved by indicating a range of possible responses (e.g., “yes” or “no” in this instance) to these modules in advance, thereby limiting the number of sound or gesture templates which need to be matched. Also, in some embodiments where voice input may be utilized, the sound level of TV audio output 320 may be temporarily lowered to reduce background noise. Thereafter, at step 530 data regarding the change in user, including user identity where this is determinable, for example via use of techniques such as described in U.S. Pat. Nos. 7,551,756, 7,702,599, or the like, may be conveyed to statistic gathering module 438 for logging, after which processing is complete.
If the detected image change event is not a user departure, at step 524 the event processor may next determine if the reported event comprises the arrival of a new or additional user in the TV viewing environment and if so take appropriate action. By way of example, some embodiments may be adapted to allow a viewer to invoke a “private viewing” status which may cause the current content to be automatically muted, paused, switched, etc. in the event an additional user enters the viewing environment. In a still further embodiment, the entry of a user into a viewing environment may trigger an offer to resume playback of previously paused content; or in a multi-room, multi-device household in which appliances are networked together and equipped with viewer recognition, the entry of a user into one viewing environment may cause the event processor in that environment to query other event processors and/or statistic modules elsewhere in the household to determine if that user has recently departed another environment, and if so, offer to resume playback of a content stream which was previously paused in that other environment.
It is contemplated that the action(s) to be taken upon a user entering the viewing environment may be set to be specific to the identity of the arriving person, e.g., to be performed only when a specifically recognized individual or recognized type/category of individual, such as a child, enters the viewing environment. Similarly, the action(s) to be taken upon the arrival of a user to the viewing environment may be set to be specific to the identity and/or type of the currently viewing user or users. In such a case, an action to be executed may be one to inhibit the performance of any new actions by the currently viewing user or users, e.g., to inhibit a child from changing a channel that is currently being viewed to thereby allow the parent an opportunity to see what the child is currently watching. In the latter example, the inhibiting of any action(s) may be lifted after a given period of time expires, upon the new user again exiting the viewing area, upon the new user overriding the action (for example via a gesture, voice, further action, or the like—to the extent that user is authorized to perform such action), etc. In still further cases, the actions to be taken when a user enters the viewing area may be prioritized based upon the identities of the newly arriving user and the currently viewing user or users. Yet further, the actions to be taken by the system upon a detected “user entry event” can be used to provide one or more appliances within the system with desired states, e.g., to establish volume level settings, closed-captioning settings, commercial skipping settings, SAP setting, lighting level settings, and/or the like type of user preference settings without limitation. Thus, from the foregoing examples it will be appreciated that various combinations of actions may be specified for the system to take upon the detection of a “user entry event,” which actions may or may not consider user identities and/or user identity priorities, and, as such, the examples provided herein are not intended to be limiting in any form.
By way of still further example, the actions to be taken by the system upon the detected arrival of a new user to the viewing area may cause the system to combine the favorite channel listings (such as shown in an electronic program guide) that have been established for the multiple viewers into a single listing or to provide a single listing that will include only those programs and/or channels that are commonly found within the favorite channel listings that have been established for each of the multiple viewers. Similarly, access to videos, games, programs, channels, or the like (such as set via use of a parental control chip) can be limited to only those videos, games, programs, channels, or the like that are commonly accessible to each of the multiple viewers. Yet further, to the extent preferences that have been established for the recognized users do not conflict, the system may take appropriate action(s) to establish within the system one or more of these preferences upon the detection of a “user entry event.”
It is for the above-noted purposes at step 526 it may be determined if any such “user entry event” action preferences have been set and, if so, at step 528 the appropriate action(s) may be executed, after which data regarding the user arrival, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete.
It will also be appreciated that, while the foregoing describes various actions that may be taken upon a detection of a “user entry event,” it is to be understood that like actions may be taken when the system is initially started in the presence of multiple viewers.
In the event that it is detected that a user responsible for a “user entry event” exits or is in the process of exiting from the viewing area, i.e., a “user exit event” is detected, the above described actions taken in connection with the “user entry event” can be automatically reversed (or the remaining user(s) can be prompted if a reversing action is to take place). For example, video that was paused can be resumed, electronic program guides/favorite channel listings can be restored, etc. In a like manner, a “user exit event” indicative of one or more users exiting or preparing to exit the viewing area can be used by the system to take actions that would remove consideration of the exiting user's preference, to place the system into a state that is appropriate for the remaining users, etc. As before, the actions to be taken by the system upon detection of a “user exit event” can be set so as to be specific to the identity of the exiting user or users, the identity of remaining user or users, and/or may be prioritized based upon the identities of the exiting user or users and the remaining user or users without limitation. It is again to these purposes that at step 518 it may be determined if any such “user exit event” action preferences have been set and, if so, at step 520 the appropriate action(s) may be executed, after which data regarding the user exiting, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete.
If the detected image change event is not a user arrival or departure, at steps 532, 534 and 536 it may next be determined if the reported event comprises a detectable gesture, a detectable gesture in this context comprising a user action or motion which either by pre-programming or a learning process has been identified to visual event detection module 432 as having significance as user input. If the reported event is determined to be associated with an operational command function, for example “pause”, “mute”, “channel up”, etc., processing may continue at step 540 to execute the command as described previously. If the reported event is determined to be a preparatory gesture, appropriate anticipatory action may be taken at step 538. In this context, a preparatory gesture may comprise without limitation any preliminary motion or gesture by a user which may be interpreted as a possible indication of that user's intent to perform an action, for example standing up, reaching for or setting down a remote control device, beckoning an out of sight person to enter the room, picking up a phone, etc. Anticipatory actions may include for example pre-conditioning visual and/or audio detection modules 432,434 to favor certain subsets of templates for matching purposes; modifying an on-screen menu from a format optimized for gesture control to one optimized for navigation via a keyboarded device such as a remote control, or vice-versa; reducing or muting audio volume; signaling a remote control device exit a quiescent state, to turn on or shut off its backlight, or the like; etc.
If the reported event is not an image change event, at step 506 the event processing may next determine if the reported event constitutes a sound recognition event reported by audio event detection module 434. Speech or sound recognition by module 434 may utilize for example the techniques described in U.S. Pat. No. 7,603,276, WIPO (PCT) patent application publication WO2002/054382A1, or the like. If the event is a sound recognition event, at step 512 it may be determined if the decoded sound constitutes a voice command issued by a user. If so processing may continue at step 540 to execute the desired command as described previously. If not a voice command, at step 514 it may be determined if the reported event constitutes a trigger sound. In this context a trigger sound may be an instance of a non-vocal audio signal received via microphone 120 which either by pre-programming or via a learning process has been assigned a command or an anticipatory action. By way of example without limitation, trigger sounds may include a phone or doorbell ringing, a baby monitor, smoke alarm, microwave chime, etc. If the reported sound event is a trigger sound, the appropriate action, such as muting the television, etc., may be taken at step 516. The system can be also be programmed to recognize various sounds and or spoken words/phrases as being indicative of a preparatory event whereupon one or more of the system components will be readied via an anticipatory action as described above. By way of example, a spoken phrase such as “let's see what else is on” may be recognized as a preparatory event whereupon an anticipatory action may be executed to place a remote control device into a state wherein the remote control device is readied to receive input in anticipation of its use or a spoken phrase such as “come here,” the sound of a door bell, or the like may be recognized as a preparatory event whereupon an anticipatory action may be executed to ready the system to look for an anticipated event, e.g., a specific gesture, such as a user standing up, leaving the viewing area, etc. whereupon the appropriate response action to the sensed event that was anticipated, e.g., pausing the media, may be performed.
In the event that an anticipated event is not performed within a predetermined period of time, the system may execute, as needed, further actions, such as restorative action, to place the system into a state as desired, e.g., to return one or more components of the home entertainment system to a state where the component(s) is no longer looking for the occurrence of the anticipation event.
If the reported event is not a sound recognition event, at step 508 the event processing may next determine if the reported event comprises a wireless device such as for example a smart phone, tablet computer, game controller, etc., joining into or dropping from a LAN or PAN associated with smart TV 200 or other appliance in the system. If such devices have been previously registered with the TV programming, such activity may be used to infer the presence or absence of particular users. Such information may then be processed in a similar manner to user image detection (e.g., processed as a user being added or departing) continuing at step 518 as previously described.
Finally, at step 510 the event processor may process any other event report activity as consistent with a particular embodiment. For example, in some embodiments users may be provisioned with personal remote control devices which embed user identification data in their command transmissions, such as described in co-pending U.S. patent application Ser. No. 13/225,635 “Controlling Devices Used to Provide an Adaptive User Interface,” of common ownership and incorporated by reference herein in its entirety, in which case user presence reporting events may be generated by remote control interface 420. In other embodiments technologies such as infrared body heat sensing such as proposed in the art for use in automatic unattended power-off applications may be further adapted for the purposes described herein. Additional sources of activity events may also include data received from other household equipment such as security, lighting, or HVAC control systems equipped with occupancy sensors; entryway cameras; driveway sensors, etc., where appropriate.
Statistic gathering module 438 may be adapted to report the data conveyed to it during the event processing steps described above to a centralized service, e.g., hosted on Internet connected server device 128, for aggregation and analysis of user viewing habits and preferences. Depending on the particular embodiment such reporting may be performed on an event-by-event basis, or alternatively the data may be accumulated and reported at predetermined time intervals or only upon receipt of a request from the server device.
By way of example, in an illustrative embodiment data reported to statistic gathering module 438 may be formatted into several different event record classes and types for uploading to server device 128. Exemplary record classes may comprise user events, e.g., as may be reported at step 530 of
With reference to Table 1 below, user event record types may include addition (i.e., arrival) of user to the viewing area and deletion (i.e., departure) of a user from the viewing area. As illustrated, each of these record types may include timestamp data and a user ID. The timestamp illustrated is suitable for use in applications where the server device is already aware of the geographical location of the reporting system, e.g., as a result of an initial setup procedure, by URL decoding, etc. In embodiments where this is not the case, the timestamp field may include additional data such as a time zone, zip code, etc., where required. User identity may be any item of data which serves to uniquely identify individual users to facilitate viewing habit and preference analysis at server 128. By way of example, user ID data may comprise identities explicitly assigned during a setup/configuration process; random numbers assigned by the system as each distinct user is initially detected; a hash value generated by a facial or voice recognition algorithm; a MAC address or serial number assigned to a smart phone or tablet computer; etc. as appropriate for a particular embodiment.
Referring now to Table 2, appliance event records may comprise record types indicative of events reported to, functional commands issued to, and/or operations performed by various controlled appliances, e.g., as reported at step 548 of
Referring now to Table 3, content event record types may include without limitation a channel or track change information record type; a title information record type containing, for example, a show title retrieved from program guide data, DVD or video-on-demand title information, etc.; a metadata record type containing metadata values obtained from a DVD or CD, streaming video service, or the like; a content sample record type containing a sample clip of audio and/or video content for comparison against a content identification database; or in alternate embodiments any other data which may be utilized in determining the identity of a particular content stream. Each record type may comprise timestamp and source appliance fields as described above, together with a field containing identity data, which may comprise numeric, text, or binary data as necessary. As will be appreciated, additional record and/or field types may be utilized in other embodiments, as necessary to enable reliable identification of media content streams.
It will be appreciated that while the exemplary data structures of Tables 1 through 3 are presented herein using a tabular format for ease of reference, in practice these may be implemented in various forms using any convenient data representation, for example a structured database, XML file, cloud-based service, etc., as appropriate for a particular embodiment. Furthermore, it will also be appreciated that while the statistics gathering and recording functionality of the illustrative embodiment is implemented as part of the programming of an exemplary appliance 200, i.e., in software module 438, in other embodiments this functionality may be provisioned at a different location, for example in one of the other appliances forming part of an entertainment system, in a local PC, at a remote server or cable system headend, etc., or at any other convenient location to which the particular appliance programming may be capable of reporting user events.
While various concepts have been described in detail, it will be appreciated by those skilled in the art that various modifications and alternatives to those concepts could be developed in light of the overall teachings of the disclosure. For example, in alternate embodiments the steps of the methods described above may be advantageously performed in various other appliances as appropriate, e.g. an AV receiver or a cable/satellite STB. Further, in an interconnected system such as illustrated in
Further, while described in the context of functional modules and illustrated using block diagram format, it is to be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or a software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an enabling understanding of the invention. Rather, the actual implementation of such modules would be well within the routine skill of an engineer, given the disclosure herein of the attributes, functionality, and inter-relationship of the various functional modules in the system. Therefore, a person skilled in the art, applying ordinary skill, will be able to practice the invention set forth in the claims without undue experimentation. It will be additionally appreciated that the particular concepts disclosed are meant to be illustrative only and not limiting as to the scope of the invention which is to be given the full breadth of the appended claims and any equivalents thereof.
All patents cited within this document are hereby incorporated by reference in their entirety.
Claims
1. A method for controlling a display device, comprising:
- receiving event data via an image sensing interface;
- determining from the received event data at least a number of and identity of viewers in a viewing area associated with the display device; and
- when it is determined from the received event data that at least the number of and identity of viewers in the viewing area associated with the display device has changed causing a command action to be executed whereupon a display state of the display device is caused to be correspondingly changed.
2. The method as recited in claim 1, wherein the display device comprises a television.
3. The method as recited in claim 1, wherein a cloud-based service performs the step of causing a command action to be executed whereupon a display state of the display device is caused to be corresponding changed.
4. The method as recited in claim 1, wherein the state of the display device comprises a state in which media is made available via use of at least a profile associated with at least one identified viewer in the viewing area.
5. The method as recited in claim 4, wherein the profile comprises a favorites list.
6. The method as recited in claim 5, wherein a cloud-based service performs the step of causing a command action to be executed whereupon a display state of the display device is caused to be corresponding changed.
7. The method as recited in claim 6, wherein the display device comprises a television.
8. The method as recited in claim 1, comprising changing the state of the display device such that the display device presents media that is commonly found in a favorites list of a plurality of identified viewers in the viewing area.
9. The method as recited in claim 8, wherein a cloud-based service performs the step of causing a command action to be executed whereupon a display state of the display device is caused to be corresponding changed.
10. The method as recited in claim 9, wherein the display device comprises a television.
11. The method as recited in claim 1, wherein the state of the display device comprises a state associated with a preference linked to an identity of a new viewer entering the viewing area.
12. The method as recited in claim 11, wherein the preference comprises a closed-captioning preference.
13. The method as recited in claim 11, wherein the preference comprises a SAP preference.
14. The method as recited in claim 1, wherein the state of the display device comprises a state in which the display device is caused to render media associated with an identity of a new viewer entering the viewing area.
Type: Application
Filed: Nov 19, 2013
Publication Date: Aug 7, 2014
Applicant: UNIVERSAL ELECTRONICS INC. (Santa Ana, CA)
Inventors: Arsham Hatambeiki (Irvine, CA), Paul D. Arling (Irvine, CA)
Application Number: 14/083,554
International Classification: H04N 21/442 (20060101);