DEVICE CONTROL METHOD AND ELECTRIC DEVICE
A method is provided for controlling an operation of a target device using a plurality of input devices, including a speech input device. The method includes acquiring, from the speech input device, speech information, including i) environmental sound around the speech input device, and ii) a speech instruction indicating an operation to be performed on the target device. The method also includes calculating a level of noise included in the speech information, and recognizing the operation. The method further includes informing a user of a second input device as a recommended input device based on the calculated noise level and the recognized operation instruction, wherein the second input device does not include a speech input device.
Latest Panasonic Patents:
The present application is a continuation application of U.S. application Ser. No. 14/743,704, filed Jun. 18, 2005, which claims the benefit of Japanese Patent Application Nos. 2015-050967, filed Mar. 13, 2015 and 2014-135899, filed Jul. 1, 2014. The entire disclosure of each of the above-identified application, including the specification, drawings, and claims is incorporated herein by reference in its entirety.
TECHNICAL FIELDThe present disclosure relates to a method for controlling an electric device whose operation can be controlled using multiple input devices, and such an electric device.
DESCRIPTION OF THE RELATED ARTWith the development of speech recognition technology, speech recognition accuracy has been significantly improved in recent years. Accordingly, there have been considered device management systems in which various types of devices are operated using speeches. It is expected that the user will be able to control the various types of devices by uttering desired operations toward the devices, without having to perform troublesome button operations.
SUMMARYHowever, such systems still have many matters to be considered and have to be further improved for commercialization.
In one general aspect, the techniques disclosed here feature a method for controlling an operation of a target device using a plurality of input devices. The method comprises: receiving from one of the plurality of the input devices a first operation instruction issued to the target device, with a first data format; recognizing the first operation instruction and the first data format; determining that the one of the plurality of the input devices is a first input device corresponding to the first data format; and providing to a user of the target device a recommendation for a second input device, a type of the second input device being different from a type of the first input device, when it is determined that a type of the first operation instruction is identical to a type of a second operation instruction received from the second input device earlier than the reception of the first operation instruction.
These general and specific aspects may be implemented using a system, a method, and a computer program, and any combination of systems, methods, and computer programs.
According to the above aspect, the above system can be further improved.
It should be noted that general or specific embodiments may be implemented as a system, a method, an integrated circuit, a computer program, a storage medium, or any selective combination thereof.
Additional benefits and advantages of the disclosed embodiments will become apparent from the specification and drawings. The benefits and/or advantages may be individually obtained by the various embodiments and features of the specification and drawings, which need not all be provided in order to obtain one or more of such benefits and/or advantages.
The user can directly give an instruction to the target device by transmitting an operation command to the target device using speech input. Thus, such an operation using speech input may be simpler than an operation performed on an input device such as a remote control to give an instruction to the target device. For example, the user can more easily perform the desired operation by using speech input than using a remote control to open the menu window and then select an item corresponding to the desired operation or than using a remote control to input a phrase to be searched for.
There are command reception devices capable of receiving both command input using a remote control and command input using a speech (e.g., Japanese Unexamined Patent Application Publication No. 2003-114698).
Japanese Unexamined Patent Application Publication No. 2003-114698 discloses a control device having a speech recognition function, a switch with which the user externally inputs various types of commands, data, or the like, a display device for displaying images, and a microphone for inputting a speech. This control device receives a command, data, or the like inputted by the user from the switch or microphone, processes the received command or the like, and outputs the result of the processing to the display device. If a command inputted using the switch can also be inputted using a speech through the microphone, the control device notifies the user that the command can be inputted using a speech.
However, each time a command which can be inputted using a speech is inputted using the switch, this technology notifies the user that the command can be inputted using a speech. Accordingly, even when the user thinks that the user can more easily perform an operation on the device by using the remote control and then does so, if the command for that operation can be inputted using a speech, the user receives a notification to that effect, thereby feeling annoyed. That is, this background technology cannot issue a notification to the user while distinguishing between an operation suitable for command input using a speech and an operation suitable for command input using the remote control.
As described above, the user can directly give an instruction to the target device by transmitting an operation command to the target device using speech input and therefore such an operation using speech input may be simpler than an operation performed on an input device such as a remote control to give an instruction to the target device. On the other hand, if the buttons of the remote control correspond to operations, such as power-on/off, the change of the television channel, and the control of the television volume, one-on-one, the user may be able to perform the desired operation more easily by operating the buttons than by using speech input as long as the user remembers the positions of the buttons. Accordingly, if the user can properly select between command input using a speech and command input using the remote control, he or she can more easily perform the desired operation on the target device.
In view of the foregoing, the inventors have conceived of the following modifications to improve the functions of a speech device operation system.
A method according to one embodiment of the present invention is a method for controlling an operation of a target device using a plurality of input devices. The method includes receiving, from one of the plurality of the input devices, a first operation instruction issued to the target device, with a first data format; recognizing the received first operation instruction and the first data format; determining that the one of the plurality of the input devices is a first input device corresponding to the recognized first data format; and providing to a user of the target device a recommendation for a second input device, a type of the second input device being different from a type of the first input device, when it is determined that a type of the first operation instruction is identical to a type of a second operation instruction that is received from the second input device earlier than the reception of the first operation instruction from the first input device. Since a second input device different from the first input device is presented to the user when the user performs an operation which is not suitable for the first input device, it is possible to notify the user that the user can more easily perform the desired operation on the target device. Further, the user can more easily perform the desired operation by using the second input device.
In one embodiment, for example, the first input device may include a speech input device, a remote controller, or a mobile phone, the first data format may include a data format for a communication using at least one of a speech input device, a remote controller, or a mobile phone, each of the type of the first operation instruction and the type of the second operation instruction include a control for turning on/off power of the target device, a volume control when the target device is a television set, and an airflow control when the target device is an air conditioner
In one embodiment, for example, the method may further include whether the type of the first operation instruction is identical to the type of the second operation instruction, when a second time when the second operation instruction is received from the second input device falls within a predetermined time period prior to a first time when the first operation instruction is received from the first input device. Thus, if the user performs the same operation multiple times within a predetermined period of time, the second input device suitable for the operation can be presented to the user.
In one embodiment, for example, the method may further include determining whether the type of the first operation instruction is identical to the type of the second operation instruction, when the second operation instruction is received from the second input device immediately before a first time when the first operation instruction is received from the first input device. Thus, if the user performs the same operations consecutively, a second input device suitable for the operations can be presented to the user.
In one embodiment, for example, the method may further include storing recommended input device information about the type of the second input device suitable to the type of the first operation instruction; and determining that the type of the first operation instruction is suitable to the type of the second input device based on the recommended input device information. Thus, a second input device suitable for the operation performed by the user can be presented to the user.
In one embodiment, for example, the target device may be capable of receiving the first operation instruction from the second input device. Thus, the operation of the target device can be controlled using the multiple input devices.
In one embodiment, for example, the second input device may include at least one of a speech input device, a remote controller, or a mobile phone.
In one embodiment, for example, a sound stating that the second input device may the speech input device, the remote controller, or the mobile phone may be outputted toward the user. Thus, the user can recognize a second input device suitable for the operation through the sound.[0024] In one embodiment, for example, the target device may be a television set, and the second input device includes a remote controller, and the recommendation for the second input device may be provided to the user by displaying on a display device of the television set an image i) indicating an appearance of the remote controller and ii) highlighting an operation portion capable of performing the first operation instruction, of the remote controller. Thus, the user can understand which part of the remote control serving as a second input device he or she should operate.
In one embodiment, for example, the method may further include storing time information indicating a time when each of operation instructions has been received; and calculating a receiving frequency of the same type of operation instruction as the first operation instruction in a predetermined period of time, on the basis of the time information, to present to the user the recommendation for the second input device on the basis of the calculated receiving frequency. Thus, if the user performs the same operation multiple times within the predetermined period of time, a second input device suitable for the operation can be presented to the user.
In one embodiment, for example, if the calculated receiving frequency is greater than or equal to a predetermined value, the second input device is provided to the user. Thus, if the user performs the same operation a predetermined number of times or more within the predetermined period of time, a second input device suitable for the operation can be presented to the user.
In one embodiment, for example, the method may further include recognizing whether the first input device is the speech input device; recognizing whether the first instruction does not include an operating range for an operation indicated by the first operation instruction; when the first input device is recognized to be the speech input device and when the first instruction does not include the operating range, determining that the speech input device is not suitable to the type of the first operation instruction. By presenting a second input device different from the first input device to the user when there is no history information of the operation corresponding to the type of the first operation instruction, the user can avoid performing troublesome operations on the first input device.
An electric device according to one embodiment of the present disclosure is an electric device, operation of the electric device capable to be controlled using a plurality of input devices The electric device includes a processor; and a non-transitory memory storing thereon a program, which executed by the processor, causes the processor to: receive, from one of the plurality of the input devices, a first operation instruction issued to the electric device, with a first data format; recognize the received first operation instruction and the first data format; determine that the one of the plurality of the input devices is a first input device corresponding to the recognized first data format; and provide to a user of the electric device a recommendation for a second input device, a type of the second input device being different from a type of the first input device, when it is determined that a type of the first operation instruction is identical to a type of a second operation instruction that is received from the second input device earlier than the reception of the first operation instruction from the first input device.
Since a second input device different from the first input device is presented to the user when the user performs an operation which is not suitable for the first input device, the user can more easily perform the desired operation on the target device.
A computer program according to one embodiment of the present disclosure is a non-transitory computer-readable recording medium storing a program for controlling an electric device, operation of the electric device capable to be controlled using a plurality of input devices. The computer program causing a computer of the electric device to: receive, from one of the plurality of the input devices, a first operation instruction issued to the electric device, with a first data format; recognize the received first operation instruction and the first data format; determine that the one of the plurality of the input devices is a first input device corresponding to the recognized first data format; and provide to a user of the electric device a recommendation for a second input device, a type of the second input device being different from a type of the first input device, when it is determined that a type of the first operation instruction is identical to a type of a second operation instruction that is received from the second input device earlier than the reception of the first operation instruction from the first input device. Since a second input device different from the first input device is presented to the user when the user performs an operation which is not suitable for the first input device, the user can more easily perform the desired operation on the target device.
A method according to one embodiment of the present disclosure is a method for controlling an operation of a target device using a plurality of input devices including a speech input device. The method includes acquiring, from the speech input device, speech information including I) environmental sound around the speech input device and ii) a speech instruction indicating an operation instruction issued to the target device; calculating a level of noise included in the speech information; recognizing the operation instruction indicated by the speech instruction; recognizing a type of the operation instruction based on the recognition result of the operation instruction; and providing to a user of the target device a recommendation for a second input device on the basis of the calculated noise level and the recognized type of the operation instruction, wherein a type of the second input device does not include speech type. Since a second input device different from the speech input device is presented to the user on the basis of the noise level, it is possible to avoid performing a different operation from the operation intended by the user.
In one embodiment, for example, the recommendation for the second input device may be determined based on i) the calculated noise level, ii) the recognized type of the operation instruction, and iii) recommended input device information, wherein the recommended input device information indicates an input device suitable to each operation instruction type. Thus, a second input device suitable for the environment in which noise is occurring can be presented to the user.
In one embodiment, for example, if the noise level is higher than or equal to a predetermined value, the recommendation for the second input device may be provided to the user. By presenting a second input device different from the speech input device to the user when the noise level is higher than the predetermined value, it is possible to avoid performing a different operation from the operation intended by the user.
A method according to one embodiment of the present disclosure is a method for controlling an operation of a target device using a plurality of input devices including a speech input device The method includes acquiring, from the speech input device, speech information including an operation instruction issued to the target device; recognizing the operation instruction included in the speech information; recognizing a type of the operation instruction based on the recognition result of the operation instruction; calculating a likelihood of the recognized operation instruction; and providing to a user of the target device a recommendation for a second input device on the basis of the recognized likelihood and the recognized type of the operation instruction, wherein a type of the second input device does not include speech type.
Since a second input device different from the speech input device is presented to the user on the basis of the likelihood, it is possible to avoid performing a different operation from the operation intended by the user.
In one embodiment, for example, the recommendation for the second input device may be determined based on i) the recognized likelihood, ii) the recognized type of the operation instruction, and iii) recommended input device information, wherein the recommended input device information indicates an input device suitable to each operation instruction type. Thus, a second input device suitable for a condition under which the likelihood of speech input is low can be presented to the user.
In one embodiment, for example, the recommendation for the second input device may be determined based on i) the recognized likelihood, ii) the recognized type of the operation instruction, and iii) recommended input device information, wherein the recommended input device information indicates an input device suitable to each operation instruction type. By presenting a second input device different from the speech input device to the user when the likelihood is lower than the predetermined value, it is possible to avoid performing a different operation from the operation intended by the user.
Now, an embodiment will be described with reference to the accompanying drawings. However, the embodiment described below is only illustrative. The numbers, shapes, elements, steps, the order of the steps, and the like described in the embodiment are also only illustrative and do not limit the technology of the present disclosure. Of the elements of the embodiment, elements which are not described in the independent claims representing the highest concept will be described as optional elements.
Overview of Services ProvidedFirst, there will be described an overview of services provided by an information management system according to the present embodiment.
The group 600 is, for example, a corporation, organization, or household and may have any size. The group 600 includes multiple devices 601 including first and second devices, and a home gateway 602. The devices 601 include devices which can be connected to the Internet (e.g., smartphone, personal computer (PC), and television) and devices which cannot be connected to the Internet by themselves (e.g., lighting system, washer, and refrigerator). The devices 601 may include devices which cannot be connected to the Internet by themselves but can be connected thereto through the home gateway 602. Users 6 use the devices 601 in the group 600.
The data center operating company 610 includes a cloud server 611. The cloud server 611 is a virtual server that cooperates with various devices through the Internet. The cloud server 611 mostly manages big data or the like, which is difficult to handle using a typical database management tool or the like. The data center operating company 610 performs the management of the data, the management of the cloud server 611, the operation of a data center which performs those, and the like. Details of the operation performed by the data center operating company 610 will be described later.
The data center operating company 610 is not limited to a corporation which only manages the data or cloud server 611. For example, as shown in
The service provider 620 includes a server 621. The server 621 may have any size and may be, for example, a memory in a personal computer (PC). The service provider 620 need not necessarily include the server 621.
Further, the information management system need not necessarily include the home gateway 602. For example, if the cloud server 611 manages all the data, the information management system does not have to include the home gateway 602. There are also cases in which any device which cannot be connected to the Internet by itself does not exist, like cases in which all devices in the household are connected to the Internet.
Next, the flow of information in the information management system will be described.
First, the first and second devices in the group 600 transmit log information thereof to the cloud server 611 of the data center operating company 610. The cloud server 611 accumulates the log information of the first and second devices (an arrow 631 in
Subsequently, the cloud server 611 of the data center operating company 610 provides a predetermined amount of the accumulated log information to the service provider 620. The predetermined amount may be an amount obtained by compiling the information accumulated in the data center operating company 610 so that the information can be provided to the service provider 620, or may be an amount requested by the service provider 620. Further, the log information need not necessarily be provided in the predetermined amount, and the amount of the log information to be provided may be changed according to the situation. The log information is stored in the server 621 held by the service provider 620 as necessary (an arrow 632 in
The service provider 620 organizes the log information into information suitable for services to be provided to users and then provides the resulting information to the users. The users to which such information is provided may be the users 6, who use the devices 601, or may be external users 7. The method for providing the information to the users 6 or 7 may be, for example, to provide the information directly to the users 6 or 7 by the service provider 620 (arrows 633, 634 in
The users 6 may be the same as the external users 7 or may differ therefrom.
As shown in
The target devices 3 are electric devices whose operation can be controlled using multiple input devices including the speech input device 1. The target devices 3 include, for example, a television, a recorder, an air-conditioner, a lighting system, an audio system, a telephone, an intercommunication system, and the like.
In the present embodiment, one or both of the speech input device 1 and server 2 may be incorporated into each target device 3.
The speech input device 1 is, for example, a microphone incorporated in or connected to each target device 3, a microphone incorporated in a remote control included with each target device 3 or the like, a microphone incorporated in or connected to a mobile communication terminal, or a sound concentrating microphone placed in the house.
At least some of the elements of the speech input device 1 can be implemented by a microcomputer and a memory. For example, the speech detection unit 102, speech section cutout unit 103, and feature value calculation unit 104 can be implemented by a microcomputer and a memory. In this case, the microcomputer performs the above processes on the basis of a computer program read from the memory.
The device operation determination unit 204 determines an operation command from the result of the speech recognition on the basis of the device operation determination table 205. As shown in
At least some of the elements of the server 2 can be implemented by a microcomputer and a memory. For example, the speech recognition unit 202, speech recognition dictionary storage unit 203, device operation determination unit 204, and device operation determination table 205 can be implemented by a microcomputer and a memory. In this case, the microcomputer performs the above processes on the basis of a computer program read from the memory.
The communication unit 301 receives the operation command and input device ID transmitted by the server 2. The device control unit 302 recognizes an operation instruction indicated by the received operation command, as well as recognizes an input device which has received an operation related to the operation instruction from the user, on the basis of the received input device ID. The device control unit 302 then controls the operation of the target device 3 in accordance with the received operation command. The consecutive operation determination unit 303 determines whether the user has consecutively inputted operations, on the basis of the operation command storage table 304. As shown in
The recommended input device determination unit 305 determines a recommended input device on the basis of the recommended input device determination table 306. As shown in
The recommended input device position determination unit 307 determines the position of the recommended input device on the basis of information in a position information acquisition unit 401 and then determines whether to present the user with the position of the recommended input device. The operation method display unit 308 presents the user with the recommended input device and the position thereof on the basis of the determinations made by the recommended input device determination unit 305 and recommended input device position determination unit 307.
At least some of the elements of the target device 3 can be implemented by a microcomputer and a memory. For example, the device control unit 302, consecutive operation determination unit 303, operation command storage table 304, recommended input device determination unit 305, recommended input device determination table 306, and recommended input device position determination unit 307 can be implemented by a microcomputer and a memory. In this case, the microcomputer performs the above processes on the basis of a computer program read from the memory.
At least some of the elements of the input device 4 can be implemented by a microcomputer and a memory. For example, the position information acquisition unit 401 and input unit 402 can be implemented by a microcomputer and a memory. In this case, the microcomputer performs the above processes on the basis of a computer program read from the memory.
First, in step S001, the speech input device 1 calculates the feature value of a speech and transmits it to the server 2. In step S002, the server 2 performs a speech recognition process, that is, it converts the received speech feature value into character strings or word strings on the basis of the information in the speech recognition dictionary of the speech recognition dictionary storage unit 203. In step S003, the server 2 determines the type of a target device and a device operation intended by the user and transmits a corresponding operation command and input device ID to a corresponding target device 3. Details of the process in step S003 will be described later.
In step S004, the target device 3 determines whether it can actually perform the operation command transmitted in step S003. For example, if the target device 3, which is a television, receives an operation command for controlling the volume with the television powered off, it cannot perform the operation.
In step S005, the target device 3 determines whether the operation performed in step S004 is the same as the immediately preceding operation. Details of the process in step S005 will be described later. If the target device 3 determines in step S005 that operations have been performed consecutively, it performs a process in step S006.
In step S006, the target device 3 determines whether, with respect to the operation instruction inputted by the user, there are any other input devices 4 which are recommended over the speech input device 1. Details of the process in step S006 will be described later. If there are any input devices 4 which are recommended over the speech input device 1, the respective input devices 4 perform a process in step S007.
In step S007, the recommended input devices 4 determined in step S006 each determine the position thereof. Examples of the method for determining the position include the following: if each input device 4 includes a pressure sensor, it determines whether the user is holding the input device 4, based on whether there is a pressure; if each input device 4 includes a position sensor such as a GPS, RFID, or infrared ID, it determines the position using the position sensor; each input device 4 determines the position using the transmission/reception information of communication radio waves in a wireless LAN, Bluetooth, or the like; if the target device 3 includes a position acquisition unit equivalent to an input device 4, the relative positions of the input devices 4 and target device 3 are determined; and if the target device 3 includes a camera, the position of the user is estimated using camera information, and the relative positions of the input devices 4, user, and target device 3 are determined.
Note that appropriate positions may be selected as the relative positions of the user, input devices 4, and target device 3 in accordance with the device used or operation command. In step S008, a recommended input device which is most suitable for the operation command is shown to the user on the basis of the position information of the recommended input devices determined in step S007. Details of the process in step S008 will be described later.
First, in step S301, the server 2 receives the character stings or word strings, which are the result of the speech recognition. In step S302, the server 2 determines whether the result of the speech recognition includes an operation to be performed on a target device 3. Specifically, the server 2 determines whether the result of the speech recognition includes the type of a target device 3 such as a “television” or “air conditioner” or a phrase indicating an operation to be performed on a target device 3, such as “”power on” or “increase the volume.” If so determined, the process proceeds to step S303; not so determined, the process proceeds to step S305.
In step S303, the server 2 calls an operation command corresponding to the result of the speech recognition on the basis of the device operation determination table in
In step S305, the speech operation system calls a function other than the target device operation function on the basis of the result of the speech recognition and then performs the called function. For example, if a speech “tell the weather” has been inputted by the user, the speech operation system calls a Q&A function rather than the device operation function. For example, in response to the speech “tell the weather” inputted by the user, the speech operation system makes a response “what area's weather do you want to know?” or a response “today's weather is sunny.”
In the device operation determination table, respective speech recognition results and corresponding operation commands are stored in the transverse (row) direction. Note that a device operation determination table may be generated for each target device 3. These operation commands represent the types of operations inputted by the user.
For example, if a speech “power on” is inputted to the speech input device 1, an operation command “C001” is called from the device operation determination table of
In the operation command table shown in
For example, when a speech “power on” is inputted to the speech input device 1, an operation command C001, as well as an input device ID I001 are called. Note that there may be operation commands which include fine adjustments such as “increase the volume by 1” and “decrease the volume by 1.”
In step S501, the consecutive operation determination unit 303 stores the operation command transmitted in step S304, in the operation number N002 of the operation command storage table shown in
If the respective operation commands stored in the operation numbers N001 and N002 are matched in step S503, the process proceeds to step S504. If the operation commands stored in the operation numbers N001 and N002 are not matched, the process proceeds to step S506.
In step S504, the consecutive operation determination unit 303 confirms that the operations are consecutive and then discards the operation command information stored in the operation numbers N001 and N002. The process proceeds to step S505. In step S505, the recommended input device determination unit 305 performs a recommended input device presentation determination process, ending the consecutive operation determination process.
If the consecutive operation determination unit 303 determines in step S502 or S503 that the operations are not consecutive, it performs step S506. It stores the operation command stored in the operation number N002, in the operation number N001.
Owing to the consecutive operation determination process, only when the user has performed operations consecutively using an input device other than a recommended input device set for each operation command, the recommended input device is presented to the user. Thus, for example, if the user who is viewing the television performs an operation “increase the volume by one stage,” to which an input device other than the speech input device is set as a recommended input device, using the speech input device only once, a message such as “use of the remote control is recommended in order to control the volume” does not appear on the screen. Accordingly, such a message is prevented from hindering the viewing of the television by the user. That is, such a message is prevented from appearing each time the user performs an operation using an input device other than the recommended input device, thereby reducing the annoyance of the user. Further, by presenting the recommended input device to the user, the user can learn selective use of an input device for each operation command.
For example, if the user inputs a speech “power on” as the first operation, an operation command C001 is stored in the operation number N002 in step S501. Then, in step S502, it is determined that no operation command is stored in the operation number N001. Accordingly, the process proceeds to step S506 to store the operation command C001 stored in the operation number N002, in the operation number N001.
Subsequently, if the user inputs a speech “increase the volume,” an operation command C002 is stored in the operation number N002 in step S501. In step S502, C001 is stored in the operation number N001. Accordingly, the process proceeds to step S503. In step S503, the operation numbers N001 and N002 are not matched, since C001 is stored in the operation number N001 and C002 is stored in the operation number N002. Accordingly, the process proceeds to step S506 to store C002 stored in the operation number N002, in the operation number N001.
Subsequently, if the user inputs a speech “increase the volume” consecutively, the same operations as those when inputting the second speech are performed until step S503. In step S503, the operation numbers N001 and N002 are matched, since C002 is stored in the operation number N001 and C002 is stored in the operation number N002. Accordingly, the process proceeds to step S504. In step S504, C002 stored in both the operation numbers N001 and N002 is discarded. The process proceeds to S505 to perform a recommended input device presentation determination process. Thus, the consecutive operation determination process ends.
In the operation command storage table shown in
Note that the operation command storage table may include time information indicating the time when each operation command has been inputted.
In step S601, the recommended input device determination unit 305 receives the input device ID transmitted in step S304. In step S602, the recommended input device determination unit 305 determines whether the input device ID is I001 representing the speech input device to determine whether the user has operated the target device using speech input. If so determined, the process proceeds to step S603; if not so determined, the process ends. In step S603, the recommended input device determination unit 305 refers to information in the recommended input device determination table shown in
If the recommended input device ID is UI001 representing the speech input device, the recommended input device determination unit 305 can determine that it is appropriate that the user has inputted the speech to the speech input device to operate the device and that there is no need to recommend any other input device to the user. Accordingly, the recommended input device determination unit 305 ends the recommended input device presentation determination process without performing a recommended input device position determination process.
In contrast, if the recommended input device ID is not UI001 representing the speech input device, the process proceeds to step S604. In step S604, the recommended input device determination unit 305 determines that with respect to the device operation performed by the user, there is an input device other than the speech input device, and the recommended input device position determination unit 307 performs a recommended input device position determination process.
For example, if the user inputs a speech “increase the volume” corresponding to an operation command C002 twice consecutively, the recommended input device determination unit 305 receives the input device ID I001 representing the speech input device in step S304 each time. In step S602, the received input device ID is I001. Accordingly, the process proceeds to step S603. In step S603, an operation using the remote control or touchscreen is recommended over an operation using a speech with respect to C002 on the basis of the recommended input device determination table shown in
For example, assume that the user operates the television using multiple input devices such as the speech input device, remote control, and smartphone. While the user can more easily input a keyword by using speech input when searching for a program, the user can more easily operate the television by using a fast-feedback device such as the remote control when controlling the volume as desired by the user. However, for a user who cannot selectively use the multiple input devices, he or she inputs a speech “increase the volume” or a speech “decrease the volume” a number of times in an attempt to control the volume using speech input. Thus, the user feels annoyed about speech input. In view of the foregoing, the recommended input device presentation determination process is performed. Thus, if input devices other than the speech input device are previously set as recommended input devices, the recommended input devices are presented to the user. Thus, the user can know whether the inputted operation command is suitable for speech input. As a result, the user can learn selective use of operation commands suitable for speech input and other input devices and thus can selectively use the recommended multiple input devices and reduce the annoyance associated with use of the multiple input devices.
In the recommended input device determination table, respective operation commands and corresponding recommended input device IDs are stored in the transverse (row) direction. In the recommended input device ID table, respective recommended input device IDs and corresponding input devices and input methods are stored in the transverse (row) direction.
In step S801, the operation method display unit 308 receives the recommended input device IDs and recommended input device position information transmitted in steps S006 and S007, respectively. If it has received no recommended input device IDs or recommended input device position information, the operation method display unit 308 repeats step S801 until it receives them. If the operation method display unit 308 determines in step S802 that a single recommended input device ID has been received in step S801, the process proceeds to step S804. If the operation method display unit 308 determines in step S802 that multiple recommended input device IDs have been received in step S801, the process proceeds to step S804. In step S803, the operation method display unit 308 determines a recommended input device which is closer to the user, of the multiple recommended input device IDs and decides the determined recommended input device as a recommended input device to be presented to the user. Note that all the multiple recommended input devices may be presented to the user without performing steps S802 and S803.
In step S804, the operation method display unit 308 presents the recommended input device other than the speech input device to the user, for example, using a method shown in
In step S805, the operation method display unit 308 determines a method for presenting the position information of the recommended input device to the user on the basis of information as to whether to present the position of the recommended input device and the presentation method information corresponding to the distance between the user and recommended input device described in a table shown in
For example, if the remote control and smartphone are recommended with respect to the operation instruction inputted by the user, the operation method display unit 308 receives the respective recommended input device IDs and the position information thereof in step S801. In step S802, the number of recommended input device IDs, which correspond to the remote control and smartphone, is two. Accordingly, the process proceeds to step S803. If the distance between the remote control and user is 50 cm and the distance between the smartphone and user is 80 cm, the operation method display unit 308 determines in step S803 that the remote control is closer to the user than the smartphone. In step S804, the operation method display unit 308 presents recommended input device information such as “remote control is recommended” to the user on the basis of the determination in step S803. In step S805, the remote control is distant from the user by 50 cm. Accordingly, the process proceeds to step S806. The operation method display unit 308 presents recommended input device position information to the user by causing the remote control to emit a sound.
If a recommended input device which is distant from the user is presented to the user, the user must take time and effort to move to search for or acquire the recommended input device. Thus, the user may feel annoyed about the use of the recommended input device. In view of the foregoing, if multiple recommended input devices are previously set, the operation method display unit 308 presents, to the user, a recommended input device which is closest to the user, of the recommended input devices on the basis of the positions of the user and recommended input devices. Thus, even if all the recommended input devices are distant from the user, the user can know the position of the recommended input device which is the closest to the user. As a result, the user no longer has to take the time and effort to move to search for or acquire the recommended input device which is distant from the user. Further, presenting a recommended input device to the user can assist the user in selectively using the speech input device and the multiple input devices including the remote control.
In the presentation method determination table, the respective distances D between the user and recommended input device and corresponding information as to whether to present the position information of the recommended input device and presentation method IDs are stored in the transverse (row) direction. Note that while the information as to whether to present the position information and the presentation method are changed at 0.3 m and 1.0 m in
Hereafter, the above processes will be described using specific examples. For example, assume that the user has issued a speech “increase the volume” toward the television. First, the speech input device 1 performs the feature value extraction process S001 on the speech signal issued by the user. In the speech recognition process S002, the server 2 converts the extracted speech feature value into character strings or word strings on the basis of the information in the speech recognition dictionary storage unit 203. The server 2 then performs the device operation determination process S003 on the basis of the resulting character strings or word strings. Thus, the server 2 determines that the speech issued by the user intends a “volume control” operation. Then, the target device 3 performs the device operation process S004 on the basis of the determination to increase the volume of the television. The target device 3 then performs the consecutive operation determination process S005. Specifically, the target device 3 determines whether volume control operation commands have been performed consecutively, determines that the volume control operation command is the first one, and stores the operation command in the operation command storage table 304, ending the process.
Subsequently, for example, the user feels that the single volume control operation is not sufficient and then issues another speech “increase the volume” toward the television. Then, the speech input device 1, server 2, and target device 3 perform similar processes to those which have been performed on the first speech until the consecutive operation determination process S005. Since the volume control operation command is stored in the operation number N001 of the operation command storage table 304, the target device 3 determines that the user has inputted the two volume control operation commands consecutively. That is, it determines that the user has inputted the same operation commands consecutively. Accordingly, the target device 3 performs the recommended input device presentation determination process S006. Specifically, it determines whether there are any recommended input devices other than the speech input device in controlling the volume. The target device 3 then determines that the remote control or smartphone is more suitable for controlling the volume than speech input, and the input devices 4 perform the recommended input device position determination process S007. As a result, the input devices 4 calculate the distance between the user and remote control, for example, as 25 cm and the distance between the user and smartphone, for example, as 50 cm. Thus, the target device 3 determines that the recommended input device to be presented to the user is the remote control. The target device 3 then performs the recommended input device presentation process S008 on the basis of the determination. Specifically, it causes the television to emit a sound stating “remote control operation is recommended” toward the user.
By presenting the recommended input device in this manner, it is possible to present a fast-feedback input device such as the remote control to the user with respect to a troublesome operation such as volume control, in which the user must input an speech “increase the volume” or a speech “decrease the volume” to the speech input device a number of times. Thus, the user can learn that the remote control or smartphone is recommended in controlling the volume. Such learning finally allows the user to select a recommended input device for each operation by himself or herself.
In the above example, when the user has inputted the same operation instructions consecutively using speeches, the target device 3 performs the recommended input device presentation determination process. Next, there will be described a process of performing the recommended input device presentation determination process on the basis of the frequency with which the same operation command has been inputted.
In step S901, the target device 3 stores the operation command transmitted in step S304 in an operation command storage table (history information) including time information as shown in
In step S902, the consecutive operation determination unit 303 calculates the frequency with which the same type of operation command as the current operation command has been inputted within the predetermined period of time (input frequency). The predetermined period of time may be any period of time and is set to, for example, 10 sec, 1 min, 30 min, or 1 h in accordance with the type of the target device. In the example of
In step S903, the consecutive operation determination unit 303 determines whether the input frequency is greater than or equal to a predetermined value. The predetermined value may be any value and is set to, for example, two. If the input frequency is greater than or equal to the predetermined value, the recommended input device determination unit 305 performs the recommended input device presentation determination process in step S904, and the operation method display unit 308 presents a recommended input device to the user on the basis of the determination. For example, the operation method display unit 308 presents the remote control or smartphone, which is an input device other than the speech input device, to the user as a recommended input device. If the input frequency is not greater than or equal to the predetermined value, the recommended input device determination unit 305 ends the process without performing the recommended input device presentation determination process.
As seen above, if the user has controlled the volume of the television using speech input a number of times within a short period of time, it is possible to present a fast-feedback input device such as the remote control to the user with respect to the troublesome operation using speech input. Thus, the user can learn that the remote control or smartphone is recommended in controlling the volume. Such learning finally allows the user to select a recommended input device for each operation by himself or herself.
Next, there will be described a process of learning the operation of the target device which has changed in accordance with an operation command and controlling the target device on the basis of the learned operation.
In step S1001, the communication unit 301 of the target device 3 receives an operation command. In step S1002, the device control unit 302 controls the operation of the target device 3. For example, if the user inputs a speech stating “increase the volume by 10” when the volume of the television is 15, the device control unit 302 increases the volume of the television by 10, thereby setting the volume to 25. In step S1003, the device control unit 302 stores, as history information, the operation in which the television volume of 15 has been changed to 25. The operation may be stored in the operation command storage table 304 serving as history information or may be stored in history information which is different from the operation command storage table 304. For another example, if the user inputs a speech stating “change the volume to 25,” the device control unit 302 sets the volume of the television to 25. The device control unit 302 then stores, as history information, the operation in which the volume of the television has been changed to 25. By accumulating operations in this manner, it is possible to obtain a learning result indicating that “when increasing the volume, the volume is often changed to 25.” By controlling the device on the basis of such a learning result, it is possible to set the device in accordance with the preferences of the user. Further, since the history information includes specific numerals such as the amount of variation of volume, it is possible to accurately set the device in accordance with the preferences of the user.
In step S1101, the communication unit 301 receives an operation command. In step S1102, the device control unit 302 determines whether it can operate the device in accordance with the operation command, which is based on speech input. For example, if a speech stating “increase the volume by 10” has been inputted, the device control unit 302 can operate the device, since the amount of volume to be increased is obvious. Accordingly, in step S1105, the device control unit 302 increases the volume of the television by 10.
On the other hand, if a speech stating “increase the volume” has been inputted, the device control unit 302 determines that it cannot operate the device, since the amount of volume to be increased is not obvious. Accordingly, the process proceeds to step S1103.
In step S1103, the device control unit 302 determines whether history information of operations corresponding to commands indicating “increase the volume” is accumulated. For example, the device control unit 302 determines whether corresponding multiple operations are accumulated. If so determined, the device control unit 302 operates the device in accordance with the operations in step S1105. For example, if operations in which the volume has been changed to 25 are accumulated, the device control unit 302 sets the volume of the television to 25.
If not so determined, the recommended input device determination unit 305 performs a recommended input device presentation determination process in step S1104, and the operation method display unit 308 presents a recommended input device to the user on the basis of the determination. Thus, the user, who may control the volume using speech input a number of times, can avoid performing such troublesome operations. The recommended input device presented is, for example, the remote control or smartphone, which is an input device other than the speech input device.
In step S1201, the communication unit 301 receives an operation command. In step S1202, the device control unit 302 determines whether the operation command, which is based on speech input, includes numerical information. The numerical information represents a specific numeral indicating the amount of variation by which the operation of the target device is changed. For example, if a speech stating “increase the volume by 10” has been inputted, the corresponding operation command includes the numeral of 10, by which the volume is to be increased. Accordingly, in step S1205, the device control unit 302 increases the volume of the television by 10.
In contrast, if the operation command does not include any numerical information, for example, if a speech stating “increase the volume” has been inputted, the process proceeds to step S1203. In step S1203, the device control unit 302 determines whether numerical history information satisfying a condition is accumulated. For example, if there is a condition that the current volume of the television is 15 and if history information that the television volume of 15 has been changed to 25 by increasing it by 10 is accumulated, the device control unit 302 sets the volume of the television to 25 by increasing it by 10 in step S1205.
In contrast, if the device control unit 302 determines that any history information satisfying the condition is not accumulated, the recommended input device determination unit 305 performs a recommended input device presentation determination process in step S1204, and the operation method display unit 308 presents a recommended input device to the user on the basis of the determination. Thus, the user, who may control the volume using speech input a number of times, can avoid performing such troublesome operations. The recommended input device presented is, for example, the remote control or smartphone, which is an input device other than the speech input device.
Next, there will be described a process of performing a recommended input device presentation determination process on the basis of the likelihood of speech recognition.
In step S1301, the speech recognition unit 202 converts a received speech feature value into character strings or word strings. In step S1302, the device operation determination unit 204 recognizes an operation command on the basis of the result of the speech recognition. Note that if the target device 3 has the functions of the speech recognition unit 202 and device operation determination unit 204, it performs these processes.
In step S1303, the device control unit 302 calculates the likelihood of the speech recognition. For example, the likelihood can be obtained by calculating the distance between the recognized speech and a language model serving as a reference. In this case, the likelihood is higher as the calculated distance is shorter.
Subsequently, in step S1304, the device control unit 302 determines whether the calculated likelihood is lower than a predetermined value. The predetermined value may be any value. If the likelihood is higher than or equal to the predetermined value, the device control unit 302 controls the device in accordance with the recognized operation command in step S1306.
In contrast, if the likelihood is lower than the predetermined value, the recommended input device determination unit 305 performs a recommended input device presentation determination process in step S1305, and the operation method display unit 308 presents a recommended input device to the user on the basis of the determination. The recommended input device presented is, for example, the remote control or smartphone, which is an input device other than the speech input device. If the likelihood is low, an operation based on the inputted speech is more likely to be different from the operation intended by the user. In this case, by presenting a recommended input device, it is possible to avoid performing the operation different from the operation intended by the user.
Next, there will be described a process of performing a recommended input device presentation determination process on the basis of the noise level of an acquired speech.
In step S1401, the speech recognition unit 202 acquires speech information including a speech feature value. This speech information includes the environmental sound around the speech input device and an operation instruction issued to the target device. In step S1402, the speech recognition unit 202 calculates the noise level of the speech information. For example, the noise level can be calculated by obtaining S/N by making comparison between the sound pressures of the noise and speech. In this case, the noise level is higher as S/N is lower.
Subsequently, in step S1403, the speech recognition unit 202 determines whether the calculated noise level is higher than or equal to a predetermined value. The predetermined value may be any value. If the noise level is lower than the predetermined value, the device control unit 302 controls the target device in accordance with an operation command recognized from the speech information in step S1405.
In contrast, if the noise level is higher than or equal to the predetermined value, the recommended input device determination unit 305 performs a recommended input device presentation determination process in step S1404, and the operation method display unit 308 presents a recommended input device to the user on the basis of the determination. The recommended input device presented is, for example, the remote control or smartphone, which is an input device other than the speech input device. If the noise level is high, an operation based on the inputted speech is more likely to be different from the operation intended by the user. In this case, by presenting a recommended input device, it is possible to avoid performing the operation different from the operation intended by the user.
Note that a recommended input device presentation determination process may be performed based on both the noise level and likelihood.
The processes described in the present embodiment may be performed by any of hardware, software, and a combination thereof. A computer program for causing hardware or the like to perform those processes may be stored in a memory and then executed by a microcomputer. Such a computer program may be installed from a recording medium (semiconductor memory, optical disk, etc.) storing the computer program to the respective devices or may be downloaded through an electrical communication line such as the Internet. Such a computer program may also be wirelessly installed into the respective devices.
For example, the technology described in the above aspect can be implemented in the following types of cloud services. However, these types of cloud services are only illustrative.
Service Type 1: Company's Own Data Center Type Cloud ServicesIn the present type, the service provider 620 operates and manages a data center (cloud server) 703. The service provider 620 also manages an operating system (OS) 702 and an application 701. The service provider 620 provides services using the OS 702 and application 701 (arrow 704).
Service Type 2: laaS Cloud Services
In the present type, a data center operating company 610 operates and manages a data center (cloud server) 703. A service provider 620 manages an OS 702 and an application 701. The service provider 620 provides services using the OS 702 and application 701 (arrow 704).
Service Type 3: PaaS Cloud ServicesIn the present type, a data center operating company 610 manages an OS 702 and operates and manages a data center (cloud server) 703. A service provider 620 manages an application 701. The service provider 620 provides services using the OS 702 and application 701 (arrow 704).
Service Type 4: SaaS Cloud ServicesIn the present type, a data center operating company 610 manages an application 701 and an OS 702 and operates and manages a data center (cloud server) 703. A service provider 620 provides services using the OS 702 and application 701 (arrow 704).
As seen above, in any of the cloud services types, the service provider 620 provides services. The service provider or data center operating company may develop an OS, application, a database of big data, or the like on its own or may outsource such development to a third party.
The technology of the present disclosure is particularly useful in the field of technologies that control the operation of a device using a speech.
Claims
1. A method for controlling an operation of a target device using a plurality of input devices including a speech input device, the method comprising:
- acquiring, from the speech input device, speech information including i) environmental sound around the speech input device, and ii) a speech instruction indicating an operation to be performed on the target device;
- calculating a level of noise included in the speech information;
- recognizing the operation; and
- informing a user of a second input device as a recommended input device based on the calculated noise level and the recognized operation instruction,
- wherein the second input device does not include a speech input device.
2. The method according to claim 1,
- wherein the recommendation for the second input device is determined based on i) the calculated noise level, ii) the recognized operation, and iii) recommended input device information, and
- wherein the recommended input device information indicates an input device suitable to each operation to be performed on the target device.
3. The method according to claim 1,
- wherein if the noise level is higher than or equal to a predetermined value, the user is informed of the second input device as the recommended input device.
Type: Application
Filed: Jun 14, 2017
Publication Date: Oct 5, 2017
Applicant: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA (Torrance, CA)
Inventors: Mayu YOKOYA (Osaka), Katsuyoshi YAMAGAMI (Osaka), Yasunori ISHII (Osaka)
Application Number: 15/622,126