Patents Examined by Mark Villena
-
Patent number: 12292953Abstract: Methods and systems may incorporate voice interaction and other audio interaction to facilitate access to prescription related information and processes. Particularly, voice/audio interactions may be utilized to achieve authentication to access prescription-related information and action capabilities. Additionally, voice/audio interactions may be utilized in performance of processes such as obtaining prescription refills and receiving reminders to consume prescription products.Type: GrantFiled: January 30, 2023Date of Patent: May 6, 2025Assignee: WALGREEN CO.Inventors: Andrew David Schweinfurth, Julija Alegra Petkus, Gunjan Dhanesh Bhow
-
Patent number: 12288553Abstract: A method of detecting a replay attack comprises: receiving an audio signal representing speech; identifying speech content present in at least a portion of the audio signal; obtaining information about a frequency spectrum of each portion of the audio signal for which speech content is identified; and, for each portion of the audio signal for which speech content is identified: retrieving information about an expected frequency spectrum of the audio signal; comparing the frequency spectrum of portions of the audio signal for which speech content is identified with the respective expected frequency spectrum; and determining that the audio signal may result from a replay attack if a measure of a difference between the frequency spectrum of the portions of the audio signal for which speech content is identified and the respective expected frequency spectrum exceeds a threshold level.Type: GrantFiled: July 31, 2019Date of Patent: April 29, 2025Assignee: Cirrus Logic Inc.Inventors: John Paul Lesso, César Alonso, Earl Corban Vickers
-
Patent number: 12265788Abstract: Methods, systems, apparatuses, and non-transitory computer-readable media are provided for providing answer data through multiple connected large language models. Operations may include receiving, through a graphical user interface associated with a local large language model having access to a first limited private dataset but not a second limited private dataset, an input from a user device, identifying, based on the input, an external large language model from among a plurality of external large language models, transmitting the input to the external large language model, receiving, from the external large language model, the answer data responsive to the input, generating, by the local large language model, response data based on the answer data, and outputting the response data at the user device.Type: GrantFiled: June 4, 2024Date of Patent: April 1, 2025Assignee: Curio XRInventor: Ethan Fieldman
-
Patent number: 12253620Abstract: An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.Type: GrantFiled: September 27, 2021Date of Patent: March 18, 2025Assignee: Microsoft Technology Licensing, LLCInventors: Kazuhito Koishida, Alexander A. Popov, Uros Batricevic, Steven Nabil Bathiche
-
Patent number: 12254885Abstract: Techniques are described herein for detecting and handling failures in other automated assistants. A method includes: executing a first automated assistant in an inactive state at least in part on a computing device operated by a user; while in the inactive state, determining, by the first automated assistant, that a second automated assistant failed to fulfill a request of the user; in response to determining that the second automated assistant failed to fulfill the request of the user, the first automated assistant processing cached audio data that captures a spoken utterance of the user comprising the request that the second automated assistant failed to fulfill, or features of the cached audio data, to determine a response that fulfills the request of the user; and providing, by the first automated assistant to the user, the response that fulfills the request of the user.Type: GrantFiled: January 13, 2023Date of Patent: March 18, 2025Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Patent number: 12249326Abstract: At least one exemplary embodiment is directed to a method and device for voice operated control with learning. The method can include measuring a first sound received from a first microphone, measuring a second sound received from a second microphone, detecting a spoken voice based on an analysis of measurements taken at the first and second microphone, learning from the analysis when the user is speaking and a speaking level in noisy environments, training a decision unit from the learning to be robust to a detection of the spoken voice in the noisy environments, mixing the first sound and the second sound to produce a mixed signal, and controlling the production of the mixed signal based on the learning of one or more aspects of the spoken voice and ambient sounds in the noisy environments.Type: GrantFiled: September 23, 2021Date of Patent: March 11, 2025Assignee: ST Case1Tech, LLCInventors: John Usher, Steven Goldstein, Marc Boillot
-
Patent number: 12223959Abstract: A method includes obtaining, at a first conference endpoint device, spoken command data representing a spoken command detected by the first conference endpoint device during a teleconference between the first conference endpoint device and a second conference endpoint device. The method further includes generating modified spoken command data by inserting a spoken phrase into the spoken command. The method further includes transmitting the modified spoken command data to a natural language service.Type: GrantFiled: October 31, 2023Date of Patent: February 11, 2025Assignee: Hewlett-Packard Development Company, L.P.Inventors: Gregory Pelton, Kwan Truong, Cody Schnacker
-
Patent number: 12217742Abstract: Embodiments are disclosed for generating full-band audio from narrowband audio using a GAN-based audio super resolution model. A method of generating full-band audio may include receiving narrow-band input audio data, upsampling the narrow-band input audio data to generate upsampled audio data, providing the upsampled audio data to an audio super resolution model, the audio super resolution model trained to perform bandwidth expansion from narrow-band to wide-band, and returning wide-band output audio data corresponding to the narrow-band input audio data.Type: GrantFiled: November 23, 2021Date of Patent: February 4, 2025Assignees: Adobe Inc., The Trustees of Princeton UniversityInventors: Zeyu Jin, Jiaqi Su, Adam Finkelstein
-
Patent number: 12216990Abstract: A method for automatically generating clinical structured reports based on templates using voice recognition, the method implemented by a computing device and includes applying a natural language processing algorithm to a captured voice input from a client to identify one or more keywords or phrases. At least one of a plurality of clinical structured report templates are identified based on the identified one or more keywords or phrases correlated to medical examination data points associated with each of the templates. A clinical structured report is automatically prepared based on the identified clinical structured report templates without a non-voice input and without an explicit separate voice command directed to manage a report generation operation. The clinical structured report includes modifications to the clinical structured report template based on the identified one or more keywords or phrases. The clinical structured report is provided to the client.Type: GrantFiled: November 8, 2021Date of Patent: February 4, 2025Assignee: ClickView CorporationInventors: David A. Martinez, David K. Martinez
-
Patent number: 12217746Abstract: A controller for a furniture drive includes an operating device which includes a speech controller. The speech controller includes a speech control subunit operatively connected to an adjustment drive, and a microphone interacting with the speech control subunit. The speech controller includes three speech control subunits arranged in the operating unit, with two of the speech control subunits forming actuators of adjustment functions and one of the speech control units forming an actuator of stopping the adjustment drive.Type: GrantFiled: April 9, 2019Date of Patent: February 4, 2025Assignee: Dewertokin Technology Group Co., LtdInventor: Armin Hille
-
Patent number: 12211497Abstract: Techniques for coordinating output of inferred content using various components and systems are described. A supplemental content system and a notification system may each receive inferred content to be output. When the supplemental content system or the notification system outputs the inferred content, the respective system stores a record of the output of the content in a historical output storage. Thereafter, when the other system is ready to output the inferred content, the other system may prevent the inferred content from being output based on the inferred content having already been output, as represented in the historical output storage.Type: GrantFiled: May 6, 2021Date of Patent: January 28, 2025Assignee: Amazon Technologies, Inc.Inventors: Vinaya Nadig, Samarth Bhargava
-
Patent number: 12197812Abstract: Systems, methods, and devices may generate speech files that reflect emotion of text-based content. An example process includes selecting a first text from a first source of text content and selecting a second text from a second source of text content. The first text and the second text are aggregated into an aggregated text, and the aggregated text includes a first emotion associated with content of the first text. The aggregated text also includes a second emotion associated with content of the second text. The aggregated text is converted into a speech stored in an audio file. The speech replicates human expression of the first emotion and of the second emotion.Type: GrantFiled: April 13, 2023Date of Patent: January 14, 2025Assignee: DISH Technologies L.L.C.Inventor: John C. Calef, III
-
Patent number: 12174864Abstract: Methods and apparatuses are described for automatic intelligent query suggestion for information retrieval applications. A server a) determines candidate intents associated with user input text received from a remote device, including applying a trained intent classification model to the user input text to predict candidate intents. The server b) calculates a likelihood value for each of the candidate intents. The server c) compiles a list of suggested queries based upon the candidate intents and associated likelihood values. The server d) identifies a subset of the list of suggested queries for display on the remote device. Upon detecting an update to the user input text at the remote device, the server repeats steps a) to d) using the updated user input text, or upon detecting a selection of one of the suggested queries at the remote device, the server retrieves content responsive to the selected query.Type: GrantFiled: September 28, 2023Date of Patent: December 24, 2024Assignee: FMR LLCInventors: Sachin Umrao, Manish Gupta, Matthew McGrath, Bibhash Chakrabarty, Sorin Roman
-
Patent number: 12175972Abstract: An audio input associated with a human utterance received at the audio input device is received from a respective one of a plurality of audio input devices. Each of the plurality of audio input devices is located in a corresponding physical location within the premises. The audio input is mapped to an intent. An audible verbal response associated with the intent is provided as audio output via a selected one or more of a plurality of audio output devices. Each of the plurality of audio output devices is located in an associated physical location within the premises.Type: GrantFiled: October 19, 2021Date of Patent: December 24, 2024Assignee: Josh.ai, Inc.Inventors: Alex Nathan Capecelatro, Timothy Earl Gill, Edward John McKenna, Jr., Derek Murphy, Scott Lon Allen
-
Patent number: 12158839Abstract: The disclosure provides a method and an apparatus for allocating memory, and an electronic device. Multiple frames of speech data are received and input to a neural network model. The neural network model is configured to ask for multiple data tensors when processing the multiple frames of speech data, and the multiple data tensors share a common memory.Type: GrantFiled: November 15, 2021Date of Patent: December 3, 2024Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Chao Tian, Lei Jia
-
Patent number: 12147769Abstract: A method for aligning a text with a media material, an apparatus, and a storage medium are provided. The method includes: determining a set of anchor points in the text; performing following operations i) to v) repeatedly until all anchor points are removed from the set of anchor points or all media materials are removed from a set of media materials: i) ranking the anchor points in the set of anchor points, ii) selecting a target anchor point from the set of anchor points based on the ranked anchor points in the set, iii) determining, from the set of media materials, a media material matching a text segment starting from the target anchor point, iv) removing the target anchor point, and v) removing the media material matching the text segment starting from the target anchor point; and aligning the text segments with respective media matching materials.Type: GrantFiled: November 30, 2021Date of Patent: November 19, 2024Assignees: BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., BAIDU USA LLCInventors: Yichen Hu, Xi Chen, Hao Tian
-
Patent number: 12141207Abstract: The present invention allows appropriate acquisition of focus points in a dialogue.Type: GrantFiled: August 14, 2019Date of Patent: November 12, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Setsuo Yamada, Yoshiaki Noda, Takaaki Hasegawa
-
Patent number: 12128839Abstract: An embodiment vehicle includes a communication device, an output device, an input device including a microphone, and a control device connected to each. The control device is configured to transmit speech data received through the microphone to a server through the communication device, to receive analysis information of the speech data from the server through the communication device in response to the transmission of the speech data, to identify whether the speech data corresponds to a first speech command registered in the server based on the analysis information, to control the output device to output information about target devices of a control setting of the vehicle based on the speech data not corresponding to the first speech command, and to set the speech data as a second speech command for controlling at least one of the target devices based on a reception of a user input through the input device.Type: GrantFiled: October 1, 2021Date of Patent: October 29, 2024Assignees: HYUNDAI MOTOR COMPANY, KIA CORPORATIONInventors: Youngjae Park, Jaemin Joh
-
Patent number: 12118997Abstract: A method and system for controlling response to a voice-command utterance. An example method includes a computing system that is associated with the first device carrying out operations upon the first device receiving the voice-command utterance. The operations include (a) making a determination of whether any of one or more second devices received the voice-command utterance before the first device received the voice-command utterance and (b) controlling whether the computing system will trigger an action in response to the first device receiving the voice-command utterance, with the controlling being based on the determination of whether any of the one or more second devices received the voice-command utterance before the first device received the voice-command utterance. In an example implementation, the action could be controlling operation of a control target such as one or more lights.Type: GrantFiled: May 16, 2023Date of Patent: October 15, 2024Assignee: Roku, Inc.Inventors: Soren Riise, Frank Maker, Carl Sassenrath, Abhay Bhorkar
-
Patent number: 12112744Abstract: The disclosure provides a multimodal speech recognition method and system, and a computer-readable storage medium. The method includes calculating a first logarithmic mel-frequency spectral coefficient and a second logarithmic mel-frequency spectral coefficient when a target millimeter-wave signal and a target audio signal both contain speech information corresponding to a target user; inputting the first and the second logarithmic mel-frequency spectral coefficient into a fusion network to determine a target fusion feature, where the fusion network includes at least a calibration module and a mapping module, the calibration module is configured to perform mutual feature calibration on the target audio/millimeter-wave signals, and the mapping module is configured to fuse a calibrated millimeter-wave feature and a calibrated audio feature; and inputting the target fusion feature into a semantic feature network to determine a speech recognition result corresponding to the target user.Type: GrantFiled: March 2, 2022Date of Patent: October 8, 2024Assignee: Zhejiang UniversityInventors: Feng Lin, Tiantian Liu, Ming Gao, Chao Wang, Zhongjie Ba, Jinsong Han, Wenyao Xu, Kui Ren