Speech To Image Patents (Class 704/235)
  • Patent number: 12035070
    Abstract: A system and method for facilitating communication between an assisted user (AU) and a hearing user (HU) includes receiving an HU voice signal as the AU and HU participate in a call using AU and HU communication devices, transcribing HU voice signal segments into verbatim caption segments, processing each verbatim caption segment to identify an intended communication (IC) intended by the HU upon uttering an associated one of the HU voice signal segments, for at least a portion of the HU voice signal segments (i) using an associated IC to generate an enhanced caption different than the associated verbatim caption, (ii) for each of a first subset of the HU voice signal segments, presenting the verbatim captions via the AU communication device display for consumption, and (iii) for each of a second subset of the HU voice signal segments, presenting enhanced captions via the AU communication device display for consumption.
    Type: Grant
    Filed: November 10, 2022
    Date of Patent: July 9, 2024
    Assignee: Ultratec, Inc.
    Inventors: Christopher Engelke, Kevin R. Colwell, Robert M. Engelke
  • Patent number: 12033615
    Abstract: The disclosure provides a method and an apparatus for recognizing a speech, an electronic device and a storage medium. A speech to be recognized is obtained. An acoustic feature of the speech to be recognized and a language feature of the speech to be recognized are obtained. The speech to be recognized is input to a pronunciation difference statistics to generate a differential pronunciation pair corresponding to the speech to be recognized. The text information of the speech to be recognized is generated based on the differential pronunciation pair, the acoustic feature and the language feature.
    Type: Grant
    Filed: October 12, 2021
    Date of Patent: July 9, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Yinlou Zhao, Liao Zhang, Zhengxiang Jiang
  • Patent number: 12033616
    Abstract: A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.
    Type: Grant
    Filed: January 10, 2022
    Date of Patent: July 9, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Junyao Shao, Xiaoyin Fu, Qiguang Zang, Zhijie Chen, Mingxin Liang, Huanxin Zheng, Sheng Qian
  • Patent number: 12033630
    Abstract: An information processing device includes an input unit, an extracting unit, an output unit, and a specifying unit. The input unit receives a voice operation. The extracting unit extracts a processing detail corresponding to the voice operation received by the input unit. When the processing detail corresponding to the voice operation received by the input unit cannot be specified, the output unit outputs response information for the user to make a selection of at least one processing detail from a plurality of processing details extracted by the extracting unit. The specifying unit specifies the processing detail selected from among the plurality of processing details contained in the response information as the processing detail corresponding to the voice operation received by the input unit.
    Type: Grant
    Filed: March 2, 2020
    Date of Patent: July 9, 2024
    Assignee: SONY GROUP CORPORATION
    Inventors: Yuhei Taki, Hiro Iwase, Kunihito Sawai, Masaki Takase, Akira Miyashita
  • Patent number: 12026133
    Abstract: A system for data space limitation includes and interface and a processor. The interface is configured to receive a query for a structured data set. The processor is configured to determine an ordered list for calculations to respond to the query; perform the calculations according to the ordered list until an allowed time required for interactivity is reached; and in response to the allowed time being reached, provide results of the calculations.
    Type: Grant
    Filed: June 24, 2021
    Date of Patent: July 2, 2024
    Assignee: Workday, Inc.
    Inventors: Viktor Brada, Peter Fedorocko, Filip Dousek, Hynek Walner
  • Patent number: 12026197
    Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a first natural-language speech input indicative of a request for media, where the first natural-language speech input comprises a first search parameter; providing, by a digital assistant, a first media item identified based on the first search parameter. The method further includes, while providing the first media item, receiving a second natural-language speech input and determining whether the second input corresponds to a user intent of refining the request for media. The method further includes, in accordance with a determination that the second speech input corresponds to a user intent of refining the request for media: identifying, based on the first parameter and the second speech input, a second media item and providing the second media item.
    Type: Grant
    Filed: April 24, 2023
    Date of Patent: July 2, 2024
    Assignee: Apple Inc.
    Inventors: David Chance Graham, Cyrus Daniel Irani, Aimee Piercy, Thomas Alsina
  • Patent number: 12028176
    Abstract: A computer-implemented conferencing method is disclosed. A conference session between a user and one or more other conference participants is initiated via a computer conference application. An attribute-specific pronunciation of the user's name is determined via one or more attribute-specific-pronunciation machine-learning models previously trained based at least on one or more attributes of the one or more other conference participants. The attribute-specific pronunciation of the user's name is compared to a preferred pronunciation of the user's name via computer-pronunciation-comparison logic. Based on the attribute-specific pronunciation of the user's name being inconsistent with the preferred pronunciation of the user's name, a pronunciation learning protocol is automatically executed to convey, via the computer conference application, the preferred pronunciation of the user's name to the one or more other conference participants.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: July 2, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Mastafa Hamza Foufa, Romain Gabriel Paul Rey
  • Patent number: 12020578
    Abstract: Systems and methods for converting voice to text messages in an aircraft. The systems and methods transcribe voice messages between a member of the flight crew and Air Traffic Control (ATC) to provide ATC text messages, transcribe a voice-automatic terminal information service report (voice-ATIS) to provide an ATIS text report, determine flight context data based at least on an analysis of the ATC text messages, determine relevant ATIS data from the ATIS text report using the flight context data, and render a visual User Interface (UI) including at least some of the ATC text messages and at least some of the relevant ATIS data on the same ATC transcription page.
    Type: Grant
    Filed: March 17, 2022
    Date of Patent: June 25, 2024
    Assignee: HONEYWELL INTERNATIONAL INC.
    Inventors: Sivakumar Kanagarajan, Hariharan Saptharishi, Gobinathan Baladhandapani
  • Patent number: 12022160
    Abstract: A live streaming sharing system includes a first live streaming sharing apparatus, a server, and a second live streaming sharing apparatus. The first live streaming sharing apparatus receives a local live streaming instruction through a virtual reality (VR) display screen; and obtains first live streaming data according to the local live streaming instruction; and transmit the first live streaming data to the server, so that the server transmits the first live streaming data to the second live streaming sharing apparatus, the first live streaming data being used by the second live streaming sharing apparatus to present first VR live streaming data. The present disclosure further provides a first live streaming sharing apparatus, a server, and a second live streaming sharing apparatus. The present disclosure achieves an effect of sharing VR content among a plurality of users, thus improving the interactivity and practicability of the solution.
    Type: Grant
    Filed: April 29, 2020
    Date of Patent: June 25, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Zhixuan Ding, Pinxian Li
  • Patent number: 12019976
    Abstract: Systems and methods disclosed relate to contextually tagging statements associated with calls. In particular, the contextual tagging is directed to training a call tagging model for predicting one or more categories associated with a statement for tagging. The disclosed technology generates training data for training the call tagging model based on a list of known phrases used in contacts in a contextual category and matching phrases and words in the list of known phrases against words and phrases used in statements in sample call transcripts. The call tagging model is fine-tuned by using sample statements that appear in contacts. Once trained, the call tagging model is used to determine a probability distribution of categories associated with statements in a contact and further determine contact-level category distributions using multi-dimensional vectors. The tagged contacts are used to determine contacts that are contextually similar to a given contact.
    Type: Grant
    Filed: December 13, 2022
    Date of Patent: June 25, 2024
    Assignee: Calabrio, Inc.
    Inventors: Dylan Morgan, Boris Chaplin, Kyle Smaagard, Chris Vanciu, Laura Cattaneo, Matt Matsui, Catherine Bullock
  • Patent number: 12013852
    Abstract: Systems and methods are described for unified processing of indexed and streaming data. A system enables users to query indexed data or specify processing pipelines to be applied to streaming data. In some instances, a user may specify a query intended to be run against indexed data, but may specify criteria that includes not-yet-indexed data (e.g., a future time frame). The system may convert the query into a data processing pipeline applied to not-yet-indexed data, thus increasing the efficiency of the system. Similarly, in some instances, a user may specify a data processing pipeline to be applied to a data stream, but specify criteria including data items outside the data stream. For example, a user may wish to apply the pipeline retroactively, to data items that have already exited the data stream. The system can convert the pipeline into a query against indexed data to satisfy the users processing requirements.
    Type: Grant
    Filed: March 27, 2023
    Date of Patent: June 18, 2024
    Assignee: Splunk Inc.
    Inventors: Joseph Gabriel Echeverria, Arthur Foelsche, Eric Sammer, Sarah Stanger
  • Patent number: 12014721
    Abstract: Methods and systems for voice-based identification of related products/services are provided. Exemplary systems may include a wireless communication-based tag reader that polls for a wireless transmission-based tag and reads information associated with the wireless transmission-based tag and a processor that executes instructions to identify a product/service associated with the wireless transmission-based tag, identify a plurality of products/services stored in a product/service database identified as related to the product/service associated with the wireless transmission-based tag, filter through the plurality of related products/services based on at least one voice-based parameter to identify a set of one or more related products/services, and generate a voice-based utterance based on the identified set of one or more related products/services.
    Type: Grant
    Filed: September 27, 2023
    Date of Patent: June 18, 2024
    Assignee: DIGIPRINT IP LLC
    Inventor: Avery Levy
  • Patent number: 12015865
    Abstract: A system for a user taking photographs or videos to help evoke genuine emotions of one or a plurality of live camera subjects, comprising a digital device with a processor, a display, and system memory, a software application with a graphical user interface installed on the digital device or a remote server, wherein said application is used to store, manage, access, and view metadata tagged prompts, a pre-loaded library of metadata tagged prompts, wherein said prompts being emotive phrase(s), direction(s), image(s), sound(s), and/or animation(s), wherein said application has a two-stage filter to successively refine the list of prompts to be presented to the user, and a prompt delivery system to convey prompts to one or a plurality of live camera subjects.
    Type: Grant
    Filed: June 4, 2022
    Date of Patent: June 18, 2024
    Inventor: Jeshurun de Rox
  • Patent number: 12014118
    Abstract: Systems and processes for operating an intelligent automated assistant to perform intelligent list reading are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a first user input of a first input type, the first user input including a plurality of words; displaying, on the touch-sensitive display, the plurality of words; receiving a second user input of a second input type indicating a selection of a word of the plurality of words, the second input type different than the first input type; receiving a third user input; modifying the selected word based on the third user input to provide a modified one or more words; and displaying, on the touch-sensitive display, the modified one or more words.
    Type: Grant
    Filed: December 17, 2021
    Date of Patent: June 18, 2024
    Assignee: Apple Inc.
    Inventors: Thomas R. Gruber, Mohammed A. Tayyeb, Ron C. Santos, Madhusudan Chinthakunta
  • Patent number: 12008991
    Abstract: A user specifies a natural language command to a device. Software on the device generates contextual metadata about the user interface of the device, such as data about all visible elements of the user interface, and sends the contextual metadata along with the natural language command to a natural language understanding engine. The natural language understanding engine parses the natural language query using a stored grammar (e.g., a grammar provided by a maker of the device) and as a result of the parsing identifies information about the command (e.g., the user interface elements referenced by the command) and provides that information to the device. The device uses that provided information to respond to the command.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: June 11, 2024
    Assignee: SoundHound AI IP, LLC
    Inventors: Utku Yabas, Philipp Hubert, Karl Stahl
  • Patent number: 12008289
    Abstract: Methods and systems are provided for assisting operation of a vehicle using speech recognition and transcription using text-to-speech for transcription playback with variable emphasis. One method involves analyzing a transcription of an audio communication with respect to the vehicle to identify an operational term pertaining to a current operational context of the vehicle within the transcription, creating an indicator identifying the operational term within the transcription for emphasis when the operational term pertains to the current operational context of the vehicle, identifying a user-configured playback rate; and generating an audio reproduction of the transcription of the audio communication in accordance with the user-configured playback rate, wherein the operational term is selectively emphasized within the audio reproduction based on the indicator.
    Type: Grant
    Filed: August 27, 2021
    Date of Patent: June 11, 2024
    Assignee: HONEYWELL INTERNATIONAL INC.
    Inventors: Gobinathan Baladhandapani, Hariharan Saptharishi, Mahesh Kumar Sampath, Sivakumar Kanagarajan
  • Patent number: 12001614
    Abstract: This application is directed to a method for controlling user experience (UX) operations on an electronic device that executes an application. A touchless UX operation associated with the application has an initiation condition including at least detection of a presence and a gesture in a required proximity range with a required confidence level. The electronic device then determines from a first sensor signal the proximity of the presence with respect to the electronic device. In accordance with a determination that the determined proximity is in the required proximity range, the electronic device determines from a second sensor signal a gesture associated with the proximity of the presence and an associated confidence level of the determination of the gesture. In accordance with a determination that the determined gesture and associated confidence level satisfy the initiation condition, the electronic device initializes the touchless UX operation associated with the application.
    Type: Grant
    Filed: July 21, 2022
    Date of Patent: June 4, 2024
    Assignee: Google LLC
    Inventors: Ashton Udall, Andrew Christopher Felch, James Paul Tobin
  • Patent number: 12002451
    Abstract: Techniques for performing automatic speech recognition (ASR) are described. In some embodiments, an ASR component integrates contextual information from user profile data into audio encoding data to predict a token(s) corresponding to a spoken input. The user profile data may include personalized words, such as, contact names, device names, etc. The ASR component determines word embedding data using the personalized words. The ASR component is configured to apply attention to audio frames that are relevant to the personalized words based on processing the audio encoding data and the word embedding data.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: June 4, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Jing Liu, Feng-Ju Chang, Athanasios Mouchtaris, Martin Radfar, Maurizio Omologo, Siegfried Kunzmann
  • Patent number: 12002459
    Abstract: A baseline of behavior is determined for a population of users that interact with an autonomous agent that is configured to respond to natural language prompts of users of the population using respective user profiles for each user. An element associated with a first user that deviates more than a threshold amount from the baseline is identified. In response to the element deviating more than the threshold amount, the autonomous agent is caused to engage the first user unprompted regarding the element via an engagement that is configured to verify an association between the first user and the element. A response from the first user regarding the engagement is determined to verify the association between the first user and the element. A first user profile that corresponds to the first user is updated in response to determining that the first user response verifies the association.
    Type: Grant
    Filed: February 24, 2021
    Date of Patent: June 4, 2024
    Assignee: International Business Machines Corporation
    Inventors: Paul R. Bastide, Robert E. Loredo, Matthew E. Broomhall
  • Patent number: 11996091
    Abstract: A mixed speech recognition method, a mixed speech recognition apparatus, and a computer-readable storage medium are provided. The mixed speech recognition method includes: monitoring an input of speech input and detecting an enrollment speech and a mixed speech; acquiring speech features of a target speaker based on the enrollment speech; and determining speech belonging to the target speaker in the mixed speech based on the speech features of the target speaker. The enrollment speech includes preset speech information, and the mixed speech is non-enrollment speech inputted after the enrollment speech.
    Type: Grant
    Filed: August 10, 2020
    Date of Patent: May 28, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Jun Wang, Jie Chen, Dan Su, Dong Yu
  • Patent number: 11990138
    Abstract: Methods, systems, and computer-readable media for rapid event voice documentation are provided herein. The rapid event voice documentation system captures verbalized orders and actions and translates that unstructured voice data to structured, usable data for documentation. The voice data captured is tagged with metadata including the name and role of the speaker, a time stamp indicating a time the data was spoken, and a clinical concept identified in the data captured. The system automatically identifies orders (e.g., medications, labs and procedures, etc.), treatments, and assessments/findings that were verbalized during the rapid event to create structured data that is usable by a health information system and ready for documentation directly into an EHR. The system provides all of the captured data including orders, assessment documentation, vital signs and measurements, performed procedures, and treatments, and who performed each, available for viewing and interaction in real time.
    Type: Grant
    Filed: May 9, 2023
    Date of Patent: May 21, 2024
    Assignee: Cerner Innovation, Inc.
    Inventors: Allison Michelle Thilges, Neil Curtis Pfeiffer, Eslie Rolland Phillips, III, Geoffrey Harold Simmons
  • Patent number: 11989514
    Abstract: Disclosed herein are system, method, and computer program product embodiments for machine learning systems to process incoming call-center calls to provide communication summaries that capture effort levels of statements made during interactive communications. For a given call, the system receives a transcript as the input and generates a textual summary as the output. In order to improve a call summary and customize a summarization task to a call center domain, the technology disclosed herein may employ a classifier that predicts an effort level and attention score for individual utterances within a call transcript, ranks the attention scores and uses selected ones of the ranked utterances in the summary.
    Type: Grant
    Filed: March 18, 2022
    Date of Patent: May 21, 2024
    Assignee: Capital One Services, LLC
    Inventors: Aysu Ezen Can, Zachary S. Brown, Chris Symons
  • Patent number: 11989229
    Abstract: Coordinating processing of audio queries is provided. A system receives a query. The system provides the query to a first digital assistant component and a second digital assistant component for processing. The system receives a first response to the query from the first digital assistant component, and a second response to the query from the second digital assistant component. The first digital assistant component can be authorized to access a database the second digital assistant component is prohibited from accessing. The system determines, based on a ranking decision function, to select the second response to the query from the second digital assistant component. The system provides, responsive to the selection, the second response from the second digital assistant to a computing device.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: May 21, 2024
    Assignee: GOOGLE LLC
    Inventors: Bo Wang, Smita Rai, Max Ohlendorf, Venkat Kotla, Chad Yoshikawa, Abhinav Taneja, Amit Agarwal, Chris Ramsdale, Chris Turkstra
  • Patent number: 11989980
    Abstract: An example method includes receiving, at a computing system, a first user input from a user interface during operation of a vehicle and responsive to receiving the first user input, determining a time of reception for the first user input. The method further includes receiving a first set of parameters from the vehicle that correspond to a first parameter identifier (PID). The method also includes determining a time of reception for each parameter, and based on the time of reception for the first user input and the time of reception for each parameter of the first set of parameters, determining a first temporal position for an indicator configured to represent the first user input on a graph of the parameters corresponding to the first PID. The method further includes displaying, on a display interface, the graph of the parameters corresponding to the first PID with the indicator in the first temporal position.
    Type: Grant
    Filed: April 5, 2022
    Date of Patent: May 21, 2024
    Assignee: Snap-on Incorporated
    Inventors: Joshua C. Covington, Patrick S. Merg, Jacob G. Foreman
  • Patent number: 11983674
    Abstract: Computerized systems are provided for automatically determining action items of an event, such as a meeting. The determined action items may be personalized to a particular user, such as a meeting attendee, and may include contextual information enabling the user to understand the action item. In particular, a personalized action item may be determined based in part from determining and utilizing particular factors in combination with an event dialog, such as an event speaker's language style; user role in an organization; historical patterns in communication; event purpose, name, or location; event participants, or other contextual information. Particular statements are evaluated to determine whether the statement likely is or is not an action item. Contextual information may be determined for action items, which then may be provided to the particular user during or following the event.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: May 14, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Sagi Hilleli, Tomer Hermelin, Ido Priness, Assaf Avihoo, Shlomi Maliah, Eleonora Shtotland, Tzoof Avny Brosh
  • Patent number: 11983554
    Abstract: Disclosed implementations relate to automating semantically-similar computing tasks across multiple contexts. In various implementations, an initial natural language input and a first plurality of actions performed using a first computer application may be used to generate a first task embedding and a first action embedding in action embedding space. An association between the first task embedding and first action embedding may be stored. Later, subsequent natural language input may be used to generate a second task embedding that is then matched to the first task embedding. Based on the stored association, the first action embedding may be identified and processed using a selected domain model to select actions to be performed using a second computer application. The selected domain model may be trained to translate between an action space of the second computer application and the action embedding space.
    Type: Grant
    Filed: April 21, 2022
    Date of Patent: May 14, 2024
    Assignee: X DEVELOPMENT LLC
    Inventors: Rebecca Radkoff, David Andre
  • Patent number: 11978094
    Abstract: A pervasive advisor for major purchases and other expenditures may detect that a customer is contemplating a major purchase (e.g., through active listening). The advisor may assist the customer with the timing and manner of making the purchase in a way that is financially sensible in view of the customer's financial situation. A customer may be provided with dynamically-updated information in response to recent actions that may affect an approved loan amount and/or interest rate. Underwriting of a loan may be triggered based on the geo-location of the user. Financial advice may be provided to customers to help them meet their goals using information obtained from third party sources, such as purchase options based on particular goals. The pervasive advisor may thus intervene to assist with budgeting, financing, and timing of major expenditures based on the customer's location and on the customer's unique and changing circumstances.
    Type: Grant
    Filed: April 14, 2023
    Date of Patent: May 7, 2024
    Assignee: Wells Fargo Bank, N.A.
    Inventors: Balin K. Brandt, Laura Fisher, Marie Jeanette Floyd, Katherine J. McGee, Teresa Lynn Rench, Sruthi Vangala
  • Patent number: 11978443
    Abstract: Implemented is a communication with reasonably smooth conversation even between people having different proficiency levels in a common language. Included are a voice recognition unit (14) that acquires a speech rate of a speaker and recognizes a voice on a speech content; and a call voice processing unit (12) that processes a part of a voice recognition result based on a result of comparing the acquired speech rate with a reference speech rate, and transmits a video on which a text character image of the voice recognition result having been processed is superimposed to a communication terminal TM of the speaker.
    Type: Grant
    Filed: June 7, 2019
    Date of Patent: May 7, 2024
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Ryosuke Aoki, Munenori Koyasu, Naoki Oshima, Naoki Mukawa
  • Patent number: 11979273
    Abstract: In one example, a server system interfaces with a plurality of remotely-situated client entities to provide data communications services. The system uses processing circuitry to access an archive of voice data indicative of transcribed audio conversations between a client station and another station participating via the data communications services. Archived voice data includes keywords associated with at least one intent or at least one topic of the transcribed audio conversations. The system identifies keywords and/or identified contexts in a message (e.g., text-based message) received by a text-based virtual assistant, and correlates the text-based message with at least one intent or at least one topic by matching keywords from the archive of digital voice data with the identified keywords from the text-based message. The system may automatically configure the virtual assistant associated with the remotely-situated client entity to address the received text-based message, based on the correlation.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: May 7, 2024
    Assignee: 8x8, Inc.
    Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
  • Patent number: 11978457
    Abstract: Methods for uniquely identifying respective participants in a teleconference involving obtaining components of the teleconference including an audio component, a video component, teleconference metadata, and transcription data, parsing components into plural speech segments, tagging respective speech segments with speaker identification information, and diarizing the teleconference so as to label respective speech segments.
    Type: Grant
    Filed: February 15, 2022
    Date of Patent: May 7, 2024
    Assignee: GONG.IO LTD
    Inventors: Shlomi Medalion, Omri Allouche, Rotem Eilaty
  • Patent number: 11973807
    Abstract: A connection procedure for data communications devices is implemented in a variety of embodiments. In one such embodiment, the procedure uses a first set of connection data for attempting to connect and upon failure to connect uses a second set of connection information in addition to the first set of connection information to attempt a connection. In another embodiment, a delay is implemented before transmitting the connection information and a subsequent delay is implemented to allow for additional connection information to be input and transmitted.
    Type: Grant
    Filed: June 23, 2022
    Date of Patent: April 30, 2024
    Assignee: 8x8, Inc.
    Inventor: Marc Petit-Huguenin
  • Patent number: 11971911
    Abstract: Systems and methods for generating customized annotations of a medical record are provided. The system receives a medical record and processes it using a predictive model to identify evidence of a finding. The system then determines whether to have a recall enhancement or validation of a specific finding. Recall enhancement is used to tune or develop the predictive model, while validation is used to rapidly validate the evidence. The source document is provided to the user and feedback is requested. When asking for validation, the system also highlights the evidence already identified and requests the user to indicate if the evidence is valid for a particular finding. If recall enhancement is utilized, the source document is provided and the user is asked to find evidence in the document for a particular finding. The user may then highlight the evidence that supports the finding. The user may also annotate the evidence using free form text.
    Type: Grant
    Filed: August 2, 2022
    Date of Patent: April 30, 2024
    Assignee: Apixio, LLC
    Inventors: Darren Matthew Schulte, John O. Schneider, Robert Derward Rogers, Vishnuvyas Sethumadhavan
  • Patent number: 11971920
    Abstract: Disclosed is a method for determining a content associated with a voice signal, which is performed by a computing device. The method may include converting a voice signal and generating text information. The method may include determining a plurality of target word candidates. The method may include determining a target word among the plurality of target word candidates based on a comparison between the plurality of target word candidates and the generated text information. The method may also include determining a content associated with the target word.
    Type: Grant
    Filed: July 26, 2023
    Date of Patent: April 30, 2024
    Assignee: ActionPower Corp.
    Inventors: Hyungwoo Kim, Seungho Kwak
  • Patent number: 11960636
    Abstract: Examples of wearable systems and methods can use multiple inputs (e.g., gesture, head pose, eye gaze, voice, and/or environmental factors (e.g., location)) to determine a command that should be executed and objects in the three-dimensional (3D) environment that should be operated on. The multiple inputs can also be used by the wearable system to permit a user to interact with text, such as, e.g., composing, selecting, or editing text.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: April 16, 2024
    Assignee: MAGIC LEAP, INC.
    Inventors: James M. Powderly, Savannah Niles, Jennifer M. R. Devine, Adam C. Carlson, Jeffrey Scott Sommers, Praveen Babu J D, Ajoy Savio Fernandes, Anthony Robert Sheeder
  • Patent number: 11960694
    Abstract: Virtual assistants intelligently emulate a representative of a service provider by providing variable responses to user queries received via the virtual assistants. These variable responses may take the context of a user's query into account both when identifying an intent of a user's query and when identifying an appropriate response to the user's query.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: April 16, 2024
    Assignee: Verint Americas Inc.
    Inventors: Fred A. Brown, Tanya M. Miller, Mark Zartler
  • Patent number: 11961507
    Abstract: A transcription of a query for content discovery is generated, and a context of the query is identified, as well as a first plurality of candidate entities to which the query refers. A search is performed based on the context of the query and the first plurality of candidate entities, and results are generated for output. A transcription of a second voice query is then generated, and it is determined whether the second transcription includes a trigger term indicating a corrective query. If so, the context of the first query is retrieved. A second term of the second query similar to a term of the first query is identified, and a second plurality of candidate entities to which the second term refers is determined. A second search is performed based on the second plurality of candidates and the context, and new search results are generated for output.
    Type: Grant
    Filed: March 2, 2023
    Date of Patent: April 16, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Jeffry Copps Robert Jose, Sindhuja Chonat Sri
  • Patent number: 11962482
    Abstract: At least one high-quality image of a speaker is captured. A low network quality condition may be detected between a client device and a video service node. In response to detecting the low network quality condition, a data stream comprising changes to the high-quality image of the speaker needed to recreate a representation of the speaker is generated. Transmission of the video stream of the speaker between the client device of the speaker and the video service node is stopped and, simultaneously, transmission of the data stream is begun. A digital twin of the speaker is then generated for display at the client device based on the data stream and the high-quality image of the speaker.
    Type: Grant
    Filed: July 14, 2022
    Date of Patent: April 16, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Johan Kölhi, Anthony Friede
  • Patent number: 11960534
    Abstract: Coordinating processing of audio queries is provided. A system receives a query. The system provides the query to a first digital assistant component and a second digital assistant component for processing. The system receives a first response to the query from the first digital assistant component, and a second response to the query from the second digital assistant component. The first digital assistant component can be authorized to access a database the second digital assistant component is prohibited from accessing. The system determines, based on a ranking decision function, to select the second response to the query from the second digital assistant component. The system provides, responsive to the selection, the second response from the second digital assistant to a computing device.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: April 16, 2024
    Assignee: GOOGLE LLC
    Inventors: Bo Wang, Smita Rai, Max Ohlendorf, Venkat Kotla, Chad Yoshikawa, Abhinav Taneja, Amit Agarwal, Chris Ramsdale, Chris Turkstra
  • Patent number: 11955026
    Abstract: A method, computer program product, and computer system for public speaking guidance is provided. A processor retrieves speaker data regarding a speech made by a user. A processor separates the speaker data into one or more speaker modalities. A processor extracts one or more speaker features from the speaker data for the one or more speaker modalities. A processor generates a performance classification based on the one or more speaker features. A processor sends to the user guidance regarding the speech based on the performance classification.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: April 9, 2024
    Assignee: International Business Machines Corporation
    Inventors: Cheng-Fang Lin, Ching-Chun Liu, Ting-Chieh Yu, Yu-Siang Chen, Ryan Young
  • Patent number: 11955117
    Abstract: A system and method are provided for analyzing and reacting to interactions between entities using electronic communication channels. The method includes receiving, via the communications module, data captured from a conversational exchange between a first entity communicating with a second entity using an electronic communication channel. The method also includes analyzing the captured data to detect an indication that the first entity is or was distracted during the conversational exchange, is or was disinterested in a portion of the conversational exchange or missed the portion of the conversational exchange. The method also includes determining based on the indication an action to address the distraction during, disinterest in, or missing of, the portion of the conversational exchange; and providing, via the communications module, an automated message to at least one of the first entity and the second entity for executing the action.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: April 9, 2024
    Assignee: The Toronto-Dominion Bank
    Inventors: Bridget McDermid, Brian Bellwood, Natalie Thien Huong Cornwall, Jeffery David True, Ryan Wall, Stella Pui Kwan Chan, Venetia D'Souza, Christopher Michael Arthur Caravan, Pranavan Premathas, Sahifa Habib Qazi, Mah Noor Siddiqui, Joe Moghaizel, Jonathan K. Barnett
  • Patent number: 11955127
    Abstract: An embodiment extracts a set of designated entities and a set of relationships between designated entities from speech content of an audio feed of a plurality of participants of a current web conference using a machine learning model trained to classify parts of speech content. The embodiment generates a list of current action items based on the extracted set of designated entities and relationships between designated entities. The embodiment identifies a first current action item that is an updated version of an ongoing action item on a progress list of ongoing action items from past web conferences. The embodiment also identifies a second current action item that is unrelated to any of the ongoing action items on the progress list. The embodiment updates the progress list to include updates for the first current action item and by adding the second current action item.
    Type: Grant
    Filed: April 8, 2021
    Date of Patent: April 9, 2024
    Assignee: KYNDRYL, INC.
    Inventors: Muhammad Ammar Ahmed, Madiha Ijaz, Sreekrishnan Venkateswaran
  • Patent number: 11954223
    Abstract: A search index is generated from one or more data records, wherein the one or more data records have contents in a plurality of different fields. Field information of the one or more data records is stored in the search index as specialized indexed elements, wherein the specialized indexed elements overlap with other indexed elements of the one or more data records. A search query is received from a user allowed to access only a portion of the plurality of different fields. The search query is processed within the portion of the plurality of different fields using the search index including the specialized indexed elements.
    Type: Grant
    Filed: October 12, 2020
    Date of Patent: April 9, 2024
    Assignee: ServiceNow, Inc.
    Inventors: William Kimble Johnson, III, Raymond Lau, Benjamin Talcott Borchard
  • Patent number: 11947924
    Abstract: The present disclosure relates to systems and methods for providing subtitle for a video. The video's audio is transcribed to obtain caption text for the video. A first machine-trained model identifies sentences in the caption text. A second model identifies intra-sentence breaks with in the sentences identified using the first machine-trained model. Based on the identified sentences and intra-sentence breaks, one or more words in the caption text are grouped into a clip caption to be displayed for a corresponding clip of the video.
    Type: Grant
    Filed: September 18, 2023
    Date of Patent: April 2, 2024
    Assignee: VoyagerX, Inc.
    Inventors: Hyeonsoo Oh, Sedong Nam
  • Patent number: 11948578
    Abstract: Systems, methods, devices and non-transitory, computer-readable storage mediums are disclosed for a wearable multimedia device and cloud computing platform with an application ecosystem for processing multimedia data captured by the wearable multimedia device. In an embodiment, a wearable multimedia device receives a first speech input from a user, including a first command to generate a message, and first content for inclusion in the message. The device determines second content for inclusion in the message based on the first content, and generates the message such that the messages includes the first and second content. The device receives a second speech input from the user, including a second command to modify the message. In response, the device determines third content for inclusion in the message based on the first content and/or the second content, and modifies the message using the third content. The device transmits the modified message to a recipient.
    Type: Grant
    Filed: March 4, 2022
    Date of Patent: April 2, 2024
    Assignee: Humane, Inc.
    Inventors: Kenneth Luke Kocienda, Imran A. Chaudhri
  • Patent number: 11943492
    Abstract: A method includes that a media asset server receives an identifier and a new-language file of a target video and converts the new-language file into a new-language medium file. The media asset server finds a first index file based on the identifier of the target video, and obtains a second index file based on a storage address of the new-language medium file on the media asset server. The media asset server sends the new-language medium file and the second index file to a content delivery server. The content delivery server replaces the storage address of the new-language medium file on the media asset server in the second index file with a storage address of the new-language medium file on the content delivery server to obtain a third index file. The content delivery server generates a first URL of the target video.
    Type: Grant
    Filed: July 13, 2021
    Date of Patent: March 26, 2024
    Assignee: PETAL CLOUD TECHNOLOGY CO., LTD.
    Inventor: Wei Yan
  • Patent number: 11941345
    Abstract: A computer-implemented process is programmed to process a source input, determine text enhancements, and present the text enhancements to apply to the sentences dictated from the source input. A text processor may use machine-learning models to process an audio input to generate sentences in a presentable format. An audio input can be processed by an automatic speech recognition model to generate electronic text. The electronic text may be used to generate sentence structures using a normalization model. A comprehension model may be used to identify instructions associated with the sentence structures and generate sentences based on the instructions and the sentence structures. An enhancement model may be used to identify enhancements to apply to the sentences. The enhancements may be presented alongside sentences generated by the comprehension model to provide the user an option to select either the enhancements or the sentences.
    Type: Grant
    Filed: October 26, 2021
    Date of Patent: March 26, 2024
    Assignee: Grammarly, Inc.
    Inventors: Timo Mertens, Vipul Raheja, Chad Mills, Ihor Skliarevskyi, Ignat Blazhko, Robyn Perry, Nicholas Bern, Dhruv Kumar, Melissa Lopez
  • Patent number: 11942093
    Abstract: A system and method to perform dubbing automatically for multiple languages at the same time using speech-to-text transcriptions, language translation, and artificial intelligence engines to perform the actual dubbing in the voice likeness of the original speaker.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: March 26, 2024
    Assignee: SYNCWORDS LLC
    Inventors: Aleksandr Dubinsky, Taras Sereda
  • Patent number: 11935540
    Abstract: A method may include obtaining first audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription units, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcript of the first audio data. The method may also include obtaining revoiced audio generated by a revoicing of the first audio data by a captioning assistant and generating a transcription of the revoiced audio using an automatic speech recognition system. The method may further include in response to selecting the revoiced transcription unit, directing the transcription of the revoiced audio to the second device as the transcript of the first audio data.
    Type: Grant
    Filed: October 5, 2021
    Date of Patent: March 19, 2024
    Assignee: Sorenson IP Holdings, LLC
    Inventors: David Thomson, David Black, Jonathan Skaggs, Kenneth Boehme, Shane Roylance
  • Patent number: 11936487
    Abstract: Systems and methods are provided herein for providing context to users who access video conferences late. This may be accomplished by a system receiving an audio segment of a video conference and generating a subtitle corresponding to the audio segment. The system may determine a summary relating to the audio segment and then display the subtitle, summary, and video conference on a device. The system allows a user, who accesses a video conference late, to quickly and accurately understand the current video conference discussion, improving the user's experience and increasing the productivity of the video conference.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: March 19, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Padmassri Chandrashekar, Daina Emmanuel
  • Patent number: 11929067
    Abstract: A security panel for controlling home automation devices via a voice assistant device is provided, in which the security panel includes a processor, a memory, a microphone, and a speaker. In one example implementation, the security panel is configured to receive a text input from a user, convert the text input into an audio format via a text-to-speech engine to generate a first voice command for controlling one or more home automation devices via a voice assistant device, and to output the first voice command via the speaker of the security panel, in which the first voice command is received by the voice assistant device via a microphone of the voice assistant device, in which the voice assistant device is configured to control the one or more home automation devices based on the first voice command.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: March 12, 2024
    Assignee: CARRIER CORPORATION
    Inventors: Pirammanayagam Nallaperumal, Vijayakumar Ummadisinghu, Srikanth Govindavaram