Patents by Inventor Shuangyu Chang

Shuangyu Chang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240403539
    Abstract: Solutions for custom display post processing (DPP) in speech recognition (SR) use a customized multi-stage DPP pipeline that transforms a stream of SR tokens from lexical form to display form. A first transformation stage of the DPP pipeline receives the stream of tokens, in turn, by an upstream filter, a base model stage, and a downstream filter, and transforms a first aspect of the stream of tokens (e.g., disfluency, inverse text normalization (ITN), capitalization, etc.) from lexical form into display form. The upstream filter and/or the downstream filter alter the stream of tokens to change the default behavior of the DPP pipeline into custom behavior. Additional transformation stages of the DPP pipeline perform further transforms, allowing for outputting final text in a display format that is customized for a specific user. This permits each user to efficiently leverage a common baseline DPP pipeline to produce a custom output.
    Type: Application
    Filed: July 3, 2024
    Publication date: December 5, 2024
    Inventors: Wei LIU, Padma VARADHARAJAN, Piyush BEHRE, Nicholas KIBRE, Edward C. LIN, Shuangyu CHANG, Che ZHAO, Khuram SHAHID, Heiko Willy RAHMEL
  • Patent number: 12087286
    Abstract: A computing system obtains features that have been extracted from an acoustic signal, where the acoustic signal comprises spoken words uttered by a user. The computing system performs automatic speech recognition (ASR) based upon the features and a language model (LM) generated based upon expanded pattern data. The expanded pattern data includes a name of an entity and a search term, where the entity belongs to a segment identified in a knowledge base. The search term has been included in queries for entities belonging to the segment. The computing system identifies a sequence of words corresponding to the features based upon results of the ASR. The computing system transmits computer-readable text to a search engine, where the text includes the sequence of words.
    Type: Grant
    Filed: May 6, 2021
    Date of Patent: September 10, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ankur Gupta, Satarupa Guha, Rupeshkumar Rasiklal Mehta, Issac John Alphonso, Anastasios Anastasakos, Shuangyu Chang
  • Patent number: 12061861
    Abstract: Solutions for custom display post processing (DPP) in speech recognition (SR) use a customized multi-stage DPP pipeline that transforms a stream of SR tokens from lexical form to display form. A first transformation stage of the DPP pipeline receives the stream of tokens, in turn, by an upstream filter, a base model stage, and a downstream filter, and transforms a first aspect of the stream of tokens (e.g., disfluency, inverse text normalization (ITN), capitalization, etc.) from lexical form into display form. The upstream filter and/or the downstream filter alter the stream of tokens to change the default behavior of the DPP pipeline into custom behavior. Additional transformation stages of the DPP pipeline perform further transforms, allowing for outputting final text in a display format that is customized for a specific user. This permits each user to efficiently leverage a common baseline DPP pipeline to produce a custom output.
    Type: Grant
    Filed: July 26, 2022
    Date of Patent: August 13, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Wei Liu, Padma Varadharajan, Piyush Behre, Nicholas Kibre, Edward C. Lin, Shuangyu Chang, Che Zhao, Khuram Shahid, Heiko Willy Rahmel
  • Patent number: 11947699
    Abstract: Embodiments are provided for securing data access to machine learning training data at a plurality of distributed computing devices. Electronic content including original data that corresponds to a preferred data security level is divided into a plurality of microsegments. The plurality of microsegments is restrictively distributed to a plurality of computing devices which apply transcription labels to the plurality of microsegments. The labeled microsegments are reconstructed into training data which is then used to train a machine learning model while facilitating an improvement in data security of the original data included with the training data from the reconstructed microsegments.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: April 2, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Hemant Malhotra, Xuedong Huang, Li Jiang, Ivo Jose Garcia Dos Santos, Dong Li, Shuangyu Chang
  • Publication number: 20240087572
    Abstract: Systems are configured to obtain streaming audio data comprising language utterances, continuously decode the streaming audio data in order to generate decoded streaming audio data and determine whether a linguistic boundary exists within an initial segment of decoded streaming audio data. When a linguistic boundary is determined to exist, the systems apply a punctuation at the linguistic boundary and output a first portion of the initial segment of the streaming audio data ending at the linguistic boundary while refraining from outputting a second portion of the initial segment which is located temporally subsequent to the first portion of the initial segment. Systems are also configured to delay the output until predetermined punctuation validation processes have been performed.
    Type: Application
    Filed: November 14, 2022
    Publication date: March 14, 2024
    Inventors: Sayan Dev PATHAK, Amit Kumar AGARWAL, Amy Parag SHAH, Sourish CHATTERJEE, Zoltan ROMOCSA, Christopher Hakan BASOGLU, Piyush BEHRE, Shuangyu CHANG, Emilian Yordanov STOIMENOV
  • Publication number: 20230409829
    Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.
    Type: Application
    Filed: July 5, 2023
    Publication date: December 21, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Sayan Dev PATHAK, Christopher Hakan BASOGLU, Amit AGARWAL, Shuangyu CHANG, Amy SHAH
  • Publication number: 20230351098
    Abstract: Solutions for custom display post processing (DPP) in speech recognition (SR) use a customized multi-stage DPP pipeline that transforms a stream of SR tokens from lexical form to display form. A first transformation stage of the DPP pipeline receives the stream of tokens, in turn, by an upstream filter, a base model stage, and a downstream filter, and transforms a first aspect of the stream of tokens (e.g., disfluency, inverse text normalization (ITN), capitalization, etc.) from lexical form into display form. The upstream filter and/or the downstream filter alter the stream of tokens to change the default behavior of the DPP pipeline into custom behavior. Additional transformation stages of the DPP pipeline perform further transforms, allowing for outputting final text in a display format that is customized for a specific user. This permits each user to efficiently leverage a common baseline DPP pipeline to produce a custom output.
    Type: Application
    Filed: July 26, 2022
    Publication date: November 2, 2023
    Inventors: Wei LIU, Padma VARADHARAJAN, Piyush BEHRE, Nicholas KIBRE, Edward C. LIN, Shuangyu CHANG, Che ZHAO, Khuram SHAHID, Heiko Willy RAHMEL
  • Publication number: 20230352009
    Abstract: Systems generate segments of spoken language utterances based on different sets of segmentation boundaries. The systems are also configured to generate one or more formatted segments by assigning a punctuation tags at segmentation boundaries and to generate one or more final sentences from the one or more segments.
    Type: Application
    Filed: April 29, 2022
    Publication date: November 2, 2023
    Inventors: Piyush BEHRE, Sharman W TAN, Shuangyu CHANG, Padma VARADHARAJAN, Sayan Dev PATHAK, Ravikant GUPTA
  • Publication number: 20230297606
    Abstract: Generally discussed herein are devices, systems, and methods for multi-lingual model generation. A method can include determining, for low-resource languages, respective a language similarity value indicating language similarity between each of the low-resource languages, clustering the low-resource languages into groups based on the respective language similarity value, aggregating training data of languages corresponding to a given group resulting in aggregated training data, and training a re-ranking language model based on the aggregated training data resulting in a trained re-ranking language model.
    Type: Application
    Filed: June 14, 2022
    Publication date: September 21, 2023
    Inventors: Li MIAO, Jian WU, Shuangyu CHANG, Piyush BEHRE, Sarangarajan PARTHASARATHY
  • Patent number: 11741302
    Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.
    Type: Grant
    Filed: May 18, 2022
    Date of Patent: August 29, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sayan Dev Pathak, Christopher Hakan Basoglu, Amit Agarwal, Shuangyu Chang, Amy Shah
  • Patent number: 11676576
    Abstract: Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.
    Type: Grant
    Filed: August 11, 2021
    Date of Patent: June 13, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao
  • Publication number: 20230153451
    Abstract: Embodiments are provided for securing data access to machine learning training data at a plurality of distributed computing devices. Electronic content including original data that corresponds to a preferred data security level is divided into a plurality of microsegments. The plurality of microsegments is restrictively distributed to a plurality of computing devices which apply transcription labels to the plurality of microsegments. The labeled microsegments are reconstructed into training data which is then used to train a machine learning model while facilitating an improvement in data security of the original data included with the training data from the reconstructed microsegments.
    Type: Application
    Filed: April 30, 2021
    Publication date: May 18, 2023
    Inventors: Hemant MALHOTRA, Xuedong HUANG, Li JIANG, Ivo Jose GARCIA DOS SANTOS, Dong LI, Shuangyu CHANG
  • Patent number: 11636854
    Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.
    Type: Grant
    Filed: May 24, 2022
    Date of Patent: April 25, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Christopher H Basoglu, Nathan E Wohlgemuth
  • Patent number: 11562738
    Abstract: A system includes acquisition of a domain grammar, determination of an interpolated grammar based on the domain grammar and a base grammar, determination of a delta domain grammar based on an augmented first grammar and the interpolated grammar, determination of an out-of-vocabulary class based on the domain grammar and the base grammar, insertion of the out-of-vocabulary class into a composed transducer composed of the augmented first grammar and one or more other transducers to generate an updated composed transducer, composition of the delta domain grammar and the updated composed transducer, and application of the composition of the delta domain grammar and the updated composed transducer to an output of an acoustic model.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: January 24, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Veljko Miljanic, Aadyot Bhatnagar, Hosam Khalil, Christopher Basoglu
  • Publication number: 20220358910
    Abstract: A computing system obtains features that have been extracted from an acoustic signal, where the acoustic signal comprises spoken words uttered by a user. The computing system performs automatic speech recognition (ASR) based upon the features and a language model (LM) generated based upon expanded pattern data. The expanded pattern data includes a name of an entity and a search term, where the entity belongs to a segment identified in a knowledge base. The search term has been included in queries for entities belonging to the segment. The computing system identifies a sequence of words corresponding to the features based upon results of the ASR. The computing system transmits computer-readable text to a search engine, where the text includes the sequence of words.
    Type: Application
    Filed: May 6, 2021
    Publication date: November 10, 2022
    Inventors: Ankur GUPTA, Satarupa GUHA, Rupeshkumar Rasiklal MEHTA, Issac John ALPHONSO, Anastasios ANASTASAKOS, Shuangyu CHANG
  • Publication number: 20220358912
    Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.
    Type: Application
    Filed: May 24, 2022
    Publication date: November 10, 2022
    Inventors: Ziad AL BAWAB, Anand U. DESAI, Shuangyu CHANG, Amit K. AGARWAL, Zoltan ROMOCSA, Christopher H. BASOGLU, Nathan E. WOHLGEMUTH
  • Patent number: 11430433
    Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.
    Type: Grant
    Filed: August 5, 2019
    Date of Patent: August 30, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Christopher H Basoglu, Nathan E Wohlgemuth
  • Patent number: 11348574
    Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.
    Type: Grant
    Filed: August 5, 2019
    Date of Patent: May 31, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Christopher H Basoglu, Nathan E Wohlgemuth
  • Publication number: 20220013109
    Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.
    Type: Application
    Filed: August 11, 2021
    Publication date: January 13, 2022
    Inventors: Ziad AL BAWAB, Anand U. DESAI, Cem AKSOYLAR, Michael LEVIT, Xin MENG, Shuangyu CHANG, Suyash CHOUDHURY, Dhiresh RAWAL, Tao LI, Rishi GIRISH, Marcus JAGER, Ananth Rampura SHESHAGIRI RAO
  • Patent number: 11120788
    Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: September 14, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao