Patents by Inventor Shuangyu Chang
Shuangyu Chang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240403539Abstract: Solutions for custom display post processing (DPP) in speech recognition (SR) use a customized multi-stage DPP pipeline that transforms a stream of SR tokens from lexical form to display form. A first transformation stage of the DPP pipeline receives the stream of tokens, in turn, by an upstream filter, a base model stage, and a downstream filter, and transforms a first aspect of the stream of tokens (e.g., disfluency, inverse text normalization (ITN), capitalization, etc.) from lexical form into display form. The upstream filter and/or the downstream filter alter the stream of tokens to change the default behavior of the DPP pipeline into custom behavior. Additional transformation stages of the DPP pipeline perform further transforms, allowing for outputting final text in a display format that is customized for a specific user. This permits each user to efficiently leverage a common baseline DPP pipeline to produce a custom output.Type: ApplicationFiled: July 3, 2024Publication date: December 5, 2024Inventors: Wei LIU, Padma VARADHARAJAN, Piyush BEHRE, Nicholas KIBRE, Edward C. LIN, Shuangyu CHANG, Che ZHAO, Khuram SHAHID, Heiko Willy RAHMEL
-
Patent number: 12087286Abstract: A computing system obtains features that have been extracted from an acoustic signal, where the acoustic signal comprises spoken words uttered by a user. The computing system performs automatic speech recognition (ASR) based upon the features and a language model (LM) generated based upon expanded pattern data. The expanded pattern data includes a name of an entity and a search term, where the entity belongs to a segment identified in a knowledge base. The search term has been included in queries for entities belonging to the segment. The computing system identifies a sequence of words corresponding to the features based upon results of the ASR. The computing system transmits computer-readable text to a search engine, where the text includes the sequence of words.Type: GrantFiled: May 6, 2021Date of Patent: September 10, 2024Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ankur Gupta, Satarupa Guha, Rupeshkumar Rasiklal Mehta, Issac John Alphonso, Anastasios Anastasakos, Shuangyu Chang
-
Patent number: 12061861Abstract: Solutions for custom display post processing (DPP) in speech recognition (SR) use a customized multi-stage DPP pipeline that transforms a stream of SR tokens from lexical form to display form. A first transformation stage of the DPP pipeline receives the stream of tokens, in turn, by an upstream filter, a base model stage, and a downstream filter, and transforms a first aspect of the stream of tokens (e.g., disfluency, inverse text normalization (ITN), capitalization, etc.) from lexical form into display form. The upstream filter and/or the downstream filter alter the stream of tokens to change the default behavior of the DPP pipeline into custom behavior. Additional transformation stages of the DPP pipeline perform further transforms, allowing for outputting final text in a display format that is customized for a specific user. This permits each user to efficiently leverage a common baseline DPP pipeline to produce a custom output.Type: GrantFiled: July 26, 2022Date of Patent: August 13, 2024Assignee: Microsoft Technology Licensing, LLC.Inventors: Wei Liu, Padma Varadharajan, Piyush Behre, Nicholas Kibre, Edward C. Lin, Shuangyu Chang, Che Zhao, Khuram Shahid, Heiko Willy Rahmel
-
Patent number: 11947699Abstract: Embodiments are provided for securing data access to machine learning training data at a plurality of distributed computing devices. Electronic content including original data that corresponds to a preferred data security level is divided into a plurality of microsegments. The plurality of microsegments is restrictively distributed to a plurality of computing devices which apply transcription labels to the plurality of microsegments. The labeled microsegments are reconstructed into training data which is then used to train a machine learning model while facilitating an improvement in data security of the original data included with the training data from the reconstructed microsegments.Type: GrantFiled: April 30, 2021Date of Patent: April 2, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Hemant Malhotra, Xuedong Huang, Li Jiang, Ivo Jose Garcia Dos Santos, Dong Li, Shuangyu Chang
-
Publication number: 20240087572Abstract: Systems are configured to obtain streaming audio data comprising language utterances, continuously decode the streaming audio data in order to generate decoded streaming audio data and determine whether a linguistic boundary exists within an initial segment of decoded streaming audio data. When a linguistic boundary is determined to exist, the systems apply a punctuation at the linguistic boundary and output a first portion of the initial segment of the streaming audio data ending at the linguistic boundary while refraining from outputting a second portion of the initial segment which is located temporally subsequent to the first portion of the initial segment. Systems are also configured to delay the output until predetermined punctuation validation processes have been performed.Type: ApplicationFiled: November 14, 2022Publication date: March 14, 2024Inventors: Sayan Dev PATHAK, Amit Kumar AGARWAL, Amy Parag SHAH, Sourish CHATTERJEE, Zoltan ROMOCSA, Christopher Hakan BASOGLU, Piyush BEHRE, Shuangyu CHANG, Emilian Yordanov STOIMENOV
-
Publication number: 20230409829Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.Type: ApplicationFiled: July 5, 2023Publication date: December 21, 2023Applicant: Microsoft Technology Licensing, LLCInventors: Sayan Dev PATHAK, Christopher Hakan BASOGLU, Amit AGARWAL, Shuangyu CHANG, Amy SHAH
-
Publication number: 20230351098Abstract: Solutions for custom display post processing (DPP) in speech recognition (SR) use a customized multi-stage DPP pipeline that transforms a stream of SR tokens from lexical form to display form. A first transformation stage of the DPP pipeline receives the stream of tokens, in turn, by an upstream filter, a base model stage, and a downstream filter, and transforms a first aspect of the stream of tokens (e.g., disfluency, inverse text normalization (ITN), capitalization, etc.) from lexical form into display form. The upstream filter and/or the downstream filter alter the stream of tokens to change the default behavior of the DPP pipeline into custom behavior. Additional transformation stages of the DPP pipeline perform further transforms, allowing for outputting final text in a display format that is customized for a specific user. This permits each user to efficiently leverage a common baseline DPP pipeline to produce a custom output.Type: ApplicationFiled: July 26, 2022Publication date: November 2, 2023Inventors: Wei LIU, Padma VARADHARAJAN, Piyush BEHRE, Nicholas KIBRE, Edward C. LIN, Shuangyu CHANG, Che ZHAO, Khuram SHAHID, Heiko Willy RAHMEL
-
Publication number: 20230352009Abstract: Systems generate segments of spoken language utterances based on different sets of segmentation boundaries. The systems are also configured to generate one or more formatted segments by assigning a punctuation tags at segmentation boundaries and to generate one or more final sentences from the one or more segments.Type: ApplicationFiled: April 29, 2022Publication date: November 2, 2023Inventors: Piyush BEHRE, Sharman W TAN, Shuangyu CHANG, Padma VARADHARAJAN, Sayan Dev PATHAK, Ravikant GUPTA
-
Publication number: 20230297606Abstract: Generally discussed herein are devices, systems, and methods for multi-lingual model generation. A method can include determining, for low-resource languages, respective a language similarity value indicating language similarity between each of the low-resource languages, clustering the low-resource languages into groups based on the respective language similarity value, aggregating training data of languages corresponding to a given group resulting in aggregated training data, and training a re-ranking language model based on the aggregated training data resulting in a trained re-ranking language model.Type: ApplicationFiled: June 14, 2022Publication date: September 21, 2023Inventors: Li MIAO, Jian WU, Shuangyu CHANG, Piyush BEHRE, Sarangarajan PARTHASARATHY
-
Patent number: 11741302Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.Type: GrantFiled: May 18, 2022Date of Patent: August 29, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Sayan Dev Pathak, Christopher Hakan Basoglu, Amit Agarwal, Shuangyu Chang, Amy Shah
-
Patent number: 11676576Abstract: Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.Type: GrantFiled: August 11, 2021Date of Patent: June 13, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao
-
Publication number: 20230153451Abstract: Embodiments are provided for securing data access to machine learning training data at a plurality of distributed computing devices. Electronic content including original data that corresponds to a preferred data security level is divided into a plurality of microsegments. The plurality of microsegments is restrictively distributed to a plurality of computing devices which apply transcription labels to the plurality of microsegments. The labeled microsegments are reconstructed into training data which is then used to train a machine learning model while facilitating an improvement in data security of the original data included with the training data from the reconstructed microsegments.Type: ApplicationFiled: April 30, 2021Publication date: May 18, 2023Inventors: Hemant MALHOTRA, Xuedong HUANG, Li JIANG, Ivo Jose GARCIA DOS SANTOS, Dong LI, Shuangyu CHANG
-
Patent number: 11636854Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.Type: GrantFiled: May 24, 2022Date of Patent: April 25, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Christopher H Basoglu, Nathan E Wohlgemuth
-
Patent number: 11562738Abstract: A system includes acquisition of a domain grammar, determination of an interpolated grammar based on the domain grammar and a base grammar, determination of a delta domain grammar based on an augmented first grammar and the interpolated grammar, determination of an out-of-vocabulary class based on the domain grammar and the base grammar, insertion of the out-of-vocabulary class into a composed transducer composed of the augmented first grammar and one or more other transducers to generate an updated composed transducer, composition of the delta domain grammar and the updated composed transducer, and application of the composition of the delta domain grammar and the updated composed transducer to an output of an acoustic model.Type: GrantFiled: October 28, 2019Date of Patent: January 24, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Veljko Miljanic, Aadyot Bhatnagar, Hosam Khalil, Christopher Basoglu
-
Publication number: 20220358910Abstract: A computing system obtains features that have been extracted from an acoustic signal, where the acoustic signal comprises spoken words uttered by a user. The computing system performs automatic speech recognition (ASR) based upon the features and a language model (LM) generated based upon expanded pattern data. The expanded pattern data includes a name of an entity and a search term, where the entity belongs to a segment identified in a knowledge base. The search term has been included in queries for entities belonging to the segment. The computing system identifies a sequence of words corresponding to the features based upon results of the ASR. The computing system transmits computer-readable text to a search engine, where the text includes the sequence of words.Type: ApplicationFiled: May 6, 2021Publication date: November 10, 2022Inventors: Ankur GUPTA, Satarupa GUHA, Rupeshkumar Rasiklal MEHTA, Issac John ALPHONSO, Anastasios ANASTASAKOS, Shuangyu CHANG
-
Publication number: 20220358912Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.Type: ApplicationFiled: May 24, 2022Publication date: November 10, 2022Inventors: Ziad AL BAWAB, Anand U. DESAI, Shuangyu CHANG, Amit K. AGARWAL, Zoltan ROMOCSA, Christopher H. BASOGLU, Nathan E. WOHLGEMUTH
-
Patent number: 11430433Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.Type: GrantFiled: August 5, 2019Date of Patent: August 30, 2022Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Christopher H Basoglu, Nathan E Wohlgemuth
-
Patent number: 11348574Abstract: A system includes acquisition of meeting data associated with a meeting, determination of a plurality of meeting participants based on the acquired meeting data, acquisition of e-mail data associated with each of the plurality of meeting participants, generation of a meeting language model based on the acquired e-mail data and the meeting data, and transcription of audio associated with the meeting based on the meeting language model.Type: GrantFiled: August 5, 2019Date of Patent: May 31, 2022Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Christopher H Basoglu, Nathan E Wohlgemuth
-
Publication number: 20220013109Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.Type: ApplicationFiled: August 11, 2021Publication date: January 13, 2022Inventors: Ziad AL BAWAB, Anand U. DESAI, Cem AKSOYLAR, Michael LEVIT, Xin MENG, Shuangyu CHANG, Suyash CHOUDHURY, Dhiresh RAWAL, Tao LI, Rishi GIRISH, Marcus JAGER, Ananth Rampura SHESHAGIRI RAO
-
Patent number: 11120788Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.Type: GrantFiled: June 27, 2019Date of Patent: September 14, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao