Patents by Inventor Christopher Hakan BASOGLU

Christopher Hakan BASOGLU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MODALITY-SPECIFIC TRAFFIC CLASSIFICATION IN MODEL-AS-A-SERVICE PLATFORM

Publication number: 20260178407

Abstract: A model-as-a-service (MaaS) platform includes an intelligence layer that tracks modality-specific token utilization for a select customer assigned to use a first instance of a multimodal model. The first instance is instantiated within a supporting hardware architecture that allocates dedicated groups of processing resources to support different modality-specific processing pipelines. The intelligence layer uses the tracked utilization data to generate a predicted token ensemble ratio for the select customer and compares the predicted token ensemble to a compute ensemble ratio determined for each of two or more of other instances of the multimodal model. The intelligence layer re-assigns the select customer to a second instance of the multimodal model in response to determining that the predicted token ensemble ratio is more similar to the compute ensemble ratio of the second instance of the multimodal model than to the compute ensemble ratio of the first instance of the multimodal model.

Type: Application

Filed: December 20, 2024

Publication date: June 25, 2026

Inventors: Sanjay RAMANUJAN, Fnu SIDHARTHA, Rakesh KELKAR, Nitin GOYAL, Christopher Hakan BASOGLU
DYNAMIC MODALITY-SPECIFIC RESOURCE ALLOCATION IN MODEL-AS-A-SERVICE PLATFORM

Publication number: 20260178406

Abstract: A model-as-a-service (MaaS) platform includes a multimodal model instance that is executed by dedicated groups of processing resources allocated to support different modality-specific processing pipelines. The MaaS platform further includes an intelligence layer that tracks modality-specific token utilization over time; generates a modality-specific workload prediction for the multimodal model instance based on the tracked data; and generates a modality-specific latency prediction based on the modality-specific workload prediction. The MaaS platform further includes a resource allocation component that uses the modality-specific latency prediction to dynamically reallocate the processing resources among the dedicated groups supporting the different modality-specific processing pipelines.

Type: Application

Filed: December 20, 2024

Publication date: June 25, 2026

Inventors: Sanjay RAMANUJAN, Fnu SIDHARTHA, Rakesh KELKAR, Nitin GOYAL, Christopher Hakan BASOGLU
Processing part of a user input to produce an early response

Patent number: 12638911

Abstract: Techniques are provided for early processing of a part of a user input to produce a response to the entire or final user input. While the user input is being received, a partial user input, which is a part of the final user input, is processed to produce a response. The response is a candidate response for the final user input. After the final user input is received, and if the partial user input is determined to match or be equivalent to the final user input, the first response, which is already available, is provided to one or more output devices for presentation. If the final user input is determined to differ from the partial user input, the final user input is processed to produce a second response to the final user input, and the second response is provided for presentation. In some instances, multiple partial user inputs are received and processed.

Type: Grant

Filed: December 23, 2024

Date of Patent: May 26, 2026

Assignee: Microsoft Technology Licensing, LLC

Inventors: Chun Hin Nelson Siu, Hosam Adel Khalil, Ajoy Nandi, Carmen Quan, Denis Fisenko, Md Nizam Uddin Chy, Min Hu, Christopher Hakan Basoglu, Sayan Dev Pathak
LONG-FORM CONVERSATION SIMULATOR

Publication number: 20260094603

Abstract: A method for simulating a long-form conversation includes instructing a language model to simulate a conversation by generating dialog associated with a primary topic; receiving, from the language model, a short-form conversation transcript that includes the dialog, and dynamically extending the short-form conversation via a feedback loop that provides for identifying secondary topics based on entities referenced in the dialog; instructing the trained language model to generate additional dialog of the conversation associated with the secondary topics; receiving from the trained language model an extension of the dialog; and appending the extension to the previously-created dialog to create a long-form conversation transcript. The long-form conversation transcript may be synthesized into audio data that is usable to train a speech recognition model.

Type: Application

Filed: September 30, 2024

Publication date: April 2, 2026

Inventors: Amy SHAH, Bert CASPER, Sayan Dev PATHAK, Christopher Hakan BASOGLU, Alberto Alonso FLORES
AUTOMATED ARTIFICIAL INTELLIGENCE DRIVEN READABILITY SCORING TECHNIQUES

Publication number: 20260093915

Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.

Type: Application

Filed: December 10, 2025

Publication date: April 2, 2026

Applicant: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev PATHAK, Christopher Hakan BASOGLU, Amit AGARWAL, Shuangyu CHANG, Amy SHAH
CONTEXT-BASED PHONETIC CORRECTIONS FOR ENTITIES REFERENCED IN AUDIO TRANSCRIPTIONS

Publication number: 20260087256

Abstract: A method for correcting misrecognized entity names in audio transcriptions includes receiving a transcribed utterance including dialog of a conversation and obtaining conversation context data associated with the conversation to compile a contextually relevant entity list of entities with contextual relevance to the transcribed utterance. The method further includes providing a phonetic similarity model with an input that includes the transcribed utterance and an instruction to use the contextually relevant entity list to identify specific entities phonetically similar to the transcribed utterance; and receiving, from the phonetic similarity model, an output identifying one or more entity names from the contextually relevant entity list that has been determined to satisfy a phonetic similarity metric with the transcribed utterance.

Type: Application

Filed: September 25, 2024

Publication date: March 26, 2026

Inventors: Harini KESAVAMOORTHY, Karthik Raman, Sayan D Pathak, Christopher Hakan BASOGLU, Manisha JAIN, Amy SHAH, Sharman W TAN, Piyush BEHRE
Automated artificial intelligence driven readability scoring techniques

Patent number: 12518094

Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.

Type: Grant

Filed: July 5, 2023

Date of Patent: January 6, 2026

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev Pathak, Christopher Hakan Basoglu, Amit Agarwal, Shuangyu Chang, Amy Shah
SYSTEMS AND METHODS FOR GPT GUIDED NEURAL PUNCTUATION FOR CONVERSATIONAL SPEECH

Publication number: 20250322832

Abstract: Some disclosed embodiments are directed to obtaining a decoded audio data including a spoken language utterance recognized in audio data and identifying a disfluency in the decoded audio data. Upon determining that correcting the disfluency would improve a readability score of the decoded audio data, the system generates a particular correction to correct the disfluency and applies the particular correction to the decoded audio data. Then, an updated decoded audio data is generated which reflects the particular correction. The updated decoded audio data has improved readability over the decoded audio data.

Type: Application

Filed: June 24, 2025

Publication date: October 16, 2025

Inventors: Sayan Dev PATHAK, Ayush VIKRAM, Zoltan ROMOCSA, Amy Parag SHAH, Piyush BEHRE, Sharman W. TAN, Amit Kumar AGARWAL, Christopher Hakan BASOGLU
Systems and methods for GPT guided neural punctuation for conversational speech

Patent number: 12374337

Abstract: Some disclosed embodiments are directed to obtaining a decoded audio data including a spoken language utterance recognized in audio data and identifying a disfluency in the decoded audio data. Upon determining that correcting the disfluency would improve a readability score of the decoded audio data, the system generates a particular correction to correct the disfluency and applies the particular correction to the decoded audio data. Then, an updated decoded audio data is generated which reflects the particular correction. The updated decoded audio data has improved readability over the decoded audio data.

Type: Grant

Filed: November 1, 2022

Date of Patent: July 29, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev Pathak, Ayush Vikram, Zoltan Romocsa, Amy Parag Shah, Piyush Behre, Sharman W Tan, Amit Kumar Agarwal, Christopher Hakan Basoglu
PROCESSING PART OF A USER INPUT TO PRODUCE AN EARLY RESPONSE

Publication number: 20250181136

Abstract: Techniques are provided for early processing of a part of a user input to produce a response to the entire or final user input. While the user input is being received, a partial user input, which is a part of the final user input, is processed to produce a response. The response is a candidate response for the final user input. After the final user input is received, and if the partial user input is determined to match or be equivalent to the final user input, the first response, which is already available, is provided to one or more output devices for presentation. If the final user input is determined to differ from the partial user input, the final user input is processed to produce a second response to the final user input, and the second response is provided for presentation. In some instances, multiple partial user inputs are received and processed.

Type: Application

Filed: December 23, 2024

Publication date: June 5, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Chun Hin Nelson SIU, Hosam Adel KHALIL, Ajoy NANDI, Carmen QUAN, Denis FISENKO, Md Nizam Uddin CHY, Min HU, Christopher Hakan BASOGLU, Sayan Dev PATHAK
Resource-Efficient and Time-Efficient Prompting of a Language Model to Invoke Functions

Publication number: 20250138909

Abstract: A technique sends a first prompt to a language model that specifies selector information. The selector information provides a summary of a group of functions that are capable of being invoked. The language model responds by choosing one or more functions from the group of functions. The technique then sends a second prompt to the language model that specifies more detailed information regarding just the function(s) that have been identified by the language model. The language model responds by providing invocation information for each of the functions, such as properly formatted API messages. The technique then invokes the function(s) based on the invocation information. The technique reduces the size of each prompt sent to the language model, which makes efficient use of resources and improves the quality of the language model's output results.

Type: Application

Filed: December 29, 2023

Publication date: May 1, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Girish Milind MAHAJAN, Sayan Dev PATHAK, Michael Anthony TAYLOR, Salman Mohammad QUAZI, Christopher Hakan BASOGLU, Prashanth SRIKANTHAN
Reducing Latency by Processing Parts of a Language Model Query in Parallel

Publication number: 20250139136

Abstract: A technique partitions a user's original query into plural smaller component queries, each of which has a common part and an instance-specific part. The technique distributes the component queries to plural processor instances of a processor. The plural processor instances transform the respective component queries into query-component responses by acting in parallel, independent of each other. The technique generates a final response based on the query-component responses, e.g., by assembling the component-query responses into the final response. The technique reduces latency because the processor instances work on parts of the user's original query at the same time, rather than as a single stream of consecutive tokens. The plural processor instances have access to a shared cache memory, and utilize relevant data that has been computed in response to previous queries.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev PATHAK, Osama ABUELSOROUR, Christopher Hakan BASOGLU, Harini KESAVAMOORTHY, Girish Milind MAHAJAN, Salman Mohammad QUAZI, Valeriy Viktorovich KIRSHIN
Reducing latency by processing parts of a language model query in parallel

Patent number: 12287816

Abstract: A technique partitions a user's original query into plural smaller component queries, each of which has a common part and an instance-specific part. The technique distributes the component queries to plural processor instances of a processor. The plural processor instances transform the respective component queries into query-component responses by acting in parallel, independent of each other. The technique generates a final response based on the query-component responses, e.g., by assembling the component-query responses into the final response. The technique reduces latency because the processor instances work on parts of the user's original query at the same time, rather than as a single stream of consecutive tokens. The plural processor instances have access to a shared cache memory, and utilize relevant data that has been computed in response to previous queries.

Type: Grant

Filed: October 31, 2023

Date of Patent: April 29, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev Pathak, Osama Abuelsorour, Christopher Hakan Basoglu, Harini Kesavamoorthy, Girish Milind Mahajan, Salman Mohammad Quazi, Valeriy Viktorovich Kirshin
SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES

Publication number: 20250054491

Abstract: Systems and methods are provided for smart audio segmentation using look-ahead based acousto-linguistic features. For example, systems and methods are provided for obtaining audio, processing the audio, identifying a potential segmentation boundary within the audio, and determining whether to generate a segment break at the potential segmentation boundary. One or more look-ahead words occurring after the potential segmentation boundary are identified, wherein an acoustic segmentation score and a language segmentation score associated with the potential segmentation boundary and the one or more look-ahead words are generated. Systems then either refrain from generating a segment break at the potential segmentation boundary or generate the segment break at the potential segmentation boundary based on the acoustic and/or language segmentation score at least meeting or exceeding a segmentation score threshold.

Type: Application

Filed: December 22, 2021

Publication date: February 13, 2025

Inventors: Sayan Dev PATHAK, Hosam Adel KHALIL, Naveen PARIHAR, Piyush BEHRE, Shuangyu CHANG, Christopher Hakan BASOGLU, Sharman W TAN, Eva SHARMA, Jian WU, Yang LIU, Edward C LIN, Amit Kumar AGARWAL
Processing part of a user input to produce an early response

Patent number: 12216809

Abstract: Techniques are provided for early processing of a part of a user input to produce a response to the entire or final user input. While the user input is being received, a partial user input, which is a part of the final user input, is processed to produce a response. The response is a candidate response for the final user input. After the final user input is received, and if the partial user input is determined to match or be equivalent to the final user input, the first response, which is already available, is provided to one or more output devices for presentation. If the final user input is determined to differ from the partial user input, the final user input is processed to produce a second response to the final user input, and the second response is provided for presentation. In some instances, multiple partial user inputs are received and processed.

Type: Grant

Filed: June 30, 2021

Date of Patent: February 4, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Chun Hin Nelson Siu, Hosam Adel Khalil, Ajoy Nandi, Carmen Quan, Denis Fisenko, Md Nizam Uddin Chy, Min Hu, Christopher Hakan Basoglu, Sayan Dev Pathak
Constructing Prompt Information for Submission to a Language Model by Dynamically Compressing Source Information

Publication number: 20240394479

Abstract: A technique for interacting with a machine-trained language model uses dynamic prompt management. The technique includes: receiving an input query and creating prompt information that expresses the input query and targeted context information. The targeted context information is selected from candidate context information. Further, a part of the prompt information is formed by compressing source information by reducing a number of content units in the source information (where the source information includes the input query and/or the candidate context information). The method further includes submitting the prompt information to the machine-trained language model, and receiving a response from the machine-trained language model based on the prompt information. The technique has the overall effect of reducing the number of content units submitted to the language model, which, in turn, reduces the amount of resources and time required by the language model to process the input query.

Type: Application

Filed: June 19, 2023

Publication date: November 28, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev PATHAK, Harini KESAVAMOORTHY, Zoltan ROMOCSA, Christopher Hakan BASOGLU, Girish Milind MAHAJAN, Salman Mohammad QUAZI
Constructing Prompt Information for Submission to a Language Model by Dynamically Selecting from Context Information

Publication number: 20240394477

Abstract: A technique for interacting with a machine-trained language model uses dynamic prompt management. The technique includes: receiving an input query; accessing a state data store that provides candidate context information; partitioning the candidate context information into plural parts; selecting targeted context information from the candidate context information by determining a semantic relevance of the input query to each of the plural parts by performing vector-based analysis; creating prompt information that includes the input query and the targeted context information; submitting the prompt information to the machine-trained language model; and receiving a response from the machine-trained language model based on the prompt information. The technique has the overall effect of reducing the number of content units submitted to the language model, which, in turn, reduces the amount of resources and time required by the language model to process the input query.

Type: Application

Filed: June 19, 2023

Publication date: November 28, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Sayan Dev PATHAK, Harini KESAVAMOORTHY, Zoltan ROMOCSA, Christopher Hakan BASOGLU, Girish Milind MAHAJAN, Salman Mohammad QUAZI
SYSTEMS AND METHODS FOR GPT GUIDED NEURAL PUNCTUATION FOR CONVERSATIONAL SPEECH

Publication number: 20240144931

Abstract: Some disclosed embodiments are directed to obtaining a decoded audio data including a spoken language utterance recognized in audio data and identifying a disfluency in the decoded audio data. Upon determining that correcting the disfluency would improve a readability score of the decoded audio data, the system generates a particular correction to correct the disfluency and applies the particular correction to the decoded audio data. Then, an updated decoded audio data is generated which reflects the particular correction. The updated decoded audio data has improved readability over the decoded audio data.

Type: Application

Filed: November 1, 2022

Publication date: May 2, 2024

Inventors: Sayan Dev PATHAK, Ayush VIKRAM, Zoltan ROMOCSA, Amy Parag SHAH, Piyush BEHRE, Sharman W TAN, Amit Kumar AGARWAL, Christopher Hakan BASOGLU
SYSTEMS AND METHODS FOR SEMANTIC SEGMENTATION FOR SPEECH

Publication number: 20240087572

Abstract: Systems are configured to obtain streaming audio data comprising language utterances, continuously decode the streaming audio data in order to generate decoded streaming audio data and determine whether a linguistic boundary exists within an initial segment of decoded streaming audio data. When a linguistic boundary is determined to exist, the systems apply a punctuation at the linguistic boundary and output a first portion of the initial segment of the streaming audio data ending at the linguistic boundary while refraining from outputting a second portion of the initial segment which is located temporally subsequent to the first portion of the initial segment. Systems are also configured to delay the output until predetermined punctuation validation processes have been performed.

Type: Application

Filed: November 14, 2022

Publication date: March 14, 2024

Inventors: Sayan Dev PATHAK, Amit Kumar AGARWAL, Amy Parag SHAH, Sourish CHATTERJEE, Zoltan ROMOCSA, Christopher Hakan BASOGLU, Piyush BEHRE, Shuangyu CHANG, Emilian Yordanov STOIMENOV
User-perceived latency while maintaining accuracy

Patent number: 11929076

Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.

Type: Grant

Filed: December 1, 2022

Date of Patent: March 12, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Hosam Adel Khalil, Emilian Stoimenov, Christopher Hakan Basoglu, Kshitiz Kumar, Jian Wu

1 2 next