Patents by Inventor Dimitrios Basile Dimitriadis
Dimitrios Basile Dimitriadis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11875796Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.Type: GrantFiled: April 30, 2019Date of Patent: January 16, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang
-
Publication number: 20230306313Abstract: Examples of ensemble knowledge transfer in collaborative learning include: receiving, at a primary node, from a plurality of remote nodes, a plurality of trained proxy machine learning (ML) models, wherein each proxy ML model is received from a different one of the plurality of remote nodes, and wherein each of the plurality of remote nodes is remote across a network from the primary node; training a primary ML model using the plurality of proxy ML models, wherein training the primary ML model comprises: for each of a plurality of training cases of a primary training dataset, weighting results from each of the proxy ML models based on at least a confidence of the respective proxy ML model regarding the training case.Type: ApplicationFiled: March 22, 2022Publication date: September 28, 2023Inventors: Dimitrios Basile DIMITRIADIS, Antonio Andre MONTEIRO MANOEL, Robert Alexander SIM, Yae Jee CHO
-
Publication number: 20230297777Abstract: A personalized natural language processing system tokenizes a plurality of sets of raw text data to generate a plurality of sets of tokenized text data for the plurality of users, respectively. The tokenized text data includes a sequence of tokens corresponding to the raw text data, the tokens at least identifying distinct words or portions of words in the raw text. The system appends predetermined user-specific tokens to the sets of tokenized text data from the users, respectively. Each predetermined user-specific token corresponds to one of the users. The system processes the sets of tokenized text data using the NLP model in accordance with the appended predetermined user-specific tokens to predict a personalized classification for the sets of tokenized text data from each of the users, and outputs the personalized classifications of the tokenized text data for each of the users.Type: ApplicationFiled: March 16, 2022Publication date: September 21, 2023Applicant: Microsoft Technology Licensing, LLCInventors: Dimitrios Basile DIMITRIADIS, Vaishnavi SHRIVASTAVA, Milad SHOKOUHI, Robert Alexander SIM, Fatemehsadat MIRESHGHALLAH
-
Patent number: 11468895Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.Type: GrantFiled: April 30, 2019Date of Patent: October 11, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Patent number: 11445295Abstract: A system and method include reception of a first plurality of audio signals, generation of a second plurality of beamformed audio signals based on the first plurality of audio signals, each of the second plurality of beamformed audio signals associated with a respective one of a second plurality of beamformer directions, generation of a first TF mask for a first output channel based on the first plurality of audio signals, determination of a first beamformer direction associated with a first target sound source based on the first TF mask, generation of first features based on the first beamformer direction and the first plurality of audio signals, determination of a second TF mask based on the first features, and application of the second TF mask to one of the second plurality of beamformed audio signals associated with the first beamformer direction.Type: GrantFiled: November 17, 2020Date of Patent: September 13, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Zhuo Chen, Changliang Liu, Takuya Yoshioka, Xiong Xiao, Hakan Erdogan, Dimitrios Basile Dimitriadis
-
Publication number: 20220230642Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.Type: ApplicationFiled: April 4, 2022Publication date: July 21, 2022Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan ZENG, Lijuan QIN, William Isaac Hinthorn, Xuedong HUANG
-
Patent number: 11322148Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.Type: GrantFiled: April 30, 2019Date of Patent: May 3, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Patent number: 11257484Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.Type: GrantFiled: August 21, 2019Date of Patent: February 22, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Dimitrios Basile Dimitriadis, Xie Chen, Nanshan Zeng, Yu Shi, Liyang Lu
-
Publication number: 20210407516Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.Type: ApplicationFiled: September 13, 2021Publication date: December 30, 2021Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Patent number: 11138980Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.Type: GrantFiled: April 30, 2019Date of Patent: October 5, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Patent number: 11023690Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.Type: GrantFiled: April 30, 2019Date of Patent: June 1, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Publication number: 20210076129Abstract: A system and method include reception of a first plurality of audio signals, generation of a second plurality of beamformed audio signals based on the first plurality of audio signals, each of the second plurality of beamformed audio signals associated with a respective one of a second plurality of beamformer directions, generation of a first TF mask for a first output channel based on the first plurality of audio signals, determination of a first beamformer direction associated with a first target sound source based on the first TF mask, generation of first features based on the first beamformer direction and the first plurality of audio signals, determination of a second TF mask based on the first features, and application of the second TF mask to one of the second plurality of beamformed audio signals associated with the first beamformer direction.Type: ApplicationFiled: November 17, 2020Publication date: March 11, 2021Inventors: Zhuo CHEN, Changliang LIU, Takuya YOSHIOKA, Xiong XIAO, Hakan ERDOGAN, Dimitrios Basile DIMITRIADIS
-
Publication number: 20210056956Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.Type: ApplicationFiled: August 21, 2019Publication date: February 25, 2021Inventors: Dimitrios Basile DIMITRIADIS, Xie CHEN, Nanshan ZENG, Yu SHI, Liyang LU
-
Patent number: 10856076Abstract: A system and method include reception of a first plurality of audio signals, generation of a second plurality of beamformed audio signals based on the first plurality of audio signals, each of the second plurality of beamformed audio signals associated with a respective one of a second plurality of beamformer directions, generation of a first TF mask for a first output channel based on the first plurality of audio signals, determination of a first beamformer direction associated with a first target sound source based on the first TF mask, generation of first features based on the first beamformer direction and the first plurality of audio signals, determination of a second TF mask based on the first features, and application of the second TF mask to one of the second plurality of beamformed audio signals associated with the first beamformer direction.Type: GrantFiled: April 5, 2019Date of Patent: December 1, 2020Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Zhuo Chen, Changliang Liu, Takuya Yoshioka, Xiong Xiao, Hakan Erdogan, Dimitrios Basile Dimitriadis
-
Publication number: 20200349949Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.Type: ApplicationFiled: April 30, 2019Publication date: November 5, 2020Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Publication number: 20200351603Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.Type: ApplicationFiled: April 30, 2019Publication date: November 5, 2020Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang
-
Publication number: 20200349230Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.Type: ApplicationFiled: April 30, 2019Publication date: November 5, 2020Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Publication number: 20200349950Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.Type: ApplicationFiled: April 30, 2019Publication date: November 5, 2020Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Publication number: 20200349954Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.Type: ApplicationFiled: April 30, 2019Publication date: November 5, 2020Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
-
Publication number: 20200349953Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.Type: ApplicationFiled: April 30, 2019Publication date: November 5, 2020Inventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang