Patents by Inventor Dimitrios Basile Dimitriadis

Dimitrios Basile Dimitriadis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11875796
    Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: January 16, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20230306313
    Abstract: Examples of ensemble knowledge transfer in collaborative learning include: receiving, at a primary node, from a plurality of remote nodes, a plurality of trained proxy machine learning (ML) models, wherein each proxy ML model is received from a different one of the plurality of remote nodes, and wherein each of the plurality of remote nodes is remote across a network from the primary node; training a primary ML model using the plurality of proxy ML models, wherein training the primary ML model comprises: for each of a plurality of training cases of a primary training dataset, weighting results from each of the proxy ML models based on at least a confidence of the respective proxy ML model regarding the training case.
    Type: Application
    Filed: March 22, 2022
    Publication date: September 28, 2023
    Inventors: Dimitrios Basile DIMITRIADIS, Antonio Andre MONTEIRO MANOEL, Robert Alexander SIM, Yae Jee CHO
  • Publication number: 20230297777
    Abstract: A personalized natural language processing system tokenizes a plurality of sets of raw text data to generate a plurality of sets of tokenized text data for the plurality of users, respectively. The tokenized text data includes a sequence of tokens corresponding to the raw text data, the tokens at least identifying distinct words or portions of words in the raw text. The system appends predetermined user-specific tokens to the sets of tokenized text data from the users, respectively. Each predetermined user-specific token corresponds to one of the users. The system processes the sets of tokenized text data using the NLP model in accordance with the appended predetermined user-specific tokens to predict a personalized classification for the sets of tokenized text data from each of the users, and outputs the personalized classifications of the tokenized text data for each of the users.
    Type: Application
    Filed: March 16, 2022
    Publication date: September 21, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Dimitrios Basile DIMITRIADIS, Vaishnavi SHRIVASTAVA, Milad SHOKOUHI, Robert Alexander SIM, Fatemehsadat MIRESHGHALLAH
  • Patent number: 11468895
    Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: October 11, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Patent number: 11445295
    Abstract: A system and method include reception of a first plurality of audio signals, generation of a second plurality of beamformed audio signals based on the first plurality of audio signals, each of the second plurality of beamformed audio signals associated with a respective one of a second plurality of beamformer directions, generation of a first TF mask for a first output channel based on the first plurality of audio signals, determination of a first beamformer direction associated with a first target sound source based on the first TF mask, generation of first features based on the first beamformer direction and the first plurality of audio signals, determination of a second TF mask based on the first features, and application of the second TF mask to one of the second plurality of beamformed audio signals associated with the first beamformer direction.
    Type: Grant
    Filed: November 17, 2020
    Date of Patent: September 13, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhuo Chen, Changliang Liu, Takuya Yoshioka, Xiong Xiao, Hakan Erdogan, Dimitrios Basile Dimitriadis
  • Publication number: 20220230642
    Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.
    Type: Application
    Filed: April 4, 2022
    Publication date: July 21, 2022
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan ZENG, Lijuan QIN, William Isaac Hinthorn, Xuedong HUANG
  • Patent number: 11322148
    Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 3, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Patent number: 11257484
    Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.
    Type: Grant
    Filed: August 21, 2019
    Date of Patent: February 22, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dimitrios Basile Dimitriadis, Xie Chen, Nanshan Zeng, Yu Shi, Liyang Lu
  • Publication number: 20210407516
    Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.
    Type: Application
    Filed: September 13, 2021
    Publication date: December 30, 2021
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Patent number: 11138980
    Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: October 5, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Patent number: 11023690
    Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: June 1, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20210076129
    Abstract: A system and method include reception of a first plurality of audio signals, generation of a second plurality of beamformed audio signals based on the first plurality of audio signals, each of the second plurality of beamformed audio signals associated with a respective one of a second plurality of beamformer directions, generation of a first TF mask for a first output channel based on the first plurality of audio signals, determination of a first beamformer direction associated with a first target sound source based on the first TF mask, generation of first features based on the first beamformer direction and the first plurality of audio signals, determination of a second TF mask based on the first features, and application of the second TF mask to one of the second plurality of beamformed audio signals associated with the first beamformer direction.
    Type: Application
    Filed: November 17, 2020
    Publication date: March 11, 2021
    Inventors: Zhuo CHEN, Changliang LIU, Takuya YOSHIOKA, Xiong XIAO, Hakan ERDOGAN, Dimitrios Basile DIMITRIADIS
  • Publication number: 20210056956
    Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.
    Type: Application
    Filed: August 21, 2019
    Publication date: February 25, 2021
    Inventors: Dimitrios Basile DIMITRIADIS, Xie CHEN, Nanshan ZENG, Yu SHI, Liyang LU
  • Patent number: 10856076
    Abstract: A system and method include reception of a first plurality of audio signals, generation of a second plurality of beamformed audio signals based on the first plurality of audio signals, each of the second plurality of beamformed audio signals associated with a respective one of a second plurality of beamformer directions, generation of a first TF mask for a first output channel based on the first plurality of audio signals, determination of a first beamformer direction associated with a first target sound source based on the first TF mask, generation of first features based on the first beamformer direction and the first plurality of audio signals, determination of a second TF mask based on the first features, and application of the second TF mask to one of the second plurality of beamformed audio signals associated with the first beamformer direction.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: December 1, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Zhuo Chen, Changliang Liu, Takuya Yoshioka, Xiong Xiao, Hakan Erdogan, Dimitrios Basile Dimitriadis
  • Publication number: 20200349949
    Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200351603
    Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang
  • Publication number: 20200349230
    Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349950
    Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349954
    Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349953
    Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang