Patents by Inventor James R. Glass

James R. Glass has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11961513
    Abstract: A decoder includes a feature extraction circuit for calculating one or more feature vectors. An acoustic model circuit is coupled to receive one or more feature vectors from and assign one or more likelihood values to the one or more feature vectors. A memory architecture that utilizes on-chip state lattices and an off-chip memory for storing states of transition of the decoder is used to reduce reading and writing to the off-chip memory. The on-chip state lattice is populated with at least one of the states of transition stored in the off-chip memory. An on-chip word is generated from a snapshot from the on-chip state lattice. The on-chip state lattice and the on-chip word lattice act as an on-chip cache to reduce reading and writing to the off-chip memory.
    Type: Grant
    Filed: July 29, 2021
    Date of Patent: April 16, 2024
    Assignee: Massachusetts Institute of Technology
    Inventors: Michael R. Price, James R. Glass, Anantha P. Chandrakasan
  • Patent number: 11830478
    Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.
    Type: Grant
    Filed: April 1, 2021
    Date of Patent: November 28, 2023
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
  • Patent number: 11817081
    Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: November 14, 2023
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
  • Publication number: 20230360642
    Abstract: One or more computer processors obtain an initial subnetwork at a target sparsity and an initial pruning mask from a pre-trained self-supervised learning (SSL) speech model. The one or more computer processors finetune the initial subnetwork, comprising: the one or more computer processors zero out one or more masked weights in the initial subnetwork specified by the initial pruning mask; the one or more computer processors train a new subnetwork from the zeroed out subnetwork; the one or more computer processors prune one or more weights of lowest magnitude in the new subnetwork regardless of network structure to satisfy the target sparsity. The one or more computer processors classify an audio segment with the finetuned subnetwork.
    Type: Application
    Filed: May 9, 2022
    Publication date: November 9, 2023
    Inventors: Cheng-I Lai, Yang Zhang, Kaizhi Qian, Chuang Gan, James R. Glass, Alexander Haojan Liu
  • Publication number: 20220319495
    Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.
    Type: Application
    Filed: April 1, 2021
    Publication date: October 6, 2022
    Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology
    Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH
  • Publication number: 20220319493
    Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology
    Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH
  • Publication number: 20210358484
    Abstract: A decoder includes a feature extraction circuit for calculating one or more feature vectors. An acoustic model circuit is coupled to receive one or more feature vectors from and assign one or more likelihood values to the one or more feature vectors. A memory architecture that utilizes on-chip state lattices and an off-chip memory for storing states of transition of the decoder is used to reduce reading and writing to the off-chip memory. The on-chip state lattice is populated with at least one of the states of transition stored in the off-chip memory. An an on-chip word is generated from a snapshot from the on-chip state lattice. The on-chip state lattice and the on-chip word lattice act as an on-chip cache to reduce reading and writing to the off-chip memory.
    Type: Application
    Filed: July 29, 2021
    Publication date: November 18, 2021
    Inventors: Michael R. PRICE, James R. GLASS, Anantha P. CHANDRAKASAN
  • Patent number: 11107461
    Abstract: A decoder comprises a feature extraction circuit for calculating one or more feature vectors; an acoustic model circuit coupled to receive one or more feature vectors from said feature extraction circuit and assign one or more likelihood values to the one or more feature vectors; a memory for storing states of transition of the decoder; and a search circuit for receiving an input from said acoustic model circuit corresponding to the one or more likelihood values based upon the one or more feature vectors, and for choosing states of transition from the memory based on the input from said acoustic model.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: August 31, 2021
    Assignee: MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Michael R. Price, James R. Glass, Anantha P. Chandrakasan
  • Patent number: 10817509
    Abstract: A system for associating a string of natural language with items in a relational database includes a first subsystem having a pre-trained first artificial neural network configured to apply a semantic tag selected from a predefined set of semantic labels to a segment of a plurality of tokens representing the string of natural language. A second subsystem includes a second artificial neural network configured to convert the plurality of labeled tokens into a first multi-dimensional vector representing the string of natural language. A third subsystem is configured to rank the first multi-dimensional vector against a second multi-dimensional vector representing a plurality of items in the relational database.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: October 27, 2020
    Assignee: Massachusetts Institute of Technology
    Inventors: Mandy Barrett Korpusik, James R. Glass
  • Patent number: 10515292
    Abstract: An approach to joint acoustic and visual processing associates images with corresponding audio signals, for example, for the retrievals of images according to voice queries. A set of paired images and audio signals are processed without requiring transcription, segmentation, or annotation of either the images or the audio. This processing of the paired images and audio is used to determine parameters of an image processor and an audio processor, with the outputs of these processors being comparable to determine a similarity across acoustic and visual modalities. In some implementations, the image processor and the audio processor make use of deep neural networks. Further embodiments associate parts of images with corresponding parts of audio signals.
    Type: Grant
    Filed: June 15, 2017
    Date of Patent: December 24, 2019
    Assignee: Massachusetts Institute of Technology
    Inventors: David F. Harwath, James R. Glass
  • Publication number: 20190147856
    Abstract: A decoder comprises a feature extraction circuit for calculating one or more feature vectors; an acoustic model circuit coupled to receive one or more feature vectors from said feature extraction circuit and assign one or more likelihood values to the one or more feature vectors; a memory for storing states of transition of the decoder; and a search circuit for receiving an input from said acoustic model circuit corresponding to the one or more likelihood values based upon the one or more feature vectors, and for choosing states of transition from the memory based on the input from said acoustic model.
    Type: Application
    Filed: May 31, 2017
    Publication date: May 16, 2019
    Inventors: Michael R. PRICE, James R. GLASS, Anantha P. CHANDRAKASAN
  • Publication number: 20180268023
    Abstract: A system for associating a string of natural language with items in a relational database includes a first subsystem having a pre-trained first artificial neural network configured to apply a semantic tag selected from a predefined set of semantic labels to a segment of a plurality of tokens representing the string of natural language. A second subsystem includes a second artificial neural network configured to convert the plurality of labeled tokens into a first multi-dimensional vector representing the string of natural language. A third subsystem is configured to rank the first multi-dimensional vector against a second multi-dimensional vector representing a plurality of items in the relational database.
    Type: Application
    Filed: March 15, 2018
    Publication date: September 20, 2018
    Inventors: Mandy Barrett Korpusik, James R. Glass
  • Publication number: 20180039859
    Abstract: An approach to joint acoustic and visual processing associates images with corresponding audio signals, for example, for the retrievals of images according to voice queries. A set of paired images and audio signals are processed without requiring transcription, segmentation, or annotation of either the images or the audio. This processing of the paired images and audio is used to determine parameters of an image processor and an audio processor, with the outputs of these processors being comparable to determine a similarity across acoustic and visual modalities. In some implementations, the image processor and the audio processor make use of deep neural networks. Further embodiments associate parts of images with corresponding parts of audio signals.
    Type: Application
    Filed: June 15, 2017
    Publication date: February 8, 2018
    Inventors: David F. Harwath, James R. Glass
  • Patent number: 8386264
    Abstract: A speech data retrieval apparatus (10) includes a speech database (1), a speech recognition unit (2), a confusion network creation unit (3), an inverted index table creation unit (4), a query input unit (6), a query conversion unit (7) and a label string check unit (8). The speech recognition unit (2) reads speech data from the speech database (1), carries out a speech recognition process with respect to the read speech data, and outputs a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit. The confusion network creation unit (3) creates a confusion network based on the output lattice and outputs the result of speech recognition process as the confusion network. The inverted index table creation unit (4) creates an inverted index table based on the output confusion network.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: February 26, 2013
    Assignees: Nippon Telegraph and Telephone Corporation, Massachusetts Institute of Technology
    Inventors: Takaaki Hori, I. Lee Hetherington, Timothy J. Hazen, James R. Glass
  • Publication number: 20100121642
    Abstract: A speech data retrieval apparatus (10) includes a speech database (1), a speech recognition unit (2), a confusion network creation unit (3), an inverted index table creation unit (4), a query input unit (6), a query conversion unit (7) and a label string check unit (8). The speech recognition unit (2) reads speech data from the speech database (1), carries out a speech recognition process with respect to the read speech data, and outputs a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit. The confusion network creation unit (3) creates a confusion network based on the output lattice and outputs the result of speech recognition process as the confusion network. The inverted index table creation unit (4) creates an inverted index table based on the output confusion network.
    Type: Application
    Filed: November 4, 2008
    Publication date: May 13, 2010
    Applicant: Massachusetts Institute of Technology
    Inventors: Takaaki Hori, I. Lee Hetherington, Timothy J. Hazen, James R. Glass
  • Publication number: 20080261045
    Abstract: The invention provides a method of making composite particles for efficient delivery of polyelectrolytes to a target. Composite particles are made by two methods: 1) by first forming disperse polyelectrolyte condensates, by mixing the polyelectrolyte with a condensing agent, and then combining the disperse polyelectrolyte condensates with particles so that the disperse polyelectrolyte condensates bind to the surfaces of the particles or 2) combining particles with opposite charge polyelectrolyte to form a polyelectrolyte coated particles followed by a subsequent polyelectrolyte of opposite charge to form a composite particle. The invention includes composite particles, where each composite particle is comprised of a particle with the polyelectrolyte from one or more polyelectrolyte condensates bound to that particle. One advantage of these composite particles is that they permit more efficient and increased amounts of polyelectrolytes to be delivered to a target, in comparison to the prior art.
    Type: Application
    Filed: January 18, 2008
    Publication date: October 23, 2008
    Inventors: James R. Glass, David Schultz, Steven J. Oldenburg
  • Publication number: 20030215817
    Abstract: Polynucleotides, polypeptides, kits and methods are provided related to genes regulated by the formation of fatty atherosclerotic lesions, and by administration of a dihydropyridine calcium antagonist, lercanidipine.
    Type: Application
    Filed: February 3, 2003
    Publication date: November 20, 2003
    Inventors: Amedeo Leonardi, Abraham Sartani, James R. Glass, J. Gregor Sutcliffe, Karl W. Hasel
  • Patent number: 5677276
    Abstract: The present invention provides novel conjugates of a synthetic polypeptide containing RGD or (dR)GD and a biodegradable polymer, hyaluronate. The conjugates are prepared by any one of three different methods provided by the present invention: (1) an epoxide method (2) a sodium periodate method, and (3) a tresyl chloride method. The conjugates prepared by these methods are useful to aid in wound healing and tissue regeneration by providing a temporary matrix for tissue repair. The invention also provides novel RGD-peptides.
    Type: Grant
    Filed: June 5, 1995
    Date of Patent: October 14, 1997
    Assignee: La Jolla Cancer Research Foundation
    Inventors: Kenneth T. Dickerson, James R. Glass, Lin-Shu Liu, James W. Polarek, William S. Craig, Daniel G. Mullen, Soan Cheng
  • Patent number: 5625749
    Abstract: Phonetic recognition is provided by capturing dynamical behavior and statistical dependencies of the acoustic attributes used to represent a subject speech waveform. A segment based framework is employed. Temporal behavior is modelled explicitly by creating dynamic templates, called tracks, of the acoustic attributes used to represent the speech waveform, and by generating the estimation of the acoustic spatio-temporal correlation structure. An error model represents this estimation as the temporal and spatial correlations between the input speech waveform and track generated speech segment. Models incorporating these two components (track and error estimation) are created for both phonetic units and for phonetic transitions. Phonetic contextual influences are accounted for by merging context-dependent tracks and pooling error statistics over the different contexts. This allows for a large number of contextual models without compromising the robustness of the statistical parameter estimates.
    Type: Grant
    Filed: August 22, 1994
    Date of Patent: April 29, 1997
    Assignee: Massachusetts Institute of Technology
    Inventors: William D. Goldenthal, James R. Glass