Patents by Inventor Malcolm Slaney

Malcolm Slaney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190179850
    Abstract: A method of generating congruous metadata is provided. The method includes receiving a similarity measure between at least two multimedia objects. Each multimedia object has associated metadata. If the at least two multimedia objects are similar based on the similarity measure and a similarity threshold, the associated metadata of each of the multimedia objects are compared. Then, based on the comparison of the associated metadata of each of the at least two multimedia objects, the method further includes generating congruous metadata. Metadata may be tags, for example.
    Type: Application
    Filed: February 21, 2019
    Publication date: June 13, 2019
    Inventors: Malcolm SLANEY, Kilian WEINBERGER
  • Patent number: 10317992
    Abstract: Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.
    Type: Grant
    Filed: September 25, 2014
    Date of Patent: June 11, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Anna Prokofieva, Fethiye Asli Celikyilmaz, Dilek Z Hakkani-Tur, Larry Heck, Malcolm Slaney
  • Patent number: 10216761
    Abstract: A method of generating congruous metadata is provided. The method includes receiving a similarity measure between at least two multimedia objects. Each multimedia object has associated metadata. If the at least two multimedia objects are similar based on the similarity measure and a similarity threshold, the associated metadata of each of the multimedia objects are compared. Then, based on the comparison of the associated metadata of each of the at least two multimedia objects, the method further includes generating congruous metadata. Metadata may be tags, for example.
    Type: Grant
    Filed: March 4, 2008
    Date of Patent: February 26, 2019
    Assignee: OATH INC.
    Inventors: Malcolm Slaney, Kilian Weinberger
  • Patent number: 10152517
    Abstract: The systems and methods described create a mathematical representation of each of the media objects for which user ratings are known. The mathematical representations take into account the subjective rating value assigned by a user to the respective media object and the user that assigned the rating value. The media object with the mathematical representation closest to that of the seed media object is then selected as the most similar media object to the seed media object. In an embodiment, the mathematical representation is a vector representation in which each user is a different dimension and each user's rating value is the magnitude of the vector in that dimension. Similarity between two songs is determined by identifying the closest vectors to that of the seed song. Closeness may be determined by subtracting or by calculating the dot product of each of the vectors with that of the seed media object.
    Type: Grant
    Filed: February 21, 2013
    Date of Patent: December 11, 2018
    Assignee: Excalibur IP, LLC
    Inventors: Malcolm Slaney, William White
  • Patent number: 9830351
    Abstract: Systems and methods for generating and playing a sequence of media objects based on a mood gradient are also disclosed. A mood gradient is a sequence of items, in which each item is media object having known characteristics or a representative set of characteristics of a media object, that is created or used by a user for a specific purpose. Given a mood gradient, one or more new media objects are selected for each item in the mood gradient based on the characteristics associated with that item. In this way, a sequence of new media objects is created but the sequence exhibits a similar variation in media object characteristics. The mood gradient may be presented to a user or created via a display illustrating a three-dimensional space in which each dimension corresponds to a different characteristic. The mood gradient may be represented as a path through the three-dimensional space and icons representing media objects are located within the three-dimensional space based on their characteristics.
    Type: Grant
    Filed: November 6, 2013
    Date of Patent: November 28, 2017
    Assignee: Yahoo! Inc.
    Inventors: William White, Malcolm Slaney
  • Publication number: 20170255650
    Abstract: Embodiments of the invention are directed to using image data and contextual data to determine information about a scene, based on one or more previously obtained images. Contextual data, such location of image capture, can be used to determine previously obtained images related to the contextual data and other location-related information, such as billboard locations. With even low resolution devices, such as cell phone, image attributes, such as a histogram or optically recognized characters, can be compared between the previously obtained images and the newly captured image. Attributes matching within a predefined threshold indicated matching images. Information on the content of matching previously obtained images can be provided back to a user who captured the new image. User profile data can refine the content information. The content information can also be used as search terms for additional searching or other processing.
    Type: Application
    Filed: May 17, 2017
    Publication date: September 7, 2017
    Inventors: Arun Ramanujapuram, Malcolm Slaney
  • Patent number: 9665596
    Abstract: Embodiments of the invention are directed to using image data and contextual data to determine information about a scene, based on one or more previously obtained images. Contextual data, such location of image capture, can be used to determine previously obtained images related to the contextual data and other location-related information, such as billboard locations. With even low resolution devices, such as cell phone, image attributes, such as a histogram or optically recognized characters, can be compared between the previously obtained images and the newly captured image. Attributes matching within a predefined threshold indicate matching images. Information on the content of matching previously obtained images can be provided back to a user who captured the new image. User profile data can refine the content information. The content information can also be used as search terms for additional searching or other processing.
    Type: Grant
    Filed: October 4, 2016
    Date of Patent: May 30, 2017
    Assignee: YAHOO! INC.
    Inventors: Arun Ramanujapuram, Malcolm Slaney
  • Patent number: 9639780
    Abstract: A system and method for improved classification. A first classifier is trained using a first process running on at least one computing device using a first set of training images relating to a class of images. A set of additional images are selected using the first classifier from a source of additional images accessible to the computing device. The first set of training images and the set of additional images are merged using the computing device to create a second set of training images. A second classifier is trained using a second process running on the computing device using the second set of training images. A set of unclassified images are classified using the second classifier thereby creating a set of classified images. The first classifier and the second classifier employ different classification methods.
    Type: Grant
    Filed: December 22, 2008
    Date of Patent: May 2, 2017
    Assignee: Excalibur IP, LLC
    Inventors: Marc Aurelio Ranzato, Kilian Quirin Weinberger, Eva Hoerster, Malcolm Slaney
  • Patent number: 9583105
    Abstract: Technologies described herein relate to modifying visual content for presentment on a display to facilitate improving performance of an automatic speech recognition (ASR) system. The visual content is modified to move elements further away from one another, wherein the moved elements give rise to ambiguity from the perspective of the ASR system. The visual content is modified to take into consideration accuracy of gaze tracking. When a user views an element in the modified visual content, the ASR system is customized as a function of the element being viewed by the user.
    Type: Grant
    Filed: June 6, 2014
    Date of Patent: February 28, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Andreas Stolcke, Geoffrey Zweig, Malcolm Slaney
  • Publication number: 20170024414
    Abstract: Embodiments of the invention are directed to using image data and contextual data to determine information about a scene, based on one or more previously obtained images. Contextual data, such location of image capture, can be used to determine previously obtained images related to the contextual data and other location-related information, such as billboard locations. With even low resolution devices, such as cell phone, image attributes, such as a histogram or optically recognized characters, can be compared between the previously obtained images and the newly captured image. Attributes matching within a predefined threshold indicate matching images. Information on the content of matching previously obtained images can be provided back to a user who captured the new image. User profile data can refine the content information. The content information can also be used as search terms for additional searching or other processing.
    Type: Application
    Filed: October 4, 2016
    Publication date: January 26, 2017
    Inventors: Arun Ramanujapuram, Malcolm Slaney
  • Patent number: 9535988
    Abstract: Methods, apparatuses and systems directed to summarizing video or other multimedia content based on the blogging activities of one or more users. In one implementation, a user may select a video and one or more blogs that annotate various segments of the video. A user may specify one or more desired attributes of the blog or comment entries, which a system processes to create an edit decision list. The edit decision list can be used to generate a summarized version of the multimedia content that includes the content segments associated with blog entries that meet the desired attributes.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: January 3, 2017
    Assignee: Yahoo! Inc.
    Inventors: Steven Horowitz, Marc Davis, Malcolm Slaney
  • Patent number: 9483499
    Abstract: Embodiments of the invention are directed to using image data and contextual data to determine information about a scene, based on one or more previously obtained images. Contextual data, such location of image capture, can be used to determine previously obtained images related to the contextual data and other location-related information, such as billboard locations. With even low resolution devices, such as cell phone, image attributes, such as a histogram or optically recognized characters, can be compared between the previously obtained images and the newly captured image. Attributes matching within a predefined threshold indicate matching images. Information on the content of matching previously obtained images can be provided back to a user who captured the new image. User profile data can refine the content information. The content information can also be used as search terms for additional searching or other processing.
    Type: Grant
    Filed: March 25, 2013
    Date of Patent: November 1, 2016
    Assignee: Yahoo! Inc.
    Inventors: Arun Ramanujapuram, Malcolm Slaney
  • Patent number: 9384214
    Abstract: A search engine determines a set of other images that are similar to a user-selected image, and presents those other images to the user. In determining whether two images are sufficiently similar to each other to merit presentation of one, the search engine determines a Euclidean distance between separate feature vectors that are associated with each of the images. Each such vector indicates diverse types of information that is known about the associated image. The types of information included within such a vector may include attributes that reflect visual characteristics that are visible in an image, verbal tags that have been associated with the image users in a community of users, concepts derived from those tags, coordinates that reflect a geographic location at which a camera that produced the image was when the camera produced the image, and concepts related to groups with which the image is associated.
    Type: Grant
    Filed: July 31, 2009
    Date of Patent: July 5, 2016
    Assignee: Yahoo! Inc.
    Inventors: Malcolm Slaney, Kilian Quirin Weinberger, Kaushal Kurapati, Sriram J. Sathish, Polly Ng
  • Patent number: 9324320
    Abstract: Pairs of feature vectors are obtained that represent speech. Some pairs represent two samples of speech from the same speakers, and other pairs represent two samples of speech from different speakers. A neural network feeds each feature vector in a sample pair into a separate bottleneck layer, with a weight matrix on the input of both vectors tied to one another. The neural network is trained using the feature vectors and an objective function that induces the network to classify whether the speech samples come from the same speaker. The weights from the tied weight matrix are extracted for use in generating derived features for a speech processing system that can benefit from features that are thus transformed to better reflect speaker identity.
    Type: Grant
    Filed: October 2, 2014
    Date of Patent: April 26, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Andreas Stolcke, Malcolm Slaney, Sree Harsha Yella
  • Publication number: 20160098987
    Abstract: Pairs of feature vectors are obtained that represent speech. Some pairs represent two samples of speech from the same speakers, and other pairs represent two samples of speech from different speakers. A neural network feeds each feature vector in a sample pair into a separate bottleneck layer, with a weight matrix on the input of both vectors tied to one another. The neural network is trained using the feature vectors and an objective function that induces the network to classify whether the speech samples come from the same speaker. The weights from the tied weight matrix are extracted for use in generating derived features for a speech processing system that can benefit from features that are thus transformed to better reflect speaker identity.
    Type: Application
    Filed: October 2, 2014
    Publication date: April 7, 2016
    Inventors: Andreas Stolcke, Malcolm Slaney, Sree Harsha Yella
  • Publication number: 20160091967
    Abstract: Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.
    Type: Application
    Filed: September 25, 2014
    Publication date: March 31, 2016
    Inventors: Anna Prokofieva, Fethiye Asli Celikyilmaz, Dilek Z. Hakkani-Tur, Larry Heck, Malcolm Slaney
  • Patent number: 9268812
    Abstract: Systems and methods for generating and playing a sequence of media objects based on a mood gradient are also disclosed. A mood gradient is a sequence of items, in which each item is media object having known characteristics or a representative set of characteristics of a media object, that is created or used by a user for a specific purpose. Given a mood gradient, one or more new media objects are selected for each item in the mood gradient based on the characteristics associated with that item. In this way, a sequence of new media objects is created but the sequence exhibits a similar variation in media object characteristics. The mood gradient may be presented to a user or created via a display illustrating a three-dimensional space in which each dimension corresponds to a different characteristic. The mood gradient may be represented as a path through the three-dimensional space and icons representing media objects are located within the three-dimensional space based on their characteristics.
    Type: Grant
    Filed: October 31, 2013
    Date of Patent: February 23, 2016
    Assignee: YAHOO! INC.
    Inventors: William White, Malcolm Slaney
  • Publication number: 20150356971
    Abstract: Technologies described herein relate to modifying visual content for presentment on a display to facilitate improving performance of an automatic speech recognition (ASR) system. The visual content is modified to move elements further away from one another, wherein the moved elements give rise to ambiguity from the perspective of the ASR system. The visual content is modified to take into consideration accuracy of gaze tracking. When a user views an element in the modified visual content, the ASR system is customized as a function of the element being viewed by the user.
    Type: Application
    Filed: June 6, 2014
    Publication date: December 10, 2015
    Inventors: Andreas Stolcke, Geoffrey Zweig, Malcolm Slaney
  • Patent number: 9020042
    Abstract: A system and a corresponding method for temporal modification of audio signals, to increase or reduce the playback rates of an audio and/or a video file in a client-server environment. The system and method improve the efficiency of serving streaming media to a client so that the client can select an arbitrary time-speedup factor. The speedup system performs many of the pre-calculations once, at the server, so that the bandwidth needs are reduced and the client's computational load is minimized. The final time-scale-modification can be either done completely on the server, thus reducing the client's needs, or partly on the client's computer to minimize latency, and to reduce on-the-fly computational load from the server that serves multiple clients concurrently.
    Type: Grant
    Filed: January 26, 2011
    Date of Patent: April 28, 2015
    Assignee: International Business Machines Corporation
    Inventors: Arnon Amir, Malcolm Slaney
  • Patent number: 8949872
    Abstract: Methods and system for identifying multimedia content streaming through a television includes retrieving an audio signal from a multimedia content selected for rendering at the television. The retrieved audio signal is partitioned into a plurality of segments of small intervals. A particular segment is analyzed to identify acoustic modulation and to generate a distinct vector for the particular segment based on the acoustic modulation, wherein the vector defines a unique fingerprint of the particular segment of the audio signal. A content database on a server is queried using the vector of the particular segment to obtain content information for multimedia content that matches the fingerprint of the particular segment. The content information is used to identify the multimedia content and the source of the multimedia content that matches the audio signal received for rendering.
    Type: Grant
    Filed: December 20, 2011
    Date of Patent: February 3, 2015
    Assignee: Yahoo! Inc.
    Inventors: Malcolm Slaney, Andres Hernandez Schafhauser