Patents by Inventor Raghavan Manmatha
Raghavan Manmatha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12229179Abstract: The present disclosure generally relates to systems and methods for searching media content. In some implementation examples, a search system receives an input query, generates a query embedding of the input query, and generates a bias mitigation transformation associated with a sensitive attribute. Based on the query embedding and the bias mitigation transformation, the search system generates a transformed query embedding that suppresses at least a portion of the query embedding related to the sensitive attribute. Using the transformed query embedding, the search system executes a similarity search in a media embedding model to identify one or more media embeddings that are similar to the transformed query embedding and transmits the one or more media embeddings.Type: GrantFiled: November 20, 2023Date of Patent: February 18, 2025Assignee: Amazon Technologies, Inc.Inventors: Matthaeus Kleindessner, Christopher Michael Russell, Kailash Budhathoki, Ali Caner Turkmen, Siqi Deng, Varad Gunjal, Ashwin Swaminathan, Raghavan Manmatha, Hao Yang
-
Publication number: 20240152510Abstract: Representations of sets of descriptors of reference objects are stored in a repository, with individual descriptors including information about entities identified in the reference objects. In response to a request to extract content from a particular data object, a reference object which satisfies a similarity criterion with respect to the particular data object is identified from the repository using the descriptors. A structural comparison of the particular data object and the reference object is performed to determine an entity related to another entity identified in the particular data object.Type: ApplicationFiled: December 18, 2023Publication date: May 9, 2024Applicant: Amazon Technologies, Inc.Inventors: Srikar Appalaraju, Raghavan Manmatha, Bhargava Urala Kota
-
Patent number: 11893012Abstract: Representations of sets of descriptors of reference objects are stored in a repository, with individual descriptors including information about entities identified in the reference objects. In response to a request to extract content from a particular data object, a reference object which satisfies a similarity criterion with respect to the particular data object is identified from the repository using the descriptors. A structural comparison of the particular data object and the reference object is performed to determine an entity related to another entity identified in the particular data object.Type: GrantFiled: May 28, 2021Date of Patent: February 6, 2024Assignee: Amazon Technologies, Inc.Inventors: Srikar Appalaraju, Raghavan Manmatha, Bhargava Urala Kota
-
Patent number: 11308354Abstract: Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.Type: GrantFiled: March 30, 2020Date of Patent: April 19, 2022Assignee: Amazon Technologies, Inc.Inventors: Ron Litman, Oron Anschel, Shahar Tsiper, Roee Litman, Shai Mazor, Jonathan Wu, Raghavan Manmatha
-
Patent number: 10659787Abstract: Techniques are generally described for enhanced compression of video data. In various examples, the techniques may include receiving first video data representing a scene in an environment. The techniques may further include generating illumination map data representing illumination of the scene in the first video data. The techniques may further comprise generating reflectance map data representing a reflectance of at least one object in the first video data. In some examples, the techniques may include sending, to a second computing device, the illumination map data and the reflectance map data. The techniques may further include receiving second video data representing the scene. The techniques may include determining a first illumination difference between the second video data and the first video data. The techniques may comprise sending, to the second computing device, the first illumination difference.Type: GrantFiled: September 20, 2018Date of Patent: May 19, 2020Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Ilya Vladimirovich Brailovskiy, Raghavan Manmatha
-
Patent number: 10366313Abstract: Tasks such as object classification from image data can take advantage of a deep learning process using convolutional neural networks. These networks can include a convolutional layer followed by an activation layer, or activation unit, among other potential layers. Improved accuracy can be obtained by using a generalized linear unit (GLU) as an activation unit in such a network, where a GLU is linear for both positive and negative inputs, and is defined by a positive slope, a negative slope, and a bias. These parameters can be learned for each channel or a block of channels, and stacking those types of activation units can further improve accuracy.Type: GrantFiled: February 12, 2018Date of Patent: July 30, 2019Assignee: A9.COM, INC.Inventors: Son Dinh Tran, Raghavan Manmatha
-
Publication number: 20180197049Abstract: Tasks such as object classification from image data can take advantage of a deep learning process using convolutional neural networks. These networks can include a convolutional layer followed by an activation layer, or activation unit, among other potential layers. Improved accuracy can be obtained by using a generalized linear unit (GLU) as an activation unit in such a network, where a GLU is linear for both positive and negative inputs, and is defined by a positive slope, a negative slope, and a bias. These parameters can be learned for each channel or a block of channels, and stacking those types of activation units can further improve accuracy.Type: ApplicationFiled: February 12, 2018Publication date: July 12, 2018Inventors: Son Dinh Tran, Raghavan Manmatha
-
Patent number: 10013633Abstract: Various approaches enable a user to capture image information (e.g., still images or video) about an object of interest such as the sole of a shoe or other piece of footwear (e.g., a sandal) and receive information about items that are determined to match footwear based at least in part on the image information. For example, an image analyze service or other similar service can analyze the images to determine a type of shoe included within the images based at least in part on patterns of other distinguishing features of the sole of the shoe. The image analysis service can aggregate the results and can provide information about the results as a set of matches or results to be displayed to a user in response to a visual search query. The information can include, for example, descriptions, contact information, availability, location data, pricing information, and other such information.Type: GrantFiled: March 8, 2017Date of Patent: July 3, 2018Assignee: A9.COM, INC.Inventors: Raghavan Manmatha, Wei-Hong Chuang
-
Patent number: 10007680Abstract: Systems and approaches for searching a content collection corresponding to query content are provided. In particular, false positive match rates between the query content and the content collection may be reduced with a minimum content region test and/or a minimum features per scale test. For example, by correlating content descriptors of a content piece in the content collection with query descriptors of the query content, the content piece can be determined to match the query content when a particular region of the content piece and/or a particular region of a query descriptor have a proportionate size meeting or exceeding a specified minimum. Alternatively, or in addition, the false positive match rate between query content and a content piece can be reduced by comparing content descriptors and query descriptors of features at a plurality of scales. A content piece can be determined to match the query content according to descriptor proportion quotas for the plurality of scales.Type: GrantFiled: January 26, 2015Date of Patent: June 26, 2018Assignee: A9.COM, INC.Inventors: Arnab Sanat Kumar Dhua, Sunil Ramesh, Max Delgadillo, Raghavan Manmatha
-
Patent number: 9892344Abstract: Tasks such as object classification from image data can take advantage of a deep learning process using convolutional neural networks. These networks can include a convolutional layer followed by an activation layer, or activation unit, among other potential layers. Improved accuracy can be obtained by using a generalized linear unit (GLU) as an activation unit in such a network, where a GLU is linear for both positive and negative inputs, and is defined by a positive slope, a negative slope, and a bias. These parameters can be learned for each channel or a block of channels, and stacking those types of activation units can further improve accuracy.Type: GrantFiled: November 30, 2015Date of Patent: February 13, 2018Assignee: A9.COM, INC.Inventors: Son Dinh Tran, Raghavan Manmatha
-
Patent number: 9721182Abstract: A method, system and computer program product for encoding an image is provided. The image that needs to be represented is represented in the form of a Gaussian pyramid which is a scale-space representation of the image and includes several pyramid images. The feature points in the pyramid images are identified and a specified number of feature points are selected. The orientations of the selected feature points are obtained by using a set of orientation calculating algorithms. A patch is extracted around the feature point in the pyramid images based on the orientations of the feature point and the sampling factor of the pyramid image. The boundary patches in the pyramid images are extracted by padding the pyramid images with extra pixels. The feature vectors of the extracted patches are defined. These feature vectors are normalized so that the components in the feature vectors are less than a threshold.Type: GrantFiled: December 21, 2016Date of Patent: August 1, 2017Assignee: A9.com, Inc.Inventors: Mark A. Ruzon, Raghavan Manmatha, Donald Tanguay
-
Patent number: 9652838Abstract: Various approaches enable a user to capture image information (e.g., still images or video) about an object of interest such as the sole of a shoe or other piece of footwear (e.g., a sandal) and receive information about items that are determined to match footwear based at least in part on the image information. For example, an image analyze service or other similar service can analyze the images to determine a type of shoe included within the images based at least in part on patterns of other distinguishing features of the sole of the shoe. The image analysis service can aggregate the results and can provide information about the results as a set of matches or results to be displayed to a user in response to a visual search query. The information can include, for example, descriptions, contact information, availability, location data, pricing information, and other such information.Type: GrantFiled: December 23, 2014Date of Patent: May 16, 2017Assignee: A9.com, Inc.Inventors: Raghavan Manmatha, Wei-Hong Chuang
-
Publication number: 20170103282Abstract: A method, system and computer program product for encoding an image is provided. The image that needs to be represented is represented in the form of a Gaussian pyramid which is a scale-space representation of the image and includes several pyramid images. The feature points in the pyramid images are identified and a specified number of feature points are selected. The orientations of the selected feature points are obtained by using a set of orientation calculating algorithms. A patch is extracted around the feature point in the pyramid images based on the orientations of the feature point and the sampling factor of the pyramid image. The boundary patches in the pyramid images are extracted by padding the pyramid images with extra pixels. The feature vectors of the extracted patches are defined. These feature vectors are normalized so that the components in the feature vectors are less than a threshold.Type: ApplicationFiled: December 21, 2016Publication date: April 13, 2017Inventors: Mark A. Ruzon, Raghavan Manmatha, Donald Ranguay
-
Patent number: 9530069Abstract: Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.Type: GrantFiled: February 3, 2015Date of Patent: December 27, 2016Assignee: A9.com, Inc.Inventors: Raghavan Manmatha, Mark A. Ruzon
-
Patent number: 9530076Abstract: A method, system and computer program product for encoding an image is provided. The image that needs to be represented is represented in the form of a Gaussian pyramid which is a scale-space representation of the image and includes several pyramid images. The feature points in the pyramid images are identified and a specified number of feature points are selected. The orientations of the selected feature points are obtained by using a set of orientation calculating algorithms. A patch is extracted around the feature point in the pyramid images based on the orientations of the feature point and the sampling factor of the pyramid image. The boundary patches in the pyramid images are extracted by padding the pyramid images with extra pixels. The feature vectors of the extracted patches are defined. These feature vectors are normalized so that the components in the feature vectors are less than a threshold.Type: GrantFiled: February 16, 2015Date of Patent: December 27, 2016Assignee: A9.com, Inc.Inventors: Mark A. Ruzon, Raghavan Manmatha, Donald Tanguay
-
Patent number: 9104700Abstract: Present invention relates to a method and system for automatic searching for information on a network in response to an image query sent by a user. The image query includes an image that is captured by using a mobile communications device with a camera. The image is processed to detect the text present in it. The detected text is then recognized using an OCR. Subsequently, the text is searched for matches in the corresponding domain database, selected from the various domain databases present in the network. Thereafter, selected matches and additional related information is sent to the user.Type: GrantFiled: January 27, 2014Date of Patent: August 11, 2015Assignee: A9.com, Inc.Inventors: Gurumurthy D. Ramkumar, Raghavan Manmatha, Supratik Bhattacharyya, Gautam Bhargava, Mark A. Ruzon
-
Publication number: 20150213061Abstract: Systems and approaches for searching a content collection corresponding to query content are provided. In particular, false positive match rates between the query content and the content collection may be reduced with a minimum content region test and/or a minimum features per scale test. For example, by correlating content descriptors of a content piece in the content collection with query descriptors of the query content, the content piece can be determined to match the query content when a particular region of the content piece and/or a particular region of a query descriptor have a proportionate size meeting or exceeding a specified minimum. Alternatively, or in addition, the false positive match rate between query content and a content piece can be reduced by comparing content descriptors and query descriptors of features at a plurality of scales. A content piece can be determined to match the query content according to descriptor proportion quotas for the plurality of scales.Type: ApplicationFiled: January 26, 2015Publication date: July 30, 2015Inventors: Arnab Sanat Kumar Dhua, Sunil Ramesh, Max Delgadillo, Raghavan Manmatha
-
Publication number: 20150161480Abstract: A method, system and computer program product for encoding an image is provided. The image that needs to be represented is represented in the form of a Gaussian pyramid which is a scale-space representation of the image and includes several pyramid images. The feature points in the pyramid images are identified and a specified number of feature points are selected. The orientations of the selected feature points are obtained by using a set of orientation calculating algorithms. A patch is extracted around the feature point in the pyramid images based on the orientations of the feature point and the sampling factor of the pyramid image. The boundary patches in the pyramid images are extracted by padding the pyramid images with extra pixels. The feature vectors of the extracted patches are defined. These feature vectors are normalized so that the components in the feature vectors are less than a threshold.Type: ApplicationFiled: February 16, 2015Publication date: June 11, 2015Inventors: Mark A. Ruzon, Raghavan Manmatha, Donald Tanguay
-
Publication number: 20150154464Abstract: Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.Type: ApplicationFiled: February 3, 2015Publication date: June 4, 2015Inventors: Raghavan Manmatha, Mark A. Ruzon
-
Patent number: 8977072Abstract: Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.Type: GrantFiled: December 13, 2012Date of Patent: March 10, 2015Assignee: A9.com, Inc.Inventors: Raghavan Manmatha, Mark A. Ruzon