Patents by Inventor Dirk Ryan Padfield
Dirk Ryan Padfield has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250037700Abstract: A method includes receiving a reference audio signal corresponding to reference speech spoken by a target speaker with atypical speech, and generating, by a speaker embedding network configured to receive the reference audio signal as input, a speaker embedding for the target speaker. The speaker embedding conveys speaker characteristics of the target speaker. The method also includes receiving a speech conversion request that includes input audio data corresponding to an utterance spoken by the target speaker associated with the atypical speech. The method also includes biasing, using the speaker embedding generated for the target speaker by the speaker embedding network, a speech conversion model to convert the input audio data corresponding to the utterance spoken by the target speaker associated with atypical speech into an output canonical representation of the utterance spoken by the target speaker.Type: ApplicationFiled: October 17, 2024Publication date: January 30, 2025Applicant: Google LLCInventors: Fadi Biadsy, Dirk Ryan Padfield, Victoria Zayats
-
Publication number: 20240428056Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing tasks. One of the methods includes obtaining a sequence of input tokens, where each token is selected from a vocabulary of tokens that includes text tokens and audio tokens, and wherein the sequence of input tokens includes tokens that describe a task to be performed and data for performing the task; generating a sequence of embeddings by embedding each token in the sequence of input tokens in an embedding space; and processing the sequence of embeddings using a language model neural network to generate a sequence of output tokens for the task, where each token is selected from the vocabulary.Type: ApplicationFiled: June 21, 2024Publication date: December 26, 2024Inventors: Paul Kishan Rubenstein, Matthew Sharifi, Alexandru Tudor, Chulayuth Asawaroengchai, Duc Dung Nguyen, Marco Tagliasacchi, Neil Zeghidour, Zalán Borsos, Christian Frank, Dalia Salem Hassan Fahmy Elbadawy, Hannah Raphaelle Muckenhirn, Dirk Ryan Padfield, Damien Vincent, Evgeny Kharitonov, Michelle Dana Tadmor, Mihajlo Velimirovic, Feifan Chen, Victoria Zayats
-
Publication number: 20240420680Abstract: Implementations relate to a multimodal translation application that can provide an abridged version of a translation through an audio interface of a computing device, while simultaneously providing a verbatim textual translation at a display interface of the computing device. The application can provide these different versions of the translation in certain circumstances when, for example, the rate of speech of a person speaking to a user is relatively high compared to a preferred rate of speech of the user. For example, a comparison between phonemes of an original language speech and a translated language speech can be performed to determine whether the ratio satisfies a threshold for providing an audible abridged translation. A determination to provide the abridged translation can additionally or alternatively be based on a determined language of the speaker.Type: ApplicationFiled: June 19, 2023Publication date: December 19, 2024Inventors: Te I, Chris Kau, Jeffrey Robert Pitman, Robert Eric Genter, Qi Ge, Wolfgang Macherey, Dirk Ryan Padfield, Naveen Arivazhagan, Colin Cherry
-
Patent number: 12136410Abstract: A method includes receiving a reference audio signal corresponding to reference speech spoken by a target speaker with atypical speech, and generating, by a speaker embedding network configured to receive the reference audio signal as input, a speaker embedding for the target speaker. The speaker embedding conveys speaker characteristics of the target speaker. The method also includes receiving a speech conversion request that includes input audio data corresponding to an utterance spoken by the target speaker associated with the atypical speech. The method also includes biasing, using the speaker embedding generated for the target speaker by the speaker embedding network, a speech conversion model to convert the input audio data corresponding to the utterance spoken by the target speaker associated with atypical speech into an output canonical representation of the utterance spoken by the target speaker.Type: GrantFiled: May 3, 2022Date of Patent: November 5, 2024Assignee: Google LLCInventors: Fadi Biadsy, Dirk Ryan Padfield, Victoria Zayats
-
Publication number: 20240265215Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate generating stable real-time textual translations in a target language of an input audio data stream that is recorded in a source language. An audio stream that is recorded in a first language is obtained. A partial transcription of the audio can be generated at each time interval in a plurality of successive time intervals. Each partial transcription can be translated into a second language that is different from the first language. Each translated partial transcription can be input to a model that determines whether a portion of an input translated partial transcription is stable. Based on the input translated partial transcription, the model identifies a portion of the translated partial transcription that is predicted to be stable. This stable portion of the translated partial transcription is provided for display on a user device.Type: ApplicationFiled: March 26, 2024Publication date: August 8, 2024Inventor: Dirk Ryan Padfield
-
Patent number: 11972226Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate generating stable real-time textual translations in a target language of an input audio data stream that is recorded in a source language. An audio stream that is recorded in a first language is obtained. A partial transcription of the audio can be generated at each time interval in a plurality of successive time intervals. Each partial transcription can be translated into a second language that is different from the first language. Each translated partial transcription can be input to a model that determines whether a portion of an input translated partial transcription is stable. Based on the input translated partial transcription, the model identifies a portion of the translated partial transcription that is predicted to be stable. This stable portion of the translated partial transcription is provided for display on a user device.Type: GrantFiled: March 23, 2020Date of Patent: April 30, 2024Assignee: Google LLCInventor: Dirk Ryan Padfield
-
Publication number: 20230360632Abstract: A method includes receiving a reference audio signal corresponding to reference speech spoken by a target speaker with atypical speech, and generating, by a speaker embedding network configured to receive the reference audio signal as input, a speaker embedding for the target speaker. The speaker embedding conveys speaker characteristics of the target speaker. The method also includes receiving a speech conversion request that includes input audio data corresponding to an utterance spoken by the target speaker associated with the atypical speech. The method also includes biasing, using the speaker embedding generated for the target speaker by the speaker embedding network, a speech conversion model to convert the input audio data corresponding to the utterance spoken by the target speaker associated with atypical speech into an output canonical representation of the utterance spoken by the target speaker.Type: ApplicationFiled: May 3, 2022Publication date: November 9, 2023Applicant: Google LLCInventors: Fadi Biadsy, Dirk Ryan Padfield, Victoria Zayats
-
Publication number: 20230021824Abstract: The technology provides an approach to train translation models that are robust to transcription errors and punctuation errors. The approach includes introducing errors from actual automatic speech recognition and automatic punctuation systems into the source side of the machine translation training data. A method for training a machine translation model includes performing automatic speech recognition on input source audio to generate a system transcript. The method aligns a human transcript of the source audio to the system transcript, including projecting system segmentation onto the human transcript. Then the method performs segment robustness training of a machine translation model according to the aligned human and system transcripts, and performs system robustness training of the machine translation model, e.g., by injecting token errors into training data.Type: ApplicationFiled: July 7, 2022Publication date: January 26, 2023Applicant: Google LLCInventors: Dirk Ryan Padfield, Colin Andrew Cherry
-
Publication number: 20220121827Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate generating stable real-time textual translations in a target language of an input audio data stream that is recorded in a source language. An audio stream that is recorded in a first language is obtained. A partial transcription of the audio can be generated at each time interval in a plurality of successive time intervals. Each partial transcription can be translated into a second language that is different from the first language. Each translated partial transcription can be input to a model that determines whether a portion of an input translated partial transcription is stable. Based on the input translated partial transcription, the model identifies a portion of the translated partial transcription that is predicted to be stable. This stable portion of the translated partial transcription is provided for display on a user device.Type: ApplicationFiled: March 23, 2020Publication date: April 21, 2022Inventor: Dirk Ryan Padfield
-
Patent number: 10223812Abstract: Systems and methods are described for providing an image validation module. The image validation mobile enables capture, enhancement, validation, and upload of a digital image to a networked computing service, applying criteria that correspond to image validation criteria used by the networked computing service. The image validation mobile may be executed on a mobile computing device, and may authenticate itself to the networked computing service to indicate that digital images have already been validated. The image validation module may provide feedback before, during, or after image capture to enable the capture of valid images, and may provide feedback before, during, or after image enhancement to allow issues that prevent a digital image from passing validation to be addressed.Type: GrantFiled: November 9, 2016Date of Patent: March 5, 2019Assignee: Amazon Technologies, Inc.Inventors: Eric Paul Bennett, Thomas Lund Dideriksen, Brian Jackson, Gregory James Nyssen, Dirk Ryan Padfield
-
Patent number: 10062173Abstract: Features are disclosed for processing composite images. Composite images may be received that include a common item such as a t-shirt with different graphics overlaid on the item. Features for detecting such composite images by comparing shape and color features of an uploaded image to previously detected composite images are described. Composite images including the common item may be grouped into clusters. The clustered images can then be processed as a group such as to separate the graphics from the underlying image and to make authorization determinations for inclusion in an online catalog system.Type: GrantFiled: December 9, 2015Date of Patent: August 28, 2018Assignee: Amazon Technologies, Inc.Inventor: Dirk Ryan Padfield
-
Publication number: 20180096497Abstract: Systems and methods are described for providing an image validation module. The image validation mobile enables capture, enhancement, validation, and upload of a digital image to a networked computing service, applying criteria that correspond to image validation criteria used by the networked computing service. The image validation mobile may be executed on a mobile computing device, and may authenticate itself to the networked computing service to indicate that digital images have already been validated. The image validation module may provide feedback before, during, or after image capture to enable the capture of valid images, and may provide feedback before, during, or after image enhancement to allow issues that prevent a digital image from passing validation to be addressed.Type: ApplicationFiled: November 9, 2016Publication date: April 5, 2018Inventors: Eric Paul Bennett, Thomas Lund Dideriksen, Brian Jackson, Gregory James Nyssen, Dirk Ryan Padfield
-
Patent number: 9898812Abstract: Features are disclosed for processing composite images. Composite images may be received that include a common item such as a t-shirt with different graphics overlaid on the item. Features for determining the quality of composite images based on processing the image data are provided. Detection of a region that the overlaid graphic covers provides a targeted location for analyzing the underlying image. A quality metric may be determined based on whether, which, and how many features of the item shown in the underlying image are obscured or otherwise modified by the overlaid image.Type: GrantFiled: December 9, 2015Date of Patent: February 20, 2018Assignee: Amazon Technologies, Inc.Inventor: Dirk Ryan Padfield
-
Patent number: 9792524Abstract: Disclosed are various embodiments for improving optical character recognition approaches through the use of gap shifting. A text detection process is performed upon an image to detect a first region of text. A second region that is in line with the first region is shifted to reduce a gap between the first region and the second region, thereby creating a modified image. The text detection process is performed upon the modified image in order to detect text within the second region.Type: GrantFiled: July 22, 2015Date of Patent: October 17, 2017Assignee: Amazon Technologies, Inc.Inventors: Wei You, Dirk Ryan Padfield, Gurumurthy Swaminathan
-
Patent number: 9324155Abstract: Systems and methods for determining parameters for image analysis are provided. One method includes obtaining ultrasound data of an object, generating an image of the object, and identifying a region of interest in the image. The method also includes determining a plurality of spatially varying parameters for image analysis of the region of interest using prior information for one or more objects of interest, including prior location information for the one or more objects of interest, and wherein the plurality of spatially varying parameters are determined for a plurality of sections of the region of interest and different for at least some of the plurality of sections. The method further includes using the plurality of spatially varying parameters for performing image analysis of the region of interest in the image to determine the location of the one or more objects of interest.Type: GrantFiled: March 10, 2014Date of Patent: April 26, 2016Assignee: General Electric CompanyInventors: Paulo Ricardo Mendonca, Dirk Ryan Padfield, Chandan Kumar Mallappa Aladahalli, Shubao Liu, Theresa Rose Broniak
-
Publication number: 20150254866Abstract: Systems and methods for determining parameters for image analysis are provided. One method includes obtaining ultrasound data of an object, generating an image of the object, and identifying a region of interest in the image. The method also includes determining a plurality of spatially varying parameters for image analysis of the region of interest using prior information for one or more objects of interest, including prior location information for the one or more objects of interest, and wherein the plurality of spatially varying parameters are determined for a plurality of sections of the region of interest and different for at least some of the plurality of sections. The method further includes using the plurality of spatially varying parameters for performing image analysis of the region of interest in the image to determine the location of the one or more objects of interest.Type: ApplicationFiled: March 10, 2014Publication date: September 10, 2015Applicant: General Electric CompanyInventors: Paulo Ricardo Mendonca, Dirk Ryan Padfield, Chandan Kumar Mallappa Aladahalli, Shubao Liu, Theresa Rose Broniak
-
Patent number: 9070004Abstract: The present techniques provide for the evaluation of cellular motion and/or cellular properties based on an analysis of motion for cluster of cells. In an exemplary technique, images of cells are acquired and the image is segmented into clusters. Motion data for each respective cluster is derived from the segmented data. The properties of each cluster can be used to evaluate cellular properties and/or cellular motion properties.Type: GrantFiled: May 3, 2012Date of Patent: June 30, 2015Assignee: GENERAL ELECTRIC COMPANYInventors: Xiaofeng Liu, Dirk Ryan Padfield
-
Patent number: 9042631Abstract: The invention relates to a computer implemented method and systems for cell level fish dot counting. FISH (fluorescence in situ hybridization) dot counting is the process of enumerating chromosomal abnormalities in the cells which can be used in areas of diagnosis and cancer research. The method comprises in part overlaying images of a biological sample comprising a nuclear counterstain mask and a FISH binary mask. The FISH binary mask is extracted using a multi-level extended h-maxima or h-minima.Type: GrantFiled: March 25, 2013Date of Patent: May 26, 2015Assignee: General Electric CompanyInventors: Dirk Ryan Padfield, Anitti Eljas Seppo, Yousef Ahmed Al-Kofahi
-
Publication number: 20140364739Abstract: System for analyzing a vascular structure. The system includes an initialization module that is configured to analyze a slice of a VOI that includes a main vessel of the vascular structure to position first and second luminal models in the lumen. Each of the first and second luminal models represents at least a portion of a cross-sectional shape of the lumen and has a location and a dimension in the slice. The system also includes a tracking module that is configured to determine the locations and the dimensions of the first and second luminal models in subsequent slices. For a designated slice, the locations and the dimensions of the first and second luminal models of the designated slice are based on the locations and the dimensions of the first and second luminal models, respectively, in a prior slice and also the image data of the designated slice.Type: ApplicationFiled: June 6, 2013Publication date: December 11, 2014Applicant: General Electric CompanyInventors: Shubao Liu, Paulo Ricardo Mendonca, Dirk Ryan Padfield
-
Patent number: 8845539Abstract: Methods, systems and computer program products for estimating the gestational age of a fetus are provided. According to one embodiment, the method generates a component image from a segmented ultrasound image of a fetal head. The component image includes one or more edges. The method then identifies a third ventricle within the component image. The method estimates a length of a bi-parietal diameter, based at least in part on the orientation of the third ventricle. Thereafter, the method estimates the gestational age of the fetus.Type: GrantFiled: December 16, 2011Date of Patent: September 30, 2014Assignee: General Electric CompanyInventors: Pavan Kumar Veerabhadra Annangi, Jyotirmoy Banerjee, Dirk Ryan Padfield