Patents by Inventor Kyle Ernewein

Kyle Ernewein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240112090
    Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.
    Type: Application
    Filed: December 13, 2023
    Publication date: April 4, 2024
    Inventors: Serag GADELRAB, James Lyall ESLIGER, Meghal VARIA, Kyle ERNEWEIN, Alwyn DOS REMEDIOS, George LEE
  • Patent number: 11907810
    Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.
    Type: Grant
    Filed: July 18, 2019
    Date of Patent: February 20, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Serag Gadelrab, James Esliger, Meghal Varia, Kyle Ernewein, Alwyn Dos Remedios, George Lee
  • Patent number: 11029745
    Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.
    Type: Grant
    Filed: November 8, 2018
    Date of Patent: June 8, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Kyle Ernewein, Jason Edward Podaima, Francisco Perez, John Daniels, Alex Miler, Jeffrey Gemar, Rexford Alan Hill, Haoping Xu
  • Publication number: 20210019652
    Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.
    Type: Application
    Filed: July 18, 2019
    Publication date: January 21, 2021
    Inventors: Serag GADELRAB, James ESLIGER, Meghal VARIA, Kyle ERNEWEIN, Alwyn DOS REMEDIOS, George LEE
  • Publication number: 20200195977
    Abstract: An exemplary method for intelligent compression defines a threshold value for a temperature reading generated by a temperature sensor. Data blocks received into the compression module are compressed according to either a first mode or a second mode, the selection of which is determined based on a comparison of the active level for the temperature reading to the defined threshold value. The first compression mode may be associated with a lossless compression algorithm while the second compression mode is associated with a lossy compression algorithm. Or, both the first compression mode and the second compression mode may be associated with a lossless compression algorithm, however, for the first compression mode the received data blocks are produced at a default high quality level setting while for the second compression mode the received data blocks are produced at a reduced quality level setting.
    Type: Application
    Filed: February 26, 2020
    Publication date: June 18, 2020
    Inventors: SERAG GADELRAB, CHINCHUAN CHIU, MOINUL KHAN, KYLE ERNEWEIN, TOM LONGO, SIMON BOOTH, MEGHAL VARIA, MILIVOJE ALEKSIC
  • Patent number: 10609418
    Abstract: An exemplary method for intelligent compression defines a threshold value for a temperature reading generated by a temperature sensor. Data blocks received into the compression module are compressed according to either a first mode or a second mode, the selection of which is determined based on a comparison of the active level for the temperature reading to the defined threshold value. The first compression mode may be associated with a lossless compression algorithm while the second compression mode is associated with a lossy compression algorithm. Or, both the first compression mode and the second compression mode may be associated with a lossless compression algorithm, however, for the first compression mode the received data blocks are produced at a default high quality level setting while for the second compression mode the received data blocks are produced at a reduced quality level setting.
    Type: Grant
    Filed: April 18, 2017
    Date of Patent: March 31, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Serag Gadelrab, Chinchuan Chiu, Moinul Khan, Kyle Ernewein, Tom Longo, Simon Booth, Meghal Varia, Milivoje Aleksic
  • Publication number: 20200073470
    Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.
    Type: Application
    Filed: November 8, 2018
    Publication date: March 5, 2020
    Inventors: KYLE ERNEWEIN, JASON EDWARD PODAIMA, FRANCISCO PEREZ, JOHN DANIELS, ALEX MILER, JEFFREY GEMAR, REXFORD ALAN HILL, HAOPING XU
  • Patent number: 10484685
    Abstract: An exemplary method for intelligent compression defines a threshold value for a key performance indicator. Based on the key performance indicator value, data blocks generated by a producer component may be scaled down to reduce power and/or bandwidth consumption when being compressed according to a lossless compression module. The compressed data blocks are then stored in a memory component along with metadata that signals the scaling factor used prior to compression. Consumer components later retrieving the compressed data blocks from the memory component may decompress the data blocks and upscale, if required, based on the scaling factor signaled by the metadata.
    Type: Grant
    Filed: April 18, 2017
    Date of Patent: November 19, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Serag Gadelrab, Chinchuan Chiu, Moinul Khan, Kyle Ernewein, Tom Longo, Simon Booth, Meghal Varia, Milivoje Aleksic, King-Chung Lai
  • Publication number: 20180302624
    Abstract: An exemplary method for intelligent compression defines a threshold value for a temperature reading generated by a temperature sensor. Data blocks received into the compression module are compressed according to either a first mode or a second mode, the selection of which is determined based on a comparison of the active level for the temperature reading to the defined threshold value. The first compression mode may be associated with a lossless compression algorithm while the second compression mode is associated with a lossy compression algorithm. Or, both the first compression mode and the second compression mode may be associated with a lossless compression algorithm, however, for the first compression mode the received data blocks are produced at a default high quality level setting while for the second compression mode the received data blocks are produced at a reduced quality level setting.
    Type: Application
    Filed: April 18, 2017
    Publication date: October 18, 2018
    Inventors: SERAG GADELRAB, CHINCHUAN CHIU, MOINUL KHAN, KYLE ERNEWEIN, TOM LONGO, SIMON BOOTH, MEGHAL VARIA, MILIVOJE ALEKSIC
  • Publication number: 20180302625
    Abstract: An exemplary method for intelligent compression defines a threshold value for a key performance indicator. Based on the key performance indicator value, data blocks generated by a producer component may be scaled down to reduce power and/or bandwidth consumption when being compressed according to a lossless compression module. The compressed data blocks are then stored in a memory component along with metadata that signals the scaling factor used prior to compression. Consumer components later retrieving the compressed data blocks from the memory component may decompress the data blocks and upscale, if required, based on the scaling factor signaled by the metadata.
    Type: Application
    Filed: April 18, 2017
    Publication date: October 18, 2018
    Inventors: SERAG GADELRAB, CHINCHUAN CHIU, MOINUL KHAN, KYLE ERNEWEIN, TOM LONGO, SIMON BOOTH, MEGHAL VARIA, MILIVOJE ALEKSIC, KING-CHUNG LAI
  • Publication number: 20180253236
    Abstract: A method and system for dynamic control of shared memory resources within a portable computing device (“PCD”) are disclosed. A limit request of an unacceptable deadline miss (“UDM”) engine of the portable computing device may be determined with a limit request sensor within the UDM element. Next, a memory management unit modifies a shared memory resource arbitration policy in view of the limit request. By modifying the shared memory resource arbitration policy, the memory management unit may smartly allocate resources to service translation requests separately queued based on having emanated from either a flooding engine or a non-flooding engine.
    Type: Application
    Filed: March 2, 2017
    Publication date: September 6, 2018
    Inventors: SERAG GADELRAB, Jason Edward Podaima, Kyle Ernewein, Meghal Varia
  • Patent number: 10067691
    Abstract: A method and system for dynamic control of shared memory resources within a portable computing device (“PCD”) are disclosed. A limit request of an unacceptable deadline miss (“UDM”) engine of the portable computing device may be determined with a limit request sensor within the UDM element. Next, a memory management unit modifies a shared memory resource arbitration policy in view of the limit request. By modifying the shared memory resource arbitration policy, the memory management unit may smartly allocate resources to service translation requests separately queued based on having emanated from either a flooding engine or a non-flooding engine.
    Type: Grant
    Filed: March 2, 2017
    Date of Patent: September 4, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Serag Gadelrab, Jason Edward Podaima, Kyle Ernewein, Meghal Varia