Patents by Inventor Kyle Ernewein

Kyle Ernewein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CONCURRENT OPTIMIZATION OF MACHINE LEARNING MODEL PERFORMANCE

Publication number: 20250181978

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

Type: Application

Filed: December 9, 2024

Publication date: June 5, 2025

Inventors: Serag GADELRAB, James ESLIGER, Meghal VARIA, Kyle ERNEWEIN, Alwyn DOS REMEDIOS, George LEE
Concurrent optimization of machine learning model performance

Patent number: 12182676

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

Type: Grant

Filed: December 13, 2023

Date of Patent: December 31, 2024

Assignee: QUALCOMM Incorporated

Inventors: Serag Gadelrab, James Esliger, Meghal Varia, Kyle Ernewein, Alwyn Dos Remedios, George Lee
CONCURRENT OPTIMIZATION OF MACHINE LEARNING MODEL PERFORMANCE

Publication number: 20240112090

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

Type: Application

Filed: December 13, 2023

Publication date: April 4, 2024

Inventors: Serag GADELRAB, James Lyall ESLIGER, Meghal VARIA, Kyle ERNEWEIN, Alwyn DOS REMEDIOS, George LEE
Concurrent optimization of machine learning model performance

Patent number: 11907810

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

Type: Grant

Filed: July 18, 2019

Date of Patent: February 20, 2024

Assignee: QUALCOMM Incorporated

Inventors: Serag Gadelrab, James Esliger, Meghal Varia, Kyle Ernewein, Alwyn Dos Remedios, George Lee
Systems and methods for controlling instantaneous current changes in parallel processors

Patent number: 11029745

Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.

Type: Grant

Filed: November 8, 2018

Date of Patent: June 8, 2021

Assignee: QUALCOMM Incorporated

Inventors: Kyle Ernewein, Jason Edward Podaima, Francisco Perez, John Daniels, Alex Miler, Jeffrey Gemar, Rexford Alan Hill, Haoping Xu
CONCURRENT OPTIMIZATION OF MACHINE LEARNING MODEL PERFORMANCE

Publication number: 20210019652

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

Type: Application

Filed: July 18, 2019

Publication date: January 21, 2021

Inventors: Serag GADELRAB, James ESLIGER, Meghal VARIA, Kyle ERNEWEIN, Alwyn DOS REMEDIOS, George LEE
SYSTEM AND METHOD FOR INTELLIGENT DATA/FRAME COMPRESSION IN A SYSTEM ON A CHIP

Publication number: 20200195977

Abstract: An exemplary method for intelligent compression defines a threshold value for a temperature reading generated by a temperature sensor. Data blocks received into the compression module are compressed according to either a first mode or a second mode, the selection of which is determined based on a comparison of the active level for the temperature reading to the defined threshold value. The first compression mode may be associated with a lossless compression algorithm while the second compression mode is associated with a lossy compression algorithm. Or, both the first compression mode and the second compression mode may be associated with a lossless compression algorithm, however, for the first compression mode the received data blocks are produced at a default high quality level setting while for the second compression mode the received data blocks are produced at a reduced quality level setting.

Type: Application

Filed: February 26, 2020

Publication date: June 18, 2020

Inventors: SERAG GADELRAB, CHINCHUAN CHIU, MOINUL KHAN, KYLE ERNEWEIN, TOM LONGO, SIMON BOOTH, MEGHAL VARIA, MILIVOJE ALEKSIC
System and method for intelligent data/frame compression in a system on a chip

Patent number: 10609418

Abstract: An exemplary method for intelligent compression defines a threshold value for a temperature reading generated by a temperature sensor. Data blocks received into the compression module are compressed according to either a first mode or a second mode, the selection of which is determined based on a comparison of the active level for the temperature reading to the defined threshold value. The first compression mode may be associated with a lossless compression algorithm while the second compression mode is associated with a lossy compression algorithm. Or, both the first compression mode and the second compression mode may be associated with a lossless compression algorithm, however, for the first compression mode the received data blocks are produced at a default high quality level setting while for the second compression mode the received data blocks are produced at a reduced quality level setting.

Type: Grant

Filed: April 18, 2017

Date of Patent: March 31, 2020

Assignee: QUALCOMM Incorporated

Inventors: Serag Gadelrab, Chinchuan Chiu, Moinul Khan, Kyle Ernewein, Tom Longo, Simon Booth, Meghal Varia, Milivoje Aleksic
SYSTEMS AND METHODS FOR CONTROLLING INSTANTANEOUS CURRENT CHANGES IN PARALLEL PROCESSORS

Publication number: 20200073470

Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.

Type: Application

Filed: November 8, 2018

Publication date: March 5, 2020

Inventors: KYLE ERNEWEIN, JASON EDWARD PODAIMA, FRANCISCO PEREZ, JOHN DANIELS, ALEX MILER, JEFFREY GEMAR, REXFORD ALAN HILL, HAOPING XU
System and method for intelligent data/frame compression in a system on a chip

Patent number: 10484685

Abstract: An exemplary method for intelligent compression defines a threshold value for a key performance indicator. Based on the key performance indicator value, data blocks generated by a producer component may be scaled down to reduce power and/or bandwidth consumption when being compressed according to a lossless compression module. The compressed data blocks are then stored in a memory component along with metadata that signals the scaling factor used prior to compression. Consumer components later retrieving the compressed data blocks from the memory component may decompress the data blocks and upscale, if required, based on the scaling factor signaled by the metadata.

Type: Grant

Filed: April 18, 2017

Date of Patent: November 19, 2019

Assignee: QUALCOMM Incorporated

Inventors: Serag Gadelrab, Chinchuan Chiu, Moinul Khan, Kyle Ernewein, Tom Longo, Simon Booth, Meghal Varia, Milivoje Aleksic, King-Chung Lai
SYSTEM AND METHOD FOR INTELLIGENT DATA/FRAME COMPRESSION IN A SYSTEM ON A CHIP

Publication number: 20180302625

Abstract: An exemplary method for intelligent compression defines a threshold value for a key performance indicator. Based on the key performance indicator value, data blocks generated by a producer component may be scaled down to reduce power and/or bandwidth consumption when being compressed according to a lossless compression module. The compressed data blocks are then stored in a memory component along with metadata that signals the scaling factor used prior to compression. Consumer components later retrieving the compressed data blocks from the memory component may decompress the data blocks and upscale, if required, based on the scaling factor signaled by the metadata.

Type: Application

Filed: April 18, 2017

Publication date: October 18, 2018

Inventors: SERAG GADELRAB, CHINCHUAN CHIU, MOINUL KHAN, KYLE ERNEWEIN, TOM LONGO, SIMON BOOTH, MEGHAL VARIA, MILIVOJE ALEKSIC, KING-CHUNG LAI
SYSTEM AND METHOD FOR INTELLIGENT DATA/FRAME COMPRESSION IN A SYSTEM ON A CHIP

Publication number: 20180302624

Abstract: An exemplary method for intelligent compression defines a threshold value for a temperature reading generated by a temperature sensor. Data blocks received into the compression module are compressed according to either a first mode or a second mode, the selection of which is determined based on a comparison of the active level for the temperature reading to the defined threshold value. The first compression mode may be associated with a lossless compression algorithm while the second compression mode is associated with a lossy compression algorithm. Or, both the first compression mode and the second compression mode may be associated with a lossless compression algorithm, however, for the first compression mode the received data blocks are produced at a default high quality level setting while for the second compression mode the received data blocks are produced at a reduced quality level setting.

Type: Application

Filed: April 18, 2017

Publication date: October 18, 2018

Inventors: SERAG GADELRAB, CHINCHUAN CHIU, MOINUL KHAN, KYLE ERNEWEIN, TOM LONGO, SIMON BOOTH, MEGHAL VARIA, MILIVOJE ALEKSIC
SYSTEM AND METHOD FOR DYNAMIC CONTROL OF SHARED MEMORY MANAGEMENT RESOURCES

Publication number: 20180253236

Abstract: A method and system for dynamic control of shared memory resources within a portable computing device (“PCD”) are disclosed. A limit request of an unacceptable deadline miss (“UDM”) engine of the portable computing device may be determined with a limit request sensor within the UDM element. Next, a memory management unit modifies a shared memory resource arbitration policy in view of the limit request. By modifying the shared memory resource arbitration policy, the memory management unit may smartly allocate resources to service translation requests separately queued based on having emanated from either a flooding engine or a non-flooding engine.

Type: Application

Filed: March 2, 2017

Publication date: September 6, 2018

Inventors: SERAG GADELRAB, Jason Edward Podaima, Kyle Ernewein, Meghal Varia
System and method for dynamic control of shared memory management resources

Patent number: 10067691

Abstract: A method and system for dynamic control of shared memory resources within a portable computing device (“PCD”) are disclosed. A limit request of an unacceptable deadline miss (“UDM”) engine of the portable computing device may be determined with a limit request sensor within the UDM element. Next, a memory management unit modifies a shared memory resource arbitration policy in view of the limit request. By modifying the shared memory resource arbitration policy, the memory management unit may smartly allocate resources to service translation requests separately queued based on having emanated from either a flooding engine or a non-flooding engine.

Type: Grant

Filed: March 2, 2017

Date of Patent: September 4, 2018

Assignee: QUALCOMM Incorporated

Inventors: Serag Gadelrab, Jason Edward Podaima, Kyle Ernewein, Meghal Varia