Patents by Inventor Chi-Keung Luk

Chi-Keung Luk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DEBUGGING CORRECTNESS ISSUES IN TRAINING MACHINE LEARNING MODELS

Publication number: 20230229978

Abstract: A method includes training, using a first computing system having a first configuration, a first machine learning model having a machine learning model architecture, and training, using a second computing system having a different second configuration, a second machine learning model having the machine learning model architecture. The method also includes determining, for a shared training operation performed by both the first computing system and the second computing system, a similarity measure that represents a similarity between: a first training output generated by the first computing system during performance of the shared training operation during training of the first machine learning model; and a second training output generated by the second computing system during performance of the shared training operation during training of the second machine learning model.

Type: Application

Filed: January 6, 2023

Publication date: July 20, 2023

Applicant: Google LLC

Inventors: Chi Keung Luk, Jose Americo Baiocchi Paredes, Russell Power, Mehmet Deveci
Debugging correctness issues in training machine learning models

Patent number: 11556861

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for debugging correctness issues in training machine learning models. In one aspect, a method comprises training a first machine learning model using a first computing system having a first configuration; training a second machine learning model using a second computing system having a second configuration, wherein the second configuration of the second computing system is different than the first configuration of the first computing system; and determining, for each of a plurality of shared training operations that are performed by both the first computing system and the second computing system, a respective similarity measure that measures a similarity between: a first training output generated by the first computing system by performing the shared training operation, and a second training output generated by the second computing system by performing the shared training operation.

Type: Grant

Filed: May 6, 2019

Date of Patent: January 17, 2023

Assignee: Google LLC

Inventors: Chi Keung Luk, Jose Americo Baiocchi Paredes, Russell Power, Mehmet Deveci
DEBUGGING CORRECTNESS ISSUES IN TRAINING MACHINE LEARNING MODELS

Publication number: 20200356905

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for debugging correctness issues in training machine learning models. In one aspect, a method comprises training a first machine learning model using a first computing system having a first configuration; training a second machine learning model using a second computing system having a second configuration, wherein the second configuration of the second computing system is different than the first configuration of the first computing system; and determining, for each of a plurality of shared training operations that are performed by both the first computing system and the second computing system, a respective similarity measure that measures a similarity between: a first training output generated by the first computing system by performing the shared training operation, and a second training output generated by the second computing system by performing the shared training operation.

Type: Application

Filed: May 6, 2019

Publication date: November 12, 2020

Inventors: Chi Keung Luk, Jose Americo Baiocchi Paredes, Russell Power, Mehmet Deveci
Adaptive mapping for heterogeneous processing systems

Publication number: 20100156888

Abstract: Embodiments of a system, program product and method are presented to perform automatic partitioning of work between host processor (such as, e.g., a CPU) and at least one additional heterogeneous processing element (such as, e.g., a GPU) through run-time adaptive mapping. The adaptive mapping may be performed by a dynamic compiler, based on projected execution times predicted by curve fitting based on actual execution times generated during a profile run of the program. Other embodiments are described and claimed.

Type: Application

Filed: December 23, 2008

Publication date: June 24, 2010

Inventors: Chi-Keung Luk, Paul Geoffrey Lowney
Software controlled pre-execution in a multithreaded processor

Patent number: 7343602

Abstract: A processor capable of running multiple threads runs a program in one thread (called the “main” thread) and at least a portion of the same program in another thread (called the “pre-execution” thread). The program in the main thread includes instructions that cause the processor to start and stop pre-execution threads and direct the processor as to which part of the program is to be run through the pre-execution threads. Preferably, such instructions cause the pre-execution thread to run ahead of the main thread in program order. In that way, any cache miss conditions that are encountered by the pre-execution thread are resolved before the main thread requires that same data. Therefore, the main thread should encounter few or no cache miss conditions.

Type: Grant

Filed: December 18, 2001

Date of Patent: March 11, 2008

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Chi-Keung Luk, Joel S. Emer
Methods and apparatus to inline conditional software instrumentation

Publication number: 20070234307

Abstract: Methods and apparatus to inline conditional software instrumentation are disclosed. An example method comprises splitting a software instrumentation conditional analysis procedure for an application segment into an unconditional portion and a conditional portion, and inlining the unconditional portion.

Type: Application

Filed: March 6, 2006

Publication date: October 4, 2007

Inventors: Chi-Keung Luk, Robert Cohn
Methods and apparatus for stride profiling a software application

Patent number: 7181723

Abstract: Methods and an apparatus for stride profiling a software application are disclosed. An example system uses a hardware performance counter to report instruction addresses and data addresses associated with memory access instructions triggered by some event, such as a data cache miss. When the same instruction address is associated with more than one data address, the difference between the two data addresses is recorded. When two or more of these data address differences are recorded for the same instruction, the system determines a stride associated with the instruction to be the greatest common divisor of the two or more differences. This stride may be used by a compiler to optimize data cache prefetching. In addition, any overhead associated with monitoring addresses of data cache misses may be reduced by cycling between an inspection phase and a skipping phase. More data cache misses are monitored during the inspection phase than during the skipping phase.

Type: Grant

Filed: May 27, 2003

Date of Patent: February 20, 2007

Assignee: Intel Corporation

Inventors: Chi-Keung Luk, Geoff Lowney
Optimizing binary-level instrumentation via instruction scheduling

Publication number: 20070006167

Abstract: In one embodiment, the present invention includes a method for receiving a command to insert instrumentation code into a code segment, analyzing the code segment to determine an optimal location for the instrumentation code within the code segment, and inserting the instrumentation code at the optimal location to generate an instrumented code segment. The instrumented code segment may then be executed and may provide for improved performance over unoptimized instrumented code. Other embodiments are described and claimed.

Type: Application

Filed: May 31, 2005

Publication date: January 4, 2007

Inventors: Chi-Keung Luk, Ady Tal, Robert Cohn, Jonathan Beimel
Methods and apparatus to pre-execute instructions on a single thread

Publication number: 20050050534

Abstract: Methods and apparatus to pre-execute instructions on a single thread are disclosed. In an example method, at least one instruction associated with a latency condition is identified. A slice of instructions is identified. The slice of instructions is configured to generate a data address associated with the at least one instruction. At least one instruction slot in the single thread is identified. Code configured to execute the slice of instructions is generated within the at least one instruction slot.

Type: Application

Filed: September 2, 2003

Publication date: March 3, 2005

Inventors: Chi-Keung Luk, Paul Lowney
Methods and apparatus for stride profiling a software application

Publication number: 20040243981

Abstract: Methods and an apparatus for stride profiling a software application are disclosed. An example system uses a hardware performance counter to report instruction addresses and data addresses associated with memory access instructions triggered by some event, such as a data cache miss. When the same instruction address is associated with more than one data address, the difference between the two data addresses is recorded. When two or more of these data address differences are recorded for the same instruction, the system determines a stride associated with the instruction to be the greatest common divisor of the two or more differences. This stride may be used by a compiler to optimize data cache prefetching. In addition, any overhead associated with monitoring addresses of data cache misses may be reduced by cycling between an inspection phase and a skipping phase. More data cache misses are monitored during the inspection phase than during the skipping phase.

Type: Application

Filed: May 27, 2003

Publication date: December 2, 2004

Inventors: Chi-Keung Luk, Geoff Lowney
Profile-guided stride prefetching

Publication number: 20030084433

Abstract: Executable code is modified to include prefetch instructions for certain loads. The targeted loads preferably include those loads for which a compiler cannot compute a stride (which represents the difference in memory addresses used in consecutive executions of a given load). Whether prefetch instructions should be included for such loads is determined preferably by running the code with a training data set which determines the frequency of strides for each subsequent execution of a load. If a stride occurs more than once for a load, then that load is prefetched by inserting a prefetch instruction into the executable code for that load. Further, a stride value is associated with the inserted prefetch. Preferably, the stride value is the most frequently occurring stride, which can be determined based on the results of the training data set. Alternatively, the stride can be computed during run-time by the code itself.

Type: Application

Filed: October 31, 2001

Publication date: May 1, 2003

Inventors: Chi-Keung Luk, Harish Patil, Robert Muth, Paul Geoffrey Lowney, Robert Cohn, Richard Weiss
Software controlled pre-execution in a multithreaded processor

Publication number: 20020055964

Abstract: A processor capable of running multiple threads runs a program in one thread (called the “main” thread) and at least a portion of the same program in another thread (called the “pre-execution” thread). The program in the main thread includes instructions that cause the processor to start and stop pre-execution threads and direct the processor as to which part of the program is to be run through the pre-execution threads. Preferably, such instructions cause the pre-execution thread to run ahead of the main thread in program order. In that way, any cache miss conditions that are encountered by the pre-execution thread are resolved before the main thread requires that same data. Therefore, the main thread should encounter few or no cache miss conditions.

Type: Application

Filed: December 18, 2001

Publication date: May 9, 2002

Inventors: Chi-Keung Luk, Joel S. Emer