Patents by Inventor Daniel A. Chimene
Daniel A. Chimene has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11940931Abstract: A turnstile OS primitive is provided that enables support for owner tracking and waiting. The turnstile primitive enables a common framework that can be adopted across multiple different types of synchronization primitives to provide a common service for priority boosting and wait queuing. A turnstile can also provide a mechanism to enable a turnstile to block on another turnstile, allowing multi-hop priority boosting within a chain of multiple blocking turnstiles.Type: GrantFiled: January 5, 2021Date of Patent: March 26, 2024Assignee: Apple Inc.Inventors: Jainam A. Shah, Jeremy C. Andrus, Daniel A. Chimene, Kushal Dalmia, Pierre Habouzit, James M. Magee, Marina Sadini, Daniel A. Steffen
-
Patent number: 11860796Abstract: Embodiments described herein provide techniques to manage drivers in a user space in a data processing system. One embodiment provides a data processing system configured perform operations, comprising discovering a hardware device communicatively coupled to the communication bus, launching a user space driver daemon, establishing an inter-process communication (IPC) link between a first proxy interface for the user space driver daemon and a second proxy interface for a server process in a kernel space, receiving, at the first proxy interface, an access right to enable access to a memory buffer in the kernel space, and relaying an access request for the memory buffer from the user space driver daemon via a third-party proxy interface to enable the user space driver daemon to access the memory buffer, the access request based on the access right.Type: GrantFiled: August 9, 2021Date of Patent: January 2, 2024Assignee: Apple Inc.Inventors: Jeremy C. Andrus, Joseph R. Auricchio, Russell A. Blaine, Daniel A. Chimene, Simon M. Douglas, Landon J. Fuller, Yevgen Goryachok, John K. Kim-Biggs, Arnold S. Liu, James M. Magee, Daniel A. Steffen, Roberto G. Yepez
-
Publication number: 20230067109Abstract: Embodiments include an asymmetric multiprocessing (AMP) system having two or more central processing unit (CPU) clusters of a first core type and a CPU cluster of a second core type. Some embodiments include determining a control effort for an active thread group, and assigning the thread group to a first performance island according to the control effort range of the first performance island. The first performance island can include a first CPU cluster of the first core type, where a second performance island includes a second CPU cluster of the first core type, where the second performance island corresponds to a different control effort range than the first performance island. Some embodiments include assigning the first CPU cluster as a preferred CPU cluster of the first thread group, and transmitting a first signal identifying the first CPU cluster as the preferred CPU cluster assigned to the first thread group.Type: ApplicationFiled: August 23, 2022Publication date: March 2, 2023Applicant: Apple Inc.Inventors: Bryan R. HINCH, John G. DORSEY, Ronit BANERJEE, Kushal DALMIA, Daniel A. CHIMENE, Jaidev P. PATWARDHAN
-
Patent number: 11579934Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.Type: GrantFiled: March 22, 2021Date of Patent: February 14, 2023Assignee: Apple Inc.Inventors: Jeremy C. Andrus, John G. Dorsey, James M. Magee, Daniel A. Chimene, Cyril de la Cropte de Chanterac, Bryan R. Hinch, Aditya Venkataraman, Andrei Dorofeev, Nigel R. Gamble, Russell A. Blaine, Constantin Pistol, James S. Ismail
-
Publication number: 20230040310Abstract: Embodiments include an asymmetric multiprocessing (AMP) system having a first central processing unit (CPU) cluster comprising a first core type, and a second CPU cluster comprising a second core type, where the AMP system can update a thread metric for a first thread running on the first CPU cluster based at least on: a past shared resource overloaded metric of the first CPU cluster, and on-core metrics of the first thread. The on-core metrics of the first thread can indicate that first thread contributes to contention of the same shared resource corresponding to the past shared resource overloaded metric of the first CPU cluster. The AMP system can assign the first thread to a different CPU cluster while other threads of the same thread group remain assigned to the first CPU cluster. The thread metric can include a Matrix Extension (MX) thread flag or a Bus Interface Unit (BIU) thread flag.Type: ApplicationFiled: August 3, 2021Publication date: February 9, 2023Applicant: Apple Inc.Inventors: John G. DORSEY, Bryan R. HINCH, Ronit BANERJEE, Kushal DALMIA, Daniel A. CHIMENE, Jaidev P. PATWARDHAN
-
Patent number: 11422857Abstract: Embodiments described herein provide multi-level scheduling for threads in a data processing system. One embodiment provides a data processing system comprising one or more processors, a computer-readable memory coupled to the one or more processors, the computer-readable memory to store instructions which, when executed by the one or more processors, configure the one or more processors to receive execution threads for execution on the one or more processors, map the execution threads into a first plurality of buckets based at least in part on a quality of service class of the execution threads, schedule the first plurality of buckets for execution using a first scheduling algorithm, schedule a second plurality thread groups within the first plurality of buckets for execution using a second scheduling algorithm, and schedule a third plurality of threads within the second plurality of thread groups using a third scheduling algorithm.Type: GrantFiled: May 22, 2020Date of Patent: August 23, 2022Assignee: Apple Inc.Inventors: Kushal Dalmia, Jeremy C. Andrus, Daniel A. Chimene, Nigel R. Gamble, James M. Magee, Daniel A. Steffen
-
Patent number: 11360820Abstract: Systems and methods are disclosed for scheduling threads on an asymmetric multiprocessing system having multiple core types. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Metrics for workloads offloaded to co-processors can be tracked and integrated into metrics for the offloading thread group.Type: GrantFiled: June 2, 2018Date of Patent: June 14, 2022Assignee: Apple Inc.Inventors: John G. Dorsey, Daniel A. Chimene, Andrei Dorofeev, Bryan R. Hinch, Evan M. Hoke, Aditya Venkataraman
-
Patent number: 11231966Abstract: Systems and methods are disclosed for scheduling threads on an asymmetric multiprocessing system having multiple core types. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Metrics for workloads offloaded to co-processors can be tracked and integrated into metrics for the offloading thread group.Type: GrantFiled: September 28, 2018Date of Patent: January 25, 2022Assignee: Apple Inc.Inventors: John G. Dorsey, Daniel A. Chimene, Andrei Dorofeev, Bryan R. Hinch, Evan M. Hoke, Aditya Venkataraman
-
Publication number: 20210365389Abstract: Embodiments described herein provide techniques to manage drivers in a user space in a data processing system. One embodiment provides a data processing system configured perform operations, comprising discovering a hardware device communicatively coupled to the communication bus, launching a user space driver daemon, establishing an inter-process communication (IPC) link between a first proxy interface for the user space driver daemon and a second proxy interface for a server process in a kernel space, receiving, at the first proxy interface, an access right to enable access to a memory buffer in the kernel space, and relaying an access request for the memory buffer from the user space driver daemon via a third-party proxy interface to enable the user space driver daemon to access the memory buffer, the access request based on the access right.Type: ApplicationFiled: August 9, 2021Publication date: November 25, 2021Inventors: Jeremy C. ANDRUS, Joseph R. Auricchio, Russell A. BLAINE, Daniel A. CHIMENE, Simon M. DOUGLAS, Landon J. FULLER, Yevgen GORYACHOK, John K. KIM-BIGGS, Arnold S. LIU, James M. MAGEE, Daniel A. STEFFEN, Roberto G. YEPEZ
-
Publication number: 20210318909Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.Type: ApplicationFiled: March 22, 2021Publication date: October 14, 2021Inventors: Jeremy C. Andrus, John G. Dorsey, James M. Magee, Daniel A. Chimene, Cyril de la Cropte de Chanterac, Bryan R. Hinch, Aditya Venkataraman, Andrei Dorofeev, Nigel R. Gamble, Russell A. Blaine, Constantin Pistol, James S. Ismail
-
Patent number: 11086800Abstract: Embodiments described herein provide techniques to manage drivers in a user space in a data processing system. One embodiment provides a data processing system configured perform operations, comprising discovering a hardware device communicatively coupled to the communication bus, launching a user space driver daemon, establishing an inter-process communication (IPC) link between a first proxy interface for the user space driver daemon and a second proxy interface for a server process in a kernel space, receiving, at the first proxy interface, an access right to enable access to a memory buffer in the kernel space, and relaying an access request for the memory buffer from the user space driver daemon via a third-party proxy interface to enable the user space driver daemon to access the memory buffer, the access request based on the access right.Type: GrantFiled: May 22, 2020Date of Patent: August 10, 2021Assignee: Apple Inc.Inventors: Jeremy C. Andrus, Joseph R. Auricchio, Russell A. Blaine, Daniel A. Chimene, Simon M. Douglas, Landon J. Fuller, Yevgen Goryachok, John K. Kim-Biggs, Arnold S. Liu, James M. Magee, Daniel A. Steffen, Roberto G. Yepez
-
Patent number: 11080095Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.Type: GrantFiled: January 12, 2018Date of Patent: August 3, 2021Assignee: Apple Inc.Inventors: Jeremy C. Andrus, John G. Dorsey, James M. Magee, Daniel A. Chimene, Cyril de la Cropte de Chanterac, Bryan R. Hinch, Aditya Venkataraman, Andrei Dorofeev, Nigel R. Gamble, Russell A. Blaine, Constantin Pistol
-
Patent number: 11048562Abstract: Techniques are disclosed relating to efficiently handling execution of multiple threads to perform various actions. In some embodiments, an application instantiates a queue and a synchronization primitive. The queue maintains a set of work items to be operated on by a thread pool maintained by a kernel. The synchronization primitive controls access to the queue by a plurality of threads including threads of the thread pool. In such an embodiment, a first thread of the application enqueues a work item in the queue and issues a system call to the kernel to request that the kernel dispatch a thread of the thread pool to operate on the first work item. In various embodiments, the dispatched thread is executable to acquire the synchronization primitive, dequeue the work item, and operate on it.Type: GrantFiled: December 8, 2017Date of Patent: June 29, 2021Assignee: Apple Inc.Inventors: Daniel A. Steffen, Pierre Habouzit, Daniel A. Chimene, Jeremy C. Andrus, James M. Magee, Puja Gupta
-
Publication number: 20210157748Abstract: A turnstile OS primitive is provided that enables support for owner tracking and waiting. The turnstile primitive enables a common framework that can be adopted across multiple different types of synchronization primitives to provide a common service for priority boosting and wait queuing. A turnstile can also provide a mechanism to enable a turnstile to block on another turnstile, allowing multi-hop priority boosting within a chain of multiple blocking turnstiles.Type: ApplicationFiled: January 5, 2021Publication date: May 27, 2021Inventors: Jainam A. Shah, Jeremy C. Andrus, Daniel A. Chimene, Kushal Dalmia, Pierre Habouzit, James M. Magee, Marina Sadini, Daniel A. Steffen
-
Patent number: 10956220Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.Type: GrantFiled: January 12, 2018Date of Patent: March 23, 2021Assignee: Apple Inc.Inventors: Jeremy C. Andrus, John G. Dorsey, James M. Magee, Daniel A. Chimene, Cyril de la Cropte de Chanterac, Bryan R. Hinch, Aditya Venkataraman, Andrei Dorofeev, Nigel R. Gamble, Russell A. Blaine, Constantin Pistol, James S. Ismail
-
Patent number: 10942850Abstract: A processing system can include a plurality of processing clusters. Each processing cluster can include a plurality of processor cores and a last level cache. Each processor core can include one or more dedicated caches and a plurality of counters. The plurality of counters may be configured to count different types of cache fills. The plurality of counters may be configured to count different types of cache fills, including at least one counter configured to count total cache fills and at least one counter configured to count off-cluster cache fills. Off-cluster cache fills can include at least one of cross-cluster cache fills and cache fills from system memory. The processing system can further include one or more controllers configured to control performance of one or more of the clusters, the processor cores, the fabric, and the memory responsive to cache fill metrics derived from the plurality of counters.Type: GrantFiled: July 16, 2019Date of Patent: March 9, 2021Assignee: Apple Inc.Inventors: John G. Dorsey, Aditya Venkataraman, Bryan R. Hinch, Daniel A. Chimene, Andrei Dorofeev, Constantin Pistol
-
Patent number: 10908954Abstract: In one embodiment, tasks executing on a data processing system can be associated with a Quality of Service (QoS) classification that is used to determine the priority values for multiple subsystems of the data processing system. The QoS classifications are propagated when tasks interact and the QoS classes are interpreted a multiple levels of the system to determine the priority values to set for the tasks. In one embodiment, one or more sensors coupled with the data processing system monitor a set of system conditions that are used in part to determine the priority values to set for a QoS class.Type: GrantFiled: February 27, 2017Date of Patent: February 2, 2021Assignee: Apple Inc.Inventors: Daniel A. Steffen, Matthew W. Wright, Russell A. Blaine, Daniel A. Chimene, Kevin J. Van Vechten, Thomas B. Duffy
-
Patent number: 10901920Abstract: One embodiment provides for a computer-implemented method comprising instantiating a synchronization primitive to control access to a resource, acquiring the synchronization primitive at a first thread, the first thread having a first priority, associating a turnstile with the synchronization primitive, setting an inheritor of the turnstile to the first thread, attempting to acquire the synchronization primitive at a second thread while the synchronization primitive is held by the first thread, the second thread having a second priority, adding the second thread to a wait queue of the turnstile; and in response to determining that the second priority is higher than the first priority, increasing the priority of the first thread to the second priority.Type: GrantFiled: April 10, 2019Date of Patent: January 26, 2021Assignee: Apple Inc.Inventors: Jainam A. Shah, Jeremy C. Andrus, Daniel A. Chimene, Kushal Dalmia, Pierre Habouzit, James M. Magee, Marina Sadini, Daniel A. Steffen
-
Publication number: 20210019258Abstract: A processing system can include a plurality of processing clusters. Each processing cluster can include a plurality of processor cores and a last level cache. Each processor core can include one or more dedicated caches and a plurality of counters. The plurality of counters may be configured to count different types of cache fills. The plurality of counters may be configured to count different types of cache fills, including at least one counter configured to count total cache fills and at least one counter configured to count off-cluster cache fills. Off-cluster cache fills can include at least one of cross-cluster cache fills and cache fills from system memory. The processing system can further include one or more controllers configured to control performance of one or more of the clusters, the processor cores, the fabric, and the memory responsive to cache fill metrics derived from the plurality of counters.Type: ApplicationFiled: July 16, 2019Publication date: January 21, 2021Inventors: John G. Dorsey, Aditya Venkataraman, Bryan R. Hinch, Daniel A. Chimene, Andrei Dorofeev, Constantin Pistol
-
Patent number: 10884811Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.Type: GrantFiled: January 12, 2018Date of Patent: January 5, 2021Assignee: Apple Inc.Inventors: Jeremy C. Andrus, John G. Dorsey, James M. Magee, Daniel A. Chimene, Cyril de la Cropte de Chanterac, Bryan R. Hinch, Aditya Venkataraman, Andrei Dorofeev, Nigel R. Gamble, Russell A. Blaine, Constantin Pistol