Patents by Inventor Dario Suarez Gracia

Dario Suarez Gracia has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10360063
    Abstract: Various embodiments proactively balance workloads between a plurality of processing units of a multi-processor computing device by making work-stealing determinations based on operating state data. An embodiment method includes obtaining static characteristics data associated with each of a victim processor and one or more of a plurality of processing units that are ready to steal work items from the victim processor (work-ready processors), obtaining dynamic characteristics data for each of the processors, calculating priority values for each of the processors based on the obtained data, and transferring a number of work items assigned to the victim processor to a winning work-ready processor based on the calculated priority values. In some embodiments, the method may include acquiring control over a probabilistic lock for a shared data structure and updating the shared data structure to indicate the number of work items transferred to the winning work-ready processor.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: July 23, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Han Zhao, Dario Suárez Gracia, Tushar Kumar
  • Patent number: 10114681
    Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.
    Type: Grant
    Filed: March 30, 2016
    Date of Patent: October 30, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Dario Suarez Gracia, Gheorghe Cascaval, Han Zhao, Tushar Kumar, Aravind Natarajan, Arun Raman
  • Patent number: 10031697
    Abstract: Methods, devices, and non-transitory processor-readable storage media for a computing device to merge concurrent writes from a plurality of processing units to a buffer associated with an application. An embodiment method executed by a processor may include identifying a plurality of concurrent requests to access the buffer that are sparse, disjoint, and write-only, configuring a write-set for each of the plurality of processing units, executing the plurality of concurrent requests to access the buffer using the write-sets, determining whether each of the plurality of concurrent requests to access the buffer is complete, obtaining a buffer index and data via the write-set of each of the plurality of processing units, and writing to the buffer using the received buffer index and data via the write-set of each of the plurality of processing units in response to determining that each of the plurality of concurrent requests to access the buffer is complete.
    Type: Grant
    Filed: January 19, 2016
    Date of Patent: July 24, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Tushar Kumar, Aravind Natarajan, Dario Suarez Gracia
  • Publication number: 20170286182
    Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.
    Type: Application
    Filed: March 30, 2016
    Publication date: October 5, 2017
    Inventors: Dario Suarez Gracia, Gheorghe Cascaval, Han Zhao, Tushar Kumar, Aravind Natarajan, Arun Raman
  • Patent number: 9740504
    Abstract: Aspects include apparatuses, systems, and methods for hardware acceleration for inline caches in dynamic languages. An inline cache may be initialized for an instance of a dynamic software operation. A call of an initialized instance of the dynamic software operation may be executed by an inline cache hardware accelerator. The inline cache may be checked to determine that its data is current. When the data is current, the initialized instance of the dynamic software operation may be executed using the related inline cache data. When the data is not current, a new inline cache may be initialized for the instance of the dynamic software operation, including the not current data of a previously initialized instance of the dynamic software operation. The inline cache hardware accelerator may include an inline cache memory, a coprocessor, and/or a functional until one an inline cache pipeline connected to a processor pipeline.
    Type: Grant
    Filed: April 28, 2014
    Date of Patent: August 22, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Behnam Robatmili, Gheorghe Calin Cascaval, Madhukar Nagaraja Kedlaya, Dario Suarez Gracia
  • Patent number: 9733978
    Abstract: Various embodiments include methods for data management in a computing device utilizing a plurality of processing units. Embodiment methods may include generating a data transfer heuristic model based on measurements from a plurality of sample data transfers between a plurality of data storage units. The generated data transfer heuristic model may be used to calculate data transfer costs for each of a plurality of tasks. The calculated data transfer costs may be used to schedule execution of the plurality of tasks in an execution order on selected ones of the plurality of processing units. The data transfer heuristic model may be updated based on measurements of data transfers occurring during the executions of the plurality of tasks (e.g., time, power consumption, etc.). Code executing on the processing units may indicate to a runtime when certain data blocks are no longer needed and thus may be evicted and/or pre-fetched for others.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: August 15, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Dario Suarez Gracia, Tushar Kumar, Aravind Natarajan, Ravish Hastantram, Gheorghe Calin Cascaval, Han Zhao
  • Publication number: 20170206035
    Abstract: Methods, devices, and non-transitory processor-readable storage media for a computing device to merge concurrent writes from a plurality of processing units to a buffer associated with an application. An embodiment method executed by a processor may include identifying a plurality of concurrent requests to access the buffer that are sparse, disjoint, and write-only, configuring a write-set for each of the plurality of processing units, executing the plurality of concurrent requests to access the buffer using the write-sets, determining whether each of the plurality of concurrent requests to access the buffer is complete, obtaining a buffer index and data via the write-set of each of the plurality of processing units, and writing to the buffer using the received buffer index and data via the write-set of each of the plurality of processing units in response to determining that each of the plurality of concurrent requests to access the buffer is complete.
    Type: Application
    Filed: January 19, 2016
    Publication date: July 20, 2017
    Inventors: Tushar Kumar, Aravind Natarajan, Dario Suarez Gracia
  • Patent number: 9710388
    Abstract: Aspects include a computing devices, systems, and methods for hardware acceleration for inline caches in dynamic languages. An inline cache may be initialized for an instance of a dynamic software operation. A call of an initialized instance of the dynamic software operation may be executed by an inline cache hardware accelerator. The inline cache may be checked to determine that its data is current. When the data is current, the initialized instance of the dynamic software operation may be executed using the related inline cache data. When the data is not current, a new inline cache may be initialized for the instance of the dynamic software operation, including the not current data of a previously initialized instance of the dynamic software operation. The inline cache hardware accelerator may include an inline cache memory, a coprocessor, and/or a functional until one an inline cache pipeline connected to a processor pipeline.
    Type: Grant
    Filed: April 28, 2014
    Date of Patent: July 18, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Behnam Robatmili, Gheorghe Calin Cascaval, Madhukar Nagaraja Kedlaya, Dario Suarez Gracia
  • Patent number: 9632569
    Abstract: Multi-processor computing device methods manage resource accesses by a signaling event manager signaling processor elements requesting access to a resource to wake up to access the resource when the resource is available or wait for an event when the resource is busy. Processor elements may enter a sleep state while awaiting access to the requested resource. When multiple elements are waiting for the resource, the processor element with a highest assigned priority is signaled to wake up when the resource is available without waking other elements. Priorities may be assigned to processor elements waiting for the resource based on a heuristic or parameter that may depend on a state of the computing device or the processor elements. A sleep duration may be estimated for a processor element waiting for a resource and the processor element may be removed from a scheduling queue or assigned another thread during the sleep duration.
    Type: Grant
    Filed: August 5, 2014
    Date of Patent: April 25, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Dario Suarez Gracia, Han Zhao, Pablo Montesinos Ortego, Gheorghe Calin Cascaval, James Xenidis
  • Publication number: 20170083827
    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for accelerating machine learning on a computing device. Raw data may be received in the computing device from a raw data source device. The apparatus may identify key features as two dimensional matrices of the raw data such that the key features are mutually exclusive from each other. The key features may be translated into key feature vectors. The computing device may generate a feature vector from at least one of the key feature vectors. The computing device may receive a first partial output resulting from an execution of a basic linear algebra subprogram (BLAS) operation using the feature vector and a weight factor. The first partial output may be combined with a plurality of partial outputs to produce an output matrix. Receiving the raw data on the computing device may include receiving streaming raw data.
    Type: Application
    Filed: September 23, 2015
    Publication date: March 23, 2017
    Inventors: Behnam Robatmili, Matthew Leslie Badin, Dario Suárez Gracia, Gheorghe Calin Cascaval, Nayeem Islam
  • Publication number: 20170083364
    Abstract: Various embodiments proactively balance workloads between a plurality of processing units of a multi-processor computing device by making work-stealing determinations based on operating state data. An embodiment method includes obtaining static characteristics data associated with each of a victim processor and one or more of a plurality of processing units that are ready to steal work items from the victim processor (work-ready processors), obtaining dynamic characteristics data for each of the processors, calculating priority values for each of the processors based on the obtained data, and transferring a number of work items assigned to the victim processor to a winning work-ready processor based on the calculated priority values. In some embodiments, the method may include acquiring control over a probabilistic lock for a shared data structure and updating the shared data structure to indicate the number of work items transferred to the winning work-ready processor.
    Type: Application
    Filed: September 23, 2015
    Publication date: March 23, 2017
    Inventors: Han Zhao, Dario Suárez Gracia, Tushar Kumar
  • Publication number: 20170060633
    Abstract: Various embodiments include methods for data management in a computing device utilizing a plurality of processing units. Embodiment methods may include generating a data transfer heuristic model based on measurements from a plurality of sample data transfers between a plurality of data storage units. The generated data transfer heuristic model may be used to calculate data transfer costs for each of a plurality of tasks. The calculated data transfer costs may be used to schedule execution of the plurality of tasks in an execution order on selected ones of the plurality of processing units. The data transfer heuristic model may be updated based on measurements of data transfers occurring during the executions of the plurality of tasks (e.g., time, power consumption, etc.). Code executing on the processing units may indicate to a runtime when certain data blocks are no longer needed and thus may be evicted and/or pre-fetched for others.
    Type: Application
    Filed: August 27, 2015
    Publication date: March 2, 2017
    Inventors: Dario Suarez Gracia, Tushar Kumar, Aravind Natarajan, Ravish Hastantram, Gheorghe Calin Cascaval, Han Zhao
  • Patent number: 9501328
    Abstract: Embodiments include computing devices, systems, and methods for task-based handling of repetitive processes in parallel. At least one processor of the computing device, or a specialized hardware controller, may be configured to partition iterations of a repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. Upon completing a task, remaining divisible partitions of the repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task or a newly initialized task. Information about the iteration space for a repetitive process may be stored in a descriptor table, and status information for all partitions of a repetitive process stored in a status table. Each processor core may have an associated local table that tracks iteration execution of each task, and is synchronized with the status table.
    Type: Grant
    Filed: March 30, 2015
    Date of Patent: November 22, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Behnam Robatmili, Shaizeen Dilawarhusen Aga, Dario Suarez Gracia, Arun Raman, Aravind Natarajan, Gheorghe Calin Cascaval, Pablo Montesinos Ortego, Han Zhao
  • Publication number: 20160292012
    Abstract: Embodiments include computing devices, systems, and methods for task-based handling of repetitive processes in parallel. At least one processor of the computing device, or a specialized hardware controller, may be configured to partition iterations of a repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. Upon completing a task, remaining divisible partitions of the repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task or a newly initialized task. Information about the iteration space for a repetitive process may be stored in a descriptor table, and status information for all partitions of a repetitive process stored in a status table. Each processor core may have an associated local table that tracks iteration execution of each task, and is synchronized with the status table.
    Type: Application
    Filed: March 30, 2015
    Publication date: October 6, 2016
    Inventors: Behnam ROBATMILI, Shaizeen Dilawarhusen Aga, Dario Suarez Gracia, Arun Raman, Aravind Natarajan, Gheorghe Calin Cascaval, Pablo Montesinos Ortego, Han Zhao
  • Publication number: 20160216969
    Abstract: Systems and methods for adaptively managing registers in an instruction processor are disclosed. The system identifies one or more registers with inoperable cells. An operand manager identifies a set of operable cells within the one or more registers with inoperable cells and determines if a present instruction will use an operand that can be supported by the set of operable cells. When the set of operable cells can support the operand, the operand manager generates an assignment which is communicated to a register file manager.
    Type: Application
    Filed: January 28, 2015
    Publication date: July 28, 2016
    Inventors: Dario Suarez Gracia, Behnam Robatmili
  • Publication number: 20160103612
    Abstract: Aspects include computing devices, systems, and methods for implementing monitoring communications between components and a memory hierarchy of a computing device. The computing device may determine at least one identifying factor for identifying execution of the processor-executable code. A communication between the components and the memory hierarchy of the computing device may be monitored for at least one communication factor of a same type as the at least one identifying factor. A determination whether a value of the at least one identifying factor matches a value of the at least one communication factor may be made. The computing device may determine that the processor-executable code is executed in response to determining that the value of the at least one identifying factor matches the value of the at least one communication factor.
    Type: Application
    Filed: October 12, 2014
    Publication date: April 14, 2016
    Inventors: Mihai Christodorescu, Mastooreh Salajegheh, Dario Suarez Gracia
  • Publication number: 20160041852
    Abstract: Multi-processor computing device methods manage resource accesses by a signaling event manager signaling processor elements requesting access to a resource to wake up to access the resource when the resource is available or wait for an event when the resource is busy. Processor elements may enter a sleep state while awaiting access to the requested resource. When multiple elements are waiting for the resource, the processor element with a highest assigned priority is signaled to wake up when the resource is available without waking other elements. Priorities may be assigned to processor elements waiting for the resource based on a heuristic or parameter that may depend on a state of the computing device or the processor elements. A sleep duration may be estimated for a processor element waiting for a resource and the processor element may be removed from a scheduling queue or assigned another thread during the sleep duration.
    Type: Application
    Filed: August 5, 2014
    Publication date: February 11, 2016
    Inventors: Dario Suarez Gracia, Han Zhao, Pablo Montesinos Ortego, Gheorghe Calin Cascaval, James Xenidis
  • Publication number: 20150358810
    Abstract: Methods, non-transitory processor-readable storage media, devices, and systems for improving user experience, energy consumption, and performance of a mobile device by automatically configuring applications.
    Type: Application
    Filed: June 10, 2014
    Publication date: December 10, 2015
    Inventors: Hui Chao, Dario Suarez Gracia, Gheorghe Calin Cascaval
  • Publication number: 20150205720
    Abstract: Aspects include apparatuses, systems, and methods for hardware acceleration for inline caches in dynamic languages. An inline cache may be initialized for an instance of a dynamic software operation. A call of an initialized instance of the dynamic software operation may be executed by an inline cache hardware accelerator. The inline cache may be checked to determine that its data is current. When the data is current, the initialized instance of the dynamic software operation may be executed using the related inline cache data. When the data is not current, a new inline cache may be initialized for the instance of the dynamic software operation, including the not current data of a previously initialized instance of the dynamic software operation. The inline cache hardware accelerator may include an inline cache memory, a coprocessor, and/or a functional until one an inline cache pipeline connected to a processor pipeline.
    Type: Application
    Filed: April 28, 2014
    Publication date: July 23, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Behnam Robatmili, Gheorghe Calin Cascaval, Madhukar Nagaraja Kedlay, Dario Suarez Gracia
  • Publication number: 20150205726
    Abstract: Aspects include a computing devices, systems, and methods for hardware acceleration for inline caches in dynamic languages. An inline cache may be initialized for an instance of a dynamic software operation. A call of an initialized instance of the dynamic software operation may be executed by an inline cache hardware accelerator. The inline cache may be checked to determine that its data is current. When the data is current, the initialized instance of the dynamic software operation may be executed using the related inline cache data. When the data is not current, a new inline cache may be initialized for the instance of the dynamic software operation, including the not current data of a previously initialized instance of the dynamic software operation. The inline cache hardware accelerator may include an inline cache memory, a coprocessor, and/or a functional until one an inline cache pipeline connected to a processor pipeline.
    Type: Application
    Filed: April 28, 2014
    Publication date: July 23, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Behnam Robatmili, Gheorghe Calin Cascaval, Madhukar Nagaraja Kedlaya, Dario Suarez Gracia