Patents Examined by Jyoti Mehta
  • Patent number: 10908903
    Abstract: A system and method of implementing a wait state for a plurality of threads executing on a computer processor core of the processor. The processor is configured to execute instruction streams by the plurality of threads, wherein the plurality of threads includes a first thread and a set of remaining threads and determine that the first thread has entered a first wait state loop. The processor is also configured to determine that any of the set of remaining threads has not entered a corresponding wait state loop and remain by the first thread in the first wait state loop until each of the set of remaining threads has entered the corresponding wait state loop.
    Type: Grant
    Filed: September 27, 2017
    Date of Patent: February 2, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Fadi Y. Busaba, Mark S. Farrell, Charles W. Gainey, Jr., Dan F. Greiner, Lisa C. Heller, Jeffrey P. Kubala, Damian L. Osisek, Donald W. Schmidt, Timothy J. Slegel
  • Patent number: 10891136
    Abstract: A system to support data gathering for a machine learning (ML) operation comprises a memory unit configured to maintain data for the ML operation in a plurality of memory blocks each accessible via a memory address. The system further comprises an inference engine comprising a plurality of processing tiles each comprising one or more of an on-chip memory (OCM) configured to load and maintain data for local access by components in the processing tile. The system also comprises a core configured to program components of the processing tiles of the inference engine according to an instruction set architecture (ISA) and a data streaming engine configured to stream data between the memory unit and the OCMs of the processing tiles of the inference engine wherein data streaming engine is configured to perform a data gathering operation via a single data gathering instruction of the ISA at the same time.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: January 12, 2021
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: Avinash Sodani
  • Patent number: 10885232
    Abstract: A computer-implemented method designs and manufactures a supporting structure for the packaging of a solid object. The supporting structure comprises a plurality of linear support elements that sustain the solid object inside a packaging container. The method includes: a) providing, as an input, a three-dimensional model of the solid object; b) computing a first cumulative linear mass density distribution of the solid object according to a first axis (x); and c) using said first cumulative linear mass density distribution to determine the positions, along said first axis, of linear support elements oriented transversally to said first axis. The resultant positions enable even distribution of the weight of the solid object among the linear support elements. A computer program product, computer-readable data-storage medium, and CAD system carry out such a method.
    Type: Grant
    Filed: June 7, 2018
    Date of Patent: January 5, 2021
    Assignee: Dassault Systemes
    Inventors: Pierre Pagliughi, Pascal Hebrard, Rohan Keswani, Patrick Merlat
  • Patent number: 10884744
    Abstract: Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.
    Type: Grant
    Filed: August 18, 2017
    Date of Patent: January 5, 2021
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher J. Hughes
  • Patent number: 10877777
    Abstract: Systems and methods of enabling virtual calls in a single instruction multiple data (SIMD) environment may involve detecting a virtual call of a function and using a single dispatch of the function to invoke the virtual call for two or more channels of the virtual call. In one example, it is determined that the two or more channels share a common target address and a single dispatch of the function is conducted with respect to the common target address. The process may be iterated for additional channels of the virtual call that share a common target address.
    Type: Grant
    Filed: October 7, 2015
    Date of Patent: December 29, 2020
    Assignee: Intel Corporation
    Inventors: Wei-Yu Chen, Guei-Yuan Lueh, Subramaniam Maiyuran
  • Patent number: 10872058
    Abstract: Provided is a computer-implemented apparatus for processing data, having a digital chip having at least one part that is reconfigurable by a number N of configuration descriptions, with N?1, a determined configuration description from the number N for reconfiguring the reconfigurable part, and a providing unit for providing an identifier specific to the determined configuration description by using a number A of derivation parameters comprising the determined configuration description, with A?1, is proposed, wherein the part reconfigured with the determined configuration description) is set up to perform a cryptographic function on determined data by using the provided specific identifier to generate cryptographically processed data. This allows security-relevant functions to be implemented as configuration descriptions. This has the advantage that the security when processing data in digital chips is increased.
    Type: Grant
    Filed: January 13, 2020
    Date of Patent: December 22, 2020
    Inventors: Rainer Falk, Christian Peter Feist
  • Patent number: 10866805
    Abstract: An apparatus comprises processing circuitry to perform data processing and instruction decoding circuitry to decode instructions to control the processing circuitry to perform the data processing. The instruction decoding circuitry is responsive to a speculation barrier instruction to control the processing circuitry to prevent a subsequent operation, appearing in program order after the speculation barrier instruction, that has an address dependency on an earlier instruction preceding the speculation barrier instruction in the program order, from speculatively influencing allocations of entries in a cache. This provides protection against speculative cache-timing side-channel attacks.
    Type: Grant
    Filed: December 4, 2018
    Date of Patent: December 15, 2020
    Assignee: Arm Limited
    Inventors: Richard Roy Grisenthwaite, Giacomo Gabrielli, Matthew James Horsnell
  • Patent number: 10838728
    Abstract: Supplemental instruction dispatch may be used in some instances in a parallel slice processor to dispatch additional instructions, referred to as supplemental instructions, to supplemental instruction ports of execution slices and using primary instruction ports of one or more execution slices to supply one or more source operands for such supplemental instructions. In addition, in some instances, in lieu of or in addition to supplemental instruction dispatch, selective slice partitioning may be used to selectively partition groups of execution slices in a parallel slice processor based upon a threading mode within which such execution slices are executing.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: November 17, 2020
    Assignee: International Business Machines Corporation
    Inventors: Kurt A. Feiste, Christopher M. Mueller, Dung Q. Nguyen, Eula A. Tolentino, Tien T. Tran, Jing Zhang
  • Patent number: 10838720
    Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: November 17, 2020
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Patent number: 10824585
    Abstract: An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of the processing elements and also receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions provided to the first subset of processing elements. Furthermore, each of processing elements is configurable by the managing element to compare input data portions received from either the load streaming unit or two or more of the other processing elements, wherein the input data portions are stored for processing in respective queues. Each processing unit is further configurable to select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: November 3, 2020
    Assignee: International Business Machines Corporation
    Inventors: Ganesh Balakrishnan, Bartholomew Blaner, John J. Reilly, Jeffrey A. Stuecheli
  • Patent number: 10817291
    Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA.
    Type: Grant
    Filed: March 30, 2019
    Date of Patent: October 27, 2020
    Assignee: Intel Corporation
    Inventors: Jesus Corbal, Rohan Sharma, Simon Steely, Jr., Chinmay Ashok, Kent D. Glossop, Dennis Bradford, Paul Caprioli, Louise Huot, Kermin ChoFleming, Barry Tannenbaum
  • Patent number: 10789202
    Abstract: A method is described. The method includes configuring a first instance of object code to execute on a processor. The processor has multiple cores and an internal network. The internal network is configured in a first configuration that enables a first number of the cores to be communicatively coupled. The method also includes configuring a second instance of the object code to execute on a second instance of the processor. A respective internal network of the second instance of the processor is configured in a second configuration that enables a different number of cores to be communicatively coupled, wherein, same positioned cores on the processor and the second instance of the processor have same network addresses for the first and second configurations. A processor is also described having an internal network designed to enable the above method.
    Type: Grant
    Filed: May 12, 2017
    Date of Patent: September 29, 2020
    Assignee: Google LLC
    Inventors: Jason Redgrave, Albert Meixner, Ji Kim, Ofer Shacham
  • Patent number: 10789069
    Abstract: Dynamically selecting a version of an instruction to be executed. Based on processing, a version of an instruction to be executed is selected. The selecting chooses the version from a plurality of versions of instructions. The plurality of versions of instructions including an architected version and another version different from the architected version. The version of the instruction selected for execution is executed.
    Type: Grant
    Filed: March 3, 2017
    Date of Patent: September 29, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10776125
    Abstract: In an embodiment, at least one CPU processor and at least one coprocessor are included in a system. The CPU processor may issue operations to the coprocessor to perform, including load/store operations. The CPU processor may generate the addresses that are accessed by the coprocessor load/store operations, as well as executing its own CPU load/store operations. The CPU processor may include a memory ordering table configured to track at least one memory region within which there are outstanding coprocessor load/store memory operations that have not yet completed. The CPU processor may delay CPU load/store operations until the outstanding coprocessor load/store operations are complete. In this fashion, the proper ordering of CPU load/store operations and coprocessor load/store operations may be maintained.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: September 15, 2020
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Brett S. Feero, Nikhil Gupta
  • Patent number: 10776123
    Abstract: Systems, apparatuses, and methods for performing efficient processor pipeline flush recovery are disclosed. A processor core includes a retire queue for storing information of outstanding instructions. When the retire queue logic detects that a pipeline flush condition occurs, the logic creates one or more groups of entries in the retire queue. The logic begins the groups with an entry storing information for a youngest outstanding instruction, and creates other groups in a contiguous manner after creating this first group. The logic marks with a first indication a given group when the given group includes one or more instructions of a given type. The logic marks with a second indication the given group when the given group does not include an instruction of the given type. The logic sends to flush recovery logic information of one or more entries in only groups marked with the first indication.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: September 15, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Erik D. Swanson, Michael Estlick, Sneha V. Desai
  • Patent number: 10747533
    Abstract: An instruction defined to be a looping instruction is obtained and processed. A determination is made as to whether an obtained selected character is an expected selected character. Based on the obtained selected character being the expected selected character, an execution process is used that includes a sequence of operations to perform an operation, the sequence of operations replacing a loop and providing a non-looping sequence to perform the operation on up to a defined number of units of data. The sequence of operations is configured to repeat one or more times and to terminate based on the obtained selected character. Based on the obtained selected character being different than the expected selected character, an alternate execution process is chosen.
    Type: Grant
    Filed: May 2, 2019
    Date of Patent: August 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10747532
    Abstract: An instruction defined to be a looping instruction is obtained and processed. A determination is made as to whether an obtained selected character is an expected selected character. Based on the obtained selected character being the expected selected character, an execution process is used that includes a sequence of operations to perform an operation, the sequence of operations replacing a loop and providing a non-looping sequence to perform the operation on up to a defined number of units of data. The sequence of operations is configured to repeat one or more times and to terminate based on the obtained selected character. Based on the obtained selected character being different than the expected selected character, an alternate execution process is chosen.
    Type: Grant
    Filed: May 2, 2019
    Date of Patent: August 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10725780
    Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.
    Type: Grant
    Filed: March 29, 2016
    Date of Patent: July 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
  • Patent number: 10719324
    Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.
    Type: Grant
    Filed: March 28, 2016
    Date of Patent: July 21, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
  • Patent number: 10678541
    Abstract: An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: June 9, 2020
    Assignee: Intel Corporation
    Inventors: Andrew Thomas Forsyth, Dennis R. Bradford