Patents Examined by Jyoti Mehta
-
Patent number: 10725780Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.Type: GrantFiled: March 29, 2016Date of Patent: July 28, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 10719324Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.Type: GrantFiled: March 28, 2016Date of Patent: July 21, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, Jr., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 10678541Abstract: An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.Type: GrantFiled: December 29, 2011Date of Patent: June 9, 2020Assignee: Intel CorporationInventors: Andrew Thomas Forsyth, Dennis R. Bradford
-
Patent number: 10671401Abstract: An apparatus includes a scheduler circuit and a plurality of hardware engines. The scheduler circuit may be configured to (i) store a directed acyclic graph, (ii) parse the directed acyclic graph into a plurality of operators and (iii) schedule the operators in one or more data paths based on a readiness of the operators to be processed. The hardware engines may be (i) configured as a plurality of the data paths and (ii) configured to generate one or more output vectors by processing zero or more input vectors using the operators.Type: GrantFiled: August 16, 2019Date of Patent: June 2, 2020Assignee: Ambarella International LPInventors: Leslie D. Kohn, Robert C. Kunz
-
Patent number: 10664276Abstract: Technical solutions are described for a supervisory processor to pass an out-of-band communication to a target processor in a multiprocessor system. For example, a first processor in a multi-processor system includes a register configured to store a command from a second processor of the multi-processor system, and to store a response to the command from the second processor. The first processor determines that the second processor has issued the command for execution by the first processor based on a first portion of the register being set to a first state, which is a predetermined state. The first processor also, responsively, reads the command from the second processor by parsing a second portion of the register. The first processor includes executes the command and stores the response for the command in the register.Type: GrantFiled: September 28, 2016Date of Patent: May 26, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Raymond M. Higgs, Luke M. Hopkins, Mushfiq U. Saleheen, Gabriel M. Tarr
-
Patent number: 10649773Abstract: A system and method process atomic instructions. A processor system includes a load store unit (LSU), first and second registers, a memory interface, and a main memory. In response to a load link (LL) instruction, the LSU loads first data from memory into the first register and sets an LL bit (LLBIT) to indicate a sequence of atomic instructions is being executed. The LSU further loads second data from memory into the second register in response to a load (LD) instruction. The LSU places a value of the second register into the memory interface in response to a store conditional coupled (SCX) instruction. When the LLBIT is set and in response to a store (SC) instruction, the LSU places a value of the second register into the memory interface and commits the first and second register values in the memory interface into the main memory when the LLBIT is set.Type: GrantFiled: April 7, 2016Date of Patent: May 12, 2020Assignee: MIPS Tech, LLCInventors: Ranjit J. Rozario, Andrew F. Glew, Sanjay Patel, James Robinson, Sudhakar Ranganathan
-
Patent number: 10635399Abstract: A system, method, and device for stochastically processing data. There is an architect module operating on a processor configured to manage and control stochastic processing of data, a non-deterministic data pool module configured to provide a stream of non-deterministic values that are not derived from a function, a plurality of functionally equivalent data processing modules each configured to stochastically process data as called upon by the architect module, a data feed configured to feed a data set desired to be stochastically processed, and a structure memory module including a memory storage device and configured to provide sufficient information for the architect module to duplicate a predefined processing architecture and to record a utilized processing architecture.Type: GrantFiled: August 14, 2017Date of Patent: April 28, 2020Assignee: CASSY HOLDINGS LLCInventor: Patrick D. Ross
-
Patent number: 10613987Abstract: In some embodiments, a system includes an execution unit, a register file, an operand cache, and a predication control circuit. Operands identified by an instruction may be stored in the operand cache. One or more entries of the operand cache that store the operands may be marked as dirty. The predication control circuit may identify an instruction as having an unresolved predication state. Subsequent to initiating execution of the instruction, the predication control circuit may receive results of the at least one unresolved conditional instruction. In response to the results indicating the instruction has a known-to-execute predication state, the predication control circuit may initiate writing, in the operand cache, results of executing the instruction. In response to the results indicating the instruction has a known-not-to-execute predication state, the predication control circuit may prevent the results from executing the instruction from being written in the operand cache.Type: GrantFiled: September 23, 2016Date of Patent: April 7, 2020Assignee: Apple Inc.Inventors: Andrew M. Havlir, Terence M. Potter
-
Patent number: 10564966Abstract: A method of an aspect includes receiving a packed data operation mask shift instruction. The packed data operation mask shift instruction indicates a source having a packed data operation mask, indicates a shift count number of bits, and indicates a destination. The method further includes storing a result in the destination in response to the packed data operation mask shift instruction. The result includes a sequence of bits of the packed data operation mask that have been shifted by the shift count number of bits. Other methods, apparatus, systems, and instructions are disclosed.Type: GrantFiled: December 22, 2011Date of Patent: February 18, 2020Assignee: Intel CorporationInventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Elmoustapha Ould-Ahmed Vall, Mark J. Charney
-
Patent number: 10540179Abstract: A processor is configured to identify a branch instruction immediately followed by an architectural delay slot. A single bonded instruction comprising the branch instruction immediately followed by the architectural delay slot is created. The single bonded instruction is loaded into an instruction buffer.Type: GrantFiled: March 7, 2013Date of Patent: January 21, 2020Assignee: MIPS Tech, LLCInventors: Ranganathan Sudhakar, Parthiv Pota
-
Patent number: 10514997Abstract: Systems and methods associated with a multi-producer single consumer lock-free queue capable of accumulating traces is described herein. In a non-limiting embodiment, data is determined to be allocated, and a first head/tail pair indicating a location along a queue is received, the location indicating where a data bucket is able to be placed. A first data bucket to use for storing the data is determined, and the data is stored using the first data bucket. The first data bucket is then placed on the queue, and a first instruction to decrement a first reference count for the first head/tail pair is generated.Type: GrantFiled: September 27, 2016Date of Patent: December 24, 2019Assignee: EATON INTELLIGENT POWER LIMITEDInventor: Ronald Landheer
-
Patent number: 10503507Abstract: A method, computer readable medium, and system are disclosed for inline data inspection. The method includes the steps of receiving, by a load/store unit, a load instruction and obtaining, by an inspection circuit that is coupled to the load/store unit, data specified by the load instruction. Additional steps include determining that the data equals zero and transmitting the data and a predicate signal to the load/store unit, wherein the predicate signal indicates that the data equals zero. Alternative additional steps include computing a predicate value based on a comparison between the data and a threshold value and transmitting the data and the predicate value to the load/store unit, wherein the predicate value is asserted when the data is less than the threshold value and is negated when the data is not less than the threshold value.Type: GrantFiled: August 31, 2017Date of Patent: December 10, 2019Assignee: NVIDIA CorporationInventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
-
Patent number: 10474465Abstract: A pipelined run-to-completion processor executes a pop stack absolute instruction. The instruction includes an opcode, an absolute pointer value, a flag don't touch bit, and predicate bits. If a condition indicated by the predicate bits is not true, then the opcode operation is not performed. If the condition is true, then the stack of the processor is popped thereby generating an operand A. The absolute pointer value is used to identify a particular register of the stack, and the content of that particular register is an operand B. The arithmetic logic operation specified by the opcode is performed using operand A and operand B thereby generating a result, and the content of the particular register is replaced with the result. If the flag don't touch bit is set to a particular value, then the flag bits (carry flag and zero flag) are not affected by the instruction execution.Type: GrantFiled: May 1, 2014Date of Patent: November 12, 2019Assignee: Netronome Systems, Inc.Inventor: Gavin J. Stark
-
Patent number: 10467006Abstract: A processor includes a front end to decode an instruction and an allocator to assign the instruction to an execution unit to execute the instruction to permute vector data into a destination register for storing elements. The execution unit includes logic to compute an element count, logic to compute an index size, logic to compute a byte count, a temporary destination, an index from an index vector, an offset, logic to determine a subset of the temporary destination, and logic to store the subset in one element in the destination register.Type: GrantFiled: December 20, 2015Date of Patent: November 5, 2019Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Nikita Astafev
-
Patent number: 10437599Abstract: A processor that reduces pipeline stall including a front end, a load queue, a scheduler, and a load buffer. The front end issues instructions while a first full indication is not provided, but otherwise stalls issuing instructions. The load queue stores issued load instruction entries including information needed to execute the issued load instruction. The load queue provides a second full indication when full. The scheduler dispatches issued instructions for execution except for stalled load instructions, such as when not yet been stored in the load queue. The load buffer transfers issued load instructions to the load queue when the load queue is not full. When the load queue is full, the load buffer temporarily buffers issued load instructions until the load queue is no longer full. The load buffer allows more accurate load queue full determination, and allows processing to continue even when the load queue is full.Type: GrantFiled: November 13, 2017Date of Patent: October 8, 2019Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.Inventor: Qianli Di
-
Patent number: 10437600Abstract: An apparatus includes a scheduler circuit and a plurality of hardware engines. The scheduler circuit may be configured to (i) store a directed acyclic graph, (ii) parse the directed acyclic graph into a plurality of operators and (iii) schedule the operators in one or more data paths based on a readiness of the operators to be processed. The hardware engines may be (i) configured as a plurality of the data paths and (ii) configured to generate one or more output vectors by processing zero or more input vectors using the operators.Type: GrantFiled: May 12, 2017Date of Patent: October 8, 2019Assignee: Ambarella, Inc.Inventors: Leslie D. Kohn, Robert C. Kunz
-
Patent number: 10430196Abstract: An arithmetic processing device includes: a branch prediction unit configured to predict a branch destination address and loop processing based on an address generated by an address generation unit; an instruction buffer unit configured to store an instruction of the address generated by the address generation unit; an instruction decoding unit configured to decode the instruction stored in the instruction buffer unit; and a loop buffer unit configured to store decoding results or decoding intermediate results of instructions of the predicted loop processing that are decoded by the instruction decoding unit and output the stored decoding results or decoding intermediate results a predetermined number of times in response to the loop processing, in which during a period when selecting the output of the loop buffer unit, operations of the address generation unit, the branch prediction unit, the instruction buffer unit, and the instruction decoding unit are stopped.Type: GrantFiled: May 22, 2017Date of Patent: October 1, 2019Assignee: FUJITSU LIMITEDInventors: Ryohei Okazaki, Norihito Gomyo, Yasunobu Akizuki, Takashi Suzuki
-
Patent number: 10430200Abstract: An integrated circuit can include a slave processor configured to execute instructions. The slave processor can be implemented in programmable circuitry of the integrated circuit. The integrated circuit also can include a processor coupled to the slave processor. The processor can be hardwired and configured to control operation of the slave processor.Type: GrantFiled: August 14, 2017Date of Patent: October 1, 2019Assignee: XILINX, INC.Inventors: Patrick Lysaght, Graham F. Schelle, Parimal Patel, Peter K. Ogden
-
Patent number: 10394678Abstract: A processor core includes a decode circuit to decode an instruction. The processor core further includes a monitor circuit, where the monitor circuit includes a data structure to store a plurality of entries for addresses that are being monitored by the monitor circuit and a triggered queue to store a plurality of addresses for which a triggering event occurred. The processor core further includes an execution circuit to execute the decoded instruction to dequeue an address from the triggered queue and return the dequeued address in response to a determination that the triggered queue is not empty.Type: GrantFiled: December 29, 2016Date of Patent: August 27, 2019Assignee: INTEL CORPORATIONInventors: Wim Heirman, Yves Vandriessche
-
Patent number: 10372447Abstract: An instruction defined to be a looping instruction is obtained and processed. A determination is made as to whether an obtained selected character is an expected selected character. Based on the obtained selected character being the expected selected character, an execution process is used that includes a sequence of operations to perform an operation, the sequence of operations replacing a loop and providing a non-looping sequence to perform the operation on up to a defined number of units of data. The sequence of operations is configured to repeat one or more times and to terminate based on the obtained selected character. Based on the obtained selected character being different than the expected selected character, an alternate execution process is chosen.Type: GrantFiled: December 7, 2018Date of Patent: August 6, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael K. Gschwind