Patents by Inventor Emil TALPES
Emil TALPES has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220083412Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.Type: ApplicationFiled: September 23, 2021Publication date: March 17, 2022Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
-
Publication number: 20220050806Abstract: A microprocessor system comprises a computational array and a hardware data formatter. The computational array includes a plurality of computation units that each operates on a corresponding value addressed from memory. The values operated by the computation units are synchronously provided together to the computational array as a group of values to be processed in parallel. The hardware data formatter is configured to gather the group of values, wherein the group of values includes a first subset of values located consecutively in memory and a second subset of values located consecutively in memory. The first subset of values is not required to be located consecutively in the memory from the second subset of values.Type: ApplicationFiled: October 22, 2021Publication date: February 17, 2022Inventors: Emil Talpes, William McGee, Peter Joseph Bannon
-
Publication number: 20220027162Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.Type: ApplicationFiled: October 8, 2021Publication date: January 27, 2022Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
-
Patent number: 11227029Abstract: A microprocessor system comprises a matrix computational unit and a control unit. The matrix computational unit includes one or more processing elements. The control unit is configured to provide a matrix processor instruction to the matrix computational unit. The matrix processor instruction specifies a floating-point operand formatted with an exponent that has been biased with a specified bias.Type: GrantFiled: May 23, 2019Date of Patent: January 18, 2022Assignee: Tesla, Inc.Inventors: Debjit Das Sarma, William McGee, Emil Talpes
-
Publication number: 20210357222Abstract: A processor in a data processing system includes a master-shadow physical register file and a renaming unit. The master-shadow physical register file has a master storage coupled to shadow storage. The renaming unit is coupled to the master-shadow physical register file. Based on an occurrence of shadow transfer activation conditions verified by the renaming unit, data in the master storage is transferred from the master storage to the shadow storage for storage. Data is transferred from the shadow storage back to the master storage based on the occurrence of a shadow-to-master transfer event, which includes, for example, a flush of the master storage by the processor.Type: ApplicationFiled: May 18, 2020Publication date: November 18, 2021Inventors: Arun A. NAIR, Ashok T. VENKATACHAR, Emil TALPES, Srikanth AREKAPUDI, Rajesh Kumar ARUNACHALAM
-
Patent number: 11157287Abstract: A microprocessor system comprises a computational array and a hardware arbiter. The computational array includes a plurality of computation units. Each of the plurality of computation units operates on a corresponding value addressed from memory. The hardware arbiter is configured to control issuing of at least one memory request for one or more of the corresponding values addressed from the memory for the computation units. The hardware arbiter is also configured to schedule a control signal to be issued based on the issuing of the memory requests.Type: GrantFiled: March 13, 2018Date of Patent: October 26, 2021Assignee: Tesla, Inc.Inventors: Emil Talpes, Peter Joseph Bannon, Kevin Altair Hurd
-
Patent number: 11157441Abstract: A microprocessor system comprises a computational array and a hardware data formatter. The computational array includes a plurality of computation units that each operates on a corresponding value addressed from memory. The values operated by the computation units are synchronously provided together to the computational array as a group of values to be processed in parallel. The hardware data formatter is configured to gather the group of values, wherein the group of values includes a first subset of values located consecutively in memory and a second subset of values located consecutively in memory. The first subset of values is not required to be located consecutively in the memory from the second subset of values.Type: GrantFiled: March 13, 2018Date of Patent: October 26, 2021Assignee: Tesla, Inc.Inventors: Emil Talpes, William McGee, Peter Joseph Bannon
-
Patent number: 11144324Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.Type: GrantFiled: September 27, 2019Date of Patent: October 12, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
-
Publication number: 20210311737Abstract: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.Type: ApplicationFiled: May 19, 2021Publication date: October 7, 2021Inventors: Gregory W. Smaus, Francesco Spadini, Matthew A. Rafacz, Michael Achenbach, Christopher J. Burke, Emil Talpes, Matthew M. Crum
-
Patent number: 11132245Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.Type: GrantFiled: March 30, 2020Date of Patent: September 28, 2021Assignee: Tesla, Inc.Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
-
Patent number: 11036505Abstract: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.Type: GrantFiled: December 20, 2012Date of Patent: June 15, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Gregory W. Smaus, Francesco Spadini, Matthew A. Rafacz, Michael Achenbach, Christopher J. Burke, Emil Talpes, Matthew M. Crum
-
Publication number: 20210096874Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.Type: ApplicationFiled: September 27, 2019Publication date: April 1, 2021Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
-
Publication number: 20210048984Abstract: Various embodiments of the disclosure relate to an accelerated mathematical engine. In certain embodiments, the accelerated mathematical engine is applied to image processing such that convolution of an image is accelerated by using a two-dimensional matrix processor comprising sub-circuits that include an ALU, output register and shadow register. This architecture supports a clocked, two-dimensional architecture in which image data and weights are multiplied in a synchronized manner to allow a large number of mathematical operations to be performed in parallel.Type: ApplicationFiled: May 29, 2020Publication date: February 18, 2021Inventors: Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes
-
Publication number: 20200394095Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.Type: ApplicationFiled: March 30, 2020Publication date: December 17, 2020Inventors: Christopher Hsiong, Emil Talpes, Debjlt Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
-
Publication number: 20200348909Abstract: A microprocessor system comprises a matrix computational unit and a control unit. The matrix computational unit includes one or more processing elements. The control unit is configured to provide a matrix processor instruction to the matrix computational unit. The matrix processor instruction specifies a floating-point operand formatted with an exponent that has been biased with a specified bias.Type: ApplicationFiled: May 23, 2019Publication date: November 5, 2020Inventors: Debjit Das Sarma, William McGee, Emil Talpes
-
Patent number: 10747844Abstract: Presented are systems and methods that accelerate the convolution of an image and similar arithmetic operations by utilizing hardware-specific circuitry that enables a large number of operations to be performed in parallel across a large set of data. In various embodiments, arithmetic operations are further enhanced by reusing data and eliminating redundant steps of storing and fetching intermediate results from registers and memory when performing arithmetic operations.Type: GrantFiled: December 12, 2017Date of Patent: August 18, 2020Assignee: Tesla, Inc.Inventors: Peter Joseph Bannon, William A McGee, Emil Talpes
-
Patent number: 10671349Abstract: Various embodiments of the disclosure relate to an accelerated mathematical engine. In certain embodiments, the accelerated mathematical engine is applied to image processing such that convolution of an image is accelerated by using a two-dimensional matrix processor comprising sub-circuits that include an ALU, output register and shadow register. This architecture supports a clocked, two-dimensional architecture in which image data and weights are multiplied in a synchronized manner to allow a large number of mathematical operations to be performed in parallel.Type: GrantFiled: September 20, 2017Date of Patent: June 2, 2020Assignee: Tesla, Inc.Inventors: Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes
-
Patent number: 10606678Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.Type: GrantFiled: November 17, 2017Date of Patent: March 31, 2020Assignee: Tesla, Inc.Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
-
Publication number: 20190361699Abstract: Systems, apparatuses, and methods for implementing a fastpath microcode sequencer are disclosed. A processor includes at least an instruction decode unit and first and second microcode units. For each received instruction, the instruction decode unit forwards the instruction to the first microcode unit if the instruction satisfies at least a first condition. In one implementation, the first condition is the instruction being classified as a frequently executed instruction. If a received instruction satisfies at least a second condition, the instruction decode unit forwards the received instruction to a second microcode unit. In one implementation, the first microcode unit is a smaller, faster structure than the second microcode unit. In one implementation, the second condition is the instruction being classified as an infrequently executed instruction.Type: ApplicationFiled: May 22, 2018Publication date: November 28, 2019Inventors: Kai Troester, Magiting Talisayon, Hongwen Gao, Benjamin Floering, Emil Talpes
-
Patent number: 10416899Abstract: In various embodiment, the present invention teaches a sequencer that identifies an address point of a first data block within a memory and a length of data that comprises that data block and is related to an input of a matrix processor. The sequencer then calculates, based on the block length, the input length, and a memory map, a block count representative of a number of data blocks that are to be retrieved from the memory. Using the address pointer, the sequencer may retrieve a number of data blocks from the memory in a number of cycles that depends on whether the data blocks are contiguous. In embodiments, based on the length of data, a formatter then maps the data blocks to the input of the matrix processor.Type: GrantFiled: June 5, 2018Date of Patent: September 17, 2019Assignee: Tesla, Inc.Inventors: Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes