Scalar/vector Processor Interface Patents (Class 712/3)
-
Patent number: 12189564Abstract: A data processing system for implementing operations that generate a dynamically-sized output is presented. The data processing system includes a reconfigurable processor that is configured to implement a first operation, a second operation, a recording unit, and a control unit. The first operation generates an output, wherein a size of the output is unknown during a configuration phase. The second operation receives the output of the first operation as an input. The recording unit generates control data that is indicative of the size of the output. The control unit that provides the control data to the second operation, wherein the second operation processes the input based on the control data.Type: GrantFiled: February 14, 2023Date of Patent: January 7, 2025Assignee: SambaNova Systems, Inc.Inventors: Abhishek Srivastava, Matthew Vilim, Raghu Prabhakar, Sankar Rachuru, Zhekun Zhang, Matheen Musaddiq, Apurv Vivek, Sitanshu Gupta, Ayesha Siddiqua
-
Patent number: 12086594Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.Type: GrantFiled: August 28, 2023Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
-
Patent number: 11756618Abstract: A log structure is created in persistent memory using hardware support in memory controller or software supported with additional instructions. Writes to persistent memory locations are streamed to the log and written to their corresponding memory location in cache hierarchy. An added victim cache for persistent memory addresses catches cache evictions, which would corrupt open transactions. On the completion of a group of atomic persistent memory operations, the log is closed and the persistent values in the cache can be copied to their source persistent memory location and the log cleaned.Type: GrantFiled: June 28, 2021Date of Patent: September 12, 2023Inventors: Ellis Robinson Giles, Peter Joseph Varman
-
Patent number: 11740904Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.Type: GrantFiled: November 11, 2021Date of Patent: August 29, 2023Assignee: Intel CorporationInventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
-
Patent number: 11663118Abstract: In some examples, a device includes a set of data storage elements, wherein each data storage element of the set of data storage elements is associated with a respective valid address vector, and wherein a bit flip in any bit of any of the valid address vectors leads to one of a set of invalid address vectors not associated with any of the set of data storage elements. The device also includes a decoder configured to receive a first address vector as part of a request and to check whether the first address vector corresponds to one of the valid address vectors or to one of the invalid address vectors. The decoder is also configured to select an associated data storage element in response to receiving the request and in response to determining that the first address vector corresponds to one of the valid address vectors.Type: GrantFiled: March 10, 2021Date of Patent: May 30, 2023Assignee: Infineon Technologies AGInventor: Jens Barrenscheen
-
Patent number: 11494592Abstract: Systems, apparatuses, and methods for converting data to a tiling format when implementing convolutional neural networks are disclosed. A system includes at least a memory, a cache, a processor, and a plurality of compute units. The memory stores a first buffer and a second buffer in a linear format, where the first buffer stores convolutional filter data and the second buffer stores image data. The processor converts the first and second buffers from the linear format to third and fourth buffers, respectively, in a tiling format. The plurality of compute units load the tiling-formatted data from the third and fourth buffers in memory to the cache and then perform a convolutional filter operation on the tiling-formatted data. The system generates a classification of a first dataset based on a result of the convolutional filter operation.Type: GrantFiled: August 28, 2020Date of Patent: November 8, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Song Zhang, Jiantan Liu, Hua Zhang, Min Yu
-
Patent number: 11397624Abstract: A data processing system including a data processor which is operable to execute programs to perform data processing operations and in which execution threads executing a program to perform data processing operations may be grouped together into thread groups. The data processor comprises a cross-lane permutation circuit which is operable to perform processing for cross-lane instructions which require data to be permuted (copied or moved) between the threads of a thread group. The cross-lane permutation circuit has plural data lanes between which data may be permuted (moved or copied). The number of data lanes is fewer than the number of threads in a thread group.Type: GrantFiled: January 22, 2019Date of Patent: July 26, 2022Assignee: Arm LimitedInventors: Luka Dejanovic, Mladen Wilder
-
Patent number: 11340904Abstract: Disclosed herein are vector index registers in vector processors that each store multiple addresses for accessing multiple positions in vectors. It is known to use scalar index registers in vector processors to access multiple positions of vectors by changing the scalar index registers in vector operations. By using a vector indexing register for indexing positions of one or more operand vectors, the scalar index register can be replaced and at least the continual changing of the scalar index register can be avoided.Type: GrantFiled: May 20, 2019Date of Patent: May 24, 2022Assignee: Micron Technology, Inc.Inventor: Steven Jeffrey Wallach
-
Patent number: 11226821Abstract: A computer processor is provided that employs a plurality of operand storage elements that store operand data values and associated meta-data as unitary operand data elements as well as at least one functional unit that performs operations that produce and access the unitary operand data elements stored in the plurality of operand storage elements. The meta-data associated with a given operand data value as part of a unitary operand data element can specify type of the unitary operand data element (e.g., vector or scalar), elemental width and floating-point error flags. The meta-data can also be used to define special operand data values (e.g., Not-a-Result and None). The meta-data is useful in optimizing execution, such as in speculation and vectorized SIMD operations. The computer processor can also support a number of particular vector operations that are useful in optimizing execution of vectorized SIMD operations.Type: GrantFiled: September 10, 2019Date of Patent: January 18, 2022Assignee: Mill Computing, Inc.Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost, Sebastien Paul Maurice Mirolo
-
Patent number: 11188330Abstract: An apparatus comprises processing circuitry, a number of vector register and a number of scalar registers. An instruction decoder is provided which supports decoding of a vector multiply-add instruction specifying at least one vector register and at least one scalar register. In response to the vector multiply-add instruction, the decoder controls the processing circuitry to perform a vector multiply-add instruction in which each lane of processing generates a respective result data element corresponding to a sum of difference of a product value and an addend value, with the product value comprising the product of a respective data element of a first vector value and a multiplier value. In each lane of processing at least one of the multiplier value and the addend value is specified as a portion of a scalar value stored in a scalar register.Type: GrantFiled: August 14, 2017Date of Patent: November 30, 2021Assignee: ARM LIMITEDInventors: Thomas Christopher Grocutt, François Christopher Jacques Botman
-
Patent number: 10922267Abstract: A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more graphics processing instructions. The computer processor may be implemented as a monolithic integrated circuit.Type: GrantFiled: May 21, 2015Date of Patent: February 16, 2021Assignee: Optimum Semiconductor Technologies Inc.Inventors: Mayan Moudgill, Gary J. Nacer, C. John Glossner, Arthur Joseph Hoane, Vitaly Kalashnikov, Sitij Agrawal
-
Patent number: 10871549Abstract: An apparatus is disclosed for proximity detection using adaptive mutual coupling cancellation. In an example aspect, the apparatus includes at least two antennas, a wireless transceiver connected to the at least two antennas, and a mutual coupling cancellation module. The at least two antennas include a first antenna and a second antenna, which are mutually coupled electromagnetically. The second antenna includes two feed ports. The wireless transceiver is configured to transmit a radar transmit signal via the first antenna and receive two versions of a radar receive signal respectively via the two feed ports of the second antenna. The wireless transceiver is also configured to adjust a transmission parameter based on a decoupled signal. The transmission parameter varies based on a range to the object. The mutual coupling cancellation module is configured to generate the decoupled signal based on the two versions of the radar receive signal.Type: GrantFiled: May 18, 2018Date of Patent: December 22, 2020Assignee: QUALCOMM IncorporatedInventors: Roberto Rimini, Anant Gupta
-
Patent number: 10846259Abstract: A computer processor is disclosed. The computer processor comprise a vector unit comprising a vector register file comprising at least one vector register to hold a varying number of elements. The computer processor further comprises out-of-order issue logic that holds a pool of vector instructions, selects a vector instruction from the pool, and sends the vector instruction for execution. The vector instruction operates on the varying number of elements of the at least one vector register.Type: GrantFiled: May 21, 2015Date of Patent: November 24, 2020Assignee: Optimum Semiconductor Technologies Inc.Inventors: Mayan Moudgill, Gary J. Nacer, C. John Glossner, Arthur Joseph Hoane, Murugappan Senthilvelan, Pablo Balzola
-
Patent number: 10776124Abstract: Processing circuitry supports a first type of vector arithmetic instruction specifying at least a first input vector. When at least one exceptional condition is detected for an arithmetic operation performed for a first active data element of the first input vector in a predetermined sequence, the processing circuitry performs at least one response action. When the at least one exceptional condition is detected for a given active data element other than the first active data element in the predetermined sequence, the processing circuitry suppresses the at least one response action and stores elements identifying information identifying which data element is the given active data element which triggered the exceptional condition. This can be useful for reducing the amount of hardware resource for tracking the occurrence of the exceptional conditions and/or supporting speculative execution of vector instructions.Type: GrantFiled: September 14, 2016Date of Patent: September 15, 2020Assignee: ARM LimitedInventors: Giacomo Gabrielli, Nigel John Stephens
-
Patent number: 10713042Abstract: An arithmetic processing device includes, a memory that stores a first data and a second data, a plurality of arithmetic circuits, a first memory arranged for each of the arithmetic circuits and that stores a first predetermined row having the predetermined number of the first data stored in the memory, a second memory arranged for each of the arithmetic circuits and that stores a second predetermined row having a predetermined number of the second data stored in the memory, and a plurality of multiply-add arithmetic circuits arranged for each of the arithmetic circuits, a number of the multiply-add arithmetic circuits corresponding to the predetermined number, each of the multiply-add arithmetic circuits that obtains a third data by executing the operation using the first data and the second data based on a result of performing a row operation which is an operation of one row of the first data.Type: GrantFiled: June 22, 2018Date of Patent: July 14, 2020Assignee: FUJITSU LIMITEDInventor: Masahiro Kuramoto
-
Patent number: 10642622Abstract: Each of product-sum arithmetic units 501 to 503 acquires, from a register file 410, different pieces of first element data included in a first predetermined row of first data that forms a matrix; acquires, from a register file 420, same pieces of second element data included in a second predetermined row of second data that forms a matrix; performs a row portion operation that is an operation performed on the first data by an amount corresponding to a single row by performing a process of performing an operation using the acquired first element data and the second element data; and performs an operation by using the first data and the second data based on the result of the row portion operation.Type: GrantFiled: October 13, 2017Date of Patent: May 5, 2020Assignee: FUJITSU LIMITEDInventor: Masahiro Kuramoto
-
Patent number: 10599745Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector.Type: GrantFiled: October 26, 2018Date of Patent: March 24, 2020Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Jinhua Tao, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
-
Patent number: 10592582Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector.Type: GrantFiled: October 26, 2018Date of Patent: March 17, 2020Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Jinhua Tao, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
-
Patent number: 10585973Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector.Type: GrantFiled: October 26, 2018Date of Patent: March 10, 2020Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Jinhua Tao, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
-
Patent number: 10496404Abstract: The present disclosure provides a data read-write scheduler and a reservation station for vector operations. The data read-write scheduler suspends the instruction execution by providing a read instruction cache module and a write instruction cache module and detecting conflict instructions based on the two modules. After the time is satisfied, instructions are re-executed, thereby solving the read-after-write conflict and the write-after-read conflict between instructions and guaranteeing that correct data are provided to a vector operations component. Therefore, the subject disclosure has more values for promotion and application.Type: GrantFiled: November 7, 2018Date of Patent: December 3, 2019Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Dong Han, Shaoli Liu, Yunji Chen, Tianshi Chen
-
Patent number: 10409596Abstract: Disclosed is an apparatus comprising: a plurality of memory banks; and a controller for generating a plurality of lookup tables storing data, needed for vector arithmetic operations, copied from data stored in the plurality of memory banks, and generating vector data by reading the data in the generated lookup tables.Type: GrantFiled: November 17, 2015Date of Patent: September 10, 2019Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Jung-uk Cho, Suk-jin Kim, Dong-kwan Suh
-
Patent number: 10361717Abstract: A first error-detecting code (EDC) is computed based on a first segment of a block of information that is to be encoded, and a second EDC is computed based on at least a second segment of the block of information. The first EDC is masked with a first masking segment and the second EDC with a second masking segment to generate a first masked EDC and a second masked EDC. The first masking segment and the second masking segment are associated with a target receiver of the block of information. A codeword is generated based on a code and an input vector that includes the first segment, the first masked EDC, the second segment, and the second masked EDC. This type of coding could be useful to support early termination of blind detection at a decoder, for example.Type: GrantFiled: June 1, 2017Date of Patent: July 23, 2019Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Yiqun Ge, Ran Zhang, Nan Cheng, Wuxian Shi
-
Patent number: 10318293Abstract: A predication method for vector processors that minimizes the use of embedded predicate fields in most instructions by using separate condition code extensions. Dedicated predicate registers provide fine grain predication of vector instructions where each bit of a predicate register controls 8 bit of the vector data.Type: GrantFiled: July 9, 2014Date of Patent: June 11, 2019Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Timothy Anderson, Duc Quang Bui, Joseph Zbiciak
-
Patent number: 10223002Abstract: A compare and swap transaction can be issued by a master device to request a processing unit to select whether to write a swap data value to a storage location corresponding to a target address in dependence on whether a compare data value matches a target data value read from the storage location. The compare and swap data values are transported within a data field of the compare and swap transaction. The compare data value is packed into a first region of the data field in dependence of an offset portion of the target address and having a position within the data field corresponding to the position of the target data value within the storage location. This reduces latency and circuitry required at the processing unit for handling the compare and swap transaction.Type: GrantFiled: February 8, 2017Date of Patent: March 5, 2019Assignee: ARM LimitedInventors: Phanindra Kumar Mannava, Bruce James Mathewson, Klas Magnus Bruce, Geoffray Matthieu Lacourba
-
Patent number: 10157061Abstract: According to one embodiment, an occurrence of an instruction is fetched. The instruction's format specifies its only source operand from a single vector write mask register, and specifies as its destination a single general purpose register. In addition, the instruction's format includes a first field whose contents selects the single vector write mask register, and includes a second field whose contents selects the single general purpose register. The source operand is a write mask including a plurality of one bit vector write mask elements that correspond to different multi-bit data element positions within architectural vector registers. The method also includes, responsive to executing the single occurrence of the single instruction, storing data in the single general purpose register such that its contents represent either a first or second scalar constant based on whether the plurality of one bit vector write mask elements in the source operand are all zero.Type: GrantFiled: December 22, 2011Date of Patent: December 18, 2018Assignee: Intel CorporationInventors: Jesus Corbal, Matthew J. Craighead, Bret L. Toll, Andrew T. Forsyth
-
Patent number: 9990966Abstract: Examples of the present disclosure provide apparatuses and methods for simulating access lines in a memory. An example method can include receiving a first bit-vector and a second bit-vector in a format associated with storing the first bit-vector in memory cells coupled to a first access line and a first number of sense lines and storing the second bit-vector in memory cells coupled to a second access line and the first number of sense lines. The method can include storing the first bit-vector in a number of memory cells coupled to the first access line and a second number of sense lines and storing the second bit-vector in a number of memory cells coupled to the first access line and a third number of sense lines, wherein a quantity of the first number of sense lines is less than a quantity of the second and third number of sense lines.Type: GrantFiled: July 10, 2017Date of Patent: June 5, 2018Assignee: Micron Technology, Inc.Inventor: Jeremiah J. Willcock
-
Patent number: 9842046Abstract: A method of an aspect includes receiving an instruction indicating a first source packed memory indices, a second source packed data operation mask, and a destination storage location. Memory indices of the packed memory indices are compared with one another. One or more sets of duplicate memory indices are identified. Data corresponding to each set of duplicate memory indices is loaded only once. The loaded data corresponding to each set of duplicate memory indices is replicated for each of the duplicate memory indices in the set. A packed data result in the destination storage location in response to the instruction. The packed data result includes data elements from memory locations that are indicated by corresponding memory indices of the packed memory indices when not blocked by corresponding elements of the packed data operation mask.Type: GrantFiled: September 28, 2012Date of Patent: December 12, 2017Assignee: Intel CorporationInventors: Andrew T. Forsyth, Dennis R. Bradford, Jonathan C. Hall
-
Patent number: 9830151Abstract: An apparatus and method for performing vector index loads and stores. For example, one embodiment of a processor comprises: a vector index register to store a plurality of index values; a mask register to store a plurality of mask bits; a vector register to store a plurality of vector data elements loaded from memory; and vector index load logic to identify an index stored in the vector index register to be used for a load operation using an immediate value and to responsively combine the index with a base memory address to determine a memory address for the load operation, the vector index load logic to load vector data elements from the memory address to the vector register in accordance with the plurality of mask bits.Type: GrantFiled: December 23, 2014Date of Patent: November 28, 2017Assignee: INTEL CORPORATIONInventors: Ashish Jha, Robert Valentine, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 9740493Abstract: Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.Type: GrantFiled: September 28, 2012Date of Patent: August 22, 2017Assignee: Intel CorporationInventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin
-
Patent number: 9552208Abstract: A system, method, and computer program product are provided for remapping registers based on a change in execution mode. A sequence of instructions is received for execution by a processor and a change in an execution mode from a first execution mode to a second execution mode within the sequence of instructions is identified, where a first register mapping is associated with the first execution mode and a second register mapping is associated with the second execution mode. Data stored in a set of registers within a processor is reorganized based on the first register mapping and the second register mapping in response to the change in the execution mode.Type: GrantFiled: December 20, 2013Date of Patent: January 24, 2017Assignee: NVIDIA CorporationInventors: Ben Hertzberg, Guillermo Juan Rozas, Alexander Christian Klaiber, Nickolas Andrew Fortino
-
Patent number: 9513917Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.Type: GrantFiled: January 31, 2014Date of Patent: December 6, 2016Assignee: Intel CorporationInventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C. Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
-
Patent number: 9330057Abstract: A reconfigurable processor includes a plurality of mini-cores and an external network to which the mini-cores are connected. Each of the mini-cores includes a first function unit including a first group of operation elements, a second function unit including a second group of operation elements that is different from the first group of operation elements, and an internal network to which the first function unit and the second function unit are connected.Type: GrantFiled: December 11, 2012Date of Patent: May 3, 2016Assignee: Samsung Electronics Co., Ltd.Inventors: Dong-Kwan Suh, Suk-Jin Kim, Hyeong-Seok Yu, Ki-Seok Kwon, Jae-Un Park
-
Patent number: 9124935Abstract: A television signals receiver for receives and stores television signals encoded at a variable data rate. Time information is generated based on the time of receipt of the signals that defines the duration of the television signals when output in decompressed form at a substantially constant data rate. The received signals are then written to a file on a hard disk (13) in received order together with the time information. The time information of signals stored in the file is monitored and old signals are deleted from the file such that the file stores signals corresponding to a predetermined period of time.Type: GrantFiled: November 13, 2002Date of Patent: September 1, 2015Assignee: BRITISH SKY BROADCASTING LTD.Inventors: Xavier Willame, Nigel Bodkin, Nicholas James, Ellen Fiona Collins, Benjamin Johnathan Freeman, Brian Francis Sullivan
-
Patent number: 9081564Abstract: A data processing apparatus having processing circuitry, a scalar register bank and a vector register bank, including decoding circuitry arranged to decode a sequence of instructions to generate control signals for the processing circuitry. The decoding circuitry is responsive to a decode modifier instruction within the sequence of instructions to alter decoding of a subsequent scalar instruction in the sequence by mapping at least one scalar operand specified by the subsequent scalar instruction to at least one vector operand in the vector register bank, and, in dependence on the scalar operation specified by the subsequent scalar instruction, determining a vector operation to be performed on at least a subset of the operand elements within the at least the one vector operand. Such an approach enables a wide variety of vector operations to be specified without the need to individually define separate vector instructions for those vector operations.Type: GrantFiled: April 4, 2012Date of Patent: July 14, 2015Assignee: ARM LimitedInventor: Alastair David Reid
-
Patent number: 9081561Abstract: The present invention relates to a method for improving execution performance of multiply-add instructions during compiling, comprising the following steps of: compiling a source code by a compiler to acquire internal representation; optimizing; generating a machine code on the basis of a target processor, and allocating a physical register to a pseudo-register in the machine code; and improving results of register allocation to multiply-accumulate instructions. The method for improving execution performance of multiply-add instructions during compiling provided by the present invention has the following advantages: the compiler is allowed to realize procedure optimization by acquiring the optimal MAC (multiply-accumulate) instruction use gain.Type: GrantFiled: April 21, 2014Date of Patent: July 14, 2015Assignee: SHENZHEN ZHONGWEIDIAN TECHNOLOGY LIMITEDInventor: Fred Chow
-
Patent number: 9064005Abstract: A computer-implemented system and method is disclosed for retrieving documents using context-dependant probabilistic modeling of words and documents. The present invention uses multiple overlapping vectors to represent each document. Each vector is centered on each of the words in the document and includes the local environment. The vectors are used to build probability models that are used for predictions of related documents and related keywords. The results of the statistical analysis are used for retrieving an indexed document, for extracting features from a document, or for finding a word within a document. The statistical evaluation is also used to evaluate the probability of relation between the key words appearing in the document and building a vocabulary of key words that are generally found together. The results of the analysis are stored in a repository. Searches of the data repository produce a list of related documents and a list of related terms.Type: GrantFiled: July 12, 2007Date of Patent: June 23, 2015Assignee: Nuance Communications, Inc.Inventor: Jan Magnus Stensmo
-
Publication number: 20150143073Abstract: A data processing system is described in which a plurality of data processing units 521 . . . 52N cooperate with one another in order to process incoming data packets or an incoming data stream. Tasks are managed using a task list which is accessible and updateable by each data processing unit.Type: ApplicationFiled: January 20, 2015Publication date: May 21, 2015Applicant: BLUWIRELESS TECHNOLOGY LIMITEDInventors: Paul Winser, RAY MCCONNELL
-
Patent number: 9038073Abstract: Efficient data processing apparatus and methods include hardware components which are pre-programmed by software. Each hardware component triggers the other to complete its tasks. After the final pre-programmed hardware task is complete, the hardware component issues a software interrupt.Type: GrantFiled: August 13, 2009Date of Patent: May 19, 2015Assignee: QUALCOMM IncorporatedInventors: Mathias Kohlenz, Irfan Anwar Khan, Sathyanarayan Madhusudan, Shailesh Maheshwari, Srividhya Krishnamoorthy, Sandeep Urgaonkar, Thomas Klingenbrunn, Tim Tynghuei Liou, Idreas Mir
-
Publication number: 20150089187Abstract: A hazard check instruction has operands that specify addresses of vector elements to be read by first and second vector memory operations. The hazard check instruction outputs a dependency vector identifying, for each element position of the first vector corresponding to the first vector memory operation, which element position of the second vector that the element of the first vector depends on (if any). In an embodiment, at least one of the vector memory operations has addresses specified using a scalar address in the operands (and a vector attribute associated with the vector). In an embodiment, the operands may include predicates for one or both of the vector memory operations, indicating which vector elements are active. The dependency vector may be qualified by the predicates, indicating dependencies only for active elements.Type: ApplicationFiled: September 24, 2013Publication date: March 26, 2015Applicant: APPLE INC.Inventor: Jeffry E. Gonion
-
Publication number: 20150089188Abstract: In an embodiment, a processor may implement a vector hazard check instruction to detect dependencies between vector memory operations based on the addresses of the vectors accessed by the vector memory operations. The addresses may be specified via a base address and a vector of indexes for each vector. In an embodiment, one of the base addresses may be an implied (or assumed) zero address, reducing the number of operands of the hazard check instruction.Type: ApplicationFiled: September 24, 2013Publication date: March 26, 2015Applicant: Apple Inc.Inventors: Jeffry E. Gonion, Alexander C. Klaiber
-
Publication number: 20150067298Abstract: A hardware circuit component configured to support vector operations in a scalar data path. The hardware circuit component configured to operate in a vector mode configuration and in a scalar mode configuration. The hardware circuit component configured to split the scalar mode configuration into a left half and a right half of the vector mode configuration. The hardware circuit component configured to perform one or more bit shifts over one or more stages of interconnected multiplexers in the vector mode configuration. The hardware circuit component configured to include duplicated coarse shift multiplexers at bit positions that receive data from both the left half and the right half of the vector mode configuration, resulting in one or more coarse shift multiplexers sharing the bit position.Type: ApplicationFiled: September 3, 2013Publication date: March 5, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten J. Boersma, Markus Kaltenbach, Christophe J. Layer, Silvia M. Mueller
-
Publication number: 20150067299Abstract: A hardware circuit component configured to support vector operations in a scalar data path. The hardware circuit component configured to operate in a vector mode configuration and in a scalar mode configuration. The hardware circuit component configured to split the scalar mode configuration into a left half and a right half of the vector mode configuration. The hardware circuit component configured to perform one or more bit shifts over one or more stages of interconnected multiplexers in the vector mode configuration. The hardware circuit component configured to include duplicated coarse shift multiplexers at bit positions that receive data from both the left half and the right half of the vector mode configuration, resulting in one or more coarse shift multiplexers sharing the bit position.Type: ApplicationFiled: January 9, 2014Publication date: March 5, 2015Inventors: Maarten J. Boersma, Markus Kaltenbach, Christophe J. Layer, Silvia M. Mueller
-
Publication number: 20150032990Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.Type: ApplicationFiled: July 29, 2014Publication date: January 29, 2015Inventors: Dejan Markovic, Fengbo Ren
-
Publication number: 20150019835Abstract: A predication method for vector processors that minimizes the use of embedded predicate fields in most instructions by using separate condition code extensions. Dedicated predicate registers provide fine grain predication of vector instructions where each bit of a predicate register controls 8 bit of the vector data.Type: ApplicationFiled: July 9, 2014Publication date: January 15, 2015Inventors: Timothy Anderson, Duc Quang Bui, Joseph Zbiciak
-
Publication number: 20150019836Abstract: The number of registers required is reduced by overlapping scalar and vector registers. This also allows increased compiler flexibility when mixing scalar and vector instructions. Local register read ports are minimized by restricting read access. Dedicated predicate registers reduces requirements for general registers, and allows reduction of critical timing paths by allowing the predicate registers to be placed next to the predicate unit.Type: ApplicationFiled: July 9, 2014Publication date: January 15, 2015Inventors: Timothy David Anderson, Duc Quang Bui, Mel Alan Phipps, Todd T. Hahn, Joseph Zbiciak
-
Publication number: 20150012723Abstract: A mini-core and a processor using such a mini-core are provided in which functional units of the mini-core are divided into a scalar domain processor and a vector domain processor. The processor includes at least one such mini-core, and all or a portion of functional units from among the functional units of the mini-core operate based on an operation mode.Type: ApplicationFiled: July 7, 2014Publication date: January 8, 2015Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Young Hwan PARK, Keshava PRASAD, Ho YANG, Yeon Bok LEE
-
Publication number: 20140359251Abstract: A signal processing device including: one or more vector processors configured to perform vector processing to a signal using a parameter, one or more scalar processors configured to perform scalar processing for generating the parameter, a first circuit coupled to the one or more vector processors and the one or more scalar processors and configured to transfer the parameter from the one or more scalar processors to the one or more vector processors, and a second circuit coupled to the one or more vector processors and another circuit that inputs the signal to the second circuit, and configured to transfer the signal among the one or more vector processors and the other circuit.Type: ApplicationFiled: May 30, 2014Publication date: December 4, 2014Applicant: FUJITSU LIMITEDInventor: Noboru KOBAYASHI
-
Publication number: 20140359250Abstract: Methods and systems are provided for inferring types in a computer program. In one example, a method comprises: identifying a type of at least one expression of the computer program; and annotating the at least one expression in the computer program when the type of the at least one expression is at least one of a varying type and a uniform type.Type: ApplicationFiled: May 28, 2013Publication date: December 4, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventor: Benedict R. Gaster
-
Publication number: 20140317376Abstract: A digital processor, such as a vector processor or a scalar processor, is provided having an instruction set with a complex angle function. A complex angle is evaluated for an input value, x, by obtaining one or more complex angle software instructions having the input value, x, as an input; in response to at least one of the complex angle software instructions, performing the following steps: invoking at least one complex angle functional unit that implements the one or more complex angle software instructions to apply the complex angle function to the input value, x; and generating an output corresponding to the complex angle of the input value, x, using one or more multipliers of a Multiply Accumulate (MAC) unit of the digital processor, wherein the complex angle software instruction is part of an instruction set of the digital signal processor. Multiplication operations optionally employ one or more multipliers of the MAC unit of the digital processor.Type: ApplicationFiled: April 17, 2014Publication date: October 23, 2014Applicant: LSI CorporationInventor: Kameran Azadet
-
Publication number: 20140297992Abstract: An apparatus and method for generating vector code are provided. The apparatus and method generate vector code using scalar-type kernel code, without user's changing a code type or modifying data layout, thereby enhancing user's convenience of use and retaining the portability of OpenCL.Type: ApplicationFiled: March 28, 2014Publication date: October 2, 2014Applicants: Seoul National University R&DB Foundation, Samsung Electronics Co., Ltd.Inventors: Jin-Seok LEE, Seong-Gun KIM, Dong-Hoon YOO, Seok-Joong HWANG, Jeongho NAH, Jaejin LEE, Jun LEE