Application Specific Patents (Class 712/17)
-
Patent number: 11126621Abstract: A method for increasing sargability of encrypted records to allow for searching of a first column of a first data table for encrypted values containing a search string without having to decrypt all encrypted values involves, for each respective data record in the first data table, accessing an encrypted data value from the first column, decrypting the accessed encrypted data value, generating from the decrypted data value a respective plurality of substrings of various lengths, encrypting each substring of the respective plurality of substrings as an encrypted substring token, and storing each encrypted substring token in association with a reference value for lookup of a corresponding record in the first data table. Subsequently, the first column of the first data table can be searched for encrypted values containing a first search string by encrypting the first search string and searching for encrypted substring tokens matching the encrypted first search string.Type: GrantFiled: January 6, 2020Date of Patent: September 21, 2021Assignee: ALLSCRIPTS SOFTWARE, LLCInventors: Igor Chmil, Mark Gregory Plunkett, Stanislav Makarskyy
-
Patent number: 11080094Abstract: Implementations of the present specification provide a method, an apparatus, and an electronic device for improving parallel performance of a CPU. The method includes: attempting to acquire data requests that are of a same type and that are allocated to the CPU core; determining a number of requests that are specified by the acquired one or more data requests; and in response to determining that the number of requests is greater than or equal to a maximum degree of parallelism: executing executable codes corresponding to the maximum degree of parallelism, wherein the maximum degree of parallelism is a maximum number of parallel threads executable by the CPU, and wherein the executable codes comprise code programs that are compiled and linked based on the maximum degree of parallelism at a time that is prior to a time of the executing.Type: GrantFiled: July 31, 2020Date of Patent: August 3, 2021Assignee: Advanced New Technologies Co., Ltd.Inventors: Ling Ma, Wei Zhou, Changhua He
-
Patent number: 10873630Abstract: Systems, methods, and articles of manufacture comprising processor-readable storage media are provided for implementing server architectures having dedicated systems for processing infrastructure-related workloads. For example, a computing system includes a server node. The server node includes a first processor, a second processor, and a shared memory system. The first processor is configured to execute data computing functions of an application. The second processor is configured to execute input/output (I/O) functions for the application in parallel with the data computing functions of the application executed by the first processor. The shared memory system is configured to enable exchange of messages and data between the first and second processors.Type: GrantFiled: September 10, 2018Date of Patent: December 22, 2020Assignee: EMC IP Holding Company LLCInventors: Dragan Savic, Michael Robillard, Adrian Michaud
-
Patent number: 10803009Abstract: A processor includes a scalar processor core and a vector coprocessor core coupled to the scalar processor core. The scalar processor core is configured to retrieve an instruction stream from program storage, and pass vector instructions in the instruction stream to the vector coprocessor core. The vector coprocessor core includes a register file, a plurality of execution units, and a table lookup unit. The register file includes a plurality of registers. The execution units are arranged in parallel to process a plurality of data values. The execution units are coupled to the register file. The table lookup unit is coupled to the register file in parallel with the execution units. The table lookup unit is configured to retrieve table values from one or more lookup tables stored in memory by executing table lookup vector instructions in a table lookup loop.Type: GrantFiled: July 13, 2012Date of Patent: October 13, 2020Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Ching-Yu Hung, Shinri Inamori, Jagadeesh Sankaran, Peter Chang
-
Patent number: 10783004Abstract: Implementations of the present specification provide a method, an apparatus, and an electronic device for improving parallel performance of a CPU. The method includes: attempting to acquire data requests that are of a same type and that are allocated to the CPU core; determining a number of requests that are specified by the acquired one or more data requests; and in response to determining that the number of requests is greater than or equal to a maximum degree of parallelism: executing executable codes corresponding to the maximum degree of parallelism, wherein the maximum degree of parallelism is a maximum number of parallel threads executable by the CPU, and wherein the executable codes comprise code programs that are compiled and linked based on the maximum degree of parallelism at a time that is prior to a time of the executing.Type: GrantFiled: February 19, 2020Date of Patent: September 22, 2020Assignee: Alibaba Group Holding LimitedInventors: Ling Ma, Wei Zhou, Changhua He
-
Patent number: 10764324Abstract: A routing system for use in an IoT apparatus is proposed to include a router device. A control module of the router device determines whether to execute a routing process relating to an input message based on environment information, status information and a conflict management mechanism that relate to the router device. In the routing process, the control module executes channel operations when the input message includes a channel management instruction, and executes, when the input message relates to authentication or an application program, a relevant verification procedure or the application program.Type: GrantFiled: December 20, 2017Date of Patent: September 1, 2020Inventors: Kung-Wei Chang, Yi-Fen Chou
-
Executing a composite VLIW instruction having a scalar atom that indicates an iteration of execution
Patent number: 10572263Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the decoded composite VLIW instruction to perform the operation.Type: GrantFiled: March 31, 2016Date of Patent: February 25, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bruce M. Fleischer, Thomas Winters Fox, Arpith C. Jacob, Hans Mikael Jacobson, Ravi Nair, Kevin John Patrick O'Brien, Daniel Arthur Prener -
Patent number: 10528640Abstract: Provided is a device configured to perform a convolution operation. The device includes bi-directional First In First Out memory including bi-directional latches configured to transfer data in a first direction or a second direction depending on a clock signal and connected to each other and performs a convolution operation of an input value and a filter. The device stores first input values corresponding to a window equivalent to a size of the filter from a input value matrix in the bi-directional First In First Out memory in response to a first convolution operation and stores second input values corresponding to a location of the window which is moved in the first direction or the second direction by a predetermined amount from locations of the first input values in the bi-directional First In First Out memory in response to a second convolution operation subsequent to the first convolution operation.Type: GrantFiled: May 23, 2018Date of Patent: January 7, 2020Assignee: Korea University Research and Business FoundationInventors: Jongsun Park, Woong Choi
-
Patent number: 10521449Abstract: One or more computing nodes located in a first region may maintain a first collection of data items. A second set of one or more computing nodes in a second region may maintain a collection of data items that is at least a partial replica of the first collection. Notifications of changes to the first collection may be transmitted, via a broadcast communications channel, to a replication module, which may be included in a client application. The replication module may transmit requests to update the second collection of data based on changes to the first collection. Conflicts may be resolved through a last-write wins policy.Type: GrantFiled: December 17, 2014Date of Patent: December 31, 2019Assignee: Amazon Technologies, Inc.Inventors: Benjamin Aldouby Schwartz, Khawaja Salman Shams, Usman Ahmed Shami, David Craig Yanacek, Khai Quang Tran
-
Patent number: 10417149Abstract: In an embodiment, a processor includes a plurality of cores to independently execute instructions, at least one graphics engine to independently execute graphics instructions, and a power controller including a duty cycle logic to set a duty cycle having a cycle time formed of an active time window in which at least some of the plurality of cores are to be active and an idle time window in which the plurality of cores are to be in a low power state. The duty cycle logic may adjust a duration of at least one of an active time window and an inactive time window based on interrupt information to accommodate an impending interrupt within the active time window. Other embodiments are described and claimed.Type: GrantFiled: June 6, 2014Date of Patent: September 17, 2019Assignee: Intel CorporationInventors: Ruchika Singh, Paul S. Diefenbaugh
-
Patent number: 10387415Abstract: Disclosed aspects relate to data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources. In the distributed data cluster environment, a set of data is monitored for a data redistribution candidate trigger. The data redistribution candidate trigger is detected with respect to the set of data. Based on the data redistribution candidate trigger, the set of data is analyzed with respect to a candidate data redistribution action. Using the candidate data redistribution action, a new data arrangement associated with the set of data is determined. Accordingly, the new data arrangement is established.Type: GrantFiled: June 28, 2016Date of Patent: August 20, 2019Assignee: International Business Machines CorporationInventors: Naresh K. Chainani, James H. Cho
-
Patent number: 10261807Abstract: Embodiments of the present invention provide a novel solution to generate multiple linked device code portions within a final executable file. Embodiments of the present invention are operable to extract device code from their respective host object filesets and then link them together to form multiple linked device code portions. Also, using the identification process described by embodiments of the present invention, device code embedded within host objects may also be uniquely identified and linked in accordance with the protocols of conventional programming languages. Furthermore, these multiple linked device code portions may be then converted into distinct executable forms of code that may be encapsulated within a single executable file.Type: GrantFiled: March 25, 2013Date of Patent: April 16, 2019Assignee: NVIDIA CorporationInventors: Jaydeep Marathe, Michael Murphy, Sean Y. Lee
-
Patent number: 10241993Abstract: A method to detect reusable groups of drawing commands in a sequence of drawing commands. Drawing commands are identified by checksums. Recurring and co-occurring drawing commands are combined into groups of drawing commands. Under certain conditions such a group can be replaced by a new drawing command, making the group reusable.Type: GrantFiled: September 27, 2017Date of Patent: March 26, 2019Assignee: SOFHA GMBHInventors: Paul Jones, Christoph Oeters
-
Patent number: 10133939Abstract: Even if a problem has occurred with respect to a multimedia micro-computer that generates a composite image including guiding lines, while a gearshift of a vehicle is in reverse, a reset process is not performed for the multimedia micro-computer. The reset process is performed after the gearshift is determined to have moved from reverse.Type: GrantFiled: October 13, 2016Date of Patent: November 20, 2018Assignee: Fujitsu Ten LimitedInventors: Kenji Tada, Yuji Maruyama, Nobuyuki Batou
-
Patent number: 9792117Abstract: A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.Type: GrantFiled: September 10, 2013Date of Patent: October 17, 2017Assignee: Oracle International CorporationInventors: Amit Ganesh, Shasank K. Chavan, Vineet Marwah, Jesse Kamp, Anindya C. Patthak, Michael J. Gleeson, Allison L. Holloway, Roger Macnicol
-
Patent number: 9553590Abstract: A programmable integrated circuit device includes a plurality of clusters of programmable logic resources. Programmable device interconnect resources allow user-defined interconnection between the clusters of programmable logic resources. A plurality of specialized processing blocks have dedicated arithmetic operators and programmable internal interconnect resources, and having inputs and outputs programmably connectable to the programmable device interconnect resources. A plurality of dedicated memory modules have inputs and outputs programmably connectable to the programmable device interconnect resources. Programmably connectable direct interconnect between at least one respective individual one of the specialized processing blocks and at least one respective individual one of the dedicated memory modules allow the formation of a processor element from a specialized processing block and a memory module.Type: GrantFiled: October 29, 2012Date of Patent: January 24, 2017Assignee: Altera CorporationInventors: Valavan Manohararajah, David Lewis
-
Patent number: 9411632Abstract: The disclosure is directed to clustering a stream of data points. An aspect receives the stream of data points, determines a plurality of cluster centroids, divides the plurality of cluster centroids among a plurality of threads and/or processors, assigns a portion of the stream of data points to each of the plurality of threads and/or processors, and combines a plurality of clusters generated by the plurality of threads and/or processors to generate a global universe of clusters. An aspect assigns a portion of the stream of data points to each of a plurality of threads and/or processors, wherein each of the plurality of threads and/or processors determines one or more cluster centroids and generates one or more clusters around the one or more cluster centroids, and combines the one or more clusters from each of the plurality of threads and/or processors to generate a global universe of clusters.Type: GrantFiled: May 30, 2013Date of Patent: August 9, 2016Assignee: QUALCOMM INCORPORATEDInventors: Isaac David Guedalia, Sarah Glickfield
-
Patent number: 9244727Abstract: Disclosed is a method for implementing task-process-table based hardware control, which includes dividing a task that has to be implemented by a hardware circuit into multiple sub-processes, and determining the depth of the task process table according to the number of the sub-processes; according to the control information of the hardware unit corresponding to each sub-process and the number (SPAN) of clock cycles occupied by hardware processing for the sub-process, determining the bit width of the task process table and generating the task process table; starting the hardware unit corresponding to each sub-process in an order of the sub-processes, under the control of the control information in the task process table, and completing the processing of each sub-process. A device for implementing hardware control is also disclosed. The disclosure enables precise control of the hardware control flow and is of versatility.Type: GrantFiled: May 5, 2011Date of Patent: January 26, 2016Assignee: ZTE CorporationInventor: Yingxian Sun
-
Patent number: 9135003Abstract: A reconfigurable processor for efficiently performing a vector operation, and a method of controlling the reconfigurable processor are provided. The reconfigurable processor designates at least one of a plurality of processing elements as a vector lane based on vector lane configuration information, and allocates a vector operation to the designated vector lane.Type: GrantFiled: January 10, 2011Date of Patent: September 15, 2015Assignee: Samsung Electronics Co., Ltd.Inventors: Dong-Kwan Suh, Hyeong-Seok Yu, Suk-Jin Kim
-
Patent number: 8996846Abstract: A system, method, and computer program product are provided for efficiently performing a scan operation. In use, an array of elements is traversed by utilizing a parallel processor architecture. Such parallel processor architecture includes a plurality of processors each capable of physically executing a predetermined number of threads in parallel. For efficiency purposes, the predetermined number of threads of at least one of the processors may be executed to perform a scan operation involving a number of the elements that is a function (e.g. multiple, etc.) of the predetermined number of threads.Type: GrantFiled: September 27, 2007Date of Patent: March 31, 2015Assignee: NVIDIA CorporationInventors: Samuli M. Laine, Timo O. Aila, Mark J. Harris
-
Patent number: 8966225Abstract: A management unit causes a plurality of processing units to execute a calculation process. A determining unit determines whether a communication time for a communication process of exchanging a calculation result obtained from the calculation process is longer than a calculation time for the calculation process, the communication process being executed between a first computational node including the processor and a second computational node being a different computational node from the first computational node. A control unit limits number of processing units when the determining unit has determined that the communication time is longer than the calculation time.Type: GrantFiled: January 5, 2012Date of Patent: February 24, 2015Assignee: Fujitsu LimitedInventor: Yusuke Oishi
-
Patent number: 8918553Abstract: A mechanism programming a direct memory access engine operating as a multithreaded processor is provided. A plurality of programs is received from a host processor in a local memory associated with the direct memory access engine. A request is received in the direct memory access engine from the host processor indicating that the plurality of programs located in the local memory is to be executed. The direct memory access engine executes two or more of the plurality of programs without intervention by a host processor. As each of the two or more of the plurality of programs completes execution, the direct memory access engine sends a completion notification to the host processor that indicates that the program has completed execution.Type: GrantFiled: June 5, 2012Date of Patent: December 23, 2014Assignee: International Business Machines CorporationInventors: Brian K. Flachs, Harm P. Hofstee, Charles R. Johns, Matthew E. King, John S. Liberty, Brad W. Michael
-
Patent number: 8856493Abstract: A method of rotating data in a plurality of processing elements comprises a plurality of shifting operations and a plurality of storing operations, with the shifting and storing operations coordinated to enable a three shears operation to be performed on the data. The plurality of storing operations is responsive to the processing element's positions.Type: GrantFiled: February 14, 2012Date of Patent: October 7, 2014Assignee: Micron Technology, Inc.Inventor: Mark Beaumont
-
Publication number: 20140244972Abstract: An apparatus for physical properties computation comprising an array processor. The array processor comprises of a plurality of processing elements, said processing elements arranged in a grid. A processing unit (PU) is coupled to the array processor. A local memory is coupled to the PU. The PU broadcasts data to rows of said processing elements in said grid, and performs physical computations in an order of complexity of O((?N) log N).Type: ApplicationFiled: November 4, 2013Publication date: August 28, 2014Applicant: AiSeek Ltd.Inventors: Roy ARMONI, Ramon AXELROD
-
Patent number: 8688958Abstract: A processor has a plurality of PEs (processing elements) that operate in parallel based on operation commands and an information collection unit that collects the data of the plurality of PEs, wherein each of the plurality of PEs holds data and a condition flag, supplies the data and the condition flag to the information collection unit upon receiving an operation command, and upon receiving an update request for updating the condition flag, updates the condition flag in accordance with the update request that was received; and the information collection unit, upon receiving the data and the condition flags, selects one PE based on a predetermined order of priority from among the PEs for which the received condition flags are active and both supplies the data of the selected PE as collection result data and supplies an update request for updating the condition flag of the PE that was selected.Type: GrantFiled: January 14, 2010Date of Patent: April 1, 2014Assignee: NEC CorporationInventor: Shohei Nomoto
-
Patent number: 8688956Abstract: The execution engine is a new organization for a digital data processing apparatus for highly parallel execution of structured fine-grain parallel computations. The execution engine includes a memory for storing data and a domain flow program, a controller for requesting the domain flow program from the memory, and further for translating the program into programming information, a processor fabric for processing the domain flow programming information and a crossbar for sending tokens and the programming information to the processor fabric.Type: GrantFiled: May 18, 2009Date of Patent: April 1, 2014Assignee: Stillwater Supercomputing, Inc.Inventor: Erwinus Theodorus Leonardus Omtzigt
-
Patent number: 8638805Abstract: Described embodiments provide for restructuring a scheduling hierarchy of a network processor having a plurality of processing modules and a shared memory. The scheduling hierarchy schedules packets for transmission. The network processor generates tasks corresponding to each received packet associated with a data flow. A traffic manager receives tasks provided by one of the processing modules and determines a queue of the scheduling hierarchy corresponding to the task. The queue has a parent scheduler at each of one or more next levels of the scheduling hierarchy up to a root scheduler, forming a branch of the hierarchy. The traffic manager determines if the queue and one or more of the parent schedulers of the branch should be restructured. If so, the traffic manager drops subsequently received tasks for the branch, drains all tasks of the branch, and removes the corresponding nodes of the branch from the scheduling hierarchy.Type: GrantFiled: September 30, 2011Date of Patent: January 28, 2014Assignee: LSI CorporationInventors: Balakrishnan Sundararaman, Shashank Nemawarkar, David Sonnier, Shailendra Aulakh, Allen Vestal
-
Patent number: 8620940Abstract: A method for processing data for pattern matching includes: receiving a first sequence of data values; and generating a second sequence of data values based on the first sequence and one or more patterns and history of data values in the first sequence, wherein the second sequence has fewer data values than the first sequence and all subsequences in the first sequence that match at least one of the one or more patterns are represented in the second sequence.Type: GrantFiled: December 23, 2010Date of Patent: December 31, 2013Assignee: Tilera CorporationInventors: Mathew Hostetter, Kenneth M. Steele, Vijay Aggarwal
-
Patent number: 8549259Abstract: Systems, methods and articles of manufacture are disclosed for performing a vector collective operation on a parallel computing system that includes multiple compute nodes and a network connecting the compute nodes that includes an ALU. A collective operation may be performed to determine displacements for the vector collective operation. Descriptors for the vector collective operation may be generated based on the displacements. The vector collective operation may then be performed using the descriptors.Type: GrantFiled: September 15, 2010Date of Patent: October 1, 2013Assignee: International Business Machines CorporationInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Patent number: 8533432Abstract: Methods, apparatuses and storage device associated with cache and/or socket sensitive breadth-first iterative traversal of a graph by parallel threads, are described. A vertices visited array (VIS) may be employed to track graph vertices visited. VIS may be partitioned into VIS sub-arrays, taking into consideration cache sizes of LLC, to reduce likelihood of evictions. Potential boundary vertices arrays (PBV) may be employed to store potential boundary vertices for a next iteration, for vertices being visited in a current iteration. The number of PBV generated for each thread may take into consideration a number of sockets, over which the processor cores employed are distributed. The threads may be load balanced; further data locality awareness to reduce inter-socket communication may be considered, and/or lock-and-atomic free update operations may be employed.Type: GrantFiled: September 27, 2012Date of Patent: September 10, 2013Assignee: Intel CorporationInventors: Nadathur Rajagopalan Satish, Changkyu Kim, Jatin Chhugani, Jason D. Sewall
-
Patent number: 8532288Abstract: A cryptographic engine for modulo N multiplication, which is structured as a plurality of almost identical, serially connected Processing Elements, is controlled so as to accept input in blocks that are smaller than the maximum capability of the engine in terms of bits multiplied at one time. The serially connected hardware is thus partitioned on the fly to process a variety of cryptographic key sizes while still maintaining all of the hardware in an active processing state.Type: GrantFiled: December 1, 2006Date of Patent: September 10, 2013Assignee: International Business Machines CorporationInventors: Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh
-
Patent number: 8504659Abstract: The present invention provides a method and apparatus for configuration of adaptive integrated circuitry, to provide one or more operating modes or other functionality in a communication device, such as a cellular telephone, a GSM telephone, another type of mobile telephone or mobile station, or any other type of media communication device, including video, voice or radio, or other forms of multimedia. The adaptive integrated circuitry is configured and reconfigured for multiple tasks, such as channel acquisition, voice transmission, or multimedia and other data processing. In the preferred embodiment, the configuration and reconfiguration occurs to adaptively optimize the performance of the particular activity over time, such as to increase the speed of channel acquisition, increase throughput rates, increase perceived voice and media quality, and decrease the rate of dropped communication sessions.Type: GrantFiled: May 7, 2008Date of Patent: August 6, 2013Assignee: Altera CorporationInventors: Paul L. Master, Bohumir Uvacek
-
Patent number: 8504662Abstract: The present invention provides a method and apparatus for configuration of adaptive integrated circuitry, to provide one or more operating modes or other functionality in a communication device, such as a cellular telephone, a GSM telephone, another type of mobile telephone or mobile station, or any other type of media communication device, including video, voice or radio, or other forms of multimedia. The adaptive integrated circuitry is configured and reconfigured for multiple tasks, such as channel acquisition, voice transmission, or multimedia and other data processing. In the preferred embodiment, the configuration and reconfiguration occurs to adaptively optimize the performance of the particular activity over time, such as to increase the speed of channel acquisition, increase throughput rates, increase perceived voice and media quality, and decrease the rate of dropped communication sessions.Type: GrantFiled: March 8, 2010Date of Patent: August 6, 2013Assignee: Altera CorporationInventors: Paul L. Master, Bohumir Uvacek
-
Patent number: 8504661Abstract: The present invention provides a method and apparatus for configuration of adaptive integrated circuitry, to provide one or more operating modes or other functionality in a communication device, such as a cellular telephone, a GSM telephone, another type of mobile telephone or mobile station, or any other type of media communication device, including video, voice or radio, or other forms of multimedia. The adaptive integrated circuitry is configured and reconfigured for multiple tasks, such as channel acquisition, voice transmission, or multimedia and other data processing. In the preferred embodiment, the configuration and reconfiguration occurs to adaptively optimize the performance of the particular activity over time, such as to increase the speed of channel acquisition, increase throughput rates, increase perceived voice and media quality, and decrease the rate of dropped communication sessions.Type: GrantFiled: March 8, 2010Date of Patent: August 6, 2013Assignee: Altera CorporationInventors: Paul L. Master, Bohumir Uvacek
-
Patent number: 8484444Abstract: A multi-node video signal processor (VSPN) is describes that tightly couples multiple multi-cycle state machines (hardware assist units) to each processor and each memory in each node of an N node scalable array processor. VSPN memory hardware assist instructions are used to initiate multi-cycle state machine functions, to pass parameters to the multi-cycle state machines, to fetch operands from a node's memory, and to control the transfer of results from the multi-cycle state machines.Type: GrantFiled: March 1, 2011Date of Patent: July 9, 2013Assignee: Altera CorporationInventors: Gerald George Pechanek, Mihailo Stojancic
-
Patent number: 8443169Abstract: A Wings array system for communicating between nodes using store and load instructions is described. Couplings between nodes are made according to a 1 to N adjacency of connections in each dimension of a G×H matrix of nodes, where G?N and H?N and N is a positive odd integer. Also, a 3D Wings neural network processor is described as a 3D G×H×K network of neurons, each neuron with an N×N×N array of synaptic weight values stored in coupled memory nodes, where G?N, H?N, K?N, and N is determined from a 1 to N adjacency of connections used in the G×H×K network. Further, a hexagonal processor array is organized according to an INFORM coordinate system having axes at 60 degree spacing. Nodes communicate on row paths parallel to an FM dimension of communication, column paths parallel to an IO dimension of communication, and diagonal paths parallel to an NR dimension of communication.Type: GrantFiled: February 28, 2011Date of Patent: May 14, 2013Inventor: Gerald George Pechanek
-
Patent number: 8402251Abstract: A semiconductor device includes a first circuit that executes a first calculation, a second circuit that includes a first storage unit therein and executes a second calculation, a controller that outputs a first address for specifying a first execution circuit for the first calculation and a second execution circuit for the second calculation, to the first circuit and the second circuit, and controls input of data into the first circuit, and a bus that transfers a result of the first calculation executed by the first circuit to the second circuit, wherein the result of the first calculation can be conditionally used as an address for specifying the second execution circuit.Type: GrantFiled: August 19, 2009Date of Patent: March 19, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Takashi Yoshikawa, Shigehiro Asano
-
Patent number: 8375395Abstract: A computing architecture comprises a plurality of processing elements to perform data processing calculations, a plurality of memory elements to store the data processing results, and a reconfigurable interconnect network to couple the processing elements to the memory elements. The reconfigurable interconnect network includes a switching element, a control element, a plurality of processor interface units, a plurality of memory interface units, and a plurality of application control units. In various embodiments, the processing elements and the interconnect network may be implemented in a field-programmable gate array.Type: GrantFiled: January 3, 2008Date of Patent: February 12, 2013Assignee: L3 Communications Integrated Systems, L.P.Inventors: Deepak Prasanna, Matthew Pascal DeLaquil
-
Patent number: 8368423Abstract: Systems and methods for partial reconfiguration of reconfigurable application specific integrated circuit (ASIC) devices that may employ an interconnection template to allow partial reconfiguration (PR) blocks of an ASIC device to be selectively and dynamically interconnected and/or disconnected in standardized fashion from communication with a packet router within the same ASIC device.Type: GrantFiled: December 23, 2009Date of Patent: February 5, 2013Assignee: L-3 Communications Integrated Systems, L.P.Inventors: Jerry Yancey, Aya N. Bennett, Timothy M. Adams, Mathew A. Sanford
-
Patent number: 8356161Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.Type: GrantFiled: October 15, 2008Date of Patent: January 15, 2013Assignee: QST Holdings LLCInventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
-
Patent number: 8327115Abstract: A matrix of execution blocks form a set of rows and columns. The rows support parallel execution of instructions and the columns support execution of dependent instructions. The matrix of execution blocks process a single block of instructions specifying parallel and dependent instructions.Type: GrantFiled: April 12, 2007Date of Patent: December 4, 2012Assignee: Soft Machines, Inc.Inventor: Mohammad A. Abdallah
-
Publication number: 20120210097Abstract: A management unit causes a plurality of processing units to execute a calculation process. A determining unit determines whether a communication time for a communication process of exchanging a calculation result obtained from the calculation process is longer than a calculation time for the calculation process, the communication process being executed between a first computational node including the processor and a second computational node being a different computational node from the first computational node. A control unit limits number of processing units when the determining unit has determined that the communication time is longer than the calculation time.Type: ApplicationFiled: January 5, 2012Publication date: August 16, 2012Applicant: Fujitsu LimitedInventor: Yusuke OISHI
-
Patent number: 8225074Abstract: In accordance with exemplary implementations, application computation operations and communications between operations on a host processing platform may be adapted to conform to the memory capacity of a parallel accelerator. Computation operations may be split and scheduled such that the computation operations fit within the memory capacity of the accelerator. Further, the operations may be automatically adapted without any modification to the code of an application. In addition, data transfers between a host processing platform and the parallel accelerator may be minimized in accordance with exemplary aspects of the present principles to improve processing performance.Type: GrantFiled: March 6, 2009Date of Patent: July 17, 2012Assignee: NEC Laboratories America, Inc.Inventors: Srimat T. Chakradhar, Anand Raghunathan, Narayanan Sundaram
-
Patent number: 8205210Abstract: A method, apparatus and system for adaptably distributing video server processes among processing elements within a video server such that video server operation may be adapted in a manner facilitating rigorous timing constraints.Type: GrantFiled: August 12, 2008Date of Patent: June 19, 2012Assignee: Comcast IP Holdings I, LLCInventors: Geoffrey Alan Cleary, Joseph I. Brown
-
Publication number: 20120144155Abstract: A method of rotating data in a plurality of processing elements comprises a plurality of shifting operations and a plurality of storing operations, with the shifting and storing operations coordinated to enable a three shears operation to be performed on the data. The plurality of storing operations is responsive to the processing element's positions.Type: ApplicationFiled: February 14, 2012Publication date: June 7, 2012Inventor: Mark Beaumont
-
Patent number: 8156364Abstract: A method (which can be computer implemented) for processing a plurality of adjacent rows of data units, using a plurality of parallel processors, given (i) a predetermined processing order, and (ii) a specified inter-row dependency structure, includes the steps of determining starting times for each individual one of the processors, and maintaining synchronization across the processors, while ensuring that the dependency structure is not violated. Not all the starting times are the same, and a sum of absolute differences between (i) starting times of any given processor, and (ii) that one of the processors having an earliest starting time, is minimized.Type: GrantFiled: June 12, 2007Date of Patent: April 10, 2012Assignee: International Business Machines CorporationInventors: Krishna Ratakonda, Deepak S. Turaga
-
Patent number: 8140826Abstract: Methods, apparatus, and computer program products are disclosed for executing a gather operation on a parallel computer according to embodiments of the present invention. Embodiments include configuring, by the logical root, a result buffer or the logical root, the result buffer having positions, each position corresponding to a ranked node in the operational group and for storing contribution data gathered from that ranked node.Type: GrantFiled: May 29, 2007Date of Patent: March 20, 2012Assignee: International Business Machines CorporationInventors: Charles J. Archer, Joseph D. Ratterman
-
Patent number: 8135940Abstract: A method of rotating data in a plurality of processing elements comprises a plurality of shifting operations and a plurality of storing operations, with the shifting and storing operations coordinated to enable a three shears operation to be performed on the data. The plurality of storing operations is responsive to the processing element's positions.Type: GrantFiled: March 15, 2011Date of Patent: March 13, 2012Assignee: Micron Technologies, Inc.Inventor: Mark Beaumont
-
Publication number: 20120047349Abstract: A data transfer system includes: a plurality of processors; and a plurality of data transfer units that executes a data transfer from one processor to other processor via a plurality of input ports and a plurality of output ports. The data transfer unit includes: an arbitration unit that executes arbitration of conflicting data sent to a same next destination; and a strength information notification unit that sends strength information indicating a number of conflicts of the arbitrated conflicting data to the next destination. The arbitration unit decides a selection ratio, which is a ratio of selecting each of the input ports and receiving the conflicting data from the selected input port, according to a ratio between the input ports in relation to magnitude of the number of conflicts indicated by the strength information received from each of the input ports.Type: ApplicationFiled: August 19, 2011Publication date: February 23, 2012Applicant: NEC CORPORATIONInventor: Yasushi KANOH
-
Patent number: 8103853Abstract: A chip having an intelligent fabric may include a soft application processor, a reconfigurable hardware intelligent processor, a partitioned memory storage, and an interface to an external reconfigurable communication processor. The reconfigurable hardware intelligent processor may be configured to implement a distributed reconfigurable processor, and to provide cognitive control for at least one of allocation, reallocation, and performance monitoring.Type: GrantFiled: March 5, 2008Date of Patent: January 24, 2012Assignee: The Boeing CompanyInventors: Tirumale K. Ramesh, John L. Meier