Patents Examined by Hyun Nam
-
Patent number: 11593278Abstract: Some embodiments provide a method of providing distributed storage services to a host computer from a network interface card (NIC) of the host computer. At the NIC, the method accesses a set of one or more external storages operating outside of the host computer through a shared port of the NIC that is not only used to access the set of external storages but also for forwarding packets not related to an external storage. In some embodiments, the method accesses the external storage set by using a network fabric storage driver that employs a network fabric storage protocol to access the external storage set. The method presents the external storage as a local storage of the host computer to a set of programs executing on the host computer. In some embodiments, the method presents the local storage by using a storage emulation layer on the NIC to create a local storage construct that presents the set of external storages as a local storage of the host computer.Type: GrantFiled: January 9, 2021Date of Patent: February 28, 2023Assignee: VMWARE, INC.Inventors: Jinpyo Kim, Claudio Fleiner, Marc Fleischmann, Anjaneya P. Gondi, Yongqi Hu
-
Patent number: 11580052Abstract: The present disclosure relates to a communication method by I2C bus between a emitting device and a receiving device, in which: a rising edge of a clock signal of the I2C bus, directly following a start condition of an I2C communication, is recorded; and when an interruption is generated within the receiving device, the receiving device verifies whether the rising edge was recorded.Type: GrantFiled: August 26, 2020Date of Patent: February 14, 2023Assignee: STMicroelectronics (Grand Ouest) SASInventor: Yves Magnaud
-
Patent number: 11580371Abstract: A method, apparatus, and system are discussed to efficiently process and execute Artificial Intelligence operations. An integrated circuit has a tailored architecture to process and execute Artificial Intelligence operations, including computations for a neural network having weights with a sparse value. The integrated circuit contains at least a scheduler, one or more arithmetic logic units, and one or more random access memories configured to cooperate with each other to process and execute these computations for the neural network having weights with the sparse value.Type: GrantFiled: March 12, 2020Date of Patent: February 14, 2023Assignee: Roviero, Inc.Inventor: Deepak Mital
-
Patent number: 11573795Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.Type: GrantFiled: August 2, 2021Date of Patent: February 7, 2023Assignee: NVIDIA CorporationInventors: Ahmad Itani, Yen-Te Shih, Jagadeesh Sankaran, Ravi P Singh, Ching-Yu Hung
-
Patent number: 11573797Abstract: Disclosed are various embodiments for computing 2-body statistics on graphics processing units (GPUs). Various types of two-body statistics (2-BS) are regarded as essential components of data analysis in many scientific and computing domains. However, the quadratic complexity of these computations hinders timely processing of data. According, various embodiments of the present disclosure involve parallel algorithms for 2-BS computation on Graphics Processing Units (GPUs). Although the typical 2-BS problems can be summarized into a straightforward parallel computing pattern, traditional wisdom from (general) parallel computing often falls short in delivering the best possible performance. Therefore, various embodiments of the present disclosure involve techniques to decompose 2-BS problems and methods for effective use of computing resources on GPUs. We also develop analytical models that guide users towards the appropriate parameters of a GPU program.Type: GrantFiled: September 14, 2021Date of Patent: February 7, 2023Assignee: UNIVERSITY OF SOUTH FLORIDAInventors: Yicheng Tu, Napath Pitaksirianan
-
Patent number: 11567767Abstract: A system for processing gather and scatter instructions can implement a front-end subsystem, a back-end subsystem, or both. The front-end subsystem includes a prediction unit configured to determine a predicted quantity of coalesced memory access operations required by an instruction. A decode unit converts the instruction into a plurality of access operations based on the predicted quantity, and transmits the plurality of access operations and an indication of the predicted quantity to an issue queue. The back-end subsystem includes a load-store unit that receives a plurality of access operations corresponding to an instruction, determines a subset of the plurality of access operations that can be coalesced, and forms a coalesced memory access operation from the subset. A queue stores multiple memory addresses for a given load-store entry to provide for execution of coalesced memory accesses.Type: GrantFiled: July 30, 2020Date of Patent: January 31, 2023Assignees: MARVELL ASIA PTE, LTD., CRAY INC.Inventors: Harold Wade Cain, III, Rabin Andrew Sugumar, Nagesh Bangalore Lakshminarayana, Daniel Jonathan Ernst, Sanyam Mehta
-
Patent number: 11568269Abstract: Disclosed are a scheduling method and a related apparatus. A computing apparatus in a server can be chosen to implement a computation request, thereby improving the running efficiency of the server.Type: GrantFiled: August 2, 2018Date of Patent: January 31, 2023Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Zidong Du, Luyang Jin
-
Patent number: 11550626Abstract: An information processing apparatus includes a memory and a processor couple to the memory and configured to generate one or more job groups by grouping multiple jobs of execution targets in descending order of priority, and perform a control for scheduling execution timings regarding the multiple jobs such that scheduling of respective jobs included in a specific job group including a job having a higher priority is implemented by priority over scheduling of respective jobs included in other job groups. The processor performs the control for scheduling the execution timings of the respective jobs included in the specific job group such that an execution completion time of all the jobs included in the specific job group satisfies a predetermined condition.Type: GrantFiled: October 19, 2020Date of Patent: January 10, 2023Assignee: FUJITSU LIMITEDInventors: Ryuichi Sekizawa, Shigeto Suzuki
-
Patent number: 11550694Abstract: A packet backpressure detection method and apparatus are provided. The method includes: a device which having a Peripheral Component Interconnect Express (PCIe) port storing a plurality of packets for transmission in a packet queue and storing a packet that is to be transmitted next in a first buffer, where the queue comprises a plurality of packets that are to be transmitted via the PCIe port; and the queue is stored in a second buffer; recording a storage duration of each packet stored in the first buffer, and accumulating the storage duration of each packet stored in the first buffer; removing the packet from the first buffer after the packet is transmitted via the PCIe port; and generating an indication of packet pressure at the PCIe port based on the accumulated storage duration.Type: GrantFiled: October 15, 2019Date of Patent: January 10, 2023Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Bin Zhang, Ligang Chen, Jiahuai Chen, Lixia Xu
-
Patent number: 11550619Abstract: According to one embodiment, an information processing device includes a processor, a controller, and a memory. The memory stores a vector address related to an interrupt request executed on condition that the processor is in a sleep state. The controller receives the interrupt request and detects that the processor transitions to the sleep state, detects fetch of the vector address of the interrupt request after the sleep state of the processor is detected, and inputs the vector address that is related to the interrupt request and stored in the memory into the processor in a case where the fetch of the vector address of the interrupt request is detected.Type: GrantFiled: September 2, 2021Date of Patent: January 10, 2023Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA ELECTRONIC DEVICES & STORAGE CORPORATIONInventors: Mikio Hashimoto, Masami Aizawa, Satoru Suzuki, Tsuneki Sasaki
-
Patent number: 11551066Abstract: A DNN hardware accelerator and an operation method of the DNN hardware accelerator are provided. The DNN hardware accelerator includes: a network distributor for receiving an input data and distributing respective bandwidth of a plurality of data types of a target data amount based on a plurality of bandwidth ratios of the target data amount; and a processing element array coupled to the network distributor, for communicating data of the data types of the target data amount between the network distributor based on the distributed bandwidth of the data types.Type: GrantFiled: January 15, 2019Date of Patent: January 10, 2023Assignee: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTEInventors: Yao-Hua Chen, Chun-Chen Chen, Chih-Tsun Huang, Jing-Jia Liou, Chun-Hung Lai, Juin-Ming Lu
-
Patent number: 11544061Abstract: Methods and systems for solving a linear system include setting resistances in an array of settable electrical resistances in accordance with values of an input matrix. A series of input vectors is applied to the array as voltages to generate a series of respective output vectors. Each input vector in the series of vectors is updated based on comparison of the respective output vectors to a target vector. A solution of a linear system is determined that includes the input matrix based on the updated input vectors.Type: GrantFiled: December 22, 2020Date of Patent: January 3, 2023Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, RAMOT AT TEL AVIV UNIVERSITY LTD.Inventors: Malte Johannes Rasch, Oguzhan Murat Onen, Tayfun Gokmen, Chai Wah Wu, Mark S. Squillante, Tomasz J. Nowicki, Wilfried Haensch, Lior Horesh, Vasileios Kalantzis, Haim Avron
-
Patent number: 11537538Abstract: In one embodiment, a cache coherent system includes one or more agents (e.g., coherent agents) that may cache data used by the system. The system may include a point of coherency in a memory controller in the system, and thus the agents may transmit read requests to the memory controller to coherently read data. The point of coherency may determine if the data is cached in another agent, and may transmit a copy back request to the other agent if the other agent has modified the data. The system may include an interconnect between the agents and the memory controller. At a point on the interconnect at which traffic from the agents converges, a copy back response may be converted to a fill for the requesting agent.Type: GrantFiled: April 27, 2021Date of Patent: December 27, 2022Assignee: Apple Inc.Inventors: Harshavardhan Kaushikkar, Christopher D. Shuler, Srinivasa Rangan Sridharan, Yu Zhang, Kaushik Kannan, Deniz Balkan
-
Patent number: 11531637Abstract: A computer comprising a plurality of interconnected processing nodes arranged in a toroid configuration in which multiple layers of interconnected nodes are arranged along an axis; each layer comprising a plurality of processing nodes connected in a ring in a non-axial plane by at least an intralayer respective set of links between each pair of neighbouring processing nodes, the links in each set adapted to operate simultaneously; wherein each of the processing nodes in each layer is connected to a respective corresponding node in each adjacent layer by an interlayer link to form respective rings along the axis; the computer programmed to provide a plurality of embedded one-dimensional logical paths and to transmit data around each of the embedded one-dimensional paths in such a manner that the plurality of embedded one-dimensional logical paths operate simultaneously, each logical path using all processing nodes of the computer in a sequence.Type: GrantFiled: March 24, 2021Date of Patent: December 20, 2022Assignee: GRAPHCORE LIMITEDInventor: Simon Knowles
-
Patent number: 11526358Abstract: Techniques are disclosed for interposing on nondeterministic events during multicore virtual machine (VM) execution to capture information that allows for deterministically recreating the nondeterministic events during execution replay of the VM. A method may include reading, by a virtual processor running within a multicore VM instance, an instruction to execute, and, responsive to a determination that the instruction is a nondeterministic instruction, interposing on the nondeterministic instruction execution so as to allow deterministic execution of the nondeterministic instruction during replay execution of the multicore VM instance. Interposing on the nondeterministic instruction execution may include recording a partial barrier event and/or a full barrier event. The nondeterministic instruction may be a read memory access instruction or a write memory access instruction.Type: GrantFiled: September 15, 2020Date of Patent: December 13, 2022Assignee: Raytheon CompanyInventors: Gregory Price, William Wysocki, Matthew A. Taylor
-
Patent number: 11507380Abstract: A processing system includes a processor with a branch predictor including one or more branch target buffer tables. The processor also includes a branch prediction pipeline including a throttle unit and an uncertainty accumulator. The processor assigns an uncertainty value for each of a plurality of branch predictions generated by the branch predictor and adds the uncertainty value for each of the plurality of branch predictions to an accumulated uncertainty counter associated with the uncertainty accumulator. The throttle unit of the branch prediction pipeline throttles operations of the branch prediction pipeline based on the accumulated uncertainty counter.Type: GrantFiled: August 29, 2018Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Thomas Clouqueur
-
Patent number: 11507043Abstract: The present disclosure provides a method and a system for automatically configuring an I/O port. The method applied to a central processor includes: receiving request information from a controlled device, the request information carrying a type of a signal required by the controlled device, and sending, according to the type of the signal, a configuration instruction to a control device, and instructing the control device to configure the I/O port according to the configuration instruction. The controlled device is connected to the central processing unit, or the controlled device is connected to the central processor by means of the control device.Type: GrantFiled: June 13, 2019Date of Patent: November 22, 2022Assignee: GREE ELECTRIC APPLIANCES, INC. OF ZHUHAIInventors: Wenhui Zhang, Wenhao Wu, Peng Ren
-
Patent number: 11507376Abstract: Disclosed embodiments relate to instructions for fast element unpacking. In one example, a processor includes fetch circuitry to fetch an instruction whose format includes fields to specify an opcode and locations of an Array-of-Structures (AOS) source matrix and one or more Structure of Arrays (SOA) destination matrices, wherein: the specified opcode calls for unpacking elements of the specified AOS source matrix into the specified Structure of Arrays (SOA) destination matrices, the AOS source matrix is to contain N structures each containing K elements of different types, with same-typed elements in consecutive structures separated by a stride, the SOA destination matrices together contain K segregated groups, each containing N same-typed elements, decode circuitry to decode the fetched instruction, and execution circuitry, responsive to the decoded instruction, to unpack each element of the specified AOS matrix into one of the K element types of the one or more SOA matrices.Type: GrantFiled: January 19, 2021Date of Patent: November 22, 2022Assignee: Intel CorporationInventors: Bret Toll, Alexander F. Heinecke, Christopher J. Hughes, Ronen Zohar, Michael Espig, Dan Baum, Raanan Sade, Robert Valentine, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 11507378Abstract: In one example, an integrated circuit comprises: a memory configured to store a first mapping between a first opcode and first control information and a second mapping between the first opcode and second control information; a processing engine configured to perform processing operations based on the control information; and a controller configured to: at a first time, provide the first opcode to the memory to, based on the first mapping stored in the memory, fetch the first control information for the processing engine, to enable the processing engine to perform a first processing operation based on the first control information; and at a second time, provide the first opcode to the memory to, based on the second mapping stored in the memory, fetch the second control information for the processing engine, to enable the processing engine to perform a second processing operation based on the second control information.Type: GrantFiled: March 1, 2021Date of Patent: November 22, 2022Assignee: Amazon Technologies, Inc.Inventors: Ron Diamant, Sundeep Amirineni, Mohammad El-Shabani, Sagar Sonar, Kenneth Wayne Patton
-
Patent number: 11507414Abstract: A circuit for fast interrupt handling is disclosed. An apparatus includes a processor circuit having an execution pipeline and a table configured to store a plurality of pointers that correspond to interrupt routines stored in a memory circuit. The apparatus further includes an interrupt redirect circuit configured to receive a plurality of interrupt requests. The interrupt redirect circuit may select a first interrupt request among a plurality of interrupt requests of a first type. The interrupt redirect circuit retrieves a pointer from the table using information associated with the request. Using the pointer, the execution pipeline retrieves first program instruction from the memory circuit to execute a particular interrupt routine.Type: GrantFiled: February 10, 2021Date of Patent: November 22, 2022Assignee: Cadence Design Systems, Inc.Inventors: Robert T. Golla, Thomas Martin Wicki, Jama Ismail Barreh