Patents by Inventor Tushar P. Ringe
Tushar P. Ringe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12079132Abstract: Data transfer between caching domains of a data processing system is achieved by a local coherency node (LCN) of a first caching domain receiving a read request for data associated with a second caching domain, from a requesting node of the first caching domain. The LCN requests the data from the second caching domain via a transfer agent. In response to receiving a cache line containing the data from the second caching domain, the transfer agent sends the cache line to the requesting node, bypassing the LCN and, optionally, sends a read-receipt indicating the state of the cache line to the LCN. The LCN updates a coherency state for the cache line in response to receiving the read-receipt from the transfer agent and a completion acknowledgement from the requesting node. Optionally, the transfer agent may send the cache line via the LCN when congestion is detected in a response channel of the data processing system.Type: GrantFiled: January 26, 2023Date of Patent: September 3, 2024Assignee: Arm LimitedInventors: Jamshed Jalal, Ashok Kumar Tummala, Wenxuan Zhang, Daniel Thomas Pinero, Tushar P Ringe
-
Publication number: 20240273025Abstract: A super home node of a first chip of a multi-chip data processing system manages coherence for both local and remote cache lines accessed by local caching agents and local cache lines accessed by caching agents of one or more second chips. Both local and remote cache lines are stored in a shared cache, and requests are stored in shared point-of-coherency queue. An entry in a snoop filter table of the super home node includes a presence vector that indicates the presence of a remote cache line at specific caching agents of the first chip or the presence of a local cache line at specific caching agents of the first chip and any caching agent of the second chip. All caching agents of the second chip are represented as a single caching agent in the presence vector.Type: ApplicationFiled: February 14, 2023Publication date: August 15, 2024Applicant: Arm LimitedInventors: Wenxuan Zhang, Jamshed Jalal, Mark David Werkheiser, Sakshi Verma, Ritukar Khanna, Devi Sravanthi Yalamarthy, Gurunath Ramagiri, Mukesh Patel, Tushar P Ringe
-
Publication number: 20240273026Abstract: A data processing apparatus includes one or more cache configuration data stores, a coherence manager, and a shared cache. The coherence manager is configured to track and maintain coherency of cache lines accessed by local caching agents and one or more remote caching agents. The cache lines include local cache lines accessed from a local memory region and remote cache lines accessed from a remote memory region. The shared cache is configured to store local cache lines in a first partition and to store remote cache lines in a second partition. The sizes of the first and second partitions are determined based on values in the one or more cache configuration data stores and may or not overlap. The cache configuration data stores may be programmable by a user or dynamically programmed in response to local memory and remote memory access patterns.Type: ApplicationFiled: February 14, 2023Publication date: August 15, 2024Applicant: Arm LimitedInventors: Devi Sravanthi Yalamarthy, Jamshed Jalal, Mark David Werkheiser, Wenxuan Zhang, Ritukar Khanna, Rajani Pai, Gurunath Ramagiri, Mukesh Patel, Tushar P Ringe
-
Publication number: 20240256460Abstract: Efficient data transfer between caching domains of a data processing system is achieved by a local coherency node (LCN) of a first caching domain receiving a read request for data associated with a second caching domain, from a requesting node of the first caching domain. The LCN requests the data from the second caching domain via a transfer agent. In response to receiving a cache line containing the data from the second caching domain, the transfer agent sends the cache line to the requesting node, bypassing the LCN and, optionally, sends a read-receipt indicating the state of the cache line to the LCN. The LCN updates a coherency state for the cache line in response to receiving the read-receipt from the transfer agent and a completion acknowledgement from the requesting node. Optionally, the transfer agent may send the cache line via the LCN when congestion is detected in a response channel of the data processing system.Type: ApplicationFiled: January 26, 2023Publication date: August 1, 2024Applicant: Arm LimitedInventors: Jamshed Jalal, Ashok Kumar Tummala, Wenxuan Zhang, Daniel Thomas Pinero, Tushar P Ringe
-
Patent number: 11934334Abstract: The present disclosure advantageously provides a method and system for transferring data over a chip-to-chip interconnect (CCI). At a request node of a coherent interconnect (CHI) of a first chip, receiving at least one peripheral component interface express (PCIe) transaction from a PCIe master device, the PCIe transaction including a stream identifier; selecting a CCI port of the CHI of the first chip based on the stream identifier of the PCIe transaction; and sending the PCIe transaction to the selected CCI port.Type: GrantFiled: April 29, 2021Date of Patent: March 19, 2024Assignee: Arm LimitedInventors: Tushar P Ringe, Mark David Werkheiser, Jamshed Jalal, Sai Kumar Marri, Ashok Kumar Tummala, Rishabh Jain
-
Patent number: 11899607Abstract: An apparatus comprises an interconnect providing communication paths between agents coupled to the interconnect. A coordination agent is provided which performs an operation requiring sending a request to each of a plurality of target agents, and receiving a response from each of the target agents, the operation being unable to complete until the response has been received from each of the target agents. Storage circuitry is provided which is accessible to the coordination agent and configured to store, for each agent that the coordination agent may communicate with via the interconnect, a latency indication for communication between that agent and the coordination agent. The coordination agent is configured, prior to performing the operation, to determine a sending order in which to send the request to each of the target agents, the sending order being determined in dependence on the latency indication for each of the target agents.Type: GrantFiled: June 1, 2021Date of Patent: February 13, 2024Assignee: Arm LimitedInventors: Timothy Hayes, Alejandro Rico Carro, Tushar P. Ringe, Kishore Kumar Jagadeesha
-
Patent number: 11599467Abstract: The present disclosure advantageously provides a system cache and a method for storing coherent data and non-coherent data in a system cache. A transaction is received from a source in a system, the transaction including at least a memory address, the source having a location in a coherent domain or a non-coherent domain of the system, the coherent domain including shareable data and the non-coherent domain including non-shareable data. Whether the memory address is stored in a cache line is determined, and, when the memory address is not determined to be stored in a cache line, a cache line is allocated to the transaction including setting a state bit of the allocated cache line based on the source location to indicate whether shareable or non-shareable data is stored in the allocated cache line, and the transaction is processed.Type: GrantFiled: May 27, 2021Date of Patent: March 7, 2023Assignee: Arm LimitedInventors: Jamshed Jalal, Bruce James Mathewson, Tushar P Ringe, Sean James Salisbury, Antony John Harris
-
Patent number: 11593025Abstract: A request node is provided comprising request circuitry to issue write requests to write data to storage circuitry. The write requests are issued to the storage circuitry via a coherency node. Status receiving circuitry receives a write status regarding write operations at the storage circuitry from the coherency node and throttle circuitry throttles a rate at which the write requests are issued to the storage circuitry in dependence on the write status. A coherency node is also provided, comprising access circuitry to receive a write request from a request node to write data to storage circuitry and to access the storage circuitry to write the data to the storage circuitry. Receive circuitry receives, from the storage circuitry, an incoming write status regarding write operations at the storage circuitry and transmit circuitry transmits an outgoing write status to the request node based on the incoming write status.Type: GrantFiled: January 15, 2020Date of Patent: February 28, 2023Assignee: Arm LimitedInventors: Gurunath Ramagiri, Jamshed Jalal, Mark David Werkheiser, Tushar P Ringe, Klas Magnus Bruce, Ritukar Khanna
-
Patent number: 11567870Abstract: An apparatus comprises snoop filter storage circuitry to store snoop filter entries corresponding to addresses and comprising sharer information. Control circuitry selects which sharers, among a plurality of sharers capable of holding cached data, should be issued with snoop requests corresponding to a target address, based on the sharer information of the snoop filter entry corresponding to the target address. The control circuitry is capable of setting a given snoop filter entry corresponding to a given address to an imprecise encoding in which the sharer information provides an imprecise description of which sharers hold cached data corresponding to the given address, and the given snoop filter entry comprises at least one sharer count value indicative of a number of sharers holding cached data corresponding to the given address.Type: GrantFiled: March 29, 2021Date of Patent: January 31, 2023Assignee: Arm LimitedInventors: Joshua Randall, Jamshed Jalal, Tushar P. Ringe, Jesse Garrett Beu
-
Patent number: 11550720Abstract: Entries in a cluster-to-caching agent map table of a data processing network identify one or more caching agents in a caching agent cluster. A snoop filter cache stores coherency information that includes coherency status information and a presence vector, where a bit position in the presence vector is associated with a caching agent cluster in the cluster-to-caching agent map table. In response to a data request, a presence vector in the snoop filter cache is accessed to identify a caching agent cluster and the map table is accessed to identify target caching agents for snoop messages. In order to reduce message traffic, snoop message are sent only to the identified targets.Type: GrantFiled: November 24, 2020Date of Patent: January 10, 2023Assignee: Arm LimitedInventors: Gurunath Ramagiri, Jamshed Jalal, Mark David Werkheiser, Tushar P Ringe, Mukesh Patel, Sakshi Verma
-
Patent number: 11531620Abstract: A data processing network includes request nodes with local memories accessible as a distributed virtual memory (DVM) and coupled by an interconnect fabric. Multiple DVM domains are assigned, each containing a DVM node for handling DVM operation requests from request nodes in the domain. On receipt of a request, a DVM node sends a snoop message to other request nodes in its domain and sends a snoop message to one or more peer DVM nodes in other DVM domains. The DVM node receives snoop responses from the request nodes and from the one or more peer DVM nodes, and send a completion message to the first request node. Each peer DVM node sends snoop messages to the request nodes in its domain, collects snoop responses, and sends a single response to the originating DVM node. In this way, DVM operations are performed in parallel.Type: GrantFiled: March 25, 2021Date of Patent: December 20, 2022Assignee: Arm LimitedInventors: Kishore Kumar Jagadeesha, Jamshed Jalal, Tushar P Ringe, Mark David Werkheiser, Premkishore Shivakumar, Lauren Elise Guckert
-
Publication number: 20220382679Abstract: The present disclosure advantageously provides a system cache and a method for storing coherent data and non-coherent data in a system cache. A transaction is received from a source in a system, the transaction including at least a memory address, the source having a location in a coherent domain or a non-coherent domain of the system, the coherent domain including shareable data and the non-coherent domain including non-shareable data. Whether the memory address is stored in a cache line is determined, and, when the memory address is not determined to be stored in a cache line, a cache line is allocated to the transaction including setting a state bit of the allocated cache line based on the source location to indicate whether shareable or non-shareable data is stored in the allocated cache line, and the transaction is processed.Type: ApplicationFiled: May 27, 2021Publication date: December 1, 2022Applicant: Arm LimitedInventors: Jamshed Jalal, Bruce James Mathewson, Tushar P Ringe, Sean James Salisbury, Antony John Harris
-
Publication number: 20220350771Abstract: The present disclosure advantageously provides a method and system for transferring data over a chip-to-chip interconnect (CCI). At a request node of a coherent interconnect (CHI) of a first chip, receiving at least one peripheral component interface express (PCIe) transaction from a PCIe master device, the PCIe transaction including a stream identifier; selecting a CCI port of the CHI of the first chip based on the stream identifier of the PCIe transaction; and sending the PCIe transaction to the selected CCI port.Type: ApplicationFiled: April 29, 2021Publication date: November 3, 2022Applicant: Arm LimitedInventors: Tushar P Ringe, Mark David Werkheiser, Jamshed Jalal, Sai Kumar Marri, Ashok Kumar Tummala, Rishabh Jain
-
Patent number: 11483260Abstract: An improved protocol for data transfer between a request node and a home node of a data processing network that includes a number of devices coupled via an interconnect fabric is provided that minimizes the number of response messages transported through the interconnect fabric. When congestion is detected in the interconnect fabric, a home node sends a combined response to a write request from a request node. The response is delayed until a data buffer is available at the home node and home node has completed an associated coherence action. When the request node receives a combined response, the data to be written and the acknowledgment are coalesced in the data message.Type: GrantFiled: May 2, 2019Date of Patent: October 25, 2022Assignee: Arm LimitedInventors: Jamshed Jalal, Tushar P Ringe, Phanindra Kumar Mannava, Dimitrios Kaseridis
-
Publication number: 20220308997Abstract: A data processing network includes request nodes with local memories accessible as a distributed virtual memory (DVM) and coupled by an interconnect fabric. Multiple DVM domains are assigned, each containing a DVM node for handling DVM operation requests from request nodes in the domain. On receipt of a request, a DVM node sends a snoop message to other request nodes in its domain and sends a snoop message to one or more peer DVM nodes in other DVM domains. The DVM node receives snoop responses from the request nodes and from the one or more peer DVM nodes, and send a completion message to the first request node. Each peer DVM node sends snoop messages to the request nodes in its domain, collects snoop responses, and sends a single response to the originating DVM node. In this way, DVM operations are performed in parallel.Type: ApplicationFiled: March 25, 2021Publication date: September 29, 2022Applicant: Arm LimitedInventors: Kishore Kumar Jagadeesha, Jamshed Jalal, Tushar P Ringe, Mark David Werkheiser, Premkishore Shivakumar, Lauren Elise Guckert
-
Patent number: 11431649Abstract: The present disclosure advantageously provides a method and system for allocating shared resources for an interconnect. A request is received at a home node from a request node over an interconnect, where the request represents a beginning of a transaction with a resource in communication with the home node, and the request has a traffic class defined by a user-configurable mapping based on one or more transaction attributes. The traffic class of the request is determined. A resource capability for the traffic class is determined based on user configurable traffic class-based resource capability data. Whether a home node transaction table has an available entry for the request is determined based on the resource capability for the traffic class.Type: GrantFiled: March 26, 2021Date of Patent: August 30, 2022Assignee: Arm LimitedInventors: Mukesh Patel, Jamshed Jalal, Gurunath Ramagiri, Tushar P Ringe, Mark David Werkheiser
-
Publication number: 20220164288Abstract: Entries in a cluster-to-caching agent map table of a data processing network identify one or more caching agents in a caching agent cluster. A snoop filter cache stores coherency information that includes coherency status information and a presence vector, where a bit position in the presence vector is associated with a caching agent cluster in the cluster-to-caching agent map table. In response to a data request, a presence vector in the snoop filter cache is accessed to identify a caching agent cluster and the map table is accessed to identify target caching agents for snoop messages. In order to reduce message traffic, snoop message are sent only to the identified targets.Type: ApplicationFiled: November 24, 2020Publication date: May 26, 2022Applicant: Arm LimitedInventors: Gurunath Ramagiri, Jamshed Jalal, Mark David Werkheiser, Tushar P. Ringe, Mukesh Patel, Sakshi Verma
-
Patent number: 11256646Abstract: An apparatus and method are provided for handling ordered transactions. The apparatus has a plurality of completer elements to process transactions, a requester element to issue a sequence of ordered transactions, and an interconnect providing, for each completer element, a communication channel between that completer element and the requester element for transfer of signals between that completer element and the requester element in either direction. A given completer element that is processing a given transaction in the sequence is arranged to issue a response signal to the requester element over its associated communication channel that comprises an ordered channel indication to identify whether the associated communication channel has an ordered channel property. The ordered channel property guarantees that processing of transactions issued by the requester element over the associated communication channel in a given order will be completed by the given completer element in the same given order.Type: GrantFiled: November 15, 2019Date of Patent: February 22, 2022Assignee: Arm LimitedInventors: Tushar P Ringe, Jamshed Jalal, Gurunath Ramagiri, Ashok Kumar Tummala, Mark David Werkheiser
-
Patent number: 11181957Abstract: An improved apparatus and method for the protection of reset in systems with stringent safety goals that employ primary and shadow logic blocks with a lock-step checker to achieve functional safety, including those systems having very large fanout of primary and shadow reset signal trees. The disclosed apparatus and method support assertion of reset that is asynchronous to the system clock and deassertion of reset that is synchronous to the system clock. Shadow logic blocks have reset deasserted a fixed number of clock cycles after their respective primary logic blocks, thereby avoiding the requirement to synchronize the primary and shadow reset signal trees at each of their end points to ensure lock-step operation between the primary and shadow logic blocks.Type: GrantFiled: November 24, 2020Date of Patent: November 23, 2021Assignee: Arm LimitedInventors: Ramamoorthy Guru Prasadh, Tushar P Ringe, Kishore Kumar Jagadeesha, David Joseph Hawkins, Saira Samar Malik
-
Patent number: 11146495Abstract: The present disclosure advantageously provides a system and method for protocol layer tunneling for a data processing system. A system includes an interconnect, a request node coupled to the interconnect, and a home node coupled to the interconnect. The request node includes a request node processor, and the home node includes a home node processor. The request node processor is configured to send, to the home node, a sequence of dynamic requests, receive a sequence of retry requests associated with the sequence of dynamic requests, and send a sequence of static requests associated with the sequence of dynamic requests in response to receiving credit grants from the home node. The home node processor is configured to send the sequence of retry requests in response to receiving the sequence of dynamic requests, determine the credit grants, and send the credit grants.Type: GrantFiled: August 23, 2019Date of Patent: October 12, 2021Assignee: Arm LimitedInventors: Tushar P. Ringe, Jamshed Jalal, Kishore Kumar Jagadeesha