Patents by Inventor Rajat Goel

Rajat Goel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150199272
    Abstract: Systems, processors, and methods for efficiently handling concurrent store and load operations within a processor. A processor comprises a load-store unit (LSU) with a banked level-one (L1) data cache. When a store operation is ready to write data to the L1 data cache, the store operation will skip the write to any banks that have a conflict with a concurrent load operation. A partial write of the store operation will be performed to those banks of the L1 data cache that do not have a conflict with a concurrent load operation. For every attempt to write the store operation, a corresponding store mask will be updated to indicate which portions of the store operation were successfully written to the L1 data cache.
    Type: Application
    Filed: January 13, 2014
    Publication date: July 16, 2015
    Applicant: Apple Inc.
    Inventors: Rajat Goel, Mridul Agarwal
  • Patent number: 9081826
    Abstract: Techniques for a system capable of performing low-latency database query processing are disclosed herein. The system includes a gateway server and a plurality of worker nodes. The gateway server is configured to divide a database query, for a database containing data stored in a distributed storage cluster having a plurality of data nodes, into a plurality of partial queries and construct a query result based on a plurality of intermediate results. Each worker node of the plurality of worker nodes is configured to process a respective partial query of the plurality of partial queries by scanning data related to the respective partial query that stored on at least one data node of the distributed storage cluster and generate an intermediate result of the plurality of intermediate results that is stored in a memory of that worker node.
    Type: Grant
    Filed: January 7, 2013
    Date of Patent: July 14, 2015
    Assignee: FACEBOOK, INC.
    Inventors: Raghotham Murthy, Rajat Goel
  • Publication number: 20150161033
    Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.
    Type: Application
    Filed: February 18, 2015
    Publication date: June 11, 2015
    Inventors: Rajat Goel, Hari S. Kannan, Khurram Z. Malik
  • Patent number: 9009369
    Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.
    Type: Grant
    Filed: October 27, 2011
    Date of Patent: April 14, 2015
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Hari S. Kannan, Khurram Z. Malik
  • Patent number: 8914580
    Abstract: In some embodiments, a cache may include a tag array and a data array, as well as circuitry that detects whether accesses to the cache are sequential (e.g., occupying the same cache line). For example, a cache may include a tag array and a data array that stores data, such as multiple bundles of instructions per cache line. During operation, it may be determined that successive cache requests are sequential and do not cross a cache line boundary. Responsively, various cache operations may be inhibited to conserve power. For example, access to the tag array and/or data array, or portions thereof, may be inhibited.
    Type: Grant
    Filed: August 23, 2010
    Date of Patent: December 16, 2014
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Ian D. Kountanis
  • Publication number: 20140215190
    Abstract: Techniques are disclosed relating to completion of load and store instructions in a weakly-ordered memory model. In one embodiment, a processor includes a load queue and a store queue and is configured to associate queue information with a load instruction in an instruction stream. In this embodiment, the queue information indicates a location of the load instruction in the load queue and one or more locations in the store queue that are associated with one or more store instructions that are older than the load instruction. The processor may determine, using the queue information, that the load instruction does not conflict with a store instruction in the store queue that is older than the load instruction. The processor may remove the load instruction from the load queue while the store instruction remains in the store queue. The queue information may include a wrap value for the load queue.
    Type: Application
    Filed: January 25, 2013
    Publication date: July 31, 2014
    Applicant: APPLE INC.
    Inventors: John H. Mylius, Rajat Goel, Pradeep Kanapathipillai, Hari S. Kannan
  • Publication number: 20140215191
    Abstract: Techniques are disclosed relating to ordering of load instructions in a weakly-ordered memory model. In one embodiment, a processor includes a cache with multiple cache lines and a store queue configured to maintain status information associated with a store instruction that targets a location in one of the cache lines. In this embodiment, the processor is configured to set an indicator in the status information in response to migration of the targeted cache line. The indicator may be usable to sequence performance of load instructions that are younger than the store instruction. For example, the processor may be configured to wait, based on the indicator, to perform a younger load instruction that targets the same location as the store instruction until the store instruction is removed from the store queue. This may prevent forwarding of the value of the store instruction to the younger load and preserve load-load ordering.
    Type: Application
    Filed: January 25, 2013
    Publication date: July 31, 2014
    Applicant: APPLE INC.
    Inventors: Pradeep Kanapathipillai, Hari Kannan, Po-Yung Chang, Ming-Ta Hsu, Rajat Goel
  • Publication number: 20140195558
    Abstract: Techniques for a system capable of performing low-latency database query processing are disclosed herein. The system includes a gateway server and a plurality of worker nodes. The gateway server is configured to divide a database query, for a database containing data stored in a distributed storage cluster having a plurality of data nodes, into a plurality of partial queries and construct a query result based on a plurality of intermediate results. Each worker node of the plurality of worker nodes is configured to process a respective partial query of the plurality of partial queries by scanning data related to the respective partial query that stored on at least one data node of the distributed storage cluster and generate an intermediate result of the plurality of intermediate results that is stored in a memory of that worker node.
    Type: Application
    Filed: January 7, 2013
    Publication date: July 10, 2014
    Inventors: Raghotham Murthy, Rajat Goel
  • Publication number: 20130297918
    Abstract: An apparatus and method for calculating flag bits is disclosed. The flag bits may be used in a processor utilizing branch predication. More particularly, the apparatus and method may be used to calculate a predicate that can be used by a branch unit to evaluate whether a branch is to be taken. In one embodiment, the apparatus is coupled to receive a condition code associated with an instruction, and flag bits generated responsive to execution of the instruction. The condition code is indicative of a condition to be checked resulting from execution of the instruction. The apparatus may then provide an indication of whether the condition is true.
    Type: Application
    Filed: May 2, 2012
    Publication date: November 7, 2013
    Inventors: Rajat Goel, Sandeep Gupta, Yamini Modukuru
  • Patent number: 8478074
    Abstract: Various embodiments are disclosed relating to providing multiple and native representations of an image. According to an example embodiment, multiple realizations of an image may be generated and provided, rather than only a single realization, for example. Also, in another embodiment, the generation and output of multiple realizations may use one or more native objects to natively perform the transforms or image processing to provide the images or realizations.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: July 2, 2013
    Assignee: Microsoft Corporation
    Inventors: Rajat Goel, Margaret L. Goodwin, Radu C. Margarint, Robert A. Wlodarczyk, Thomas W. Olsen, Wei-Chung Jones Wang
  • Publication number: 20130138924
    Abstract: An apparatus and method for avoiding bubbles and maintaining a maximum instruction throughput rate when cracking microcode instructions. A lookahead pointer scans the newest entries of a dispatch queue for microcode instructions. A detected microcode instruction is conveyed to a microcode engine to be cracked into a sequence of micro-ops. Then, the sequence of micro-ops is placed in a queue, and when the original microcode instruction entry in the dispatch queue is selected for dispatch, the sequence of micro-ops is dispatched to the next stage of the processor pipeline.
    Type: Application
    Filed: November 30, 2011
    Publication date: May 30, 2013
    Inventors: Ramesh B. Gunna, Peter J. Bannon, Rajat Goel
  • Publication number: 20130107655
    Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.
    Type: Application
    Filed: October 27, 2011
    Publication date: May 2, 2013
    Inventors: Rajat Goel, Hari S. Kannan, Khurram Z. Malik
  • Patent number: 8171258
    Abstract: In an embodiment, an address generation unit (AGU) is configured to generate a pseudo sum from an index portion of two or more operands. The pseudo sum may equal the index if the carry-in of the actual sum to the least significant bit of the index is a selected value (e.g. zero). The AGU may also include circuitry coupled to receive the operands and to generate the actual carry-in to the least significant bit of the index. The AGU may transmit the pseudo sum and the carry-in to a decode block for a memory array. The decode block may decode the pseudo sum into one or more one-hot vectors. The one-hot vectors may be input to muxes, and the one-hot vectors rotated by one position may be the other input. The actual carry-in may be the selection control of the mux.
    Type: Grant
    Filed: July 21, 2009
    Date of Patent: May 1, 2012
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Chen-Ju Hsieh
  • Publication number: 20120047329
    Abstract: In some embodiments, a cache may include a tag array and a data array, as well as circuitry that detects whether accesses to the cache are sequential (e.g., occupying the same cache line). For example, a cache may include a tag array and a data array that stores data, such as multiple bundles of instructions per cache line. During operation, it may be determined that successive cache requests are sequential and do not cross a cache line boundary. Responsively, various cache operations may be inhibited to conserve power. For example, access to the tag array and/or data array, or portions thereof, may be inhibited.
    Type: Application
    Filed: August 23, 2010
    Publication date: February 23, 2012
    Inventors: Rajat Goel, Ian D. Kountanis
  • Patent number: 8108551
    Abstract: A computer-implemented method for monitoring physical paths within a computer network may include: 1) identifying a first logical path within a computer network, 2) identifying a physical path that corresponds to the first logical path, 3) probing the physical path to determine whether the first logical path is active, 4) identifying a second logical path within the computer network, 5) determining that the physical path also corresponds to the second logical path, and then 6) using the results of the probe of the physical path to determine whether the second logical path is active without probing the physical path a second time. Additional computer-implemented methods for monitoring physical paths within multi-host computer networks are also disclosed.
    Type: Grant
    Filed: September 15, 2009
    Date of Patent: January 31, 2012
    Assignee: Symantec Corporation
    Inventors: Rajat Goel, Meena Patel
  • Patent number: 8063664
    Abstract: An integrated circuit includes multiple power domains. Supply current switch circuits (SCSCs) are distributed across each power domain. When a signal is present on a control node within a SCSC, the SCSC couples a local supply bus of the power domain to a global supply bus. An enable signal path extends through the SCSCs so that an enable signal can be propagated down a chain of SCSCs from control node to control node, thereby turning the SCSCs on one by one. When the domain is to be powered up, a control circuit asserts an enable signal that propagates down a first chain of SCSCs. After a programmable amount of time, the control circuit asserts a second enable signal that propagates down a second chain. By spreading the turning on of SCSCs over time, large currents that would otherwise be associated with coupling the local and global buses together are avoided.
    Type: Grant
    Filed: December 18, 2009
    Date of Patent: November 22, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Lew G Chua-Eoan, Matthew L Severson, Sorin A Dobre, Tsvetomir P Petrov, Rajat Goel
  • Publication number: 20110022824
    Abstract: In an embodiment, an address generation unit (AGU) is configured to generate a pseudo sum from an index portion of two or more operands. The pseudo sum may equal the index if the carry-in of the actual sum to the least significant bit of the index is a selected value (e.g. zero). The AGU may also include circuitry coupled to receive the operands and to generate the actual carry-in to the least significant bit of the index. The AGU may transmit the pseudo sum and the carry-in to a decode block for a memory array. The decode block may decode the pseudo sum into one or more one-hot vectors. The one-hot vectors may be input to muxes, and the one-hot vectors rotated by one position may be the other input. The actual carry-in may be the selection control of the mux.
    Type: Application
    Filed: July 21, 2009
    Publication date: January 27, 2011
    Inventors: Rajat Goel, Chen-Ju Hsieh
  • Publication number: 20100097101
    Abstract: An integrated circuit includes multiple power domains. Supply current switch circuits (SCSCs) are distributed across each power domain. When a signal is present on a control node within a SCSC, the SCSC couples a local supply bus of the power domain to a global supply bus. An enable signal path extends through the SCSCs so that an enable signal can be propagated down a chain of SCSCs from control node to control node, thereby turning the SCSCs on one by one. When the domain is to be powered up, a control circuit asserts an enable signal that propagates down a first chain of SCSCs. After a programmable amount of time, the control circuit asserts a second enable signal that propagates down a second chain. By spreading the turning on of SCSCs over time, large currents that would otherwise be associated with coupling the local and global buses together are avoided.
    Type: Application
    Filed: December 18, 2009
    Publication date: April 22, 2010
    Applicant: QUALCOMM INCORPORATED
    Inventors: Lew G. Chua-Eoan, Matthew Levi Severson, Sorin Adrian Dobre, Tsvetomir P. Petrov, Rajat Goel
  • Patent number: 7659746
    Abstract: An integrated circuit includes multiple power domains. Supply current switch circuits (SCSCs) are distributed across each power domain. When a signal is present on a control node within a SCSC, the SCSC couples a local supply bus of the power domain to a global supply bus. An enable signal path extends through the SCSCs so that an enable signal can be propagated down a chain of SCSCs from control node to control node, thereby turning the SCSCs on one by one. When the domain is to be powered up, a control circuit asserts an enable signal that propagates down a first chain of SCSCs. After a programmable amount of time, the control circuit asserts a second enable signal that propagates down a second chain. By spreading the turning on of SCSCs over time, large currents that would otherwise be associated with coupling the local and global buses together are avoided.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: February 9, 2010
    Assignee: QUALCOMM, Incorporated
    Inventors: Lew G. Chua-Eoan, Matthew Levi Severson, Sorin Adrian Dobre, Tsvetomir P. Petrov, Rajat Goel
  • Patent number: 7626595
    Abstract: In aspects, a class hierarchy is defined that provides definitions of methods for operating on at least bitmaps and vector graphics. A software developer may instantiate an object according to a class definition of the class hierarchy and assign it to any variable (e.g., a control's property) having a type of an ancestor class of the class. The object may be associated with an image internally represented as bitmap, vector graphics, or some other representation. The control does not need to be aware of how the image is represented. Rather, to draw an image associated with the object, a draw method associated with the object may be called.
    Type: Grant
    Filed: August 1, 2005
    Date of Patent: December 1, 2009
    Assignee: Microsoft Corporation
    Inventors: Greg D. Schechter, Adam M. Smith, Leonardo E. Blanco, Sriram Subramanian, Rajat Goel