Patents by Inventor Raheel Khan

Raheel Khan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230359694
    Abstract: A system and method for implementing variable-precision matrix multiplications using a low-precision digit matrix multiplier is disclosed. The system enables multiplication of matrices of different dimensions by splitting the large matrix into fixed-size matrix blocks. These block matrices are further decomposed into fixed-precision digit submatrices that are then individually multiplied, scaled, and accumulated to allow for variable-precision matrix multiplication. The system uses a systolic array of block matrix multipliers, which are each an array of dot product units, to efficiently implement larger matrix multiplications without substantially increasing either latency or wiring congestion. The system further uses only unsigned digit matrix multipliers but accounts for signed matrix multiplication by using row and column sums of the input matrices to adjust for the signed to unsigned conversion.
    Type: Application
    Filed: May 5, 2022
    Publication date: November 9, 2023
    Applicant: Xcelerium Inc
    Inventors: Hamza Khan, Asma Khan, Raheel Khan
  • Patent number: 10423567
    Abstract: Transmission of data over a serial link based on a unidirectional clock signal is provided. A unidirectional clock signal is generated based on a first clock of a master device. The unidirectional clock signal is sent to a slave device that is connected to the serial link. The master device transmits data to the slave device over the serial link based on the first clock. The slave device receives the unidirectional clock signal from a master device. The slave device transmits data over the serial link to the master device based on the unidirectional clock signal.
    Type: Grant
    Filed: February 1, 2017
    Date of Patent: September 24, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Scott Cheng, Pascal Philippe, Joaquin Romera
  • Patent number: 10159053
    Abstract: Systems, methods, and apparatus for synchronizing timing in devices coupled to a data communication link are disclosed. In one example, a first device programs a future system time value in a second device. The first device launches a low-latency trigger signal that causes the future system time value to be loaded into a timer of the second device when a timer of the first device matches the future system time value. The second device measures phase difference between the trigger signal and edges of a clock signal used for timing in the second device. The phase difference is measured using an oversampling clock that provides a desired measurement reliability. The measured phase difference permits the first device to accurately determine system time as applied to the second device. The trigger signal can be provided on existing pins used by first and second devices in accordance with communication protocols and specifications.
    Type: Grant
    Filed: August 30, 2016
    Date of Patent: December 18, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Joaquin Romera, Graig Zethner, Raheel Khan
  • Patent number: 10061581
    Abstract: Systems and methods for performing on-the-fly format conversion on data vectors during load/store operations are described herein. In one embodiment, a method for loading a data vector from a memory into a vector unit comprises reading a plurality of samples from the memory, wherein the plurality of samples are packed in the memory. The method also comprises unpacking the samples to obtain a plurality of unpacked samples, performing format conversion on the unpacked samples in parallel, and sending at least a portion of the format-converted samples to the vector unit.
    Type: Grant
    Filed: January 31, 2014
    Date of Patent: August 28, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Jun Ho Bahn, Vijay Bantval
  • Patent number: 9996439
    Abstract: Various aspects describe an on-chip, hardware error-generator component. In some cases, the hardware error-generator component connects to a data path between two components contained within a same chip. Upon receiving an error simulation input, the hardware error-generator component modifies data being transmitted on the data path by inserting a data pattern that simulates an error condition. Alternately or additionally, the hardware error-generator randomly alters one or more of the transmitted data bits.
    Type: Grant
    Filed: March 24, 2016
    Date of Patent: June 12, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Scott Wang-Yip Cheng, Raheel Khan, Kanwal Banga
  • Patent number: 9979432
    Abstract: A serial transceiver that includes programmable distributed data processing is provided. The serial transceiver can include an ingress channel that receives serial ingress data and an egress channel that transmits serial egress data. The serial transceiver can also include first and second layers that are one and another of a transport layer, a link layer, or a physical layer (PHY). The first and second layers can include elements that process the ingress data and the egress data. The serial transceiver can also include a programmable controller, a first interconnect that connects the programmable controller to the first layer, and a second interconnect that connects the programmable controller to the second layer. The programmable controller can send first data via the first interconnect to the first layer, and the first data can be processed by one of the first layer elements.
    Type: Grant
    Filed: January 31, 2017
    Date of Patent: May 22, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Scott Cheng, Pascal Philippe, Graig Zethner, Vaidyanathan Seetharaman, Kanwal Preet S. Banga, Srinivas Badam
  • Patent number: 9977676
    Abstract: Vector processing engines (VPEs) employing reordering circuitry in data flow paths between execution units and vector data memory to provide in-flight reordering of output vector data stored to vector data memory are disclosed. Related vector processor systems and methods are also disclosed. Reordering circuitry is provided in data flow paths between execution units and vector data memory in the VPE. The reordering circuitry is configured to reorder output vector data sample sets from execution units as a result of performing vector processing operations in-flight while the output vector data sample sets are being provided over the data flow paths from the execution units to the vector data memory to be stored. In this manner, the output vector data sample sets are stored in the reordered format in the vector data memory without requiring additional post-processing steps, which may delay subsequent vector processing operations to be performed in the execution units.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: May 22, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Fahad Ali Mujahid
  • Patent number: 9880845
    Abstract: Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations are disclosed. Related vector processor systems and methods are also disclosed. Format conversion circuitry is provided in data flow paths between vector data memory and execution units in the VPE. The format conversion circuitry is configured to convert input vector data sample sets fetched from vector data memory in-flight while the input vector data sample sets are being provided over the data flow paths to the execution units to be processed. In this manner, format conversion of the input vector data sample sets does not require pre-processing, storage, and re-fetching from vector data memory, thereby reducing power consumption and not limiting efficiency of the data flow paths by format conversion pre-processing delays.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: January 30, 2018
    Assignee: QUALCOMM Incorporated
    Inventor: Raheel Khan
  • Patent number: 9792118
    Abstract: Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption are disclosed. Related vector processor systems and methods are also disclosed. The VPEs are configured to provide filter vector processing operations. To minimize re-fetching of input vector data samples from memory to reduce power consumption, a tapped-delay line(s) is included in the data flow paths between a vector data file and execution units in the VPE. The tapped-delay line(s) is configured to receive and provide input vector data sample sets to execution units for performing filter vector processing operations. The tapped-delay line(s) is also configured to shift the input vector data sample set for filter delay taps and provide the shifted input vector data sample set to execution units, so the shifted input vector data sample set does not have to be re-fetched during filter vector processing operations.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: October 17, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Fahad Ali Mujahid, Afshin Shiravi
  • Publication number: 20170230210
    Abstract: Methods, systems, and devices for wireless communication are described. A user equipment (UE) may tune an auxiliary receiver within a first radio to a transmission frequency of a co-located second radio. The auxiliary receiver may downconvert a signal from the second radio so that the UE may generate an interference estimate and perform interference cancellation. In some cases, the auxiliary receiver may also be used to perform transmission corrections for transmissions of the first radio. For example, the auxiliary receiver may be used to enable gain control or digital predistortion. The auxiliary receiver may be selectively tuned to the transmission frequency of the first radio or the second radio based on whether the auxiliary receiver is being used to perform interference cancellation or transmission correction.
    Type: Application
    Filed: February 7, 2017
    Publication date: August 10, 2017
    Inventors: Madihally Narasimha, Gurkanwal Sahota, Raheel Khan
  • Publication number: 20170222685
    Abstract: A serial transceiver that includes programmable distributed data processing is provided. The serial transceiver can include an ingress channel that receives serial ingress data and an egress channel that transmits serial egress data. The serial transceiver can also include first and second layers that are one and another of a transport layer, a link layer, or a physical layer (PHY). The first and second layers can include elements that process the ingress data and the egress data. The serial transceiver can also include a programmable controller, a first interconnect that connects the programmable controller to the first layer, and a second interconnect that connects the programmable controller to the second layer. The programmable controller can send first data via the first interconnect to the first layer, and the first data can be processed by one of the first layer elements.
    Type: Application
    Filed: January 31, 2017
    Publication date: August 3, 2017
    Inventors: Raheel KHAN, Scott CHENG, Pascal PHILIPPE, Graig ZETHNER, Vaidyanathan SEETHARAMAN, Kanwal Preet S. BANGA, Srinivas BADAM
  • Publication number: 20170222684
    Abstract: Transmission of data over a serial link based on a unidirectional clock signal is provided. A unidirectional clock signal is generated based on a first clock of a master device. The unidirectional clock signal is sent to a slave device that is connected to the serial link. The master device transmits data to the slave device over the serial link based on the first clock. The slave device receives the unidirectional clock signal from a master device. The slave device transmits data over the serial link to the master device based on the unidirectional clock signal.
    Type: Application
    Filed: January 31, 2017
    Publication date: August 3, 2017
    Inventors: Raheel KHAN, Scott CHENG, Pascal PHILIPPE, Joaquin ROMERA
  • Publication number: 20170223646
    Abstract: Systems, methods, and apparatus for synchronizing timing in devices coupled to a data communication link are disclosed. In one example, a first device programs a future system time value in a second device. The first device launches a low-latency trigger signal that causes the future system time value to be loaded into a timer of the second device when a timer of the first device matches the future system time value. The second device measures phase difference between the trigger signal and edges of a clock signal used for timing in the second device. The phase difference is measured using an oversampling clock that provides a desired measurement reliability. The measured phase difference permits the first device to accurately determine system time as applied to the second device. The trigger signal can be provided on existing pins used by first and second devices in accordance with communication protocols and specifications.
    Type: Application
    Filed: August 30, 2016
    Publication date: August 3, 2017
    Inventors: Joaquin Romera, Graig Zethner, Raheel Khan
  • Publication number: 20170220517
    Abstract: Transmission of data over a serial link based on a unidirectional clock signal is provided. A unidirectional clock signal is generated based on a first clock of a master device. The unidirectional clock signal is sent to a slave device that is connected to the serial link. The master device transmits data to the slave device over the serial link based on the first clock. The slave device receives the unidirectional clock signal from a master device. The slave device transmits data over the serial link to the master device based on the unidirectional clock signal.
    Type: Application
    Filed: February 1, 2017
    Publication date: August 3, 2017
    Inventors: Raheel KHAN, Scott CHENG, Pascal PHILIPPE, Joaquin ROMERA
  • Publication number: 20170222686
    Abstract: Serial communication using a packetization protocol engineered for efficient transmission is provided. Data link layer (DLL) control packets can be generated for transmission of control messages. Each DLL control message packet can have a DLL control packet length, and the DLL control packet length can be a fixed length. Physical layer (PHY) control packets can be generated. Each PHY control packet can include one of the DLL control packets and a control token. The length of each PHY control packet can be the sum of the DLL control packet length and a control token length of the control token. The PHY control packets can be encapsulated in frames. Each of the frames can include a synchronization symbol having a symbol length. The length of each frame can be the sum of the symbol length and an encapsulation length, which can be twice the length of the PHY control packet.
    Type: Application
    Filed: January 31, 2017
    Publication date: August 3, 2017
    Inventors: Raheel KHAN, Scott CHENG, Pascal PHILIPPE, Joaquin ROMERA
  • Patent number: 9684509
    Abstract: Vector processing engines (VPEs) employing merging circuitry in data flow paths between execution units and vector data memory to provide in-flight merging of output vector data stored to vector data memory are disclosed. Related vector processing instructions, systems, and methods are also disclosed. Merging circuitry is provided in data flow paths between execution units and vector data memory in the VPE. The merging circuitry is configured to merge an output vector data sample set from execution units as a result of performing vector processing operations in-flight while the output vector data sample set is being provided over the output data flow paths from the execution units to the vector data memory to be stored. The merged output vector data sample set is stored in a merged form in the vector data memory without requiring additional post-processing steps, which may delay subsequent vector processing operations to be performed in execution units.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: June 20, 2017
    Assignee: QUALCOMM Incorporated
    Inventor: Raheel Khan
  • Patent number: 9619227
    Abstract: Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision correlation/covariance vector processing operations with reduced sample re-fetching and/or power consumption are disclosed. The VPEs disclosed herein are configured to provide correlation/covariance vector processing operations, such as code division multiple access (CDMA) correlation/covariance vector processing operations as a non-limiting example. A tapped-delay line(s) is included in the data flow paths between memory and execution units in the VPE. The tapped-delay line (s) is configured to receive and provide an input vector data sample set to execution units for performing correlation/covariance vector processing operations.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: April 11, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Fahad Ali Mujahid, Afshin Shiravi
  • Publication number: 20170085475
    Abstract: Various aspects of this disclosure describe a bi-directional, dual interconnect bus configured in a ring to route data to processors implementing modem functions. A plurality of nodes may be coupled to form a ring bus comprising at least two interconnect rings. A plurality of processors may be assigned to the plurality of nodes. A first processor among the plurality of processors may be configured to process a first data type, and a second processor among the plurality of processors may be configured to process a second data type. Data on the ring bus may be separated into the first data type and the second data type, and separated data of the first data type may be routed on one interconnect ring to the first processor and separated data of the second data type may be routed on another interconnect ring to the second processor.
    Type: Application
    Filed: March 24, 2016
    Publication date: March 23, 2017
    Inventors: Scott Wang-Yip Cheng, Raheel Khan, Vijay Bantval, Jun Ho Bahn
  • Publication number: 20170083441
    Abstract: Apparatuses and techniques are disclosed herein that enable region-based cache management. In some aspects, a configuration for a region of cache memory is determined based on characteristics of information to be written to the cache memory. Based on the determined configuration, an address range of the cache memory is allocated to define the region within the cache memory. A cache policy is the applied to the allocated address range to control caching of the information written to the region of cache memory. By so doing, regions of cache memory and respective caching policies applied thereto can be optimized for a variety of information types or usages.
    Type: Application
    Filed: March 24, 2016
    Publication date: March 23, 2017
    Inventors: Scott Wang-Yip Cheng, Raheel Khan, Warren Lew
  • Publication number: 20170083422
    Abstract: Various aspects describe an on-chip, hardware error-generator component. In some cases, the hardware error-generator component connects to a data path between two components contained within a same chip. Upon receiving an error simulation input, the hardware error-generator component modifies data being transmitted on the data path by inserting a data pattern that simulates an error condition. Alternately or additionally, the hardware error-generator randomly alters one or more of the transmitted data bits.
    Type: Application
    Filed: March 24, 2016
    Publication date: March 23, 2017
    Inventors: Scott Wang-Yip Cheng, Raheel Khan, Kanwal Banga