Patents by Inventor Kun Tian

Kun Tian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11099880
    Abstract: A processing device comprises an address translation circuit to intercept a work request from an I/O device. The work request comprises a first ASID to map to a work queue. A second ASID of a host is allocated for the first ASID based on the work queue. The second ASID is allocated to at least one of: an ASID register for a dedicated work queue (DWQ) or an ASID translation table for a shared work queue (SWQ). Responsive to receiving a work submission from the SVM client to the I/O device, the first ASID of the application container is translated to the second ASID of the host machine for submission to the I/O device using at least one of: the ASID register for the DWQ or the ASID translation table for the SWQ based on the work queue associated with the I/O device.
    Type: Grant
    Filed: February 22, 2017
    Date of Patent: August 24, 2021
    Assignee: Intel Corporation
    Inventors: Sanjay Kumar, Rajesh M. Sankaran, Gilbert Neiger, Philip R. Lantz, Jason W. Brandt, Vedvyas Shanbhogue, Utkarsh Y. Kakaiya, Kun Tian
  • Patent number: 11069021
    Abstract: A display engine comprises a surface splitter to generate frame buffer coordinates to split frame buffer data into a plurality of regions, each corresponding to a frame buffer coordinate, a pipeline, including a plurality of pipes, to receive the frame buffer coordinates, wherein two or more of the plurality of pipes operate in parallel to process frame buffer data corresponding to a region of the frame buffer identified by the frame buffer coordinates, a first of a plurality of transcoders to merge the frame buffer data from each of the two or more pipes into an output signal whenever the display engine is operating in a multi-pipe collaboration mode and a multiplexer (Mux) and multi-stream arbiter to control an order of transmission of the frame buffer data from each of the two or more pipes to the first transcoder based on a fetch order received from the surface splitter.
    Type: Grant
    Filed: July 2, 2016
    Date of Patent: July 20, 2021
    Assignee: Intel Corporation
    Inventors: Dingyu Pei, Kun Tian
  • Patent number: 11055147
    Abstract: Techniques for scalable virtualization of an Input/Output (I/O) device are described. An electronic device composes a virtual device comprising one or more assignable interface (AI) instances of a plurality of AI instances of a hosting function exposed by the I/O device. The electronic device emulates device resources of the I/O device via the virtual device. The electronic device intercepts a request from the guest pertaining to the virtual device, and determines whether the request from the guest is a fast-path operation to be passed directly to one of the one or more AI instances of the I/O device or a slow-path operation that is to be at least partially serviced via software executed by the electronic device. For a slow-path operation, the electronic device services the request at least partially via the software executed by the electronic device.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: July 6, 2021
    Assignee: Intel Corporation
    Inventors: Utkarsh Y. Kakaiya, Rajesh Sankaran, Sanjay Kumar, Kun Tian, Philip Lantz
  • Publication number: 20210194828
    Abstract: Methods and apparatus for smart switch centered next generation cloud infrastructure architectures. Smart server switches are implemented in place of Top of Rack (ToR) switches and other switches in cloud infrastructure that include programmable switch chips (e.g., P4 switch chips) that are programmed via data plane runtime code executing on the switch chips to implement data plane operations in hardware in the switches. Meanwhile, control plane operations are implemented in the server switches via software executing on one or more CPUs or are implemented via servers that are coupled to the server switches. The data plane runtime code is used to forward data traffic and storage traffic in hardware via the programmable switch chips in a manner that offloads forwarding to hardware in virtualized cloud environments.
    Type: Application
    Filed: December 7, 2020
    Publication date: June 24, 2021
    Inventors: Shaopeng He, Jingjing Wu, Haitao Kang, Yadong Li, Kun Tian
  • Publication number: 20210173790
    Abstract: Embodiments of apparatuses, methods, and systems for unified address translation for virtualization of input/output devices are described. In an embodiment, an apparatus includes first circuitry to use at least an identifier of a device to locate a context entry and second circuitry to use at least a process address space identifier (PASID) to locate a PASID-entry. The context entry is to include at least one of a page-table pointer to a page-table translation structure and a PASID. The PASID-entry is to include at least one of a first-level page-table pointer to a first-level translation structure and a second-level page-table pointer to a second-level translation structure. The PASID is to be supplied by the device. At least one of the apparatus, the context entry, and the PASID entry is to include one or more control fields to indicate whether the first-level page-table pointer or the second-level page-table pointer is to be used.
    Type: Application
    Filed: December 29, 2017
    Publication date: June 10, 2021
    Applicant: Intel Corporation
    Inventors: Utkarsh Y. Kakaiya, Sanjay Kumar, Rajesh M. Sankaran, Philip R. Lantz, Ashok Raj, Kun Tian
  • Publication number: 20210158471
    Abstract: Embodiments described herein provide techniques enable a compute unit to continue processing operations when all dispatched threads are blocked. One embodiment provides for a method comprising executing multiple concurrent threads on a processing resource of a graphics processor, during execution, detecting that each of the multiple concurrent threads of the processing resource are blocked from execution, selecting a victim thread from the multiple concurrent threads, and suspending the victim thread. The thread state is stored to a thread scratch space in memory along with a blocking event associated with the victim thread.
    Type: Application
    Filed: November 16, 2020
    Publication date: May 27, 2021
    Applicant: Intel Corporation
    Inventors: Murali Ramadoss, Balaji Vembu, Eric C. Samson, Kun Tian, David J. Cowperthwaite, Altug Koker, Zhi Wang, Joydeep Ray, Subramaniam M. Maiyuran, Abhishek R. Appu
  • Patent number: 10996968
    Abstract: Methods, software, and apparatus for application transparent, high available GPU computing with VM checkpointing. The guest access of certain GPU resources, such as MMIO resources, are trapped to keep a copy of guest context per semantics, and/or emulate the guest access of the resources prior to submission to the GPU, while other commands relating to certain graphics memory address regions are trapped before being passed through to the GPU. The trapped commands are scanned before submission to predict: a) potential to-be-dirtied graphics memory pages, and b) the execution time of intercepted commands, so the next checkpointing can be aligned to a predicted execution time. The GPU internal states are drained by flushing internal context/tlb/cache, at the completion of submitted commands, and then a snapshot of the vGPU state is taken, based on tracked GPU state, GPU context (through GPU-specific commands), detected dirty graphics memory pages and predicted to-be dirtied graphics memory pages.
    Type: Grant
    Filed: November 24, 2014
    Date of Patent: May 4, 2021
    Assignee: Intel Corporation
    Inventors: Yaozu Dong, Kun Tian
  • Patent number: 10991867
    Abstract: A thermoelectric material, having a formula TbxM1y-xM2zOw where M1 is one of Ca, Mg, Sr, Ba and Ra, M2 is at least one of Co, Fe, Ni, and Mn, x ranges from 0.01 to 5; y is 1, 2, 3, or 5; z is 1, 2, 3, or 4; and w is 1, 2, 3, 4, 5, 7, 8, 9, or 14. The thermoelectric material is chemically stable within 5% for one year and is also non-toxic. The thermoelectric material can also be incorporated into a thermoelectric system which can be used to generate electricity from waste heat sources or to cool an adjacent region.
    Type: Grant
    Filed: August 7, 2017
    Date of Patent: April 27, 2021
    Assignee: University of Utah Research Foundation
    Inventors: Ashutosh Tiwari, Shrikant Saini, Kun Tian, Haritha Sree Yaddanapudi, Yinong Yin
  • Patent number: 10983821
    Abstract: An apparatus and method are described for implementing a hybrid layer of address mapping for an IOMMU implementation.
    Type: Grant
    Filed: September 26, 2016
    Date of Patent: April 20, 2021
    Assignee: Intel Corporation
    Inventors: Xiao Zheng, Yao Zu Dong, Kun Tian
  • Patent number: 10970129
    Abstract: Technologies for scheduling workload submissions for a graphics processing unit (GPU) in a virtualization environment include a GPU scheduler embodied in a computing device. The virtualization environment includes a number of different virtual machines that are configured with a native graphics driver. The GPU scheduler receives GPU commands from the different virtual machines, dynamically selects a scheduling policy, and schedules the GPU commands for processing by the GPU.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: April 6, 2021
    Assignee: Intel Corporation
    Inventors: Kun Tian, Zhiyuan Lv, Yao Zu Dong
  • Publication number: 20210097493
    Abstract: The disclosed embodiments provide a system for predicting response rates. During operation, the system determines features representing historical applications to opportunities and historical responses to the historical applications by a poster of the opportunities, wherein the historical responses include interactions between the poster and candidates submitting the historical applications and notifications transmitted to the candidates of actions related to the historical applications by the poster. Next, the system applies one or more operations to the features to generate a predicted response rate of the poster to an application for an opportunity by a member of the online system. The system then compares the predicted response rate to a threshold to determine a recommendation related to the application by the member. Finally, the system outputs, to the member in a user interface of the online system, the recommendation in association with the opportunity.
    Type: Application
    Filed: September 30, 2019
    Publication date: April 1, 2021
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Fenglin Li, Qing Duan, Aditya S. Aiyer, Kun Tian
  • Publication number: 20210064525
    Abstract: A processor includes a hardware input/output (I/O) memory management unit (IOMMU) and a core, which executes an instruction to intercept a payload from a virtual machine (VM). The payload contains a guest bus device function (BDF) identifier, a guest address space identifier (ASID), and a guest address range. The core accesses, within a virtual machine control structure stored in memory, pointers to a first set of translation tables and a second set of translation tables. The core traverses the first set of translation tables to translate the guest BDF identifier to a host BDF identifier and traverses the second set of translation tables to translate the guest ASID to a host ASID. The core stores the host BDF identifier and the host ASID in the payload and submits, to the hardware IOMMU, an administrative command containing the payload to perform invalidation of the guest address range.
    Type: Application
    Filed: January 2, 2018
    Publication date: March 4, 2021
    Applicant: Intel Corporation
    Inventors: Kun Tian, Rajesh Sankaran, Sanjay Kumar, Ashok Raj
  • Publication number: 20210056051
    Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.
    Type: Application
    Filed: September 1, 2020
    Publication date: February 25, 2021
    Inventors: NIRANJAN L. COORAY, ABHISHEK R. APPU, ALTUG KOKER, JOYDEEP RAY, BALAJI VEMBU, PATTABHIRAMAN K, DAVID PUFFER, DAVID J. COWPERTHWAITE, RAJESH M. SANKARAN, SATYESHWAR SINGH, SAMEER KP, ANKUR N. SHAH, KUN TIAN
  • Patent number: 10929157
    Abstract: Examples may include a determining a policy for primary and secondary virtual machines based on output-packet-similarities. The output-packet-similarities may be based on a comparison of time intervals via which content matched for packets outputted from the primary and secondary virtual machines. A mode may then be selected based, at least in part, on the determined policy.
    Type: Grant
    Filed: February 26, 2020
    Date of Patent: February 23, 2021
    Assignee: INTEL CORPORATION
    Inventors: Kun Tian, Yao Zu Dong
  • Publication number: 20210035348
    Abstract: A virtual reality apparatus and method are described for tile-based rendering. For example, one embodiment of an apparatus comprises: a set of on-chip geometry buffers including a first buffer to store geometry data, and a set of pointer buffers to store pointers to the geometry data; a tile-based immediate mode rendering (TBIMR) module to perform tile-based immediate mode rendering using geometry data and pointers stored within the set of on-chip geometry buffers; spill circuitry to determine when the on-chip geometry buffers are over-subscribed and responsively spill additional geometry data and/or pointers to an off-chip memory; and a prefetcher to start prefetching the geometry data from the off-chip memory as space becomes available within the on-chip geometry buffers, the TBIMR module to perform tile-based immediate mode rendering using the geometry data prefetched from the off-chip memory.
    Type: Application
    Filed: October 16, 2020
    Publication date: February 4, 2021
    Inventors: Prasoonkumar SURTI, Tomas G. AKENINE-MOLLER, David J. COWPERTHWAITE, Kun TIAN, Peter L. DOYLE, Brent E. INSKO, Adam T. LAKE
  • Publication number: 20210004334
    Abstract: Embodiment of this disclosure provides a mechanism to extend a workload instruction to include both untranslated and translated address space identifiers (ASIDs). In one embodiment, a processing device comprising a translation manager is provided. The translation manager receives a workload instruction from a guest application. The workload instruction comprises an untranslated (ASID) and a workload for an input/output (I/O) device. The untranslated ASID is translated to a translated ASID. The translated ASID inserted into a payload of the workload instruction. Thereupon, the payload is provided to a work queue of the I/O device to execute the workload based in part on at least one of: the translated ASID or the untranslated ASID.
    Type: Application
    Filed: March 28, 2018
    Publication date: January 7, 2021
    Inventors: Kun TIAN, Xiao ZHENG, Ashok RAJ, Sanjay KUMAR, Rajesh SANKARAN
  • Patent number: 10853118
    Abstract: An apparatus and method are described for pattern driven page table updates. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) to process graphics commands and responsively render a plurality of image frames; a hypervisor to virtualize the GPU to share the GPU among a plurality of virtual machines (VMs); a first guest page table managed within a first VM, the first guest page table comprising a plurality of page table entries; a first shadow page table managed by the hypervisor and comprising page table entries corresponding to the page table entries of the first guest page table; and a command parser to analyze a current working set of commands submitted from the first VM to the GPU, the command parser to responsively update the first shadow page table responsive to determining a set of page table entries predicted to be used based on the analysis of the working set of commands.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: December 1, 2020
    Assignee: Intel Corporation
    Inventors: Kun Tian, Yao Zu Dong
  • Patent number: 10839476
    Abstract: Embodiments described herein provide techniques enable a compute unit to continue processing operations when all dispatched threads are blocked. One embodiment provides for a graphics processor comprising a compute unit to execute multiple concurrent threads and a memory coupled with and on a same package as the compute unit. The memory can store thread state for a suspended thread and the compute unit can detect that multiple concurrent threads of the compute unit are blocked from execution. Upon detection, the compute unit can select a victim thread from the multiple concurrent threads, suspend the victim thread, store thread state of the victim thread to the memory, and select an additional thread to be executed. The compute unit can then replace the victim thread with an additional thread to be executed. The additional thread to be executed can be based on a blocking event for the additional thread.
    Type: Grant
    Filed: August 20, 2019
    Date of Patent: November 17, 2020
    Assignee: Intel Corporation
    Inventors: Murali Ramadoss, Balaji Vembu, Eric C. Samson, Kun Tian, David J. Cowperthwaite, Altug Koker, Zhi Wang, Joydeep Ray, Subramaniam M. Maiyuran, Abhishek R. Appu
  • Patent number: 10831625
    Abstract: An apparatus and method performing debug and rollback operations using snapshots. For example, one embodiment of an apparatus comprises: a graphics processing unit (GPU) to perform graphics processing operations by executing graphics commands; a command parser to parse graphics commands submitted to the GPU and generate a list of graphics memory pages which will be affected by the graphics commands; an I/O state tracker to track I/O accesses from a graphics driver to determine a list of registers affected by the I/O accesses; snapshot circuitry and/or logic to perform a memory snapshot and I/O snapshot based on the list of graphics memory pages and the list of registers, respectively; and rollback circuitry and/or logic to perform a rollback operation using the memory snapshot and I/O snapshot in response to detecting a GPU error condition.
    Type: Grant
    Filed: April 1, 2016
    Date of Patent: November 10, 2020
    Assignee: Intel Corporation
    Inventors: Yao Zu Dong, Kun Tian
  • Publication number: 20200294183
    Abstract: Systems and methods for container access to graphics processing unit (GPU) resources are disclosed herein. In some embodiments, a computing system may include a physical GPU and kernel-mode driver circuitry, to communicatively couple with the physical GPU to create a plurality of emulated GPUs and a corresponding plurality of device nodes. Each device node may be associated with a single corresponding user-side container to enable communication between the user-side container and the corresponding emulated GPU. Other embodiments may be disclosed and/or claimed.
    Type: Application
    Filed: February 14, 2020
    Publication date: September 17, 2020
    Inventors: Kun Tian, Yao Zu Dong, Zhiuyuan LV