Patents by Inventor Guansong Zhang

Guansong Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9218186
    Abstract: A computer-implemented method for creating a threaded package of computer executable instructions from software compiler generated code includes allocating, through a computer processor, the computer executable instructions into a plurality of stacks, differentiating between different types of computer executable instructions for each computer executable instruction allocated to each stack of the plurality of stacks, creating switch points for each stack of the plurality of stacks based upon the differentiating, and inserting the switch points within each stack of the plurality of stacks.
    Type: Grant
    Filed: September 1, 2011
    Date of Patent: December 22, 2015
    Assignee: International Business Machines Corporation
    Inventors: Raul E. Silvera, Guansong Zhang, Yue Zhao
  • Patent number: 9112625
    Abstract: A method and apparatus for emulating stream clock signal in asynchronous data transmission. The inventive subject matter proposes a system consisting of a transmitter module, a receiver module, and a link or network in between. A scheme to generate the emulated stream clock across a wide frequency range is also proposed with the property of controllable deviation from the original stream frequency to meet jitter requirement and fast frequency convergence (minimal number of converging steps). The scheme includes an optional first step to derive a frequency estimation of the stream clock and a second step of continuous adjusting the emulated clock frequency to keep the average frequency equals that of the original stream clock.
    Type: Grant
    Filed: June 21, 2010
    Date of Patent: August 18, 2015
    Inventors: Guansong Zhang, Tsung-Yi Yang, Cathy Zhang
  • Publication number: 20150110125
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, payload data originated by a user process running on a host processor of the computer system is fetched by an interface of the computer system by performing direct virtual memory addressing of a user memory space of a system memory of the computer system on behalf of a network processor of the computer system. The direct virtual memory addressing maps a physical address of the payload data to a virtual address. The payload data is segmented by the network processor across one or more packets.
    Type: Application
    Filed: December 12, 2014
    Publication date: April 23, 2015
    Applicant: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 8964785
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a user process of a host processor requests a network driver to store payload data within a system memory. The network driver stores (i) payload buffers each containing therein at least a subset of the payload data and (ii) buffer descriptors each containing therein information indicative of a starting address of a corresponding payload buffer within a user memory space. A network processor transmits onto a network the payload data within multiple transport layer protocol packets by (i) causing a network interface to retrieve the payload data from the payload buffers by performing direct virtual memory addressing of the user memory space using the buffer descriptors and information contained within a translation data structure stored within the system memory; and (ii) segmenting the payload data across the transport layer protocol packets.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: February 24, 2015
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 8527962
    Abstract: A method for promotion of a child procedure in a software application for a heterogeneous architecture, wherein the heterogeneous architecture comprises a first architecture type and a second architecture type, comprises inserting a parameter representing a parallel frame pointer to a parent procedure of the child procedure into the child procedure; and modifying a reference in the child procedure to a stack variable of the parent procedure to include an indirect access to the parent procedure via the parallel frame pointer.
    Type: Grant
    Filed: March 10, 2009
    Date of Patent: September 3, 2013
    Assignee: International Business Machines Corporation
    Inventors: Raul Silvera, Ettore Tiotto, Guansong Zhang
  • Patent number: 8411702
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing transport layer protocol segmentation offloading. Multiple buffer descriptors are stored in a system memory of a network device. The buffer descriptors contain information indicative of a starting address of a payload buffer stored in a user memory space of the system memory. The payload buffers contain payload data originated by a user process running on a host processor of the network device. The payload data is retrieved from the payload buffers on behalf of a network processor of the network device without copying the payload data from the user memory space to a kernel memory space of the system memory by performing direct virtual memory addressing of the user memory space. Finally, the payload data is segmented across one or more transport layer protocol packets.
    Type: Grant
    Filed: April 28, 2011
    Date of Patent: April 2, 2013
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Publication number: 20130061000
    Abstract: A computer-implemented method for creating a threaded package of computer executable instructions from software compiler generated code includes allocating, through a computer processor, the computer executable instructions into a plurality of stacks, differentiating between different types of computer executable instructions for each computer executable instruction allocated to each stack of the plurality of stacks, creating switch points for each stack of the plurality of stacks based upon the differentiating, and inserting the switch points within each stack of the plurality of stacks.
    Type: Application
    Filed: September 1, 2011
    Publication date: March 7, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Raul E. Silvera, Guansong Zhang, Yue Zhao
  • Patent number: 8375375
    Abstract: A method and system of auto parallelization of zero-trip loops that substitutes a nested basic linear induction variable by exploiting a parallelizing compiler is provided. Provided is a use of a max{0,N} variable for loop iterations in case of no information is known about the value of N, for a typical loop iterating from 1 to N, in which N is the loop invariant. For the nested basic induction variables, an induction variable substitution process is applied to the nested loops starting from the innermost loop to the outermost one. Then a removal of the max operator afterwards through a copy propagation pass of the IBM compiler is provided. In doing so, the loop dependency on the induction variable is eliminated and an opportunity for a parallelizing compiler to parallel the outermost loop is provided.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: February 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: Zhixing Ren, Raul Esteban Silvera, Guansong Zhang
  • Patent number: 8341615
    Abstract: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization.
    Type: Grant
    Filed: July 11, 2008
    Date of Patent: December 25, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Raul E. Silvera, Amy K. Wang, Guansong Zhang
  • Patent number: 8104030
    Abstract: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Raul Esteban Silvera, Priya Unnikrishnan, Guansong Zhang
  • Publication number: 20110311011
    Abstract: A method and apparatus for emulating stream clock signal in asynchronous data transmission. The inventive subject matter proposes a system consisting of a transmitter module, a receiver module, and a link or network in between. A scheme to generate the emulated stream clock across a wide frequency range is also proposed with the property of controllable deviation from the original stream frequency to meet jitter requirement and fast frequency convergence (minimal number of converging steps). The scheme includes an optional first step to derive a frequency estimation of the stream clock and a second step of continuous adjusting the emulated clock frequency to keep the average frequency equals that of the original stream clock.
    Type: Application
    Filed: June 21, 2010
    Publication date: December 22, 2011
    Inventors: Guansong Zhang, Tsung-Yi Yang, Cathy Zhang
  • Publication number: 20110200057
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing transport layer protocol segmentation offloading. Multiple buffer descriptors are stored in a system memory of a network device. The buffer descriptors contain information indicative of a starting address of a payload buffer stored in a user memory space of the system memory. The payload buffers contain payload data originated by a user process running on a host processor of the network device. The payload data is retrieved from the payload buffers on behalf of a network processor of the network device without copying the payload data from the user memory space to a kernel memory space of the system memory by performing direct virtual memory addressing of the user memory space. Finally, the payload data is segmented across one or more transport layer protocol packets.
    Type: Application
    Filed: April 28, 2011
    Publication date: August 18, 2011
    Applicant: FORTINET, INC.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 7944946
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing segmentation offloading, such as TCP segmentation offloading (TSO). An interface performs direct virtual memory addressing of a user memory space of a system memory on behalf of a network processor to fetch payload data originated by a user process running on a host processor. Then, the network processor segments the payload data across one or more packets.
    Type: Grant
    Filed: October 21, 2008
    Date of Patent: May 17, 2011
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 7877739
    Abstract: A computer-implemented method for determining whether an array within a loop can be privatized for that loop is presented. The method calculates the array sections that require first or last privatization and copies only those sections, reducing the privatization overhead of the known solutions.
    Type: Grant
    Filed: October 9, 2006
    Date of Patent: January 25, 2011
    Assignee: International Business Machines Corporation
    Inventors: Roch G. Archambault, Erik P. Charlebois, Guansong Zhang
  • Publication number: 20100235811
    Abstract: A method for promotion of a child procedure in a software application for a heterogeneous architecture, wherein the heterogeneous architecture comprises a first architecture type and a second architecture type, comprises inserting a parameter representing a parallel frame pointer to a parent procedure of the child procedure into the child procedure; and modifying a reference in the child procedure to a stack variable of the parent procedure to include an indirect access to the parent procedure via the parallel frame pointer.
    Type: Application
    Filed: March 10, 2009
    Publication date: September 16, 2010
    Applicant: International Business Machines Corporation
    Inventors: Raul Silvera, Ettore Tiotto, Guansong Zhang
  • Patent number: 7689977
    Abstract: The present disclosure is directed to a method for providing an OpenMP reduction implementation. The method may comprise creating an aggregate of at least one reduction variable in a parallel region or a work-sharing construct; defining a pointer variable, the pointer variable pointing to a dynamic array of the aggregate; creating an initialization routine, an outlined routine and a reduction accumulation routine; replacing the parallel region or the work-sharing construct with a runtime routine, the runtime routine taking a plurality of arguments including an address of the initialization routine, an address of the outlined routine, an address of the reduction accumulation routine, an address of the pointer variable, and a size of the aggregate; and executing the runtime routine when the at least one reduction variable is in the parallel region or the work-sharing construct.
    Type: Grant
    Filed: April 15, 2009
    Date of Patent: March 30, 2010
    Assignee: International Business Machines Corporation
    Inventors: Guansong Zhang, Shimin Cui, Ettore Tiotto
  • Publication number: 20100011339
    Abstract: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization.
    Type: Application
    Filed: July 11, 2008
    Publication date: January 14, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Raul E. Silvera, Amy K. Wang, Guansong Zhang
  • Publication number: 20090307363
    Abstract: Methods and systems are provided for network protocol reassembly acceleration. According to one embodiment, an incoming packet is received at a network interface. Payload data from the packet is written by a memory interface to a physical page within a system memory on behalf of the network interface based on a sequence number associated with the incoming packet and by obtaining a physical address from a virtual memory map corresponding to an incoming session with which the packet is associated. After the physical page is full, the physical page is made accessible to a user process being executed by a processor associated with the system memory by remapping the physical page through a paging table used by the user process.
    Type: Application
    Filed: October 22, 2008
    Publication date: December 10, 2009
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Publication number: 20090304029
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing segmentation offloading, such as TCP segmentation offloading (TSO). An interface performs direct virtual memory addressing of a user memory space of a system memory on behalf of a network processor to fetch payload data originated by a user process running on a host processor. Then, the network processor segments the payload data across one or more packets.
    Type: Application
    Filed: October 21, 2008
    Publication date: December 10, 2009
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 7581222
    Abstract: The present invention provides an approach for barrier synchronization. The barrier has a first array of elements with each element of the first array having an associated process, and a second array of elements with each element of the second array having an associated process. Prior to use, the values or states of the elements in each array may be initialized. As each process finishes its phase and arrives at the barrier, it may update the value or state of its associated element in the first array. Each process may then proceed to spin at its associated element in the second array, waiting for that element to switch. When the values or states of the elements of the first array reach a predetermined value or state, an instruction is sent to all of the elements in the second array to switch their values or states, allowing all processes to leave.
    Type: Grant
    Filed: November 20, 2003
    Date of Patent: August 25, 2009
    Assignee: International Business Machines Corporation
    Inventors: Robert James Blainey, Guansong Zhang