Patents by Inventor Ahmed Gheith
Ahmed Gheith has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20150161153Abstract: A request to access to a logical location in a file stored in a content addressable storage (CAS) system can be processed by retrieving first tree data from a first node in a first hash tree that represents a first version of the file. Based on the first tree data, a second node is selected from which a CAS signature is compared to a reserved CAS signature to determine the proper file version. In response to a match, a third node is accessed in a second hash tree that represents a second version of the file. Tree data is retrieved from a third node.Type: ApplicationFiled: December 6, 2013Publication date: June 11, 2015Applicant: International Business Machines CorporationInventors: Ahmed Gheith, Eric Van Hensbergen, James Xenidis
-
Publication number: 20150161154Abstract: A request to access to a logical location in a file stored in a content addressable storage (CAS) system can be handled by retrieving first tree data from a first node in a hash tree that represents the file, the first tree data including a first hash tree depth, a first CAS signature, a block size and a file size. Based on the tree data, a second node is selected from a higher level in the hash tree. Second tree data from the second node of the hash tree that represents the file is retrieved, including a second CAS signature. The second CAS signature is determined to match a reserved CAS signature, and in response, an indication that the requested logical location is unallocated within the file is provided.Type: ApplicationFiled: December 6, 2013Publication date: June 11, 2015Applicant: International Business Machines CorporatonInventors: Ahmed Gheith, Eric Van Hensbergen, James Xenidis
-
Publication number: 20150127767Abstract: A method, system, and computer program product for resolving cache lookup of large pages with variable granularity are provided in the illustrative embodiments. A number of unused bits in an available number of bits is identified. The available number of bits is configured to address a page of data in memory, wherein the page exceeding a threshold size, and the page comprising a set of parts. The unused bits are mapped to the plurality of parts such that a value of the unused bits corresponds to existence of a subset of the set of parts in a memory. A virtual address is translated to a physical address of a requested part in the set of parts. A determination is made, using the unused bits, whether the requested part exists in the memory.Type: ApplicationFiled: November 4, 2013Publication date: May 7, 2015Applicant: International Business Machines CorporationInventors: AHMED GHEITH, Eric Van Hensbergen, James Xenidis
-
Publication number: 20150121029Abstract: A memory buffer with a set of one or more structures is created by a process of a first software program. The first memory buffer comprises a predetermined amount of memory. It is determined that a structure of the set of one or more structures has been or will be consumed by a second software program that supports the first software program. The consumption of the structure of the set of one or more structures indicates that memory associated with the structure of the set of one or more structures is being reclaimed. In response to the determination that the structure of the set of one or more structures has been or will be consumed, data is written from a first location to a second location. The first location is in memory allocated to the first software program and the second location is indicated for data storage.Type: ApplicationFiled: October 24, 2013Publication date: April 30, 2015Applicant: International Business Machines CorporationInventors: Andrew J. Declercq, Ahmed Gheith, Andrew R. Malota
-
Patent number: 9009716Abstract: Creating a thread of execution in a computer processor, including copying, as indicated by a hardware processor opcode having been specified by a user-level process, data from a first set of registers to a second set of registers, wherein the first set of registers is associated with a parent hardware thread, wherein the second set of registers is associated with a child hardware thread, wherein the child hardware thread is in a wait state, and changing, as indicated by the hardware processor opcode, the child hardware thread from the wait state to an ephemeral run state.Type: GrantFiled: April 27, 2012Date of Patent: April 14, 2015Assignee: International Business Machines CorporationInventors: Patrick J. Bohrer, Ahmed Gheith, James L. Peterson
-
Publication number: 20150082324Abstract: A mechanism is provided for handling interrupt actions for inter-thread communication. In association with a first processor thread, a thread action data structure is provided that comprises a non-blocking synchronization data structure and an internal list data structure of pending interrupts having no form of synchronization. A post of an interrupt action is received from a second processor thread to the thread action data structure associated with the first processor thread, where the interrupt action is added to the non-blocking synchronization data structure of the thread action data structure. The interrupt action is moved from the non-blocking synchronization data structure to the internal list data structure of pending interrupts for handling by the first processor thread. The internal list data structure of pending interrupts is processed to thereby handle interrupt actions moved to the internal list data structure.Type: ApplicationFiled: September 18, 2013Publication date: March 19, 2015Applicant: International Business Machines CorporationInventors: Andrew J. Declercq, Ahmed Gheith, Aditya Kumar
-
Patent number: 8972703Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.Type: GrantFiled: July 12, 2011Date of Patent: March 3, 2015Assignee: International Business Machines CorporationInventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
-
Patent number: 8954707Abstract: A mechanism is provided for automatic use of large pages. An operating system loader performs aggressive contiguous allocation followed by demand paging of small pages into a best-effort contiguous and naturally aligned physical address range sized for a large page. The operating system detects when the large page is fully populated and switches the mapping to use large pages. If the operating system runs low on memory, the operating system can free portions and degrade gracefully.Type: GrantFiled: August 3, 2012Date of Patent: February 10, 2015Assignee: International Business Machines CorporationInventors: Ahmed Gheith, Eric Van Hensbergen, James Xenidis
-
Patent number: 8893153Abstract: A first set of one or more hardware threads for receiving messages sent from hardware threads are registered. After receiving indications of a message location value and a number, the message location value is increments and sent to a different hardware thread of the first set of one or more hardware threads until the message location value has been incremented the number of times or a criterion for interrupting the incrementing and sending is satisfied. An actual number of times the message location value was incremented is indicated to a hardware thread that sent the indications of the message location value and the number.Type: GrantFiled: October 11, 2013Date of Patent: November 18, 2014Assignee: International Business Machines CorporationInventors: Patrick J. Bohrer, Ahmed Gheith, James L. Peterson
-
Patent number: 8838939Abstract: Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.Type: GrantFiled: April 4, 2012Date of Patent: September 16, 2014Assignee: International Business Machines CorporationInventors: Elmootazbellah N. Elnozahy, Ahmed Gheith
-
Patent number: 8799625Abstract: A method for fast remote communication and computation between processors is provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.Type: GrantFiled: March 7, 2012Date of Patent: August 5, 2014Assignee: International Business Machines CorporationInventors: John Bruce Carter, Elmootazbellah Nabil Elnozahy, Ahmed Gheith, Eric Van Hansbergen, Karthick Rajamani, William Evan Speight, Lixin Zhang
-
Patent number: 8762126Abstract: Analyzing simulated operation of a computer including loading user-defined dynamically linked analysis libraries that each include specifications of events to be traced for analysis, including: executing, in separate hardware threads, one trace buffer handler for each analysis library, and associating, with each trace buffer handler, one or more analysis functions; translating static binary instructions for the simulated computer into binary instructions for the executing computer, including: inserting, into the translation, implementing code for each specification of an event to be traced and inserting, into the translation for each static instruction, a memory address of a separate static instruction buffer; executing the translation, including executing the implementing code and generating, in a trace buffer, one or more trace records for each specified event; and processing the trace buffer, including calling analysis functions and associating by the analysis functions through the separate static instructType: GrantFiled: January 5, 2011Date of Patent: June 24, 2014Assignee: International Business Machines CorporationInventors: Patrick J. Bohrer, Ahmed Gheith, James L. Peterson
-
Publication number: 20140163947Abstract: A simulation technique that handles accesses to a frame of instruction memory by inserting a command object between a frame proxy and a memory frame provides improved throughput in simulation environments. The instruction frame, if present, processes the access to the frame. If an instruction frame is not present for the accessed frame, the memory frame handles the request directly. The instruction frame caches fetched and decoded instructions and may be inserted at the first access to a corresponding instruction memory frame. The instruction frame can track write accesses to instruction memory so that changes to the instruction memory can be reflected in the state of the instruction frame. Additional check frames may be chained between the interface and the memory frame to handle breakpoints, instruction memory watches or other access checks on the instruction memory frame.Type: ApplicationFiled: November 26, 2013Publication date: June 12, 2014Applicant: International Business Machines CorporationInventors: Tracy Bashore, Ahmed Gheith, Aditya Kumar, Ronald L. Rockhold
-
Publication number: 20140163946Abstract: A simulation technique that handles accesses to a frame of instruction memory by inserting a command object between a frame proxy and a memory frame provides improved throughput in simulation environments. The instruction frame, if present, processes the access to the frame. If an instruction frame is not present for the accessed frame, the memory frame handles the request directly. The instruction frame caches fetched and decoded instructions and may be inserted at the first access to a corresponding instruction memory frame. The instruction frame can track write accesses to instruction memory so that changes to the instruction memory can be reflected in the state of the instruction frame. Additional check frames may be chained between the interface and the memory frame to handle breakpoints, instruction memory watches or other access checks on the instruction memory frame.Type: ApplicationFiled: December 7, 2012Publication date: June 12, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tracy Bashore, Ahmed Gheith, Aditya Kumar, Ronald L. Rockhold
-
Publication number: 20140163945Abstract: A simulation technique that handles accesses to a frame of memory via a proxy object provides improved throughput in simulation environments. The proxy object, if present, processes the access at a head of a linked list of frames. If a check frame is not inserted in the list, the memory frame handles the request directly, but if a check frame is inserted, then the check operation is performed. The check frame can be a synchronization frame that blocks access to a memory frame while the check frame is present, or the check frame may be a breakpoint, watch or exception frame that calls a suitable handling routine. Additional check frames may be chained between the interface and the memory subsystem to handle synchronization, breakpoints, memory watches or other accesses to or information gathering associated with the memory frame.Type: ApplicationFiled: December 7, 2012Publication date: June 12, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tracy Bashore, Ahmed Gheith, Aditya Kumar, Andrew R. Malota, Ronald L. Rockhold
-
Patent number: 8688432Abstract: A method, apparatus, and full-system simulator for speeding memory management unit simulation with direct address mapping on a host system, the host system supporting a full-system simulator, on which a guest system is simulated, the method comprising the following steps: setting a border in the logical space assigned for the full-system simulator by the host system, thereby dividing the logical space into a safe region and a simulator occupying region; shifting the full-system simulator itself from the occupied original host logical space to the simulator occupying region; and reserving the safe region for use with at least part of the guest system.Type: GrantFiled: October 28, 2008Date of Patent: April 1, 2014Assignee: International Business Machines CorporationInventors: Ahmed Gheith, Hua Yong Wang, Kun Wang, Yu Zhang
-
Publication number: 20140075159Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.Type: ApplicationFiled: July 12, 2011Publication date: March 13, 2014Applicant: International Business Machines CorporationInventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
-
Patent number: 8650554Abstract: A mechanism is provided for improving single-thread performance for a multi-threaded, in-order processor core. In a first phase, a compiler analyzes application code to identify instructions that can be executed in parallel with focus on instruction-level parallelism and removing any register interference between the threads. The compiler inserts as appropriate synchronization instructions supported by the apparatus to ensure that the resulting execution of the threads is equivalent to the execution of the application code in a single thread. In a second phase, an operating system schedules the threads produced in the first phase on the hardware threads of a single processor core such that they execute simultaneously. In a third phase, the microprocessor core executes the threads specified by the second phase such that there is one hardware thread executing an application thread.Type: GrantFiled: April 27, 2010Date of Patent: February 11, 2014Assignee: International Business Machines CorporationInventors: Elmootazbellah N. Elnozahy, Ahmed Gheith
-
Publication number: 20140040577Abstract: A mechanism is provided for automatic use of large pages. An operating system loader performs aggressive contiguous allocation followed by demand paging of small pages into a best-effort contiguous and naturally aligned physical address range sized for a large page. The operating system detects when the large page is fully populated and switches the mapping to use large pages. If the operating system runs low on memory, the operating system can free portions and degrade gracefully.Type: ApplicationFiled: August 3, 2012Publication date: February 6, 2014Applicant: International Business Machines CorporationInventors: Ahmed Gheith, Eric Van Hensbergen, James Xenidis
-
Publication number: 20140040901Abstract: A first set of one or more hardware threads for receiving messages sent from hardware threads are registered. After receiving indications of a message location value and a number, the message location value is increments and sent to a different hardware thread of the first set of one or more hardware threads until the message location value has been incremented the number of times or a criterion for interrupting the incrementing and sending is satisfied. An actual number of times the message location value was incremented is indicated to a hardware thread that sent the indications of the message location value and the number.Type: ApplicationFiled: October 11, 2013Publication date: February 6, 2014Applicant: International Business Machines CorporationInventors: Patrick J. Bohrer, Ahmed Gheith, James L. Peterson