Patents by Inventor Thomas Friedrich Wenisch
Thomas Friedrich Wenisch has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12073222Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.Type: GrantFiled: November 26, 2019Date of Patent: August 27, 2024Assignees: Arm Limited, The Regents of the University of MichiganInventors: Vaibhav Gogte, Wei Wang, Stephan Diestelhorst, Peter M Chen, Satish Narayanasamy, Thomas Friedrich Wenisch
-
Publication number: 20220004390Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.Type: ApplicationFiled: November 26, 2019Publication date: January 6, 2022Inventors: Vaibhav GOGTE, Wei WANG, Stephan DIESTELHORST, Peter M CHEN, Satish NARAYANAMY, Thomas Friedrich WENISCH
-
Patent number: 11055576Abstract: System, methods, and other embodiments described herein relate to improving querying of a visual dataset of images through implementing system-aware cascades. In one embodiment, a method includes enumerating a set of cascade classifiers that are each separately comprised of transformation modules and machine learning modules arranged in multiple pairs. Classifiers of the set of cascade classifiers are configured to extract content from an image according to a query. The method includes selecting a query classifier from the set of cascade classifiers based, at least in part, on system costs that characterize computational resources consumed by the classifiers of the set of cascade classifiers. The computational resources include at least data handling costs. The method includes identifying content within the image using the query classifier.Type: GrantFiled: April 20, 2018Date of Patent: July 6, 2021Assignees: Toyota Research Institute, Inc., The Regents of The University of MichiganInventors: Michael Robert Anderson, Thomas Friedrich Wenisch, German Ros Sanchez
-
Patent number: 10956166Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.Type: GrantFiled: March 8, 2019Date of Patent: March 23, 2021Assignees: Arm Limited, The Regents of The University of MichiganInventors: Vaibhav Gogte, Wei Wang, Stephan Diestelhorst, Peter M Chen, Satish Narayanasamy, Thomas Friedrich Wenisch
-
Publication number: 20200285479Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.Type: ApplicationFiled: March 8, 2019Publication date: September 10, 2020Inventors: Vaibhav GOGTE, Wei WANG, Stephan DIESTELHORST, Peter M. CHEN, Satish NARAYANASAMY, Thomas Friedrich WENISCH
-
Patent number: 10705849Abstract: A data processing system includes a multithreaded processor to execute a plurality of selected program threads in parallel. A mode-selectable processor is coupled to the multithreaded processor and executes in either a first mode or a second mode. In the first mode program instructions from a single thread are executed. In the second mode, which is selected when the single program thread is inactive, program instructions forming a plurality of borrowed threads are executed. These borrowed threads are taken from a queue of candidate program threads which is managed by the multithreaded processor.Type: GrantFiled: February 5, 2018Date of Patent: July 7, 2020Assignee: The Regents of the University of MichiganInventors: Seyedamirhossein Mirhosseininiri, Thomas Friedrich Wenisch
-
Patent number: 10649780Abstract: A data processing apparatus and method are provided for executing a stream of instructions out-of-order with respect to original program order. At least some of the instructions in the stream identify one or more architectural registers from a set of architectural registers. The apparatus comprises a plurality of out-of-order components configured to manage execution of a first subset of instructions out-of-order, the plurality of out-of-order components being configured to remove false dependencies between instructions in the first subset. The plurality of out-of-order components include a first issue queue into which the instructions in the first subset are buffered prior to execution. A second issue queue is used to buffer a second subset of instructions prior to execution, the second subset of instructions being constrained to execute in order.Type: GrantFiled: March 18, 2015Date of Patent: May 12, 2020Assignee: The Regents of the University of MichiganInventors: Faissal Mohamad Sleiman, Thomas Friedrich Wenisch
-
Patent number: 10496642Abstract: A hardware accelerator 2 for performing queries into, for example, an indexed text log files is formed of plurality of hardware execution units (text engines) 4, each executing a partial query program upon the same full set of input data. These partial query programs may switch between different query algorithms on up to a per-character basis. The sequence of data when loaded into a buffer memory 16 for querying may be searched for delimiters as the data is loaded. The hardware execution units may support a number match program instruction which serves to identify a numeric variable, and to determine a value of that numeric variable located at a variable position within a sequence of characters being queried.Type: GrantFiled: September 23, 2015Date of Patent: December 3, 2019Assignee: The Regents of the University of MichiganInventors: Prateek Tandon, Thomas Friedrich Wenisch, Michael John Cafarella
-
Publication number: 20190243654Abstract: A data processing system 2 includes a multi threaded processor 4 to execute a plurality of selected program threads in parallel. A mode-selectable processor 6 is coupled to the multi threaded processor 4 and executes in either a first mode or a second mode. In the first mode program instructions from a single thread are executed. In the second mode, which is selected when the single program thread is inactive, program instructions form a plurality of borrowed threads are executed. These borrowed threads are taken from a queue of candidate program threads which is managed by the multi threaded processor.Type: ApplicationFiled: February 5, 2018Publication date: August 8, 2019Inventors: Seyedamirhossein MIRHOSSEININIRI, Thomas Friedrich WENISCH
-
Publication number: 20190130223Abstract: System, methods, and other embodiments described herein relate to improving querying of a visual dataset of images through implementing system-aware cascades. In one embodiment, a method includes enumerating a set of cascade classifiers that are each separately comprised of transformation modules and machine learning modules arranged in multiple pairs. Classifiers of the set of cascade classifiers are configured to extract content from an image according to a query. The method includes selecting a query classifier from the set of cascade classifiers based, at least in part, on system costs that characterize computational resources consumed by the classifiers of the set of cascade classifiers. The computational resources include at least data handling costs. The method includes identifying content within the image using the query classifier.Type: ApplicationFiled: April 20, 2018Publication date: May 2, 2019Inventors: Michael Robert Anderson, Thomas Friedrich Wenisch, German Ros Sanchez
-
Patent number: 10042776Abstract: An apparatus for processing data includes signature generation circuitry 30, 32 for generating a signature value indicative of the current state of the apparatus in dependence upon a sequence of immediately preceding return addresses generating during execution of a stream of program instructions to reach that state of the apparatus. Prefetch circuitry 10 performs one or more prefetch operations in dependence upon the signature value that is generated. The signature value may be generated by a hashing operation (such as an XOR) performed upon return addresses stored within a return address stack 28.Type: GrantFiled: November 20, 2012Date of Patent: August 7, 2018Assignees: ARM Limited, The Regents of the University of MichiganInventors: Ali Saidi, Thomas Friedrich Wenisch, Aasheesh Kolli
-
Patent number: 9946492Abstract: A data processing system 2 including non-volatile memory 22 manages the ordering of writes to the non-volatile memory and persist barrier instructions using a persist buffer storing persist buffer data. A write controller responds to the persist buffer data to prevent writing to the non-volatile memory for instructions following a given persist barrier instruction within a sequence of program instructions before the writes to the non-volatile memory which precede that given persist barrier instruction have at least been acknowledged as received by the memory system containing the non-volatile memory. In the case of a multi-core system, cache snooping mechanisms are used to pass persistency dependence data between cores such that strong persist atomicity may be tracked and managed between the cores.Type: GrantFiled: October 30, 2015Date of Patent: April 17, 2018Assignees: ARM Limited, The Regents of the University of MichiganInventors: Stephan Diestelhorst, Aasheesh Kolli, Ali Ghassan Saidi, Peter Chen, Thomas Friedrich Wenisch
-
Patent number: 9691374Abstract: A data processing system is provided for performing processing operations upon an ordered stream of input data values to form an ordered stream of output data values. A select circuit (18) includes select interval generation circuitry (34) which determines a number (interval number) of input data values between each data value to be selected for output from among the ordered stream of input data values. This interval number varies with position within the ordered stream of input data values. The select circuit (18) can thus perform selection of input data values in accordance with an interval number which may be varied, for example, in accordance with a linear piecewise approximation of an desired curve or, in other embodiments, in a piecewise quadratic variation approximating a desired curve. The processing techniques may be used, for example, in beam forming application, such as 3D beam forming of ultrasonic images.Type: GrantFiled: November 28, 2012Date of Patent: June 27, 2017Assignee: The Regents of the University of MichiganInventors: Richard Sampson, Thomas Friedrich Wenisch
-
Publication number: 20170123723Abstract: A data processing system 2 including non-volatile memory 22 manages the ordering of writes to the non-volatile memory and persist barrier instructions using a persist buffer storing persist buffer data. A write controller responds to the persist buffer data to prevent writing to the non-volatile memory for instructions following a given persist barrier instruction within a sequence of program instructions before the writes to the non-volatile memory which precede that given persist barrier instruction have at least been acknowledged as received by the memory system containing the non-volatile memory. In the case of a multi-core system, cache snooping mechanisms are used to pass persistency dependence data between cores such that strong persist atomicity may be tracked and managed between the cores.Type: ApplicationFiled: October 30, 2015Publication date: May 4, 2017Inventors: Stephan DIESTELHORST, Aasheesh KOLLI, Ali Ghassan SAIDI, Peter CHEN, Thomas Friedrich WENISCH
-
Publication number: 20170109172Abstract: A data processing apparatus and method are provided for executing a stream of instructions out-of-order with respect to original program order. At least some of the instructions in the stream identify one or more architectural registers from a set of architectural registers. The apparatus comprises a plurality of out-of-order components configured to manage execution of a first subset of instructions out-of-order, the plurality of out-of-order components being configured to remove false dependencies between instructions in the first subset. The plurality of out-of-order components include a first issue queue into which the instructions in the first subset are buffered prior to execution. A second issue queue is used to buffer a second subset of instructions prior to execution, the second subset of instructions being constrained to execute in order.Type: ApplicationFiled: March 18, 2015Publication date: April 20, 2017Inventors: Faissal Mohamad SLEIMAN, Thomas Friedrich WENISCH
-
Publication number: 20160098411Abstract: A hardware accelerator 2 for performing queries into, for example, an indexed text log files is formed of plurality of hardware execution units (text engines) 4, each executing a partial query program upon the same full set of input data. These partial query programs may switch between different query algorithms on up to a per-character basis. The sequence of data when loaded into a buffer memory 16 for querying may be searched for delimiters as the data is loaded. The hardware execution units may support a number match program instruction which serves to identify a numeric variable, and to determine a value of that numeric variable located at a variable position within a sequence of characters being queried.Type: ApplicationFiled: October 3, 2014Publication date: April 7, 2016Inventors: Prateek TANDON, Thomas Friedrich WENISCH, Michael John CAFARELLA
-
Publication number: 20160098450Abstract: A hardware accelerator 2 for performing queries into, for example, an indexed text log files is formed of plurality of hardware execution units (text engines) 4, each executing a partial query program upon the same full set of input data. These partial query programs may switch between different query algorithms on up to a per-character basis. The sequence of data when loaded into a buffer memory 16 for querying may be searched for delimiters as the data is loaded. The hardware execution units may support a number match program instruction which serves to identify a numeric variable, and to determine a value of that numeric variable located at a variable position within a sequence of characters being queried.Type: ApplicationFiled: September 23, 2015Publication date: April 7, 2016Inventors: Prateek TANDON, Thomas Friedrich WENISCH, Michael John CAFARELLA
-
Publication number: 20150302844Abstract: A data processing system is provided for performing processing operations upon an ordered stream of input data values to form an ordered stream of output data values. A select circuit (18) includes select interval generation circuitry (34) which determines a number (interval number) of input data values between each data value to be selected for output from among the ordered stream of input data values. This interval number varies with position within the ordered stream of input data values. The select circuit (18) can thus perform selection of input data values in accordance with an interval number which may be varied, for example, in accordance with a linear piecewise approximation of an desired curve or, in other embodiments, in a piecewise quadratic variation approximating a desired curve. The processing techniques may be used, for example, in beam forming application, such as 3D beam forming of ultrasonic images.Type: ApplicationFiled: November 28, 2012Publication date: October 22, 2015Applicant: THE REGENTS OF THE UNIVERSITY OF MICHIGANInventors: Richard SAMPSON, Thomas Friedrich WENISCH
-
Publication number: 20150277925Abstract: A data processing apparatus and method are provided for executing a stream of instructions out-of-order with respect to original program order. At least some of the instructions in the stream identify one or more architectural registers from a set of architectural registers. The apparatus comprises a plurality of out-of-order components configured to manage execution of a first subset of instructions out-of-order, the plurality of out-of-order components being configured to remove false dependencies between instructions in the first subset. The plurality of out-of-order components include a first issue queue into which the instructions in the first subset are buffered prior to execution. A second issue queue is used to buffer a second subset of instructions prior to execution, the second subset of instructions being constrained to execute in order.Type: ApplicationFiled: April 1, 2014Publication date: October 1, 2015Applicant: THE REGENTS OF THE UNIVERSITY OF MICHIGANInventors: Faissal Mohamad SLEIMAN, Thomas Friedrich WENISCH
-
Patent number: 8825955Abstract: A data processing apparatus has a cache with a data array and a tag array. The tag array stores address tag portions associated with the data values in the data array. The cache performs a tag lookup, comparing a tag portion of a received address with a set of tag entries in the tag array. The data array includes a partial tag store storing a partial tag value in association with each data entry. In parallel with the tag lookup, a partial tag value of the received address is compared with partial tag values stored in association with a set of data entries in said data array. A data value is read out if a match condition occurs. Exclusivity circuitry ensures that at most one partial tag value of said partial tag values stored in association with said set of data entries can generate said match condition.Type: GrantFiled: March 19, 2012Date of Patent: September 2, 2014Assignee: The Regents of the University of MichiganInventors: Faissal Mohamad Sleiman, Ronald George Dreslinski, Jr., Thomas Friedrich Wenisch