Patents by Inventor Saeed Maleki
Saeed Maleki has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240403598Abstract: Embodiments of the present disclosure include techniques for designing and generating a parallelization plan for a neural network so that workloads in the neural network may be split amongst multiple devices. Operators and tensors in the neural network are transformed into a set of functionally equivalent operators and tensors. These functionally equivalent operators and tensors are then scheduled to separate devices for execution.Type: ApplicationFiled: June 1, 2023Publication date: December 5, 2024Inventors: Youshan MIAO, Fan YANG, Quanlu ZHANG, Saeed MALEKI, Xu CAO, Yi ZHU, Mao YANG, Lidong ZHOU, Zhiqi LIN
-
Publication number: 20240311153Abstract: A method for scheduling a coordinated transfer of data among a plurality of processor nodes on a network comprises operating a multi-commodity flow model subject to a plurality of predetermined constraints. The model is configured to (a) receive as input a set of demands defining, for each of the plurality of processor nodes, an amount of data to be transferred to that processor node, (b) assign a plurality of paths linking the plurality of processor nodes, and (c) emit a schedule for transfer of the data along the plurality of paths so as to minimize a predetermined cost function, wherein the schedule comprises at least one store-and-forward operation and at least one copy operation.Type: ApplicationFiled: June 8, 2023Publication date: September 19, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Behnaz ARZANI, Siva Kesava Reddy KAKARLA, Miguel OOM TEMUDO DE CASTRO, Srikanth KANDULA, Saeed MALEKI, Luke Jonathon MARSHALL
-
Patent number: 11295231Abstract: Systems, methods, and computer-readable media are disclosed for parallel stochastic gradient descent using linear and non-linear activation functions. One method includes: receiving a set of input examples; receiving a global model; and learning a new global model based on the global model and the set of input examples by iteratively performing the following steps: computing a plurality of local models having a plurality of model parameters based on the global model and at least a portion of the set of input examples; computing, for each local model, a corresponding model combiner based on the global model and at least a portion of the set of input examples; and combining the plurality of local models into the new global model based on the current global model and the plurality of corresponding model combiners.Type: GrantFiled: May 22, 2017Date of Patent: April 5, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Saeed Maleki, Madanlal S. Musuvathi, Todd D. Mytkowicz
-
Patent number: 11177935Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed to optimizing the generation, evaluation, and selection of tensor circuit specifications for a tensor circuit to perform homomorphic encryption operations on encrypted data. A computing device having an improved compiler and runtime configuration can obtain a tensor circuit and associated schema. The computing device can map the obtained tensor circuit to an equivalent tensor circuit, adapted to perform fully homomorphic encryption (FHE) operations, and instantiated based on the obtained associated scheme. The computing device can then monitor a flow of data through the equivalent FHE-adapted tensor circuit utilizing various tensor circuit specifications determined therefor.Type: GrantFiled: October 31, 2018Date of Patent: November 16, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Madanlal S. Musuvathi, Kim Laine, Kristin E. Lauter, Hao Chen, Olli Ilari Saarikivi, Saeed Maleki, Roshan Dathathri, Todd D. Mytkowicz
-
Patent number: 11062226Abstract: Described herein is a system that transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations along with symbolic representations. The symbolic representations can be used to combine the local models. The global model can determine a likelihood, given a new data instance of a feature set, that a user performs a computer interaction with the content element. For instance, the system can use the model to provide search results in response to a search query submitted by a user. Or, the system can use the model to make a recommendation or suggestion to a user in response to a request for content (e.g., display a targeted advertisement, suggest a news story, etc.).Type: GrantFiled: June 15, 2017Date of Patent: July 13, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding
-
Patent number: 10922627Abstract: Described herein is a system that transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations along with symbolic representations. Network transmission of the local models and the symbolic representations, rather than transmission of the large training data subsets processed to compute the local models and symbolic representations, conserves resources and decreases latency. The global model can then be used as a model to determine a likelihood of a course of action being successful for an organization. For example, the course of action can be a purchase of a security or a business operation strategy. In another example, the course of action can be a type of medical treatment for a patient.Type: GrantFiled: June 15, 2017Date of Patent: February 16, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding
-
Patent number: 10805317Abstract: Described herein is a system transmits and combines local models, that individually include a set of local parameters computed via stochastic gradient descent (SGD), into a global model that includes a set of global model parameters. The local models are computed in parallel at different geographic locations (e.g., different instances of computing infrastructure) along with symbolic representations. Network transmission of the local models and the symbolic representations, rather than transmission of the large training data subsets processed to compute the local models and symbolic representations, conserves resources and decreases latency. The global model can then be used as a model to determine a likelihood that at least a portion of current and/or recently received data traffic is illegitimate data traffic that is associated with a cyber attack. In some instances, the system can implement a remedial action to mitigate the effects of the cyber attack on computing infrastructure.Type: GrantFiled: June 15, 2017Date of Patent: October 13, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding
-
Publication number: 20200076570Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed to optimizing the generation, evaluation, and selection of tensor circuit specifications for a tensor circuit to perform homomorphic encryption operations on encrypted data. A computing device having an improved compiler and runtime configuration can obtain a tensor circuit and associated schema. The computing device can map the obtained tensor circuit to an equivalent tensor circuit, adapted to perform fully homomorphic encryption (FHE) operations, and instantiated based on the obtained associated scheme. The computing device can then monitor a flow of data through the equivalent FHE-adapted tensor circuit utilizing various tensor circuit specifications determined therefor.Type: ApplicationFiled: October 31, 2018Publication date: March 5, 2020Inventors: Madanlal S. MUSUVATHI, Kim LAINE, Kristin E. LAUTER, Hao CHEN, Olli Ilari SAARIKIVI, Saeed MALEKI, Roshan DATHATHRI, Todd D. MYTKOWICZ
-
Patent number: 10503580Abstract: Described herein is a system that transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations along with symbolic representations. Network transmission of the local models and the symbolic representations, rather than transmission of the large training data subsets processed to compute the local models and symbolic representations, conserves resources and decreases latency. The global model can then be used as a model to determine a likelihood of a monitored resource or a user of the monitored resource experiencing a problem with respect to performance or completion of one or more operations. The system can also implement an action to assist in resolving or avoiding the problem.Type: GrantFiled: June 15, 2017Date of Patent: December 10, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding
-
Publication number: 20180365580Abstract: Described herein is a system that transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations along with symbolic representations. The symbolic representations can be used to combine the local models. The global model can determine a likelihood, given a new data instance of a feature set, that a user performs a computer interaction with the content element. For instance, the system can use the model to provide search results in response to a search query submitted by a user. Or, the system can use the model to make a recommendation or suggestion to a user in response to a request for content (e.g., display a targeted advertisement, suggest a news story, etc.).Type: ApplicationFiled: June 15, 2017Publication date: December 20, 2018Inventors: Madanlal S. MUSUVATHI, Todd D. MYTKOWICZ, Saeed MALEKI, Yufei DING
-
Publication number: 20180365582Abstract: Described herein is a system that transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations along with symbolic representations. Network transmission of the local models and the symbolic representations, rather than transmission of the large training data subsets processed to compute the local models and symbolic representations, conserves resources and decreases latency. The global model can then be used as a model to determine a likelihood of a course of action being successful for an organization. For example, the course of action can be a purchase of a security or a business operation strategy. In another example, the course of action can be a type of medical treatment for a patient.Type: ApplicationFiled: June 15, 2017Publication date: December 20, 2018Inventors: Madanlal S. MUSUVATHI, Todd D. MYTKOWICZ, Saeed MALEKI, Yufei DING
-
Publication number: 20180365093Abstract: Described herein is a system that transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations along with symbolic representations. Network transmission of the local models and the symbolic representations, rather than transmission of the large training data subsets processed to compute the local models and symbolic representations, conserves resources and decreases latency. The global model can then be used as a model to determine a likelihood of a monitored resource or a user of the monitored resource experiencing a problem with respect to performance or completion of one or more operations. The system can also implement an action to assist in resolving or avoiding the problem.Type: ApplicationFiled: June 15, 2017Publication date: December 20, 2018Inventors: Madanlal S. MUSUVATHI, Todd D. MYTKOWICZ, Saeed MALEKI, Yufei DING
-
Publication number: 20180367550Abstract: Described herein is a system transmits and combines local models, that individually comprise a set of local parameters computed via stochastic gradient descent (SGD), into a global model that comprises a set of global model parameters. The local models are computed in parallel at different geographic locations (e.g., different instances of computing infrastructure) along with symbolic representations. Network transmission of the local models and the symbolic representations, rather than transmission of the large training data subsets processed to compute the local models and symbolic representations, conserves resources and decreases latency. The global model can then be used as a model to determine a likelihood that at least a portion of current and/or recently received data traffic is illegitimate data traffic that is associated with a cyber attack. In some instances, the system can implement a remedial action to mitigate the effects of the cyber attack on computing infrastructure.Type: ApplicationFiled: June 15, 2017Publication date: December 20, 2018Inventors: Madanlal S. MUSUVATHI, Todd D. MYTKOWICZ, Saeed MALEKI, Yufei DING
-
Publication number: 20180330271Abstract: Systems, methods, and computer-readable media are disclosed for parallel stochastic gradient descent using linear and non-linear activation functions. One method includes: receiving a set of input examples; receiving a global model; and learning a new global model based on the global model and the set of input examples by iteratively performing the following steps: computing a plurality of local models having a plurality of model parameters based on the global model and at least a portion of the set of input examples; computing, for each local model, a corresponding model combiner based on the global model and at least a portion of the set of input examples; and combining the plurality of local models into the new global model based on the current global model and the plurality of corresponding model combiners.Type: ApplicationFiled: May 22, 2017Publication date: November 15, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Saeed MALEKI, Madanlal S. MUSUVATHI, Todd D. MYTKOWICZ
-
Patent number: 9195436Abstract: The techniques and/or systems described herein implement parallel processing of a dynamic programming problem across stages and/or clusters by breaking dependencies between stages and/or clusters. For instance, the techniques and/or systems may identify dependencies between sub-problems of the dynamic programming problem and group the sub-problems into stages. The techniques and/or systems may also group the stages into clusters (e.g., at least two clusters to be parallel processed). Then, the techniques and/or systems generate one or more solutions to use instead of actual solutions so that the dynamic programming problem can be parallel processed across stages and/or clusters.Type: GrantFiled: April 21, 2014Date of Patent: November 24, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Todd D. Mytkowicz, Madanlal Musuvathi, Saeed Maleki
-
Publication number: 20150106783Abstract: The techniques and/or systems described herein implement parallel processing of a dynamic programming problem across stages and/or clusters by breaking dependencies between stages and/or clusters. For instance, the techniques and/or systems may identify dependencies between sub-problems of the dynamic programming problem and group the sub-problems into stages. The techniques and/or systems may also group the stages into clusters (e.g., at least two clusters to be parallel processed). Then, the techniques and/or systems generate one or more solutions to use instead of actual solutions so that the dynamic programming problem can be parallel processed across stages and/or clusters.Type: ApplicationFiled: April 21, 2014Publication date: April 16, 2015Applicant: Microsoft CorporationInventors: Todd D. Mytkowicz, Madanlal Musuvathi, Saeed Maleki