Patents by Inventor Shengquan Yan

Shengquan Yan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

STREAM PROCESSING IN SEARCH DATA PIPELINES

Publication number: 20200293536

Abstract: Architecture that decomposes of one or more monolithic data concepts into atomic concepts and related atomic concept dependencies, and provides streaming data processing that processes individual or separate (atomic) data concepts and defined atomic dependencies. The architecture can comprise data-driven data processing that enables the plug-in of new data concepts with minimal effort. Efficient processing of the data concepts is enabled by streaming only required data concepts and corresponding dependencies and enablement of the seamless configuration of data processing between stream processing systems and batch processing systems as a result of data concept decomposition. Incremental and non-incremental metric processing enables realtime access and monitoring of operational parameters and queries.

Type: Application

Filed: March 12, 2020

Publication date: September 17, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Wei Lu, Michael Kinoti, Shengquan Yan, Peng Yu, Xian Zhang, Guixi Zou, Yin He, Xavier Drudis Rius, Miriam Rosenberg, Zijian Zheng
Stream processing in search data pipelines

Patent number: 10628423

Abstract: Architecture that decomposes of one or more monolithic data concepts into atomic concepts and related atomic concept dependencies, and provides streaming data processing that processes individual or separate (atomic) data concepts and defined atomic dependencies. The architecture can comprise data-driven data processing that enables the plug-in of new data concepts with minimal effort. Efficient processing of the data concepts is enabled by streaming only required data concepts and corresponding dependencies and enablement of the seamless configuration of data processing between stream processing systems and batch processing systems as a result of data concept decomposition. Incremental and non-incremental metric processing enables realtime access and monitoring of operational parameters and queries.

Type: Grant

Filed: February 2, 2015

Date of Patent: April 21, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Wei Lu, Michael Kinoti, Shengquan Yan, Peng Yu, Xian Zhang, Guixi Zou, Yin He, Xavier Drudis Rius, Miriam Rosenberg, Zijian Zheng
STREAM PROCESSING IN SEARCH DATA PIPELINES

Publication number: 20160224632

Abstract: Architecture that decomposes of one or more monolithic data concepts into atomic concepts and related atomic concept dependencies, and provides streaming data processing that processes individual or separate (atomic) data concepts and defined atomic dependencies. The architecture can comprise data-driven data processing that enables the plug-in of new data concepts with minimal effort. Efficient processing of the data concepts is enabled by streaming only required data concepts and corresponding dependencies and enablement of the seamless configuration of data processing between stream processing systems and batch processing systems as a result of data concept decomposition. Incremental and non-incremental metric processing enables realtime access and monitoring of operational parameters and queries.

Type: Application

Filed: February 2, 2015

Publication date: August 4, 2016

Applicant: Microsoft Corporation

Inventors: Wei Lu, Michael Kinoti, Shengquan Yan, Peng Yu, Xian Zhang, Guixi Zou, Yin He, Xavier Drudis Rius, Miriam Rosenberg, Zijian Zheng
Monad based cloud computing

Patent number: 8806451

Abstract: Systems and methods are provided for using monads to facilitate complex computation tasks in a cloud computing environment. In particular, monads can be employed to facilitate creation and execution of data mining jobs for large data sets. Monads can allow for improved error handling for complex computation tasks. Monads can also facilitate identification of opportunities for improving the efficiency of complex computations.

Type: Grant

Filed: June 16, 2011

Date of Patent: August 12, 2014

Assignee: Microsoft Corporation

Inventors: Zijian Zheng, Shengquan Yan, Peng Yu
PROCESSING USER LOG SESSIONS IN A DISTRIBUTED SYSTEM

Publication number: 20140172813

Abstract: Systems, methods, and computer media for efficiently processing user log data are provided. The log data is progressively processed in variable sized windows based on a specified time period. The log data may be anonymized to protect user privacy. A log server processes the windowed log data in phases. The first phase includes fast data like page view log data. Subsequent phases include slow data like session data which may build on the page view data processed in the first phase. The log server identifies metrics based on the log data processed at each phase. Based on the identified metrics, the log server may identify interests across a community of users or for specific users.

Type: Application

Filed: December 14, 2012

Publication date: June 19, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Shengquan Yan, Bai Xiao, Yunqiao Zhang, Peng Yu, Yin He, Kevin Philip White, Brian Jude Frasca, Zijian Zheng, Ravi Chandru Shahani
Machine-based learning for automatically categorizing data on per-user basis

Patent number: 8682819

Abstract: Architecture that employs machine-based learning to automatically categorize data on a per-user basis. Auto-tagging reduces the burden on infoworkers by creating a machine learning model to learn from user tagging behavior or preferences. Once this information is obtained, a trained model for this specific user is used to assign tags to incoming data, such as emails. The architecture finds particular applicability to compliance and message retention policies that otherwise would mandate extra work for the infoworker. The architecture learns the tagging behavior of a user and uses this learned behavior to automatically tag data based on the user's prior tagging habits. A regression algorithm is employed to process the training data according to an n-dimensional framework for prediction and application of the tag(s) to the incoming messages.

Type: Grant

Filed: June 19, 2008

Date of Patent: March 25, 2014

Assignee: Microsoft Corporation

Inventors: Ashish Consul, Harvey Rook, Rajasi Saha, Shengquan Yan
MONAD BASED CLOUD COMPUTING

Publication number: 20120324455

Abstract: Systems and methods are provided for using monads to facilitate complex computation tasks in a cloud computing environment. In particular, monads can be employed to facilitate creation and execution of data mining jobs for large data sets. Monads can allow for improved error handling for complex computation tasks. Monads can also facilitate identification of opportunities for improving the efficiency of complex computations.

Type: Application

Filed: June 16, 2011

Publication date: December 20, 2012

Applicant: MICROSOFT CORPORATION

Inventors: ZIJIAN ZHENG, SHENGQUAN YAN, PENG YU
OPTIMIZATION OF NON-DETERMINISTIC COMPUTATIONAL PATHS

Publication number: 20120284315

Abstract: Methods, computer systems and computer readable media for optimizing non-deterministic computational paths are provided. In embodiments, requests are received to generate reports derived from a plurality of series of data files whose metadata attributes form certain mathematical structures that can be used to choose the optimal path in the non-deterministic dependency model. Storage for each of the series of data files is optimized. Available data files needed for the report are processed and missing data files are identified. Based on the mathematical structure of the plurality of series of data files, an optimal transition with the missing data files available is determined. An entry into the transition is triggered and the missing data files are processed. The report is generated and the optimized storage is retained for future requests.

Type: Application

Filed: May 4, 2011

Publication date: November 8, 2012

Applicant: MICROSOFT CORPORATION

Inventors: ZHENGHAO WANG, SHENGQUAN YAN, AN YAN, JEFFREY ERIC LARSSON, ZIJIAN ZHENG
USER ANALYSIS THROUGH USER LOG FEATURE EXTRACTION

Publication number: 20120278354

Abstract: Systems, methods, and computer media for efficiently processing user log data are provided. A received user log data analysis request specifies: target user log features that identify users in a target user group, analysis user log features that identify data associated with the users in the target user group, and an analysis to perform on the identified data associated with the users in the target user group. Occurrences of specified features are extracted from user logs and stored. Users associated with an occurrence of each of the extracted and stored target user log features are identified as users in the target user group. Occurrences of the analysis user log features that are associated with a user in the target user group are extracted and reformatted for the analysis specified in the analysis request.

Type: Application

Filed: April 29, 2011

Publication date: November 1, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Shengquan Yan, Zhenghao Wang, Xiao Huang, Yu Chen, An Yan, Jeffrey Eric Larsson, Michael Kiogora Kinoti, Peng Yu, Zijian Zheng
Boosting to determine indicative features from a training set

Patent number: 8200601

Abstract: Determining indicative features may be provided. First, a first set of features may be determined using a document frequency process. Then a second set of features may be determined using a boosting process. Using the boosting process may comprise using an approximation for a one-dimensional optimization. The approximation may include an upper bound. Next, the first set of features and the second set of features may be combined into a combined set of features. The combined set of features may comprise a union of the first set of features and the second set of features. At least one document may then be classified based on the combined set of features.

Type: Grant

Filed: May 26, 2009

Date of Patent: June 12, 2012

Assignee: Microsoft Corporation

Inventors: John Platt, Harvey Rook, Shengquan Yan, Rajasi Saha
Boosting to Determine Indicative Features from a Training Set

Publication number: 20100306147

Abstract: Determining indicative features may be provided. First, a first set of features may be determined using a document frequency process. Then a second set of features may be determined using a boosting process. Using the boosting process may comprise using an approximation for a one-dimensional optimization. The approximation may include an upper bound. Next, the first set of features and the second set of features may be combined into a combined set of features. The combined set of features may comprise a union of the first set of features and the second set of features. At least one document may then be classified based on the combined set of features.

Type: Application

Filed: May 26, 2009

Publication date: December 2, 2010

Applicant: Microsoft Corporation

Inventors: John Platt, Harvey Rook, Shengquan Yan, Rajasi Saha
Searching An Email System Dumpster

Publication number: 20100146056

Abstract: A method is presented for searching for email messages that on a server computer. A request is received on the server computer to search for one or more email messages in one or more mailboxes on the server computer. Each of the one or more mailboxes includes a dumpster folder. The request includes search criteria including a parameter indicating whether the dumpster folder associated with a mailbox should be searched. The dumpster folder stores one or more email messages that have been deleted from a deleted items folder in the mailbox. One or more mailboxes that satisfy the search criteria in the request are identified. If the parameter indicates that the dumpster folder should be searched, the dumpster folder of each of the identified mailboxes that satisfy the search criteria is queried and any email messages in each dumpster folder that satisfy the search criteria are identified.

Type: Application

Filed: December 4, 2008

Publication date: June 10, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Ashish Consul, Suryanarayana M. Gorti, Shengquan Yan, Andrei Marinescu, Julian Alexander Zbogar-Smith
MACHINE-BASED LEARNING FOR AUTOMATICALLY CATEGORIZING DATA ON PER-USER BASIS

Publication number: 20090319456

Abstract: Architecture that employs machine-based learning to automatically categorize data on a per-user basis. Auto-tagging reduces the burden on infoworkers by creating a machine learning model to learn from user tagging behavior or preferences. Once this information is obtained, a trained model for this specific user is used to assign tags to incoming data, such as emails. The architecture finds particular applicability to compliance and message retention policies that otherwise would mandate extra work for the infoworker. The architecture learns the tagging behavior of a user and uses this learned behavior to automatically tag data based on the user's prior tagging habits. A regression algorithm is employed to process the training data according to an n-dimensional framework for prediction and application of the tag(s) to the incoming messages.

Type: Application

Filed: June 19, 2008

Publication date: December 24, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Ashish Consul, Harvey Rook, Rajasi Saha, Shengquan Yan