Patents by Inventor Alexander ALPEROVICH
Alexander ALPEROVICH has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11625558Abstract: Data events of an event stream are processed in accordance with temporally valid machine learning models. A streaming node may receive data events via an event stream. Each data event may be associated with a timestamp. The streaming node may also utilize punctuation events that specify the temporal validity of available machine learning models. The streaming node performs a temporal join operation for each data event based on its timestamp and the temporal validity. If the data event's timestamp is less than or equal to the punctuation event's timestamp, the data event is provided to the temporally valid machine learning model for processing thereby. If the data event's timestamp is greater than the punctuation event's timestamp, the data event is held until a subsequent punctuation event specifying a later timestamp is received.Type: GrantFiled: December 13, 2019Date of Patent: April 11, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Alexander Alperovich, Kanstantsyn Zoryn, Krishna G. Mamidipaka
-
Patent number: 11496153Abstract: Described herein is a system and method for coded streaming data to facilitate recovery from failed or slow processor(s). A batch of processing stream data can be partitioned into a plurality of data chunks. Parity chunk(s) for the plurality of data chunks. The plurality of data chunks and the parity chunk(s) can be provided to processors for processing. Processed data of at least some (e.g., one or more) of the plurality of data chunks, and, processed data of parity chunk(s) are received. When it is determined that processed data for a pre-defined quantity of data chunks has not been received by a pre-defined period of time, the processed data for particular data chunk(s) of particular processor(s) from which processed data has not been received are determined based, at least in part, upon the received processed parity chunk(s) and the received processed data chunk(s).Type: GrantFiled: April 5, 2021Date of Patent: November 8, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Todd Robert Porter, Xin Tian, Alexander Alperovich
-
Patent number: 11226966Abstract: Described herein is a system and method of journaling of a streaming anchor resource. An input node can store a value of a property associated with the streaming data in a persistent indexed data structure. The input node can generate an anchor that describes a particular point in time in a data stream. The anchor can include an index into the persistent indexed data structure of the stored value of the property associated with the streaming data. The generated anchor and streaming data can be provided to the downstream node. During recovery of a downstream node, the input node can utilize a received anchor to retrieve a value of a property associated with the streaming data from the persistent indexed data structure, and, provide a batch of data based upon the received anchor and the retrieved property value.Type: GrantFiled: October 2, 2019Date of Patent: January 18, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Alexander Alperovich, Boris Shulman, Ke Liu
-
Publication number: 20210306001Abstract: Described herein is a system and method for coded streaming data to facilitate recovery from failed or slow processor(s). A batch of processing stream data can be partitioned into a plurality of data chunks. Parity chunk(s) for the plurality of data chunks. The plurality of data chunks and the parity chunk(s) can be provided to processors for processing. Processed data of at least some (e.g., one or more) of the plurality of data chunks, and, processed data of parity chunk(s) are received. When it is determined that processed data for a pre-defined quantity of data chunks has not been received by a pre-defined period of time, the processed data for particular data chunk(s) of particular processor(s) from which processed data has not been received are determined based, at least in part, upon the received processed parity chunk(s) and the received processed data chunk(s).Type: ApplicationFiled: April 5, 2021Publication date: September 30, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Todd Robert PORTER, Xin TIAN, Alexander ALPEROVICH
-
Patent number: 11113197Abstract: A method for joining an event stream with reference data includes loading a plurality of reference data snapshots from a reference data source into a cache. Punctuation events are supplied that indicate temporal validity for the plurality of reference data snapshots in the cache. A logical barrier is provided that restricts a flow of data events in the event stream to a cache lookup operation based on the punctuation events. The cache lookup operation is performed with respect to the data events in the event stream that are permitted to cross the logical barrier.Type: GrantFiled: April 8, 2019Date of Patent: September 7, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Boris Shulman, Shoupei Li, Alexander Alperovich, Xindi Zhang, Kanstantsyn Zoryn
-
Patent number: 11095522Abstract: Described herein is a system and method for dynamically scaling a stream processing system (e.g., “exactly once” data stream processing system). Various parameter(s) (e.g., user-configurable capacity, real-time load metrics, and/or performance counters) can be used to dynamically scale in and/or scale out the “exactly once” stream processing system without system restart. Delay introduced by this scaling operation can be minimized by utilizing a combination of mutable process topology (which can dynamically assign certain parts of the system to a new host machine) and controllable streaming processor movement with checkpoints and the streaming protocol controlled recovery which still enforces the “exactly once” delivery metric.Type: GrantFiled: August 21, 2019Date of Patent: August 17, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Xindi Zhang, Boris Shulman, Alexander Alperovich, Patrick Chung
-
Patent number: 11044291Abstract: Described herein is a system and method for startup and/or recovery for stream processing. During a startup phase: start anchor request(s), each identifying a particular time, are accumulated until request(s) are pending from downstream nodes. A minimum time of the accumulated start anchor request(s) is determined. If the processing system is an input node, an anchor associated with the determined minimum time is generated. Otherwise, a start anchor request is provided to an upstream node identifying the determined minimum time. Once the anchor associated with the determined minimum time is received (or generated), the anchor is provided in response to a polled start anchor request for the determined minimum time from a downstream node. Asynchronous requests for batches of data bounded by two specific anchors are performed in accordance with information stored in an ordered collection of anchors during a recovery phase.Type: GrantFiled: September 28, 2018Date of Patent: June 22, 2021Assignee: Microsft Technology Licensing, LLCInventors: Alexander Alperovich, Boris Shulman, Zhong Chen, Lev Novik, Kanstantsyn Zoryn
-
Publication number: 20210182619Abstract: Data events of an event stream are processed in accordance with temporally valid machine learning models. A streaming node may receive data events via an event stream. Each data event may be associated with a timestamp. The streaming node may also utilize punctuation events that specify the temporal validity of available machine learning models. The streaming node performs a temporal join operation for each data event based on its timestamp and the temporal validity. If the data event's timestamp is less than or equal to the punctuation event's timestamp, the data event is provided to the temporally valid machine learning model for processing thereby. If the data event's timestamp is greater than the punctuation event's timestamp, the data event is held until a subsequent punctuation event specifying a later timestamp is received.Type: ApplicationFiled: December 13, 2019Publication date: June 17, 2021Inventors: Alexander Alperovich, Kanstantsyn Zoryn, Krishna G. Mamidipaka
-
Patent number: 11010171Abstract: Methods, systems, apparatuses, and computer program products are provided for processing a stream of data. A maximum temporal divergence is established for data flushed to a data store from a plurality of upstream partitions. Each of a plurality of data flushers, each corresponding to an upstream partition, may obtain an item of data from a data producer. Each data flusher may determine whether flushing the data to the data store would exceed the maximum temporal divergence. Based at least on determining that flushing the data to the data store would not exceed the maximum temporal divergence, the data may be flushed to the data store for ingestion by a downstream partition and a data structure (e.g., a ledger) may be updated to indicate a time associated with the most recent item of data flushed to the data store.Type: GrantFiled: May 30, 2019Date of Patent: May 18, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Alexander Alperovich, Zhong Chen, Boris Shulman
-
Patent number: 10998919Abstract: Described herein is a system and method for coded streaming data to facilitate recovery from failed or slow processor(s). A batch of processing stream data can be partitioned into a plurality of data chunks. Parity chunk(s) for the plurality of data chunks. The plurality of data chunks and the parity chunk(s) can be provided to processors for processing. Processed data of at least some (e.g., one or more) of the plurality of data chunks, and, processed data of parity chunk(s) are received. When it is determined that processed data for a pre-defined quantity of data chunks has not been received by a pre-defined period of time, the processed data for particular data chunk(s) of particular processor(s) from which processed data has not been received are determined based, at least in part, upon the received processed parity chunk(s) and the received processed data chunk(s).Type: GrantFiled: October 2, 2019Date of Patent: May 4, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Todd Robert Porter, Xin Tian, Alexander Alperovich
-
Publication number: 20210105024Abstract: Described herein is a system and method for coded streaming data to facilitate recovery from failed or slow processor(s). A batch of processing stream data can be partitioned into a plurality of data chunks. Parity chunk(s) for the plurality of data chunks. The plurality of data chunks and the parity chunk(s) can be provided to processors for processing. Processed data of at least some (e.g., one or more) of the plurality of data chunks, and, processed data of parity chunk(s) are received. When it is determined that processed data for a pre-defined quantity of data chunks has not been received by a pre-defined period of time, the processed data for particular data chunk(s) of particular processor(s) from which processed data has not been received are determined based, at least in part, upon the received processed parity chunk(s) and the received processed data chunk(s).Type: ApplicationFiled: October 2, 2019Publication date: April 8, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Todd Robert PORTER, Xin TIAN, Alexander ALPEROVICH
-
Publication number: 20210103590Abstract: Described herein is a system and method of journaling of a streaming anchor resource. An input node can store a value of a property associated with the streaming data in a persistent indexed data structure. The input node can generate an anchor that describes a particular point in time in a data stream. The anchor can include an index into the persistent indexed data structure of the stored value of the property associated with the streaming data. The generated anchor and streaming data can be provided to the downstream node. During recovery of a downstream node, the input node can utilize a received anchor to retrieve a value of a property associated with the streaming data from the persistent indexed data structure, and, provide a batch of data based upon the received anchor and the retrieved property value.Type: ApplicationFiled: October 2, 2019Publication date: April 8, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Alexander ALPEROVICH, Boris SHULMAN, Ke LIU
-
Publication number: 20210058298Abstract: Described herein is a system and method for dynamically scaling a stream processing system (e.g., “exactly once” data stream processing system). Various parameter(s) (e.g., user-configurable capacity, real-time load metrics, and/or performance counters) can be used to dynamically scale in and/or scale out the “exactly once” stream processing system without system restart. Delay introduced by this scaling operation can be minimized by utilizing a combination of mutable process topology (which can dynamically assign certain parts of the system to a new host machine) and controllable streaming processor movement with checkpoints and the streaming protocol controlled recovery which still enforces the “exactly once” delivery metric.Type: ApplicationFiled: August 21, 2019Publication date: February 25, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Xindi ZHANG, Boris SHULMAN, Alexander ALPEROVICH, Patrick CHUNG
-
Patent number: 10868741Abstract: A method for facilitating anchor shortening across streaming nodes in an event stream processing system may include receiving a full anchor at an upstream marshaller. The full anchor may be associated with a data batch that corresponds to one or more event streams. The full anchor may include an indication of an input point for the one or more event streams. The full anchor may be received from an upstream compute processor. The method may also include mapping the full anchor to an index anchor and passing the index anchor to a downstream marshaller.Type: GrantFiled: May 11, 2018Date of Patent: December 15, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Alexander Alperovich, Boris Shulman, Lev Novik
-
Publication number: 20200379805Abstract: Methods, systems, and computer program products are described herein for automated cloud-edge workload distribution and bidirectional migration with lossless, once-only data stream processing. A cloud service may provide workload and bidirectional migration management between cloud and edge to provide once-only processing of data streams before and after migration. Migrated logic nodes may begin processing data streams where processing stopped at source logic nodes before migration without data loss or repetition, for example, by migrating and using anchors in pull-based stream processing. Query logic implementing customer queries of data streams may be distributed to edge and/or cloud devices based on placement criteria. Query logic may be migrated from source to target edge and/or cloud devices based on migration criteria.Type: ApplicationFiled: May 30, 2019Publication date: December 3, 2020Inventors: Todd R. Porter, Alexander Alperovich, Krishna Gyana Mamidipaka
-
Publication number: 20200379774Abstract: Methods, systems, apparatuses, and computer program products are provided for processing a stream of data. A maximum temporal divergence is established for data flushed to a data store from a plurality of upstream partitions. Each of a plurality of data flushers, each corresponding to an upstream partition, may obtain an item of data from a data producer. Each data flusher may determine whether flushing the data to the data store would exceed the maximum temporal divergence. Based at least on determining that flushing the data to the data store would not exceed the maximum temporal divergence, the data may be flushed to the data store for ingestion by a downstream partition and a data structure (e.g., a ledger) may be updated to indicate a time associated with the most recent item of data flushed to the data store.Type: ApplicationFiled: May 30, 2019Publication date: December 3, 2020Inventors: Alexander Alperovich, Zhong Chen, Boris Shulman
-
Publication number: 20200320005Abstract: A method for joining an event stream with reference data includes loading a plurality of reference data snapshots from a reference data source into a cache. Punctuation events are supplied that indicate temporal validity for the plurality of reference data snapshots in the cache. A logical barrier is provided that restricts a flow of data events in the event stream to a cache lookup operation based on the punctuation events. The cache lookup operation is performed with respect to the data events in the event stream that are permitted to cross the logical barrier.Type: ApplicationFiled: April 8, 2019Publication date: October 8, 2020Inventors: Boris SHULMAN, Shoupei LI, Alexander ALPEROVICH, Xindi ZHANG, Kanstantsyn ZORYN
-
Patent number: 10733191Abstract: Described herein is a system and method for a static streaming job startup sequence. During compilation of a streaming job, a graph of computing nodes of the streaming job is traversed to determine a minimum start time of computing node(s) downstream of each input computing node of the streaming job. Also, during compilation, a start time is assigned to each input computing node in accordance with the determined minimum start time. During execution of the streaming job, responsive to receipt of the trigger anchor by a particular input node, processing of the particular input computing node using the determined minimum start time is commenced. The input computing node further generates an anchor. Input data is received, and, a batch of data provided in accordance with the received input data and generated anchor.Type: GrantFiled: September 28, 2018Date of Patent: August 4, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Alexander Alperovich, Boris Shulman, Todd Robert Porter, Patrick Chung
-
Publication number: 20200106816Abstract: Described herein is a system and method for startup and/or recovery for stream processing. During a startup phase: start anchor request(s), each identifying a particular time, are accumulated until request(s) are pending from downstream nodes. A minimum time of the accumulated start anchor request(s) is determined. If the processing system is an input node, an anchor associated with the determined minimum time is generated. Otherwise, a start anchor request is provided to an upstream node identifying the determined minimum time. Once the anchor associated with the determined minimum time is received (or generated), the anchor is provided in response to a polled start anchor request anchor for the determined minimum from a downstream node. Asynchronous requests for batches of data bounded by two specific anchors are performed in accordance with information stored in an ordered collection of anchors during a recovery phase.Type: ApplicationFiled: September 28, 2018Publication date: April 2, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Alexander ALPEROVICH, Boris SHULMAN, Zhong CHEN, Lev NOVIK, Kanstantsyn ZORYN
-
Publication number: 20200104399Abstract: Described herein is a system and method for a static streaming job startup sequence. During compilation of a streaming job, a graph of computing nodes of the streaming job is traversed to determine a minimum start time of computing node(s) downstream of each input computing node of the streaming job. Also, during compilation, a start time is assigned to each input computing node in accordance with the determined minimum start time. During execution of the streaming job, responsive to receipt of the trigger anchor by a particular input node, processing of the particular input computing node using the determined minimum start time is commenced. The input computing node further generates an anchor. Input data is received, and, a batch of data provided in accordance with the received input data and generated anchor.Type: ApplicationFiled: September 28, 2018Publication date: April 2, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Alexander ALPEROVICH, Boris SHULMAN, Todd Robert PORTER, Patrick CHUNG