Avitan Gefen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
Abstract: Embodiments for dynamically allocating journal space for Do streams across multiple applications. A shared Do stream process has a dynamic block allocation component that provides a certain amount of buffering of a data flush for an application, using space that would normally be allocated for, but unused by, other applications, thus preventing the need for one or more of the applications to move to fast-forward mode when possible. Certain machine learning techniques are used in order to predict the required Do stream for each application according to past experience with the application, and this prediction is used to intelligently allocate Do Streams between the different applications.
Abstract: One example method includes monitoring performance of an element of a runtime environment, where the monitoring includes collecting performance information concerning the element, analyzing the collected information, detecting, based on the analysis of the collected information, an anomaly in the performance of the element and, in response to detection of the anomaly, automatically marking a snapshot of the runtime environment element, and the marking of the snapshot overrides a retention policy applicable to the snapshot.
Abstract: One example method includes collecting telemetry data for each of a group of virtual machines (VM), and each of the VMs is associated with a user, collecting usage data for each of the VMs, creating a user profile definition for each user, and the user profile definition is created based on the telemetry data and usage data of the VMs associated with that user, creating, for each user, a user profile that is based on the user profile definition for that user, clustering the users based on similarity of their respective user profiles, and generating a recommended VM hardware configuration for a VM of one of the users.
October 16, 2019
April 22, 2021
Amihai Savir, Avitan Gefen, Roi Gamliel
Abstract: Techniques are provided for generating workspace recommendations based on prior user ratings of selected workspaces, as well as other similar selections. One method comprises obtaining user workspace ratings provided by a user; calculating a first workspace recommendation score for workspaces that the user previously rated based on the obtained user workspaces ratings; calculating a second workspace recommendation score for additional workspaces that are: (i) similar to workspaces previously rated by the user based on a predefined workspace similarity metric, and/or (ii) selected by similar users, based on a predefined user similarity metric; and recommending workspaces for the user based on the first workspace recommendation score and the second workspace recommendation score.
Abstract: Embodiments include facilitating DNA storage of digital data including a plurality of data assets in a network by building a causal graph of the network and the relationship of the data assets; computing a value of each data asset; computing, using the causal graph and data values, a radius of recovery for each data asset; classifying each data asset as appropriate DNA stored by assigning a numerical ranking of each data asset; defining manual constraints and a DNA storage configuration; and generating a ranked list of recommended data assets for storing in the DNA storage using the classification, manual constraints and DNA storage configuration.
Abstract: A system recommends the refactoring of microservices. The system determines a segments similarity score based on comparing first code segments, associated with a first microservice in an application, against second code segments, associated with a second microservice in the application. The system determines whether the segments similarity score satisfies a segments similarity threshold. The system determines microservices similarity scores based on comparing a size of similar code segments in the first code segments and the second code segments against sizes of the first microservice and the second microservice, if the segments similarity score satisfies the segments similarity threshold. The system determines whether any microservices similarity score satisfies a microservices similarity threshold. The system outputs a recommendation to merge the first microservice with the second microservice, if any microservices similarity score satisfies the microservices similarity threshold.
April 17, 2019
Date of Patent:
March 2, 2021
EMC IP HOLDING COMPANY LLC
Roi Gamliel, Amihai Savir, Avitan Gefen
Abstract: A system, method, and computer-readable medium are disclosed for improved searching of contact information that includes receiving a request for a member's contact information. An initial list of candidates is returned. Based on organizational charts, distances between candidates and a point of reference are calculated, calculation is made as to a number of messages that are exchanged between candidates and the point of reference. Scores are determined based on the calculations. The scores are aggregated, and a refined list of candidates' contact information based on the aggregated scores is returned.
Abstract: Predicting large data flushes by collecting usage data for system assets, analyzing the data using machine learning on each asset and the whole system to determine usage trends, predicting a next large data flush using a time-series model, and determining if a size of the predicted next flush size is too large relative to journal storage space in order to advance fast forward mode. Further, protecting history information by pausing distribution of data from journal volumes to replica volumes, taking storage-level snapshots of the replica and the journal volumes, storing a snapshot timestamp for each of the storage-level snapshots in a snapshot database prior to advancing the fast forward mode or un-pausing distribution.
Abstract: Techniques are provided for data-driven reduction of log message data. An exemplary method comprises: obtaining log files and user-specified configuration parameters, wherein the log files each comprise one or more log messages; generating an event count matrix indicating a number of times each of a plurality of unique messages appeared in a given log file of the log files; generating a correlation graph by inserting similar messages with a mutual undirected edge, wherein similar messages are identified based on a predefined similarity measure; extracting redundant messages from the correlation graph by selecting log messages for inclusion in an uninformative log message filter from sub-graphs of the correlation graph in which any two nodes are connected together, except those log messages satisfying a predefined message frequency criteria; and identifying one or more redundant messages using the uninformative log message filter.
Abstract: Systems and methods for providing data protection operations including cyber-threat protection operations. A sentiment analysis may be performed using language analysis to identify or determine a general or specific sentiment with or without intent to do harm. A score of the sentiment is then determined to assess risk. The data backup policy can be updated based on the assessed risk.
Abstract: Systems and methods for detecting cost anomalies in a data protection system. Data is collected for assets of a data protection system operating in a cloud. The data often relates to cost and may constitute time series. The time series are then analyzed by performing a fitting competition using multiple models. The best fitting model is selected and the residuals are analyzes to find outliers and produce a normal zone for the signal. The outliers can identify cost anomalies that may reflect the health of the data protection system.
June 28, 2019
December 31, 2020
Roi Gamliel, Amihai Savir, Avitan Gefen
Abstract: A method and system for processing user feedback on client devices. Specifically, the method and system disclosed herein entail aggregating and sampling a feature set pertinent to the classification and/or prediction of user dissatisfaction with their respective client devices. Following the derivation of user discontent scores based on anomaly detection and machine learning methodologies, one or more actions may be performed to address and/or alleviate the observed user discontent.
Abstract: Techniques are provided for facial recognition using a high probability group database. One method comprises maintaining (i) a first database of facial images of individuals, and (ii) a second database of facial images comprising a subset of the individuals from the first database based on a probability of individuals appearing in sequences of image frames at a given time; applying a face detection algorithm to sequences of image frames to identify one or more faces in the sequences of images; and applying a facial recognition to at least one sequence of image frames using at least the second database to identify one or more individuals in the at least one sequence of image frames. The second database is comprised of facial images of: (i) individuals from multiple angles; (ii) individuals that appeared in prior image frames; and/or (iii) individuals that appeared in an image frame generated by a plurality of cameras.
Abstract: A method and system for performance-driven load shifting. Specifically, the method and system disclosed herein entail transferring user program workloads, for processing, between local computing resources available on a client device and cloud computing resources available on an offload domain based on the assessed performance score of the client device at any given point in time. Seamless load shifting is further guaranteed due substantively to the employment of a mobile network facilitating communications between the client device and the offload domain.
Abstract: Techniques are provided for recommending changes to a hardware configuration based on a user satisfaction rating. One method comprises obtaining usage data indicating user activity for each user on a computing device; generating a user profile for each user; clustering the users into user clusters based on the user profiles; determining, for a given user cluster, a satisfaction score for each user in the given user cluster based on the obtained usage data for each user on the computing device; providing suggested hardware upgrades for the computing device of a given user in the given user cluster, wherein the given user is selected based on a lower corresponding satisfaction score relative to the satisfaction scores of other users in the given cluster, and wherein the one or more suggested hardware upgrades are based on hardware configurations of one or more of the other users in the given cluster having a higher corresponding satisfaction score.
Abstract: Predicting large data flushes by collecting usage data for system assets, analyzing the data using machine learning on each asset and the whole system to determine usage trends, predicting a next large data flush using a time-series model, and determining if a size of the predicted next flush size is too large relative to journal storage space in order to advance fast forward mode. Further, protecting history information by pausing distribution of data from journal volumes to replica volumes, taking storage-level snapshots of the replica and the journal volumes, storing a snapshot timestamp for each of the storage-level snapshots in a a snapshot database prior to advancing the fast forward mode or un-pausing distribution.
Abstract: Embodiments for predicting large data flushes in a data replication system collecting usage data for assets in the system; analyzing the data using machine learning processes on the basis of each asset and the system as a whole to determine usage trends with respect to the data flush operations; predicting a next large data flush using a time-series model; obtaining a capacity of a journal storage space used for write operations to a storage device in the system; and determining if a size of the predicted next flush size is too large relative to this capacity, and if so, invoking a fast forward mode to not retain I/O history information for undo operations during a replication in order to save resources in the system.
Abstract: Techniques are provided for extracting anomaly related rules from organizational data. One method comprises obtaining anomaly analysis data integrated from multiple data sources of an organization, wherein the multiple data sources comprise at least one set of labeled anomaly data related to anomalous transactions; extracting features from the integrated anomaly analysis data that correlate with an indication of an anomaly; training multiple machine learning models using the extracted features, where the machine learning models are trained using different combinations of the extracted features; evaluating a performance of the trained machine learning models; and extracting rules from the trained machine learning models based on the performance, wherein the extracted rules are used to classify transactions as anomalous. The trained machine learning models comprise a decision tree comprising paths to an anomaly classification. The extracted rules are optionally in a human-readable format.