Patents by Inventor Biruk Gebremariam

Biruk Gebremariam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11281689
    Abstract: A computing system creates interaction features from variable values in a transformed dataset that includes a variable value computed for each variable of transformed variables computed from a prior execution of a transformation flow applied to an input dataset. An interaction transformation flow definition indicates a subset of the transformed variables, a synthesis definition, and interaction transformation operations to apply to the transformed variables. The synthesis definition describes how the subset of the transformed variables are combined to compute a value input to the interaction transformation operations. A plurality of variable combinations of the subset is defined. A computation is defined for each combination and interaction transformation operation. An operation data value is computed for each computation from the transformed dataset. An observation vector is read from the transformed dataset and a current interaction variable value is synthesized for each combination.
    Type: Grant
    Filed: October 19, 2021
    Date of Patent: March 22, 2022
    Assignee: SAS Institute Inc.
    Inventors: Biruk Gebremariam, Taiping He
  • Patent number: 10824694
    Abstract: A computing system defines transformed variable values for training a machine learning model. A data description is determined for each variable of a plurality of variables from observation vectors. A number of rare-levels is determined for any variable of the plurality of variables that has a nominal variable type. Bins that describe a cumulative distribution function are defined for each variable based on the data description determined for each variable and based on the number of rare-levels determined for any variable of the plurality of variables identified as the nominal variable type. A transformed value is determined for each variable and for each observation vector of the observation vectors using the bins defined for a respective variable of the plurality of variables. Each determined transformed value is written to a transformed dataset with a respective observation vector of the observation vectors.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: November 3, 2020
    Assignee: SAS Institute Inc.
    Inventors: Biruk Gebremariam, Mark Traccarella
  • Patent number: 10628409
    Abstract: A computing system transforms variable values in a dataset using a transformation flow definition applied in parallel. The transformation flow definition indicates flow variables and transformation phases to apply to the flow variables. A computation is defined for each variable and for each transformation phase. A phase internal parameter value is computed for each defined computation from observation vectors read from the dataset. A current variable, a first variable value, a first transformation phase, the phase internal parameter value, and a current transformation phase are selected based on an observation vector read from the dataset. A result value is computed by executing the transformation function with the phase internal parameter value and the first variable value. The computed result value is output to a transformed input dataset. The process is repeated for each variable, transformation phase, and observation vector.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: April 21, 2020
    Assignee: SAS INSTITUTE INC.
    Inventors: Biruk Gebremariam, Xiangxiang Meng
  • Patent number: 10311044
    Abstract: A system provides analysis of distributed data and grouping of variables in support of analytics. Policy parameter values that define thresholds are received. A first computation of a cardinality value and of a number of observations having a non-missing value is requested for each variable of a plurality of variables included in the distributed data by each worker computing device. A number of observation vectors having the non-missing value and the cardinality value are computed by each worker computing device for each variable in response to the first computation request. Each respective worker computing device computes the number of observation vectors having the non-missing value and the cardinality value from a subset of the input dataset distributed to the respective worker computing device by reading each observation vector from the subset once. Each variable is assigned a category based on a comparison between computed values and the policy parameter values.
    Type: Grant
    Filed: October 16, 2018
    Date of Patent: June 4, 2019
    Assignee: SAS INSTITUTE INC.
    Inventor: Biruk Gebremariam
  • Publication number: 20190050446
    Abstract: A system provides analysis of distributed data and grouping of variables in support of analytics. Policy parameter values that define thresholds are received. A first computation of a cardinality value and of a number of observations having a non-missing value is requested for each variable of a plurality of variables included in the distributed data by each worker computing device. A number of observation vectors having the non-missing value and the cardinality value are computed by each worker computing device for each variable in response to the first computation request. Each respective worker computing device computes the number of observation vectors having the non-missing value and the cardinality value from a subset of the input dataset distributed to the respective worker computing device by reading each observation vector from the subset once. Each variable is assigned a category based on a comparison between computed values and the policy parameter values.
    Type: Application
    Filed: October 16, 2018
    Publication date: February 14, 2019
    Inventor: Biruk Gebremariam
  • Publication number: 20190012344
    Abstract: A computing system transforms variable values in a dataset using a transformation flow definition applied in parallel. The transformation flow definition indicates flow variables and transformation phases to apply to the flow variables. A computation is defined for each variable and for each transformation phase. A phase internal parameter value is computed for each defined computation from observation vectors read from the dataset. A current variable, a first variable value, a first transformation phase, the phase internal parameter value, and a current transformation phase are selected based on an observation vector read from the dataset. A result value is computed by executing the transformation function with the phase internal parameter value and the first variable value. The computed result value is output to a transformed input dataset. The process is repeated for each variable, transformation phase, and observation vector.
    Type: Application
    Filed: July 12, 2018
    Publication date: January 10, 2019
    Inventors: Biruk Gebremariam, Xiangxiang Meng
  • Publication number: 20180300650
    Abstract: A computing system provides analysis of data and grouping of variables in support of analytics. From a plurality of observation vectors read from a dataset, a number of observations having a non-missing value and a cardinality value are computed for each variable of the variables. For each variable of the variables, the cardinality ratio value is compared to a first policy parameter value, and the respective variable is identified as a nominal variable type or as an interval variable type based on the comparison. For each variable of the variables identified as the nominal variable type, the cardinality value of the respective variable is compared to a second policy parameter value, and the respective variable is identified as the high-cardinality nominal variable type or as a non-high-cardinality nominal variable type based on the comparison with the cardinality value. The identified variable type is output for each variable of the variables.
    Type: Application
    Filed: January 22, 2018
    Publication date: October 18, 2018
    Inventor: Biruk Gebremariam
  • Publication number: 20180300338
    Abstract: A computing system transforms values in a dataset using of high-cardinality transformation flow definitions in parallel. A transformation flow definition includes flow variables and a transformation method to apply to the flow variables. Per-level values are computed for each variable of each observation vector read from the dataset based on the transformation method. (a) an observation vector is read from the dataset to define a first variable value for each variable. (b) a variable of the flow variables is selected as the current variable. (c) a current value is defined equal to the first variable value for the current variable. (d) a per-level value associated with the current value is selected. (e) the per-level value is output to a transformed input dataset. (f) (b) to (e) are repeated with each remaining variable of as the current variable. (n) (h) to (m) are repeated with each remaining observation vector.
    Type: Application
    Filed: January 22, 2018
    Publication date: October 18, 2018
    Inventor: Biruk Gebremariam
  • Patent number: 10025813
    Abstract: A computing system transforms variable values in a dataset using a transformation flow definition applied in parallel. The transformation flow definition indicates flow variables and transformation phases to apply to the flow variables. A computation is defined for each variable and for each transformation phase. A phase internal parameter value is computed for each defined computation from observation vectors read from the dataset. A current variable, a first variable value, a first transformation phase, the phase internal parameter value, and a current transformation phase are selected based on an observation vector read from the dataset. A result value is computed by executing the transformation function with the phase internal parameter value and the first variable value. The computed result value is output to a transformed input dataset. The process is repeated for each variable, transformation phase, and observation vector.
    Type: Grant
    Filed: January 22, 2018
    Date of Patent: July 17, 2018
    Assignee: SAS Institute Inc.
    Inventors: Biruk Gebremariam, Xiangxiang Meng