Patents Assigned to SAS Institute
-
Patent number: 9928320Abstract: Techniques for estimated compound probability distribution are described herein. Embodiments may include receiving, at a master node of a distributed system, a compound model specification comprising frequency models, severity models, and one or more adjustment functions, wherein at least one model of the frequency models and the severity models depend on one or more regressor and distributing the compound model specification to worker nodes of the distributed system, each of the worker nodes to at least generate a portion of samples for use in predicting compound distribution model estimates. Embodiments may also include predicting the compound distribution model estimates based on the sample portions of aggregate values and adjusted aggregate values.Type: GrantFiled: April 12, 2017Date of Patent: March 27, 2018Assignee: SAS Institute Inc.Inventors: Mahesh V. Joshi, Richard Potter, Jan Chvosta, Mark Roland Little
-
Patent number: 9928052Abstract: Various embodiments are generally directed to an apparatus, method and other techniques for receiving a request to generate a bootable image in a cloud-based computing environment, creating a block storage volume in the cloud-based computing environment in response to receiving the request, the block storage volume having one or more partitions. Further, an apparatus, method and so forth may include installing software comprising one or more files in a file system on the block storage volume in the cloud-based computing environment, creating a snapshot of the file system including the software in the cloud-based computing environment, and creating a bootable image from the snapshot of the file system in the cloud-based computing environment.Type: GrantFiled: November 2, 2016Date of Patent: March 27, 2018Assignee: SAS Institute Inc.Inventor: Mihai Ibanescu
-
Publication number: 20180075051Abstract: An apparatus includes a processor component caused to: retrieve metadata of organization of data within a data set, and map data of organization of data blocks within a data file; receive indications of which node devices are available to perform a processing task with a data set portion; and in response to the data set including partitioned data, compare the quantities of available node devices and of the node devices last involved in storing the data set. In response to a match, for each map data map entry: retrieve a hashed identifier for a data sub-block, and a size for each of the data sub-blocks within the corresponding data block; divide the hashed identifier by the quantity of available node devices; compare the modulo value to a designation assigned to each of the available node devices; and provide a pointer to the available node device assigned the matching designation.Type: ApplicationFiled: November 6, 2017Publication date: March 15, 2018Applicant: SAS Institute Inc.Inventors: BRIAN PAYTON BOWMAN, STEVEN E. KRUEGER, RICHARD TODD KNIGHT, CHIH-WEI HO
-
Publication number: 20180060468Abstract: An apparatus may include a processor caused to: receive indications of selection of experiment designs to compare; receive indications of selection of a set of terms to include in the comparison; for each experiment design, generate a corresponding term correlation graph of a set of term correlation graphs, wherein: the correlation graph comprises horizontal and vertical axes along both of which the set of terms are arranged, at each intersection within the graph, a degree of correlation between terms is indicated with a visual indicator selected from a set of visual indicators, the set of visual indicators is assigned an order that corresponds to a range of degree of correlation, and the range is divided into a set of contiguous sub-ranges, and each visual indicator corresponds to one of the sub-ranges; and present at least two correlation graphs of the set of correlation graphs at adjacent locations on a display.Type: ApplicationFiled: August 30, 2017Publication date: March 1, 2018Applicant: SAS Institute Inc.Inventors: Joseph Albert Morgan, Bradley Allen Jones, Ryan Adam Lekivetz
-
Publication number: 20180060470Abstract: Techniques for estimated compound probability distribution are described herein. Embodiments may include receiving a compound model specification comprising a frequency model and a severity model, the compound model specification including a model error comprising a frequency model error and a severity model error, and determining a number of frequency models and severity models to generate based on the received number of models to generate. Embodiments include generating a plurality of frequency models through perturbation of the frequency model according to the frequency model error, and generating a plurality of severity models through perturbation of the severity model according to the severity model error.Type: ApplicationFiled: November 7, 2017Publication date: March 1, 2018Applicant: SAS Institute Inc.Inventors: Mahesh V. Joshi, Richard Potter, Jan Chvosta, Mark Roland Little
-
Publication number: 20180060759Abstract: Computer-based models can be developed, deployed, and managed in an automated manner. For example, a model building tool can be selected based on the model building tool being compatible with one or more parameters. A first machine-learning model can be generated using the model building tool and trained using a training dataset. The first machine-learning model can then be used to perform a task. Thereafter, a new model-building tool can be selected based on the new model-building tool being compatible with the one or more parameters. A second machine-learning model can be generated using the new model-building tool and trained using the training dataset. The accuracy of the first machine-learning model can be compared to the accuracy of the second machine-learning model. Based on the second machine-learning model being more accurate, the second machine-learning model can be used to perform the particular task rather than the first machine-learning model.Type: ApplicationFiled: August 30, 2017Publication date: March 1, 2018Applicant: SAS Institute Inc.Inventors: Chengwen Robert Chu, Wenjie Bao, Glenn Joseph Clingroth
-
Publication number: 20180060469Abstract: An apparatus may include a processor caused to: receive indications of selection of an experiment design for regression analysis, of a type of distribution for a simulation of random data in the regression analysis, and of selection of a number of iterations of the simulation of random data; generate executable instructions in a pre-selected programming language to be executable by the processor to perform the regression analysis with the selected number of iterations of simulation of random data and with the selected type of distribution; generate a human readable form of a portion of the first executable instructions that includes the coefficients and terms in mathematical notation, and that specifies the selected number of iterations and the selected type of distribution for the simulation of random data; and present, on a display communicatively coupled to the processor, the human readable form of the portion of the first executable instructions.Type: ApplicationFiled: August 30, 2017Publication date: March 1, 2018Applicant: SAS Institute Inc.Inventors: Joseph Albert Morgan, Bradley Allen Jones, Ryan Adam Lekivetz
-
Publication number: 20180060466Abstract: An apparatus may include a processor caused to: receive indications of first and second experiment designs to be compared; for each factor of the model of the first experiment design, identify a matching factor of the model of the second experiment design based on factor type, wherein the factor type is selected from the group consisting of a categorical factor and a continuous factor; for each categorical factor of the model of the first experiment design, identify a matching factor of the model of the second experiment design additionally based on quantity of levels of each factor; for each term of the model of the first experiment design, identify a matching term of the model of the second experiment design based on an order of each term; and present, on a display, the identified matches between the terms and between the responses of the first and second experiment designs.Type: ApplicationFiled: August 30, 2017Publication date: March 1, 2018Applicant: SAS Institute Inc.Inventors: Joseph Albert Morgan, Bradley Allen Jones, Ryan Adam Lekivetz
-
Patent number: 9900378Abstract: An apparatus includes a processor and storage to store instructions that cause the processor to perform operations including: receive an indication of completion of a first task with a first partition such that the first node device is available to assign to perform another task; delay assignment of performance of a second task on a second partition to the first node device for up to a predetermined period of time, in spite of readiness of the second task to be performed on the second partition and availability of the first node device; determine whether an indication of completion of the first task with the second partition such that the second node device is available to assign to perform another task is received within the predetermined period of time; and assign performance of the second task on the second partition to the second node device based on the determination.Type: GrantFiled: February 1, 2017Date of Patent: February 20, 2018Assignee: SAS Institute Inc.Inventors: Chaowang Zhang, Henry Gabriel Victor Bequet, Juan Du
-
Publication number: 20180039897Abstract: Data sets for a three-stage predictor can be automatically determined. For example, multiple time series can be filtered to identify a subset of time series that have time durations that exceed a preset time duration. Whether a time series of the subset of time series includes a time period with inactivity can be determined. Whether the time series exhibits a repetitive characteristic can be determined based on whether the time series has a pattern that repeats over a predetermined time period. Whether the time series includes a magnitude spike with a value above a preset magnitude can be determined. If the time series (i) lacks the time period with inactivity, (ii) exhibits the repetitive characteristic, and (iii) has the magnitude spike with the value above the preset magnitude threshold, the time series can be included in a data set for use with the three-stage predictor.Type: ApplicationFiled: October 19, 2017Publication date: February 8, 2018Applicant: SAS Institute Inc.Inventors: KALYAN JOSHI, NITZI ROEHL, YUNG-HSIN (ALEX) CHIEN
-
Publication number: 20180011865Abstract: An apparatus including a processor caused to: receive sizes and data block encryption data for multiple encrypted data blocks from multiple node devices, wherein data block encryption data is separately generated and used by each node device to encrypt a portion of a data set to generate one of the multiple encrypted data blocks; for each encrypted data block, generate a corresponding map entry within map data to include size and data block encryption data; and in response to receiving size and data block encryption data for all encrypted data blocks, encrypt a portion of the map data to generate an encrypted map base, wherein the portion of map data includes at least a subset of the multiple map entries, and transmit the encrypted map base to one or more storage devices to be stored within a data file along with the multiple encrypted data blocks.Type: ApplicationFiled: September 1, 2017Publication date: January 11, 2018Applicant: SAS Institute Inc.Inventors: Brian Payton Bowman, Mark Kuebler Gass, III
-
Publication number: 20180011866Abstract: An apparatus may include a processor component caused to: generate map entries in map data descriptive of encrypted data blocks within a data file; use first map block encryption data to encrypt a first map extension of the map data; transmit the encrypted first map extension for storage within the data file; store the first map block encryption data within the second map extension; use second map block encryption data to encrypt a second map extension of the map data after storage of the first map block encryption data therein; transmit encrypted second map extension for storage within the data file; store the second map block encryption data within the map base; use third map block encryption data to encrypt a map base of the map data after storage of the second map block encryption data therein; and transmit the encrypted map base for storage within the data file.Type: ApplicationFiled: September 1, 2017Publication date: January 11, 2018Applicant: SAS Institute Inc.Inventors: Brian Payton Bowman, Mark Kuebler Gass, III
-
Publication number: 20180011867Abstract: An apparatus includes a processor component of a first node device caused to receive data block encryption data and an indication of size of an encrypted data block distributed to the first node device for decryption, and in response to the data set being of encrypted data: receive an indication of the quantity of sub-blocks within the encrypted data block, and a hashed identifier for each data sub-block; use the data block encryption data to decrypt the encrypted data block to regenerate data set portions from the data sub-blocks; analyze the hashed identifier of each data sub-block to determine whether all data set portions are distributed to the first node device for processing; and in response to a determination that at least one data set portion is to be distributed to a second node device for processing, transmit the at least one data set portion to the second node device.Type: ApplicationFiled: September 1, 2017Publication date: January 11, 2018Applicant: SAS Institute Inc.Inventors: Brian Payton Bowman, Mark Kuebler Gass, III
-
Publication number: 20180011882Abstract: Streaming data, such as streaming records transmitted from entities, can be managed. For example, a new record associated with an entity can be received. There can be an existing record for the entity within a group of records. The group of records can form a block. A new block for the new record can be generated. A datastore can be updated to indicate that the new block has the most current record for the entity. Entries in the datastore can be filtered to identify a subgroup of blocks that has the most current record for each entity of multiple entities. A combined group of blocks can be generated by joining the new block with the subgroup of blocks. The combined group of blocks can be processed as a batch of data by a processing engine.Type: ApplicationFiled: November 7, 2016Publication date: January 11, 2018Applicant: SAS Institute Inc.Inventor: KATHERINE FULLINGTON TAYLOR
-
Patent number: 9860229Abstract: A first computing device connected to an internal network de-anonymizes data. A record including a surrogate key is received from a second computing device connected to an external network to the internal network. Each identity data record includes a second surrogate key, an entity identifier field value, a record type field value, and a de-identified field value. The second surrogate key uniquely identifies the respective record. The surrogate key is compared to the second surrogate key to identify a matching record. The matching entity identifier field value is selected and compared to the entity identifier field value of the plurality of records to identify a master record for the surrogate key. The record type field value includes an indicator indicating whether the record is the master record. The de-identified field value included in the identified master record is selected. The received record is supplemented with the selected de-identified field value.Type: GrantFiled: April 13, 2017Date of Patent: January 2, 2018Assignee: SAS Institute Inc.Inventors: Brian Oneal Miles, Keith Adams
-
Publication number: 20170371856Abstract: Various embodiments are generally directed to systems for summarizing data visualizations (i.e., images of data visualizations), such as a graph image, for instance. Some embodiments are particularly directed to a personalized graph summarizer that analyzes a data visualization, or image, to detect pre-defined patterns within the data visualization, and produces a textual summary of the data visualization based on the pre-defined patterns detected within the data visualization. In various embodiments, the personalized graph summarizer may include features to adapt to the preferences of a user for generating an automated, personalized computer-generated narrative. For instance, additional pre-defined patterns may be created for detection and/or the textual summary may be tailored based on user preferences. In some such instances, one or more of the user preferences may be automatically determined by the personalized graph summarizer without requiring the user to explicitly indicate them.Type: ApplicationFiled: June 22, 2017Publication date: December 28, 2017Applicant: SAS Institute Inc.Inventors: Ethem F. Can, Richard W. Crowell, James Tetterton, Jared Peterson, SARATENDU SETHI
-
Patent number: 9852013Abstract: An apparatus includes a processor and a storage storing instructions causing the processor to: maintain a federated area; receive a request to perform a job flow with a data set from a remote device; retrieve a job flow definition specifying the tasks of the job flow from the federated area; determine whether there is an instance log in the federated area generated by a previous performance of the job flow with the data set; in response to there being such an instance log, retrieve the version specified in the instance log of each task routine for each task from the federated area; in response to there being no such instance log, retrieve the most recent version of each task routine; perform the job flow with the retrieved versions of the task routines and the data set to generate a result report; and provide the result report to the remote device.Type: GrantFiled: June 5, 2017Date of Patent: December 26, 2017Assignee: SAS Institute Inc.Inventors: Henry Gabriel Victor Bequet, Kais Arfaoui, Ronald Earl Stogner
-
Patent number: 9830558Abstract: A computing device determines an SVDD to identify an outlier in a dataset. First and second sets of observation vectors of a predefined sample size are randomly selected from a training dataset. First and second optimal values are computed using the first and second observation vectors to define a first set of support vectors and a second set of support vectors. A third optimal value is computed using the first set of support vectors updated to include the second set of support vectors to define a third set of support vectors. Whether or not a stop condition is satisfied is determined by comparing a computed value to a stop criterion. When the stop condition is not satisfied, the first set of support vectors is defined as the third set of support vectors, and operations are repeated until the stop condition is satisfied. The third set of support vectors is output.Type: GrantFiled: June 17, 2016Date of Patent: November 28, 2017Assignee: SAS Institute Inc.Inventors: Arin Chaudhuri, Deovrat Vijay Kakde, Maria Jahja, Wei Xiao, Seung Hyun Kong, Hansi Jiang, Sergiy Peredriy
-
Patent number: 9817882Abstract: An apparatus includes a processor and a storage storing instructions causing the processor to receive representation metadata indicating features of representation data to be generated from a plurality of representation portions, receive a command to generate at least one row of the representation data, determine a subset of data blocks of a data blob required to generate the at least one row, and a subset of node devices that store the subset of data blocks, for each node device of the subset of node devices, derive a node block map identifying at least one data item of a data block for generating a representation portion, transmit the node block maps to the subset of node devices; and transmit a command to the subset of node devices to each generate at least one row of one of the plurality of representation portions.Type: GrantFiled: March 30, 2017Date of Patent: November 14, 2017Assignee: SAS Institute Inc.Inventors: Stacey Michelle Christian, Michael Stephen Whitcher, Donald Kent McAlister, Phillip Elliot Hanna
-
Patent number: 9811575Abstract: An apparatus includes a processor and storage storing instructions causing the processor to store, at a node device of a grid of node devices, a data block of a data blob, receive data blob metadata indicative of an organization of data items within the data blob, receive a command to generate, from the data block, at least one row of a representation portion of a plurality of representation portions from which a 2D representation of the data blob is to be generated, use the data blob metadata and a node block map indicative of which data items of the data block are required to generate the representation portion to derive one or more transforms to be performed with the data block to generate the at least one row of the representation portion, and perform the one or more transforms with the data block to generate the at least one row.Type: GrantFiled: March 30, 2017Date of Patent: November 7, 2017Assignee: SAS Institute Inc.Inventors: Stacey Michelle Christian, Michael Stephen Whitcher, Donald Kent McAlister, Phillip Elliot Hanna