Patents by Inventor Mike PIPPIN

Mike PIPPIN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11947927
    Abstract: Disclosed are embodiments for sorting rows of a dataset after a JOIN operation. In one embodiment, a method is disclosed comprising performing a JOIN operation on an annotation dataset, the performing of the JOIN operation generating an unordered dataset; grouping a plurality of rows in the unordered dataset into a plurality of buckets, the grouping performed based on a root dataset associated with the annotation dataset; sorting each bucket, the sorting comprising sorting each bucket independently; and combining each sorted bucket into a sorted dataset.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: April 2, 2024
    Assignee: YAHOO ASSETS LLC
    Inventors: George Aleksandrovich, Allie K. Watfa, Robin Sahner, Mike Pippin
  • Patent number: 11809396
    Abstract: Disclosed are embodiments for generating metadata files for composite datasets. In one embodiment, a method is disclosed comprising generating a tree representing a plurality of datasets; parsing the tree into an algebraic representation of the tree; identifying a plurality of terms in the algebraic representation, each term in the terms comprising at least two factors, each of the two factors associated with a dataset in the plurality of datasets; generating a metadata object of the plurality of terms; serializing the metadata object to generate serialized terms; and storing the serialized terms in a metadata file associated with the plurality of datasets.
    Type: Grant
    Filed: November 21, 2022
    Date of Patent: November 7, 2023
    Assignee: YAHOO ASSETS LLC
    Inventors: George Aleksandrovich, Allie K. Watfa, Robin Sahner, Mike Pippin
  • Publication number: 20230259506
    Abstract: Disclosed embodiments are methods, apparatuses, and computer-readable media for annotating distributed data without redundant data copying. In one embodiment, a method is disclosed comprising reading a raw dataset, the raw dataset comprising a first set of columns and a first set of rows; generating an annotation dataset, the annotation dataset comprising a second set of columns and a second set of rows; assigning row identifiers to each row in the second set of rows, the row identifiers aligning the second set of rows with the first set of rows based on the underlying storage of the raw dataset and annotation dataset; and writing the annotation dataset to a distributed storage medium.
    Type: Application
    Filed: April 21, 2023
    Publication date: August 17, 2023
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20230252021
    Abstract: Disclosed embodiments are methods, apparatuses, and computer-readable media for annotating distributed data without redundant data copying. In one embodiment, a method is disclosed comprising reading a raw dataset, the raw dataset comprising a first set of columns and a first set of rows; generating an annotation dataset, the annotation dataset comprising a second set of columns and a second set of rows; assigning row identifiers to each row in the second set of rows, the row identifiers aligning the second set of rows with the first set of rows based on the underlying storage of the raw dataset and annotation dataset; and writing the annotation dataset to a distributed storage medium.
    Type: Application
    Filed: April 21, 2023
    Publication date: August 10, 2023
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Patent number: 11663162
    Abstract: Disclosed are embodiments for replacing database table join keys with index keys. In one embodiment, a method is disclosed comprising: receiving, by a processor, annotation data, the annotation data comprising a set of rows; retrieving, by the processor, a root dataset, the root dataset stored in one or more files; generating, by the processor, a row identifier for each row in the set of rows, the row identifier storing a plurality of fields enabling alignment of a respective row in the annotation data to a corresponding row in the root dataset; generating, by the processor, an annotation dataset, the annotation dataset comprising the set of rows and corresponding row identifiers; and writing, by the processor, the annotation dataset to at least one file, the at least one file separate from the one or more files.
    Type: Grant
    Filed: August 29, 2022
    Date of Patent: May 30, 2023
    Assignee: YAHOO ASSETS LLC
    Inventors: George Aleksandrovich, Allie K. Watfa, Robin Sahner, Mike Pippin
  • Patent number: 11650977
    Abstract: Disclosed embodiments are methods, apparatuses, and computer-readable media for annotating distributed data without redundant data copying. In one embodiment, a method is disclosed comprising reading a raw dataset, the raw dataset comprising a first set of columns and a first set of rows; generating an annotation dataset, the annotation dataset comprising a second set of columns and a second set of rows; assigning row identifiers to each row in the second set of rows, the row identifiers aligning the second set of rows with the first set of rows based on the underlying storage of the raw dataset and annotation dataset; and writing the annotation dataset to a distributed storage medium.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: May 16, 2023
    Assignee: YAHOO ASSETS LLC
    Inventors: George Aleksandrovich, Allie K. Watfa, Robin Sahner, Mike Pippin
  • Publication number: 20230086741
    Abstract: Disclosed are embodiments for generating metadata files for composite datasets. In one embodiment, a method is disclosed comprising generating a tree representing a plurality of datasets; parsing the tree into an algebraic representation of the tree; identifying a plurality of terms in the algebraic representation, each term in the terms comprising at least two factors, each of the two factors associated with a dataset in the plurality of datasets; generating a metadata object of the plurality of terms; serializing the metadata object to generate serialized terms; and storing the serialized terms in a metadata file associated with the plurality of datasets.
    Type: Application
    Filed: November 21, 2022
    Publication date: March 23, 2023
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20220414057
    Abstract: Disclosed are embodiments for replacing database table join keys with index keys. In one embodiment, a method is disclosed comprising: receiving, by a processor, annotation data, the annotation data comprising a set of rows; retrieving, by the processor, a root dataset, the root dataset stored in one or more files; generating, by the processor, a row identifier for each row in the set of rows, the row identifier storing a plurality of fields enabling alignment of a respective row in the annotation data to a corresponding row in the root dataset; generating, by the processor, an annotation dataset, the annotation dataset comprising the set of rows and corresponding row identifiers; and writing, by the processor, the annotation dataset to at least one file, the at least one file separate from the one or more files.
    Type: Application
    Filed: August 29, 2022
    Publication date: December 29, 2022
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Patent number: 11507554
    Abstract: Disclosed are embodiments for generating metadata files for composite datasets. In one embodiment, a method is disclosed comprising generating a tree representing a plurality of datasets; parsing the tree into an algebraic representation of the tree; identifying a plurality of terms in the algebraic representation, each term in the terms comprising at least two factors, each of the two factors associated with a dataset in the plurality of datasets; generating a metadata object of the plurality of terms; serializing the metadata object to generate serialized terms; and storing the serialized terms in a metadata file associated with the plurality of datasets.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: November 22, 2022
    Assignee: YAHOO ASSETS LLC
    Inventors: George Aleksandrovich, Allie K. Watfa, Robin Sahner, Mike Pippin
  • Patent number: 11429561
    Abstract: Disclosed are embodiments for replacing database table join keys with index keys. In one embodiment, a method is disclosed comprising: receiving, by a processor, annotation data, the annotation data comprising a set of rows; retrieving, by the processor, a root dataset, the root dataset stored in one or more files; generating, by the processor, a row identifier for each row in the set of rows, the row identifier storing a plurality of fields enabling alignment of a respective row in the annotation data to a corresponding row in the root dataset; generating, by the processor, an annotation dataset, the annotation dataset comprising the set of rows and corresponding row identifiers; and writing, by the processor, the annotation dataset to at least one file, the at least one file separate from the one or more files.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: August 30, 2022
    Assignee: YAHOO ASSETS LLC
    Inventors: George Aleksandrovich, Allie K. Watfa, Robin Sahner, Mike Pippin
  • Publication number: 20210200715
    Abstract: Disclosed are embodiments for replacing database table join keys with index keys. In one embodiment, a method is disclosed comprising: receiving, by a processor, annotation data, the annotation data comprising a set of rows; retrieving, by the processor, a root dataset, the root dataset stored in one or more files; generating, by the processor, a row identifier for each row in the set of rows, the row identifier storing a plurality of fields enabling alignment of a respective row in the annotation data to a corresponding row in the root dataset; generating, by the processor, an annotation dataset, the annotation dataset comprising the set of rows and corresponding row identifiers; and writing, by the processor, the annotation dataset to at least one file, the at least one file separate from the one or more files.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20210200512
    Abstract: Disclosed are embodiments for sorting rows of a dataset after a JOIN operation. In one embodiment, a method is disclosed comprising performing a JOIN operation on an annotation dataset, the performing of the JOIN operation generating an unordered dataset; grouping a plurality of rows in the unordered dataset into a plurality of buckets, the grouping performed based on a root dataset associated with the annotation dataset; sorting each bucket, the sorting comprising sorting each bucket independently; and combining each sorted bucket into a sorted dataset.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20210200717
    Abstract: Disclosed are embodiments for generating a dataset metadata file based on partial metadata files. In one embodiment, a method is disclosed comprising receiving data to write to disk, the data comprising a subset of a dataset; writing a first portion of the data to disk; detecting a split boundary after writing the first portion; recording metadata describing the split boundary; continuing to write a remaining portion of the data to disk; and after completing the writing of the data to disk: generating a partial metadata file for the data, the partial metadata file including the split boundary, and transmitting the partial metadata to a partial metadata collector.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20210200731
    Abstract: Disclosed are embodiments for horizontally skimming composite datasets.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20210200747
    Abstract: Disclosed embodiments are methods, apparatuses, and computer-readable media for annotating distributed data without redundant data copying. In one embodiment, a method is disclosed comprising reading a raw dataset, the raw dataset comprising a first set of columns and a first set of rows; generating an annotation dataset, the annotation dataset comprising a second set of columns and a second set of rows; assigning row identifiers to each row in the second set of rows, the row identifiers aligning the second set of rows with the first set of rows based on the underlying storage of the raw dataset and annotation dataset; and writing the annotation dataset to a distributed storage medium.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN
  • Publication number: 20210200732
    Abstract: Disclosed are embodiments for generating metadata files for composite datasets. In one embodiment, a method is disclosed comprising generating a tree representing a plurality of datasets; parsing the tree into an algebraic representation of the tree; identifying a plurality of terms in the algebraic representation, each term in the terms comprising at least two factors, each of the two factors associated with a dataset in the plurality of datasets; generating a metadata object of the plurality of terms; serializing the metadata object to generate serialized terms; and storing the serialized terms in a metadata file associated with the plurality of datasets.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Inventors: George ALEKSANDROVICH, Allie K. WATFA, Robin SAHNER, Mike PIPPIN