Patents by Inventor JUSTO L. PEREZ

JUSTO L. PEREZ has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Indexing multiple types of data to facilitate rapid re-indexing of one or more types of data

Patent number: 11829324

Abstract: A method and indexing system indexes the content of a body of documents into a content index, and the metadata of the documents into a metadata index which is a parallel index to the content index. The metadata is copied into a data store that is easily accessible by the indexing system and is stored in native form. The indexing system can dynamically re-index the metadata from the native metadata in the data store to produce a new metadata index which is used to replace the original metadata index. Search queries received by a search engine associated with the indexing system are applied to both the content and metadata index and the results are merged for return.

Type: Grant

Filed: May 6, 2019

Date of Patent: November 28, 2023

Assignee: International Business Machines Corporation

Inventors: David O. Been, Michael Busch, Osamu Furusawa, Frederick S. Grennan, Fumihiko Terui, Justo L. Perez
Managing large scale association sets using optimized bit map representations

Patent number: 11372831

Abstract: Processing a database query for sets of data includes assigning a unique identifier from an integer space to each entity within data and creating one or more sets of entities each pertaining to a corresponding entity within the data. A representation is then generated on disk for each set of entities, wherein each representation encompasses and is suited for a range of the unique identifiers of entities within a corresponding set and indicates a presence of an entity within that corresponding set. Finally, a query is processed based on the representation for each set of entities to retrieve data satisfying the query, wherein the representation provides a constant time for association and dissociation operations that are append-only operations with deferred merge and automatic filtering of deleted and duplicate entities at query time.

Type: Grant

Filed: July 29, 2019

Date of Patent: June 28, 2022

Assignee: International Business Machines Corporation

Inventors: Rajesh M. Desai, Magesh Jayapandian, Iun V. Leong, Justo L. Perez, Roger C. Raphael, Gabriel Valencia
Handling queries in document systems using segment differential based document text-index modelling

Patent number: 11157477

Abstract: A method, computer system, and computer program product for segment differential-based document text-index modeling are provided. The embodiment may include receiving, by a processor, a document with a valid document ID and version ID tuple. The embodiment may also include determining the received document is a new version of a previously stored document and consequently multiplexing versions of the document into a single indexed document. The embodiment may further include segmenting the received document and building a token vector. The embodiment may also include calculating a difference between the received new version of the document and the previously stored document using information obtained from the segmentation. The embodiment may further include in response to the calculated difference being below a pre-configured threshold value, discarding the received new version.

Type: Grant

Filed: November 28, 2018

Date of Patent: October 26, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Roger C. Raphael, Rajesh M. Desai, Fumihiko Terui, Justo L. Perez, Thomas Hampp
HANDLING QUERIES IN DOCUMENT SYSTEMS USING SEGMENT DIFFERENTIAL BASED DOCUMENT TEXT-INDEX MODELLING

Publication number: 20200167329

Abstract: A method, computer system, and computer program product for segment differential-based document text-index modeling are provided. The embodiment may include receiving, by a processor, a document with a valid document ID and version ID tuple. The embodiment may also include determining the received document is a new version of a previously stored document and consequently multiplexing versions of the document into a single indexed document. The embodiment may further include segmenting the received document and building a token vector. The embodiment may also include calculating a difference between the received new version of the document and the previously stored document using information obtained from the segmentation. The embodiment may further include in response to the calculated difference being below a pre-configured threshold value, discarding the received new version.

Type: Application

Filed: November 28, 2018

Publication date: May 28, 2020

Inventors: Roger C. Raphael, Rajesh M. Desai, Fumihiko Terui, Justo L. Perez, Thomas Hampp
MANAGING LARGE SCALE ASSOCIATION SETS USING OPTIMIZED BIT MAP REPRESENTATIONS

Publication number: 20190354514

Abstract: Processing a database query for sets of data includes assigning a unique identifier from an integer space to each entity within data and creating one or more sets of entities each pertaining to a corresponding entity within the data. A representation is then generated on disk for each set of entities, wherein each representation encompasses and is suited for a range of the unique identifiers of entities within a corresponding set and indicates a presence of an entity within that corresponding set. Finally, a query is processed based on the representation for each set of entities to retrieve data satisfying the query, wherein the representation provides a constant time for association and dissociation operations that are append-only operations with deferred merge and automatic filtering of deleted and duplicate entities at query time.

Type: Application

Filed: July 29, 2019

Publication date: November 21, 2019

Inventors: Rajesh M. Desai, Magesh Jayapandian, Iun V. Leong, Justo L. Perez, Roger C. Raphael, Gabriel Valencia
Managing large scale association sets using optimized bit map representations

Patent number: 10452631

Abstract: Processing a database query for sets of data includes assigning a unique identifier from an integer space to each entity within data and creating one or more sets of entities each pertaining to a corresponding entity within the data. A representation is then generated on disk for each set of entities, wherein each representation encompasses and is suited for a range of the unique identifiers of entities within a corresponding set and indicates a presence of an entity within that corresponding set. Finally, a query is processed based on the representation for each set of entities to retrieve data satisfying the query, wherein the representation provides a constant time for association and dissociation operations that are append-only operations with deferred merge and automatic filtering of deleted and duplicate entities at query time.

Type: Grant

Filed: March 15, 2017

Date of Patent: October 22, 2019

Assignee: International Business Machines Corporation

Inventors: Rajesh M. Desai, Magesh Jayapandian, Iun V. Leong, Justo L. Perez, Roger C. Raphael, Gabriel Valencia
PROVIDING NEAR REAL-TIME AND EFFECTIVE LITIGATION MANAGEMENT FOR MULTIPLE REMOTE CONTENT SYSTEMS USING ASYNCHRONOUS BI-DIRECTIONAL REPLICATION PIPELINES

Publication number: 20190304041

Abstract: Embodiments generally relate to providing litigation management for multiple remote content systems using asynchronous bi-directional replication pipelines. In some embodiments, a method includes retrieving, at one or more inbound replicators of one or more respective bi-directional pipelines, metadata associated with documents stored in one or more content repositories. The method further includes resolving, at a governance control hub, conflicts associated with legal holds on one or more of the documents based on the metadata. The method further includes sending conflict resolution results from one or more outbound applicators of the bi-directional pipelines to the content repositories, where the content repositories enforce legal holds on the documents.

Type: Application

Filed: June 18, 2019

Publication date: October 3, 2019

Inventors: Roger C. RAPHAEL, Ronald L. RATHGEBER, Rajesh M. DESAI, Gabriel VALENCIA, Justo L. PEREZ, William Russell BELKNAP, Sudhakar BASIREDDY
Indexing multiple types of data to facilitate rapid re-indexing of one or more types of data

Patent number: 10394754

Abstract: A method and indexing system indexes the content of a body of documents into a content index, and the metadata of the documents into a metadata index which is a parallel index to the content index. The metadata is copied into a data store that is easily accessible by the indexing system and is stored in native form. The indexing system can dynamically re-index the metadata from the native metadata in the data store to produce a new metadata index which is used to replace the original metadata index. Search queries received by a search engine associated with the indexing system are applied to both the content and metadata index and the results are merged for return.

Type: Grant

Filed: March 8, 2010

Date of Patent: August 27, 2019

Assignee: International Business Machines Corporation

Inventors: David O. Been, Michael Busch, Osamu Furusawa, Frederick S. Grennan, Fumihiko Terui, Justo L. Perez
INDEXING MULTIPLE TYPES OF DATA TO FACILITATE RAPID RE-INDEXING OF ONE OR MORE TYPES OF DATA

Publication number: 20190258603

Abstract: A method and indexing system indexes the content of a body of documents into a content index, and the metadata of the documents into a metadata index which is a parallel index to the content index. The metadata is copied into a data store that is easily accessible by the indexing system and is stored in native form. The indexing system can dynamically re-index the metadata from the native metadata in the data store to produce a new metadata index which is used to replace the original metadata index. Search queries received by a search engine associated with the indexing system are applied to both the content and metadata index and the results are merged for return.

Type: Application

Filed: May 6, 2019

Publication date: August 22, 2019

Inventors: David O. Been, Michael Busch, Osamu Furusawa, Frederick S. Grennan, Fumihiko Terui, Justo L. Perez
MANAGING LARGE SCALE ASSOCIATION SETS USING OPTIMIZED BIT MAP REPRESENTATIONS

Publication number: 20180268009

Abstract: Processing a database query for sets of data includes assigning a unique identifier from an integer space to each entity within data and creating one or more sets of entities each pertaining to a corresponding entity within the data. A representation is then generated on disk for each set of entities, wherein each representation encompasses and is suited for a range of the unique identifiers of entities within a corresponding set and indicates a presence of an entity within that corresponding set. Finally, a query is processed based on the representation for each set of entities to retrieve data satisfying the query, wherein the representation provides a constant time for association and dissociation operations that are append-only operations with deferred merge and automatic filtering of deleted and duplicate entities at query time.

Type: Application

Filed: March 15, 2017

Publication date: September 20, 2018

Inventors: Rajesh M. Desai, Magesh Jayapandian, Iun V. Leong, Justo L. Perez, Roger C. Raphael, Gabriel Valencia
Extending a content repository using an auxiliary data store

Patent number: 9613041

Abstract: According to one embodiment of the present invention, a system extends a content repository by creating an auxiliary data store outside of the content repository and storing auxiliary data in the auxiliary data store, wherein the auxiliary data is associated with a collection of documents in the content repository. The system stores version information for the auxiliary data store and records of operations against the auxiliary data store in a log in the repository. In response to receiving a request for an operation against the auxiliary data store, the system determines that the auxiliary data store and repository are consistent based on the version information and applies the operation against the auxiliary data store. Embodiments of the present invention further include a method and computer program product for extending a content repository data model in substantially the same manners described above.

Type: Grant

Filed: October 3, 2013

Date of Patent: April 4, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajesh M. Desai, Magesh Jayapandian, Aidon P. Jennery, Justo L. Perez
Extending a content repository using an auxiliary data store

Patent number: 9606998

Abstract: According to one embodiment of the present invention, a system extends a content repository by creating an auxiliary data store outside of the content repository and storing auxiliary data in the auxiliary data store, wherein the auxiliary data is associated with a collection of documents in the content repository. The system stores version information for the auxiliary data store and records of operations against the auxiliary data store in a log in the repository. In response to receiving a request for an operation against the auxiliary data store, the system determines that the auxiliary data store and repository are consistent based on the version information and applies the operation against the auxiliary data store. Embodiments of the present invention further include a method and computer program product for extending a content repository data model in substantially the same manners described above.

Type: Grant

Filed: June 6, 2014

Date of Patent: March 28, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajesh M. Desai, Magesh Jayapandian, Aidon P. Jennery, Justo L. Perez
Strategies for result set processing and presentation in search applications

Patent number: 9594813

Abstract: In searching electronic documents, prior to executing a query, a reviewer indicates whether a result set of the query will be dynamic or static. The query is then executed on the electronic documents to obtain an original result set, which is provided to the reviewer through a user interface. Upon determining that one or more changes to one or more of the electronic documents have occurred, and if the result set is static, then the original result set continues to be provided to the reviewer without re-executing the query. If the result set is dynamic, then the query is re-executed on the electronic documents to obtain an updated result set, and the updated result set is provided to the reviewer through the user interface. The original result set may be associated with a search session and/or may be a random sample of the electronic documents for an overview query.

Type: Grant

Filed: November 29, 2014

Date of Patent: March 14, 2017

Assignee: International Business Machines Corporation

Inventors: Rajesh M. Desai, Magesh Jayapandian, Aidon P. Jennery, Justo L. Perez
Strategies for result set processing and presentation in search applications

Patent number: 9589035

Abstract: In searching electronic documents, prior to executing a query, a reviewer indicates whether a result set of the query will be dynamic or static. The query is then executed on the electronic documents to obtain an original result set, which is provided to the reviewer through a user interface. Upon determining that one or more changes to one or more of the electronic documents have occurred, and if the result set is static, then the original result set continues to be provided to the reviewer without re-executing the query. If the result set is dynamic, then the query is re-executed on the electronic documents to obtain an updated result set, and the updated result set is provided to the reviewer through the user interface. The original result set may be associated with a search session and/or may be a random sample of the electronic documents for an overview query.

Type: Grant

Filed: March 3, 2014

Date of Patent: March 7, 2017

Assignee: International Business Machines Corporation

Inventors: Rajesh M. Desai, Magesh Jayapandian, Aidon P. Jennery, Justo L. Perez
Strategies for Result Set Processing and Presentation in Search Applications

Publication number: 20150248463

Abstract: In searching electronic documents, prior to executing a query, a reviewer indicates whether a result set of the query will be dynamic or static. The query is then executed on the electronic documents to obtain an original result set, which is provided to the reviewer through a user interface. Upon determining that one or more changes to one or more of the electronic documents have occurred, and if the result set is static, then the original result set continues to be provided to the reviewer without re-executing the query. If the result set is dynamic, then the query is re-executed on the electronic documents to obtain an updated result set, and the updated result set is provided to the reviewer through the user interface. The original result set may be associated with a search session and/or may be a random sample of the electronic documents for an overview query.

Type: Application

Filed: November 29, 2014

Publication date: September 3, 2015

Inventors: Rajesh M. DESAI, Magesh JAYAPANDIAN, Aidon P. JENNERY, Justo L. PEREZ
STRATEGIES FOR RESULT SET PROCESSING AND PRESENTATION IN SEARCH APPLICATIONS

Publication number: 20150248464

Abstract: In searching electronic documents, prior to executing a query, a reviewer indicates whether a result set of the query will be dynamic or static. The query is then executed on the electronic documents to obtain an original result set, which is provided to the reviewer through a user interface. Upon determining that one or more changes to one or more of the electronic documents have occurred, and if the result set is static, then the original result set continues to be provided to the reviewer without re-executing the query. If the result set is dynamic, then the query is re-executed on the electronic documents to obtain an updated result set, and the updated result set is provided to the reviewer through the user interface. The original result set may be associated with a search session and/or may be a random sample of the electronic documents for an overview query.

Type: Application

Filed: March 3, 2014

Publication date: September 3, 2015

Applicant: International Business Machines Corporation

Inventors: Rajesh M. DESAI, Magesh JAYAPANDIAN, Aidon P. JENNERY, Justo L. PEREZ
EXTENDING A CONTENT REPOSITORY USING AN AUXILIARY DATA STORE

Publication number: 20150100550

Abstract: According to one embodiment of the present invention, a system extends a content repository by creating an auxiliary data store outside of the content repository and storing auxiliary data in the auxiliary data store, wherein the auxiliary data is associated with a collection of documents in the content repository. The system stores version information for the auxiliary data store and records of operations against the auxiliary data store in a log in the repository. In response to receiving a request for an operation against the auxiliary data store, the system determines that the auxiliary data store and repository are consistent based on the version information and applies the operation against the auxiliary data store. Embodiments of the present invention further include a method and computer program product for extending a content repository data model in substantially the same manners described above.

Type: Application

Filed: June 6, 2014

Publication date: April 9, 2015

Inventors: Rajesh M. Desai, Magesh Jayapandian, Aidon P. Jennery, Justo L. Perez
EXTENDING A CONTENT REPOSITORY USING AN AUXILIARY DATA STORE

Publication number: 20150100549

Abstract: According to one embodiment of the present invention, a system extends a content repository by creating an auxiliary data store outside of the content repository and storing auxiliary data in the auxiliary data store, wherein the auxiliary data is associated with a collection of documents in the content repository. The system stores version information for the auxiliary data store and records of operations against the auxiliary data store in a log in the repository. In response to receiving a request for an operation against the auxiliary data store, the system determines that the auxiliary data store and repository are consistent based on the version information and applies the operation against the auxiliary data store. Embodiments of the present invention further include a method and computer program product for extending a content repository data model in substantially the same manners described above.

Type: Application

Filed: October 3, 2013

Publication date: April 9, 2015

Applicant: International Business Machines Corporation

Inventors: Rajesh M. Desai, Magesh Jayapandian, Aidon P. Jennery, Justo L. Perez
INDEXING MULTIPLE TYPES OF DATA TO FACILITATE RAPID RE-INDEXING OF ONE OR MORE TYPES OF DATA

Publication number: 20110219008

Abstract: A method and indexing system indexes the content of a body of documents into a content index, and the metadata of the documents into a metadata index which is a parallel index to the content index. The metadata is copied into a data store that is easily accessible by the indexing system and is stored in native form. The indexing system can dynamically re-index the metadata from the native metadata in the data store to produce a new metadata index which is used to replace the original metadata index. Search queries received by a search engine associated with the indexing system are applied to both the content and metadata index and the results are merged for return.

Type: Application

Filed: March 8, 2010

Publication date: September 8, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: DAVID O. BEEN, MICHAEL BUSCH, OSAMU FURUSAWA, FREDERICK S. GRENNAN, FUMIHIKO TERUI, JUSTO L. PEREZ