Vectors, Bitmaps Or Matrices (epo) Patents (Class 707/E17.051)

Accelerated query operators for high-speed, in-memory online analytical processing queries and operations

Patent number: 8892586

Abstract: An additional data structure can be initialized for a column of compressed data to include a prefix storing, for each block of values in the column, a total number of bits set in previous blocks in the bit vector. A block number can be determined for a target block of the plurality of blocks, for example by checking whether or not a specified row number is located in the prefix. If the specified row number is located in the prefix, the prefix value of the prefix is returned, the most frequently occurring value is returned if a corresponding bit in the bit vector in the specified row number is not located in the prefix, or a position of the specified row in an index vector for the column is returned.

Type: Grant

Filed: December 23, 2011

Date of Patent: November 18, 2014

Assignee: SAP AG

Inventors: Christian Lemke, Tobias Mindnich, Christoph Weyerhaeuser, Franz Faerber, Kai-Uwe Sattler
Columnwise Storage of Point Data

Publication number: 20130325874

Abstract: A database query of point data among two or more axes of a database is received. The database stores point data in distinct integer vectors with a shared dictionary. Thereafter, the dictionary is scanned to determine boundaries for each axis specified by the query. In response, results characterizing data responsive to the query within the determined boundaries for each axis are returned. Related apparatus, systems, techniques and articles are also described.

Type: Application

Filed: June 4, 2012

Publication date: December 5, 2013

Applicant: SAP AG

Inventors: Christoph Weyerhaeuser, Tobias Mindnich, Daniel Baeumges, Gerrit Simon Kazmaier
INTRA-BLOCK PARTITIONING FOR DATABASE MANAGEMENT

Publication number: 20130325900

Abstract: A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.

Type: Application

Filed: May 31, 2012

Publication date: December 5, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ronald J. Barber, Min-Soo Kim, Sam S. Lightstone, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
ACCELERATED QUERY OPERATORS FOR HIGH-SPEED, IN-MEMORY ONLINE ANALYTICAL PROCESSING QUERIES AND OPERATIONS

Publication number: 20130166566

Abstract: An additional data structure can be initialized for a column of compressed data to include a prefix storing, for each block of values in the column, a total number of bits set in previous blocks in the bit vector. A block number can be determined for a target block of the plurality of blocks, for example by checking whether or not a specified row number is located in the prefix. If the specified row number is located in the prefix, the prefix value of the prefix is returned, the most frequently occurring value is returned if a corresponding bit in the bit vector in the specified row number is not located in the prefix, or a position of the specified row in an index vector for the column is returned.

Type: Application

Filed: December 23, 2011

Publication date: June 27, 2013

Inventors: Christian Lemke, Tobias Mindnich, Christoph Weyerhaeuser, Franz Faerber, Kai-Uwe Sattler
SYSTEMS AND METHODS FOR QUERYING COLUMN ORIENTED DATABASES

Publication number: 20120323867

Abstract: Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein.

Type: Application

Filed: August 27, 2012

Publication date: December 20, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bin He, Hui-l Hsiao
SYSTEMS AND METHODS FOR QUERYING COLUMN ORIENTED DATABASES

Publication number: 20120303633

Abstract: Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein.

Type: Application

Filed: May 26, 2011

Publication date: November 29, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bin He, Hui-l Hsiao
BIT STRING SEARCH APPARATUS, SEARCH METHOD, AND PROGRAM

Publication number: 20120239664

Abstract: To increase space efficiency of a coupled node tree, a branch node does not have an area that holds an array element number of an array element wherein is stored the primary node of the node pair that is the link target, and the root node is disposed in an array element with a node location number 1, and the primary node is disposed in an array element whose node location number is twice the node location number of the branch node. The node location number of an array element wherein is disposed a link target node is obtained by adding the bit value in the search key at the discrimination bit position for the link source branch node to twice the node location number of the branch node.

Type: Application

Filed: May 30, 2012

Publication date: September 20, 2012

Applicant: S. Grants Co., Ltd.

Inventors: Toshio Shinjo, Mitsuhiro Kokubun
System and method for structuring data in a storage device

Patent number: 8255424

Abstract: A system, storage medium, and method for structuring data are provided. The system connects to a storage device that stores original data. The method obtains the original data from the storage device, and stores the original data in the form of character strings into a buffer memory according to end of file-line (EOF) tags. The method further constructs data arrays to store the character strings, and arranges each of the data arrays into a data matrix. In addition, the method classifies each of the data arrays in the data matrix according to properties of the character strings, arranges the classified data arrays into a data file, and stores the data file into the buffer memory.

Type: Grant

Filed: September 14, 2009

Date of Patent: August 28, 2012

Assignee: Hon Hai Precision Industry Co., Ltd.

Inventors: Shen-Chun Li, Yung-Chieh Chen, Shou-Kuo Hsu
System, method and apparatus for information retrieval and data representation

Patent number: 8244711

Abstract: A system, method, and apparatus for information retrieval are provided. Embodiments of the present invention may generate data structures that may be used to process user queries. According to embodiments of the present invention, a processor component configured to perform the operations of an indexing module and a storage module, the indexing module configured to generate a term list and a term-file matrix from information stored on the storage module, the indexing module further configured to generate an adjacency matrix from the one or more files, wherein the adjacency matrix represents a relationship of the one or more terms in each of the one or more files; and the indexing module further configured to generate a probability matrix using the adjacency matrix and a one-step or two-step random walk.

Type: Grant

Filed: September 28, 2009

Date of Patent: August 14, 2012

Inventor: Chin Lung Fong
ENCODED MATRIX INDEX

Publication number: 20120179712

Abstract: A method, apparatus, and program product are provided for creating an Encoded Matrix Index for a column in a database table. An element of the column for all rows in the database table is compared to a corresponding reference value in a reference data structure, and in response to at least one value for the element of the column not matching the reference value, indicating a variation in a variation data structure and creating a value data structure. Queries executed using the Encoded Matrix Index include terms associated with a sub-column defined in a column of a database table. The variation data structure is accessed to determine whether any variation exists between rows belonging to a sub-column of the database table. If no variation exists, a value is accessed from the reference data structure; otherwise, a value for each row of the sub-column is accessed from a value data structure.

Type: Application

Filed: March 16, 2012

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert J. Bestgen, Thomas J. Eggebraaten, Jeffrey W. Tenner
Period to date functions for time intelligence functionality

Patent number: 8122038

Abstract: A system for extending a Time Intelligence language to provide support for period-to-date functions and for generating member sets in response to data queries is provided. The system may apply member aggregation functions and queries across a plurality of heterogeneous data sources. Each data source is aligned to a reference dimension and is said to organize data according to at least one level of granularity. In some embodiments, a member aggregation function specifies a period (e.g., year, quarter, month) and retrieves data from a data source starting with the current specified period and ending with the most recently completed period equal to the granularity of the data source. The system may allow a user to further customize a member aggregation function by specifying a granularity, a period offset, or a granularity end offset. Additionally, the system may generate a caption to display in association with the retrieved data.

Type: Grant

Filed: June 15, 2009

Date of Patent: February 21, 2012

Assignee: Microsoft Corporation

Inventors: Steve Handy, Catalin Tomai, Chen-I Lim
EFFICIENT AND SCALABLE DATA EVOLUTION WITH COLUMN ORIENTED DATABASES

Publication number: 20110270844

Abstract: A method, system and program product for data evolution on column oriented databases is disclosed. For an input evolution operation, reusable and non-reusable attributes are identified. For attributes in a target schema that cannot be reused from the source schema, data and bitmap indexes of those attributes are generated from source data and bitmap indexes. A decompose operation is disclosed for decomposing a table into two tables. A merge operation is disclosed in which only one input table can be reused for mergence. A second merge operation is disclosed in which both input tables cannot be reused for mergence.

Type: Application

Filed: May 3, 2010

Publication date: November 3, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: BIN HE, HUI-I HSIAO
SEMANTIC OBJECT CHARACTERIZATION AND SEARCH

Publication number: 20110213784

Abstract: Semantic object characterization and its use in indexing and searching a database directory is presented. In general, a first binary hash code is generated to represent a first representation or view of a semantic object which when compared to a characterized version of a second representation or view of the same semantic object in the form of a second binary hash code, the first and second binary hash codes exhibit a degree of similarity indicative of the objects being the same object. In one implementation the semantic objects correspond to peoples' names and the first and second representations or views correspond to two different languages. Thus, a user can search a database of information in one language with a search query in another language.

Type: Application

Filed: March 1, 2010

Publication date: September 1, 2011

Applicant: Microsoft Corporation

Inventors: Uppinakuduru Raghavendra Udupa, Shaishav Kumar
EFFICIENT ACCESS OF BITMAP ARRAY WITH HUGE USAGE VARIANCE ALONG LINEAR FASHION, USING POINTERS

Publication number: 20100169322

Abstract: A system and method for locating an unallocated bit in a bitmap array includes traversing the bitmap array using a plurality of pointers to locate a unit. The unit includes a plurality of entities and at least one of the plurality of entities is unallocated. The method further includes traversing the at least one of the plurality of unallocated entities in the unit to obtain an unallocated entity. The unit is associated with at least one pointer, and the at least one pointer is associated with a plurality of threshold values and a fill count, the fill count being less than a maximum fill count of the bitmap array.

Type: Application

Filed: December 26, 2008

Publication date: July 1, 2010

Applicant: SUN MICROSYSTEMS, INC.

Inventor: Parthasarathy Selvaraj
SYSTEM AND METHOD FOR IMPROVED GRID PROCESSING

Publication number: 20100114909

Abstract: A method and system for improved processing of volumetric data. The method includes encoding the volumetric data into a plurality of blocks, wherein each block is associated with: a block topology denoting a relative location of the block within the volumetric data and a set of elements, and each element is associated with: an element topology denoting a relative location of the element within the associated block and a data value. The method includes encoding each block into a value table and an element bit-mask, wherein the value table stores element values, and the element bit-mask indicates non-zero element values. The method includes randomly accessing an element value, further comprising: determining a selected block containing the element value from the element coordinate, computing a value table offset from the element coordinate, and accessing the element value in the value table with the value table offset.

Type: Application

Filed: November 5, 2008

Publication date: May 6, 2010

Inventor: Ken Museth
BITMAP DEVICE IDENTIFICATION IN A WIRELESS COMMUNICATION SYSTEM

Publication number: 20090327239

Abstract: A method and apparatus for identifying devices using a bitmap are disclosed. In one embodiment, the method comprises: accessing a memory to obtain one or more bitmaps that map each bit location in the bitmap to an index value, where one index value is assigned to each remote wireless media device of a wireless network in a wireless communication system; and examining those bitmaps and determining at least one characteristic of a remote wireless media device and a device identifier that identifies the remote wireless media device in the wireless network based on the bit position in those bitmaps.

Type: Application

Filed: June 30, 2008

Publication date: December 31, 2009

Inventors: In Sung Cho, Kumar Mahesh, Prakash Kamath, Jeffrey Gilbert, Rob Frizzell
Method for Estimating the Number of Distinct Values in a Partitioned Dataset

Publication number: 20090192980

Abstract: The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.

Type: Application

Filed: January 30, 2008

Publication date: July 30, 2009

Applicant: International Business Machines Corporation

Inventors: Kevin Scott Beyer, Rainer Gemulla, Peter Jay Haas, Berthold Reinwald, John Sismanis
TEXT CATEGORIZATION WITH KNOWLEDGE TRANSFER FROM HETEROGENEOUS DATASETS

Publication number: 20090171956

Abstract: The present invention provides a method for incorporating features from heterogeneous auxiliary datasets into input text data for use in classification, a plurality of heterogeneous auxiliary datasets, such as labeled datasets and unlabeled datasets, are accessed after receiving input text data. A plurality of features are extracted from each of the plurality of heterogeneous auxiliary datasets. The plurality of features are combined with the input text data to generate a set of features which may potentially be used to classify the input text data. Classification features are then extracted from the set of features and used to classify the input text data. In one embodiment, the classification features are extracted by calculating a mutual information value associated with each feature in the set of features and identifying features having a mutual information value exceeding a threshold value.

Type: Application

Filed: October 10, 2008

Publication date: July 2, 2009

Inventors: Rakesh Gupta, Lev Ratinov
METHOD USING SLP PACKING WITH STATEMENTS HAVING BOTH ISOMORPHIC AND NON-ISOMORPHIC EXPRESSIONS

Publication number: 20090171919

Abstract: A computer implemented method is provided for using SLP in processing a plurality of statements, wherein the statements are associated with an array having a number of array positions, and each statement includes one or more expressions. The method includes the step of gathering expressions for each of the statements into a structure comprising a single merge stream, the merge streams being furnished with a location for each expression, wherein the location for a given expression is associated with one of the array positions. The method further comprises selectively identifying a plurality of expressions, and applying SLP packing operations to the identified expressions, in order to merge respective identified expressions into one or more isomorphic sub-streams. The method further comprises selectively combining the expressions of the isomorphic sub-streams, and other expressions of the single merge stream, into a number of input vectors that are substantially equal in length to one another.

Type: Application

Filed: December 26, 2007

Publication date: July 2, 2009

Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
System And Method For Developing Strategic Scenario

Publication number: 20090089106

Abstract: Provided are a system and method for developing a strategic scenario.

Type: Application

Filed: October 27, 2005

Publication date: April 2, 2009

Inventors: Jeong-Seok Park, Kyung-Seok Ryu, Jee-Hyung Lee
System and Method of Self-Learning Conceptual Mapping to Organize and Interpret Data

Publication number: 20090049067

Abstract: In a computer implemented method of researching textual data sources, textual data is reduced to a plurality of distinctive words based on frequency of usage within the textual data. The distinctive words are converted into first numeric representations of vectors containing random numbers. A first self-organizing map is formed from the first numeric representations and organized by similarities between the vectors. A second self-organizing map is formed from second numeric representations generated from the organization of the first self-organizing map. The second numeric representations are vectors derived from the first self-organizing map. The vectors are used to train the second self-organizing map. The vectors derived from the first self-organizing map are organized into clusters of similarities between the vectors on the second self-organizing map. Dialectic arguments are formed from the second self-organizing map to interpret the textual data.

Type: Application

Filed: October 27, 2008

Publication date: February 19, 2009

Applicant: KINETX, INC.

Inventor: Jonathan Murray
INFORMATION RETRIEVAL APPARATUS AND INFORMATION RETRIEVAL METHOD

Publication number: 20080114754

Abstract: A retrieval information that is a retrieval target, acquired from an information source, is arranged on a vector space. Data of a usage information and a content specifying manners of usage, used by the user in the past is acquired by a usage-information acquisition unit. A user's preference extraction unit extracts a preference of a user from the data, and notifies a number-of-effective-elements reduction unit of the extracted data. The number-of-effective-elements reduction unit evaluates each element of a vector of retrieval information by using the preference of the user, and reduces the number of effective elements by removing the elements smaller than a certain criteria.

Type: Application

Filed: August 29, 2007

Publication date: May 15, 2008

Applicant: Fujitsu Limited

Inventors: Akira Karasudani, Takahiro Matsuda
Data Structure And Management System For A Superset Of Relational Databases

Publication number: 20080027980

Abstract: A computer readable medium for validating data is disclosed. The computer readable medium provides a computer readable medium with instructions executable by a computer to cause the computer to perform the following functions: receive a subjective representation of an address; re-format the subjective representation of the address according to a set of standardization rules; locate one or more candidate representations of the address from source data by recognizing that a preferred token is present among any of the one or more candidate representations of the address; select a preferred representation of the address from among the one or more candidate representations of the address based on the presence of the preferred token; and communicate the preferred representation of the address to an interface.

Type: Application

Filed: August 13, 2007

Publication date: January 31, 2008

Applicant: United Parcel Service of America, Inc.

Inventors: Timothy Owens, Bruce Harrison