Vectors, Bitmaps Or Matrices (epo) Patents (Class 707/E17.051)
-
Patent number: 8892586Abstract: An additional data structure can be initialized for a column of compressed data to include a prefix storing, for each block of values in the column, a total number of bits set in previous blocks in the bit vector. A block number can be determined for a target block of the plurality of blocks, for example by checking whether or not a specified row number is located in the prefix. If the specified row number is located in the prefix, the prefix value of the prefix is returned, the most frequently occurring value is returned if a corresponding bit in the bit vector in the specified row number is not located in the prefix, or a position of the specified row in an index vector for the column is returned.Type: GrantFiled: December 23, 2011Date of Patent: November 18, 2014Assignee: SAP AGInventors: Christian Lemke, Tobias Mindnich, Christoph Weyerhaeuser, Franz Faerber, Kai-Uwe Sattler
-
Publication number: 20130325874Abstract: A database query of point data among two or more axes of a database is received. The database stores point data in distinct integer vectors with a shared dictionary. Thereafter, the dictionary is scanned to determine boundaries for each axis specified by the query. In response, results characterizing data responsive to the query within the determined boundaries for each axis are returned. Related apparatus, systems, techniques and articles are also described.Type: ApplicationFiled: June 4, 2012Publication date: December 5, 2013Applicant: SAP AGInventors: Christoph Weyerhaeuser, Tobias Mindnich, Daniel Baeumges, Gerrit Simon Kazmaier
-
Publication number: 20130325900Abstract: A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.Type: ApplicationFiled: May 31, 2012Publication date: December 5, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ronald J. Barber, Min-Soo Kim, Sam S. Lightstone, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
-
Publication number: 20130166566Abstract: An additional data structure can be initialized for a column of compressed data to include a prefix storing, for each block of values in the column, a total number of bits set in previous blocks in the bit vector. A block number can be determined for a target block of the plurality of blocks, for example by checking whether or not a specified row number is located in the prefix. If the specified row number is located in the prefix, the prefix value of the prefix is returned, the most frequently occurring value is returned if a corresponding bit in the bit vector in the specified row number is not located in the prefix, or a position of the specified row in an index vector for the column is returned.Type: ApplicationFiled: December 23, 2011Publication date: June 27, 2013Inventors: Christian Lemke, Tobias Mindnich, Christoph Weyerhaeuser, Franz Faerber, Kai-Uwe Sattler
-
Publication number: 20120323867Abstract: Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein.Type: ApplicationFiled: August 27, 2012Publication date: December 20, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bin He, Hui-l Hsiao
-
Publication number: 20120303633Abstract: Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein.Type: ApplicationFiled: May 26, 2011Publication date: November 29, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bin He, Hui-l Hsiao
-
Publication number: 20120239664Abstract: To increase space efficiency of a coupled node tree, a branch node does not have an area that holds an array element number of an array element wherein is stored the primary node of the node pair that is the link target, and the root node is disposed in an array element with a node location number 1, and the primary node is disposed in an array element whose node location number is twice the node location number of the branch node. The node location number of an array element wherein is disposed a link target node is obtained by adding the bit value in the search key at the discrimination bit position for the link source branch node to twice the node location number of the branch node.Type: ApplicationFiled: May 30, 2012Publication date: September 20, 2012Applicant: S. Grants Co., Ltd.Inventors: Toshio Shinjo, Mitsuhiro Kokubun
-
Patent number: 8255424Abstract: A system, storage medium, and method for structuring data are provided. The system connects to a storage device that stores original data. The method obtains the original data from the storage device, and stores the original data in the form of character strings into a buffer memory according to end of file-line (EOF) tags. The method further constructs data arrays to store the character strings, and arranges each of the data arrays into a data matrix. In addition, the method classifies each of the data arrays in the data matrix according to properties of the character strings, arranges the classified data arrays into a data file, and stores the data file into the buffer memory.Type: GrantFiled: September 14, 2009Date of Patent: August 28, 2012Assignee: Hon Hai Precision Industry Co., Ltd.Inventors: Shen-Chun Li, Yung-Chieh Chen, Shou-Kuo Hsu
-
Patent number: 8244711Abstract: A system, method, and apparatus for information retrieval are provided. Embodiments of the present invention may generate data structures that may be used to process user queries. According to embodiments of the present invention, a processor component configured to perform the operations of an indexing module and a storage module, the indexing module configured to generate a term list and a term-file matrix from information stored on the storage module, the indexing module further configured to generate an adjacency matrix from the one or more files, wherein the adjacency matrix represents a relationship of the one or more terms in each of the one or more files; and the indexing module further configured to generate a probability matrix using the adjacency matrix and a one-step or two-step random walk.Type: GrantFiled: September 28, 2009Date of Patent: August 14, 2012Inventor: Chin Lung Fong
-
Publication number: 20120179712Abstract: A method, apparatus, and program product are provided for creating an Encoded Matrix Index for a column in a database table. An element of the column for all rows in the database table is compared to a corresponding reference value in a reference data structure, and in response to at least one value for the element of the column not matching the reference value, indicating a variation in a variation data structure and creating a value data structure. Queries executed using the Encoded Matrix Index include terms associated with a sub-column defined in a column of a database table. The variation data structure is accessed to determine whether any variation exists between rows belonging to a sub-column of the database table. If no variation exists, a value is accessed from the reference data structure; otherwise, a value for each row of the sub-column is accessed from a value data structure.Type: ApplicationFiled: March 16, 2012Publication date: July 12, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Robert J. Bestgen, Thomas J. Eggebraaten, Jeffrey W. Tenner
-
Patent number: 8122038Abstract: A system for extending a Time Intelligence language to provide support for period-to-date functions and for generating member sets in response to data queries is provided. The system may apply member aggregation functions and queries across a plurality of heterogeneous data sources. Each data source is aligned to a reference dimension and is said to organize data according to at least one level of granularity. In some embodiments, a member aggregation function specifies a period (e.g., year, quarter, month) and retrieves data from a data source starting with the current specified period and ending with the most recently completed period equal to the granularity of the data source. The system may allow a user to further customize a member aggregation function by specifying a granularity, a period offset, or a granularity end offset. Additionally, the system may generate a caption to display in association with the retrieved data.Type: GrantFiled: June 15, 2009Date of Patent: February 21, 2012Assignee: Microsoft CorporationInventors: Steve Handy, Catalin Tomai, Chen-I Lim
-
Publication number: 20110270844Abstract: A method, system and program product for data evolution on column oriented databases is disclosed. For an input evolution operation, reusable and non-reusable attributes are identified. For attributes in a target schema that cannot be reused from the source schema, data and bitmap indexes of those attributes are generated from source data and bitmap indexes. A decompose operation is disclosed for decomposing a table into two tables. A merge operation is disclosed in which only one input table can be reused for mergence. A second merge operation is disclosed in which both input tables cannot be reused for mergence.Type: ApplicationFiled: May 3, 2010Publication date: November 3, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: BIN HE, HUI-I HSIAO
-
Publication number: 20110213784Abstract: Semantic object characterization and its use in indexing and searching a database directory is presented. In general, a first binary hash code is generated to represent a first representation or view of a semantic object which when compared to a characterized version of a second representation or view of the same semantic object in the form of a second binary hash code, the first and second binary hash codes exhibit a degree of similarity indicative of the objects being the same object. In one implementation the semantic objects correspond to peoples' names and the first and second representations or views correspond to two different languages. Thus, a user can search a database of information in one language with a search query in another language.Type: ApplicationFiled: March 1, 2010Publication date: September 1, 2011Applicant: Microsoft CorporationInventors: Uppinakuduru Raghavendra Udupa, Shaishav Kumar
-
Publication number: 20100169322Abstract: A system and method for locating an unallocated bit in a bitmap array includes traversing the bitmap array using a plurality of pointers to locate a unit. The unit includes a plurality of entities and at least one of the plurality of entities is unallocated. The method further includes traversing the at least one of the plurality of unallocated entities in the unit to obtain an unallocated entity. The unit is associated with at least one pointer, and the at least one pointer is associated with a plurality of threshold values and a fill count, the fill count being less than a maximum fill count of the bitmap array.Type: ApplicationFiled: December 26, 2008Publication date: July 1, 2010Applicant: SUN MICROSYSTEMS, INC.Inventor: Parthasarathy Selvaraj
-
Publication number: 20100114909Abstract: A method and system for improved processing of volumetric data. The method includes encoding the volumetric data into a plurality of blocks, wherein each block is associated with: a block topology denoting a relative location of the block within the volumetric data and a set of elements, and each element is associated with: an element topology denoting a relative location of the element within the associated block and a data value. The method includes encoding each block into a value table and an element bit-mask, wherein the value table stores element values, and the element bit-mask indicates non-zero element values. The method includes randomly accessing an element value, further comprising: determining a selected block containing the element value from the element coordinate, computing a value table offset from the element coordinate, and accessing the element value in the value table with the value table offset.Type: ApplicationFiled: November 5, 2008Publication date: May 6, 2010Inventor: Ken Museth
-
Publication number: 20090327239Abstract: A method and apparatus for identifying devices using a bitmap are disclosed. In one embodiment, the method comprises: accessing a memory to obtain one or more bitmaps that map each bit location in the bitmap to an index value, where one index value is assigned to each remote wireless media device of a wireless network in a wireless communication system; and examining those bitmaps and determining at least one characteristic of a remote wireless media device and a device identifier that identifies the remote wireless media device in the wireless network based on the bit position in those bitmaps.Type: ApplicationFiled: June 30, 2008Publication date: December 31, 2009Inventors: In Sung Cho, Kumar Mahesh, Prakash Kamath, Jeffrey Gilbert, Rob Frizzell
-
Publication number: 20090192980Abstract: The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.Type: ApplicationFiled: January 30, 2008Publication date: July 30, 2009Applicant: International Business Machines CorporationInventors: Kevin Scott Beyer, Rainer Gemulla, Peter Jay Haas, Berthold Reinwald, John Sismanis
-
Publication number: 20090171956Abstract: The present invention provides a method for incorporating features from heterogeneous auxiliary datasets into input text data for use in classification, a plurality of heterogeneous auxiliary datasets, such as labeled datasets and unlabeled datasets, are accessed after receiving input text data. A plurality of features are extracted from each of the plurality of heterogeneous auxiliary datasets. The plurality of features are combined with the input text data to generate a set of features which may potentially be used to classify the input text data. Classification features are then extracted from the set of features and used to classify the input text data. In one embodiment, the classification features are extracted by calculating a mutual information value associated with each feature in the set of features and identifying features having a mutual information value exceeding a threshold value.Type: ApplicationFiled: October 10, 2008Publication date: July 2, 2009Inventors: Rakesh Gupta, Lev Ratinov
-
Publication number: 20090171919Abstract: A computer implemented method is provided for using SLP in processing a plurality of statements, wherein the statements are associated with an array having a number of array positions, and each statement includes one or more expressions. The method includes the step of gathering expressions for each of the statements into a structure comprising a single merge stream, the merge streams being furnished with a location for each expression, wherein the location for a given expression is associated with one of the array positions. The method further comprises selectively identifying a plurality of expressions, and applying SLP packing operations to the identified expressions, in order to merge respective identified expressions into one or more isomorphic sub-streams. The method further comprises selectively combining the expressions of the isomorphic sub-streams, and other expressions of the single merge stream, into a number of input vectors that are substantially equal in length to one another.Type: ApplicationFiled: December 26, 2007Publication date: July 2, 2009Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
-
Publication number: 20090089106Abstract: Provided are a system and method for developing a strategic scenario.Type: ApplicationFiled: October 27, 2005Publication date: April 2, 2009Inventors: Jeong-Seok Park, Kyung-Seok Ryu, Jee-Hyung Lee
-
Publication number: 20090049067Abstract: In a computer implemented method of researching textual data sources, textual data is reduced to a plurality of distinctive words based on frequency of usage within the textual data. The distinctive words are converted into first numeric representations of vectors containing random numbers. A first self-organizing map is formed from the first numeric representations and organized by similarities between the vectors. A second self-organizing map is formed from second numeric representations generated from the organization of the first self-organizing map. The second numeric representations are vectors derived from the first self-organizing map. The vectors are used to train the second self-organizing map. The vectors derived from the first self-organizing map are organized into clusters of similarities between the vectors on the second self-organizing map. Dialectic arguments are formed from the second self-organizing map to interpret the textual data.Type: ApplicationFiled: October 27, 2008Publication date: February 19, 2009Applicant: KINETX, INC.Inventor: Jonathan Murray
-
Publication number: 20080114754Abstract: A retrieval information that is a retrieval target, acquired from an information source, is arranged on a vector space. Data of a usage information and a content specifying manners of usage, used by the user in the past is acquired by a usage-information acquisition unit. A user's preference extraction unit extracts a preference of a user from the data, and notifies a number-of-effective-elements reduction unit of the extracted data. The number-of-effective-elements reduction unit evaluates each element of a vector of retrieval information by using the preference of the user, and reduces the number of effective elements by removing the elements smaller than a certain criteria.Type: ApplicationFiled: August 29, 2007Publication date: May 15, 2008Applicant: Fujitsu LimitedInventors: Akira Karasudani, Takahiro Matsuda
-
Publication number: 20080027980Abstract: A computer readable medium for validating data is disclosed. The computer readable medium provides a computer readable medium with instructions executable by a computer to cause the computer to perform the following functions: receive a subjective representation of an address; re-format the subjective representation of the address according to a set of standardization rules; locate one or more candidate representations of the address from source data by recognizing that a preferred token is present among any of the one or more candidate representations of the address; select a preferred representation of the address from among the one or more candidate representations of the address based on the presence of the preferred token; and communicate the preferred representation of the address to an interface.Type: ApplicationFiled: August 13, 2007Publication date: January 31, 2008Applicant: United Parcel Service of America, Inc.Inventors: Timothy Owens, Bruce Harrison