Vectors, Bitmaps Or Matrices (epo) Patents (Class 707/E17.051)
  • Patent number: 8892586
    Abstract: An additional data structure can be initialized for a column of compressed data to include a prefix storing, for each block of values in the column, a total number of bits set in previous blocks in the bit vector. A block number can be determined for a target block of the plurality of blocks, for example by checking whether or not a specified row number is located in the prefix. If the specified row number is located in the prefix, the prefix value of the prefix is returned, the most frequently occurring value is returned if a corresponding bit in the bit vector in the specified row number is not located in the prefix, or a position of the specified row in an index vector for the column is returned.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: November 18, 2014
    Assignee: SAP AG
    Inventors: Christian Lemke, Tobias Mindnich, Christoph Weyerhaeuser, Franz Faerber, Kai-Uwe Sattler
  • Publication number: 20130325874
    Abstract: A database query of point data among two or more axes of a database is received. The database stores point data in distinct integer vectors with a shared dictionary. Thereafter, the dictionary is scanned to determine boundaries for each axis specified by the query. In response, results characterizing data responsive to the query within the determined boundaries for each axis are returned. Related apparatus, systems, techniques and articles are also described.
    Type: Application
    Filed: June 4, 2012
    Publication date: December 5, 2013
    Applicant: SAP AG
    Inventors: Christoph Weyerhaeuser, Tobias Mindnich, Daniel Baeumges, Gerrit Simon Kazmaier
  • Publication number: 20130325900
    Abstract: A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ronald J. Barber, Min-Soo Kim, Sam S. Lightstone, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
  • Publication number: 20130166566
    Abstract: An additional data structure can be initialized for a column of compressed data to include a prefix storing, for each block of values in the column, a total number of bits set in previous blocks in the bit vector. A block number can be determined for a target block of the plurality of blocks, for example by checking whether or not a specified row number is located in the prefix. If the specified row number is located in the prefix, the prefix value of the prefix is returned, the most frequently occurring value is returned if a corresponding bit in the bit vector in the specified row number is not located in the prefix, or a position of the specified row in an index vector for the column is returned.
    Type: Application
    Filed: December 23, 2011
    Publication date: June 27, 2013
    Inventors: Christian Lemke, Tobias Mindnich, Christoph Weyerhaeuser, Franz Faerber, Kai-Uwe Sattler
  • Publication number: 20120323867
    Abstract: Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein.
    Type: Application
    Filed: August 27, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bin He, Hui-l Hsiao
  • Publication number: 20120303633
    Abstract: Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein.
    Type: Application
    Filed: May 26, 2011
    Publication date: November 29, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bin He, Hui-l Hsiao
  • Publication number: 20120239664
    Abstract: To increase space efficiency of a coupled node tree, a branch node does not have an area that holds an array element number of an array element wherein is stored the primary node of the node pair that is the link target, and the root node is disposed in an array element with a node location number 1, and the primary node is disposed in an array element whose node location number is twice the node location number of the branch node. The node location number of an array element wherein is disposed a link target node is obtained by adding the bit value in the search key at the discrimination bit position for the link source branch node to twice the node location number of the branch node.
    Type: Application
    Filed: May 30, 2012
    Publication date: September 20, 2012
    Applicant: S. Grants Co., Ltd.
    Inventors: Toshio Shinjo, Mitsuhiro Kokubun
  • Patent number: 8255424
    Abstract: A system, storage medium, and method for structuring data are provided. The system connects to a storage device that stores original data. The method obtains the original data from the storage device, and stores the original data in the form of character strings into a buffer memory according to end of file-line (EOF) tags. The method further constructs data arrays to store the character strings, and arranges each of the data arrays into a data matrix. In addition, the method classifies each of the data arrays in the data matrix according to properties of the character strings, arranges the classified data arrays into a data file, and stores the data file into the buffer memory.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: August 28, 2012
    Assignee: Hon Hai Precision Industry Co., Ltd.
    Inventors: Shen-Chun Li, Yung-Chieh Chen, Shou-Kuo Hsu
  • Patent number: 8244711
    Abstract: A system, method, and apparatus for information retrieval are provided. Embodiments of the present invention may generate data structures that may be used to process user queries. According to embodiments of the present invention, a processor component configured to perform the operations of an indexing module and a storage module, the indexing module configured to generate a term list and a term-file matrix from information stored on the storage module, the indexing module further configured to generate an adjacency matrix from the one or more files, wherein the adjacency matrix represents a relationship of the one or more terms in each of the one or more files; and the indexing module further configured to generate a probability matrix using the adjacency matrix and a one-step or two-step random walk.
    Type: Grant
    Filed: September 28, 2009
    Date of Patent: August 14, 2012
    Inventor: Chin Lung Fong
  • Publication number: 20120179712
    Abstract: A method, apparatus, and program product are provided for creating an Encoded Matrix Index for a column in a database table. An element of the column for all rows in the database table is compared to a corresponding reference value in a reference data structure, and in response to at least one value for the element of the column not matching the reference value, indicating a variation in a variation data structure and creating a value data structure. Queries executed using the Encoded Matrix Index include terms associated with a sub-column defined in a column of a database table. The variation data structure is accessed to determine whether any variation exists between rows belonging to a sub-column of the database table. If no variation exists, a value is accessed from the reference data structure; otherwise, a value for each row of the sub-column is accessed from a value data structure.
    Type: Application
    Filed: March 16, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert J. Bestgen, Thomas J. Eggebraaten, Jeffrey W. Tenner
  • Patent number: 8122038
    Abstract: A system for extending a Time Intelligence language to provide support for period-to-date functions and for generating member sets in response to data queries is provided. The system may apply member aggregation functions and queries across a plurality of heterogeneous data sources. Each data source is aligned to a reference dimension and is said to organize data according to at least one level of granularity. In some embodiments, a member aggregation function specifies a period (e.g., year, quarter, month) and retrieves data from a data source starting with the current specified period and ending with the most recently completed period equal to the granularity of the data source. The system may allow a user to further customize a member aggregation function by specifying a granularity, a period offset, or a granularity end offset. Additionally, the system may generate a caption to display in association with the retrieved data.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: February 21, 2012
    Assignee: Microsoft Corporation
    Inventors: Steve Handy, Catalin Tomai, Chen-I Lim
  • Publication number: 20110270844
    Abstract: A method, system and program product for data evolution on column oriented databases is disclosed. For an input evolution operation, reusable and non-reusable attributes are identified. For attributes in a target schema that cannot be reused from the source schema, data and bitmap indexes of those attributes are generated from source data and bitmap indexes. A decompose operation is disclosed for decomposing a table into two tables. A merge operation is disclosed in which only one input table can be reused for mergence. A second merge operation is disclosed in which both input tables cannot be reused for mergence.
    Type: Application
    Filed: May 3, 2010
    Publication date: November 3, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: BIN HE, HUI-I HSIAO
  • Publication number: 20110213784
    Abstract: Semantic object characterization and its use in indexing and searching a database directory is presented. In general, a first binary hash code is generated to represent a first representation or view of a semantic object which when compared to a characterized version of a second representation or view of the same semantic object in the form of a second binary hash code, the first and second binary hash codes exhibit a degree of similarity indicative of the objects being the same object. In one implementation the semantic objects correspond to peoples' names and the first and second representations or views correspond to two different languages. Thus, a user can search a database of information in one language with a search query in another language.
    Type: Application
    Filed: March 1, 2010
    Publication date: September 1, 2011
    Applicant: Microsoft Corporation
    Inventors: Uppinakuduru Raghavendra Udupa, Shaishav Kumar
  • Publication number: 20100169322
    Abstract: A system and method for locating an unallocated bit in a bitmap array includes traversing the bitmap array using a plurality of pointers to locate a unit. The unit includes a plurality of entities and at least one of the plurality of entities is unallocated. The method further includes traversing the at least one of the plurality of unallocated entities in the unit to obtain an unallocated entity. The unit is associated with at least one pointer, and the at least one pointer is associated with a plurality of threshold values and a fill count, the fill count being less than a maximum fill count of the bitmap array.
    Type: Application
    Filed: December 26, 2008
    Publication date: July 1, 2010
    Applicant: SUN MICROSYSTEMS, INC.
    Inventor: Parthasarathy Selvaraj
  • Publication number: 20100114909
    Abstract: A method and system for improved processing of volumetric data. The method includes encoding the volumetric data into a plurality of blocks, wherein each block is associated with: a block topology denoting a relative location of the block within the volumetric data and a set of elements, and each element is associated with: an element topology denoting a relative location of the element within the associated block and a data value. The method includes encoding each block into a value table and an element bit-mask, wherein the value table stores element values, and the element bit-mask indicates non-zero element values. The method includes randomly accessing an element value, further comprising: determining a selected block containing the element value from the element coordinate, computing a value table offset from the element coordinate, and accessing the element value in the value table with the value table offset.
    Type: Application
    Filed: November 5, 2008
    Publication date: May 6, 2010
    Inventor: Ken Museth
  • Publication number: 20090327239
    Abstract: A method and apparatus for identifying devices using a bitmap are disclosed. In one embodiment, the method comprises: accessing a memory to obtain one or more bitmaps that map each bit location in the bitmap to an index value, where one index value is assigned to each remote wireless media device of a wireless network in a wireless communication system; and examining those bitmaps and determining at least one characteristic of a remote wireless media device and a device identifier that identifies the remote wireless media device in the wireless network based on the bit position in those bitmaps.
    Type: Application
    Filed: June 30, 2008
    Publication date: December 31, 2009
    Inventors: In Sung Cho, Kumar Mahesh, Prakash Kamath, Jeffrey Gilbert, Rob Frizzell
  • Publication number: 20090192980
    Abstract: The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.
    Type: Application
    Filed: January 30, 2008
    Publication date: July 30, 2009
    Applicant: International Business Machines Corporation
    Inventors: Kevin Scott Beyer, Rainer Gemulla, Peter Jay Haas, Berthold Reinwald, John Sismanis
  • Publication number: 20090171956
    Abstract: The present invention provides a method for incorporating features from heterogeneous auxiliary datasets into input text data for use in classification, a plurality of heterogeneous auxiliary datasets, such as labeled datasets and unlabeled datasets, are accessed after receiving input text data. A plurality of features are extracted from each of the plurality of heterogeneous auxiliary datasets. The plurality of features are combined with the input text data to generate a set of features which may potentially be used to classify the input text data. Classification features are then extracted from the set of features and used to classify the input text data. In one embodiment, the classification features are extracted by calculating a mutual information value associated with each feature in the set of features and identifying features having a mutual information value exceeding a threshold value.
    Type: Application
    Filed: October 10, 2008
    Publication date: July 2, 2009
    Inventors: Rakesh Gupta, Lev Ratinov
  • Publication number: 20090171919
    Abstract: A computer implemented method is provided for using SLP in processing a plurality of statements, wherein the statements are associated with an array having a number of array positions, and each statement includes one or more expressions. The method includes the step of gathering expressions for each of the statements into a structure comprising a single merge stream, the merge streams being furnished with a location for each expression, wherein the location for a given expression is associated with one of the array positions. The method further comprises selectively identifying a plurality of expressions, and applying SLP packing operations to the identified expressions, in order to merge respective identified expressions into one or more isomorphic sub-streams. The method further comprises selectively combining the expressions of the isomorphic sub-streams, and other expressions of the single merge stream, into a number of input vectors that are substantially equal in length to one another.
    Type: Application
    Filed: December 26, 2007
    Publication date: July 2, 2009
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Publication number: 20090089106
    Abstract: Provided are a system and method for developing a strategic scenario.
    Type: Application
    Filed: October 27, 2005
    Publication date: April 2, 2009
    Inventors: Jeong-Seok Park, Kyung-Seok Ryu, Jee-Hyung Lee
  • Publication number: 20090049067
    Abstract: In a computer implemented method of researching textual data sources, textual data is reduced to a plurality of distinctive words based on frequency of usage within the textual data. The distinctive words are converted into first numeric representations of vectors containing random numbers. A first self-organizing map is formed from the first numeric representations and organized by similarities between the vectors. A second self-organizing map is formed from second numeric representations generated from the organization of the first self-organizing map. The second numeric representations are vectors derived from the first self-organizing map. The vectors are used to train the second self-organizing map. The vectors derived from the first self-organizing map are organized into clusters of similarities between the vectors on the second self-organizing map. Dialectic arguments are formed from the second self-organizing map to interpret the textual data.
    Type: Application
    Filed: October 27, 2008
    Publication date: February 19, 2009
    Applicant: KINETX, INC.
    Inventor: Jonathan Murray
  • Publication number: 20080114754
    Abstract: A retrieval information that is a retrieval target, acquired from an information source, is arranged on a vector space. Data of a usage information and a content specifying manners of usage, used by the user in the past is acquired by a usage-information acquisition unit. A user's preference extraction unit extracts a preference of a user from the data, and notifies a number-of-effective-elements reduction unit of the extracted data. The number-of-effective-elements reduction unit evaluates each element of a vector of retrieval information by using the preference of the user, and reduces the number of effective elements by removing the elements smaller than a certain criteria.
    Type: Application
    Filed: August 29, 2007
    Publication date: May 15, 2008
    Applicant: Fujitsu Limited
    Inventors: Akira Karasudani, Takahiro Matsuda
  • Publication number: 20080027980
    Abstract: A computer readable medium for validating data is disclosed. The computer readable medium provides a computer readable medium with instructions executable by a computer to cause the computer to perform the following functions: receive a subjective representation of an address; re-format the subjective representation of the address according to a set of standardization rules; locate one or more candidate representations of the address from source data by recognizing that a preferred token is present among any of the one or more candidate representations of the address; select a preferred representation of the address from among the one or more candidate representations of the address based on the presence of the preferred token; and communicate the preferred representation of the address to an interface.
    Type: Application
    Filed: August 13, 2007
    Publication date: January 31, 2008
    Applicant: United Parcel Service of America, Inc.
    Inventors: Timothy Owens, Bruce Harrison