Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
  • Publication number: 20110029492
    Abstract: System and method for implementing a reliable persistent random access compressed data stream is described. In one embodiment, the system comprises a computer-implemented journaled file system that includes a first file for storing a series of independently compressed blocks of a data stream; a second file for storing a series of indexes corresponding to the compressed blocks, wherein each one of the indexes comprises a byte offset into the first file of the corresponding compressed block; and a third file for storing a chunk of data from the data stream before it is compressed and written to the first file. The system further comprises a writer module for writing uncompressed data to the third file and writing indexes to the second file and a compressor module for compressing a chunk of data from the third file and writing it to the end of the first file.
    Type: Application
    Filed: July 29, 2009
    Publication date: February 3, 2011
    Applicant: Novell, Inc.
    Inventor: Shon Vella
  • Publication number: 20110029959
    Abstract: Techniques for discovering database connectivity leaks are presented. Each connection made by an application to a database is monitored. When the application is shut down, if information regarding a particular connection remains in memory, then that connection is reported as a potential database connectivity leak.
    Type: Application
    Filed: December 16, 2009
    Publication date: February 3, 2011
    Applicant: Teradata US, Inc.
    Inventor: Dennis Avery Tackett
  • Publication number: 20110029511
    Abstract: A method, system and apparatus for a assigning keywords to a web page using keyword data from the web page itself, web pages having links pointing to the web page, and web pages pointed to by a link in the web page, wherein the keyword data from the multiple web pages is processed to provide a relevant set of keyword data for the web page.
    Type: Application
    Filed: July 30, 2009
    Publication date: February 3, 2011
    Inventors: MURALIDHARAN SAMPATH KODIALAM, Sarit Mukherjee, Limin Wang, Sunghwan Ihm
  • Publication number: 20110029507
    Abstract: A method for estimating the selectivity of a database base table predicate, the cardinality of a join, and the cardinality of an aggregation. The method includes receiving a database query, the query comprising one or more query predicates and referencing one or more database tables. One or more join indexes are identified, the join index(es) defined on respective database tables referenced by the database query. The join index(es) comprises one or more join index predicates, and includes one or more join columns in its select list. The row count selected by the query predicates is calculated at least partly using the row count or statistics of the one or more join indexes. The selectivity of the base table predicate is calculated at least partly from the calculated row count. The cardinality of the join is estimated at least partly from the row count and statistics of the identified join index(es).
    Type: Application
    Filed: July 28, 2009
    Publication date: February 3, 2011
    Inventors: Grace Au, Rama Krishna Korlapati, Haiyan Chen
  • Publication number: 20110029538
    Abstract: A system and method for a user of a mobile device or computer to easily and directly create content with correlated virtual and geospatial locations and associated context information. The content may include audio, video, and images, and be exposed via a unique URL based on content identifier. This content is indexed, including using a geospatial, “place-based” index, enabling users to easily share content with their matrix of social networks, friends, and communities, or to discover content created by others.
    Type: Application
    Filed: July 28, 2009
    Publication date: February 3, 2011
    Inventors: Daniel HARPLE, JR., Sam CRITCHLEY, Rich PIZZARRO, Gavin NICOL
  • Publication number: 20110022579
    Abstract: An Internet-based system involves a database and search capabilities for connecting patients with healthcare providers, e.g., physicians, hospitals, nursing homes, treatment facilities, etc., and further enables such providers to reach patients with whom they may not otherwise come into contact. A patient may access the healthcare provider information through a search conducted using a search engine, such as Google, Yahoo, etc. Alternatively, a patient may access the company Web site's predetermined Web page that provides search capabilities on its database. A patient may research a healthcare provider based on criteria specified by the patient. Information provided to the patient may be in the form of a report, profile, ratings, etc., including patient-provided information, physician-verified information, and information verified by an independent third party.
    Type: Application
    Filed: October 4, 2010
    Publication date: January 27, 2011
    Applicant: Health Grades, Inc.
    Inventors: David G. Hicks, Scott Montroy, John Neal
  • Publication number: 20110022599
    Abstract: A computer-based method and a system for indexing, querying, and ranking documents based on layout are provided. The method includes providing a plurality of documents to computer memory, extracting layout blocks from the provided documents, clustering the layout blocks into a plurality of layout block clusters, computing a representative block for each of the layout block clusters, generating a document index for each provided document based on the layout blocks of the document and the computed representatives blocks, clustering the created document indexes into a plurality of document index clusters, and generating a representative cluster index for each of the document index clusters. The indexes generated, together with the representative blocks and document index clusters, can be stored and used for retrieval of documents responsive to a layout query.
    Type: Application
    Filed: September 9, 2009
    Publication date: January 27, 2011
    Applicant: Xerox Corporation
    Inventors: Boris Chidlovskii, Loïc M. Lecerf
  • Publication number: 20110022596
    Abstract: Generating a document index comprises: obtaining a document to be indexed; performing a monadic partition operation on the document to obtain a plurality of monadic partitions; and for each monadic partition in the plurality of monadic partitions: determining whether said each monadic partition is a filter character; in the event said each monadic partition is a filter character, forming a polynary partition by combining the monadic partition with at least one other monadic partition adjacent to the monadic partition, and indexing the polynary partition; and in the event that the monadic partition is not a filter character, indexing the monadic partition.
    Type: Application
    Filed: July 20, 2010
    Publication date: January 27, 2011
    Inventors: Lei Wei, Jiaxiang Shen
  • Publication number: 20110016097
    Abstract: A “fast approximation” of compression of current data involves using information obtained from an earlier compression of similar data. It overcomes the iterative process of discovering a unique set of optimal symbols. Representatively, a dictionary of symbols corresponding to original data from an earlier compressed file is extracted. Original bits are then obtained from the symbols. Sequences of the original bits are identified in the current data of a current file under consideration. A new bit stream for the current file is created from the original bits and according to the symbols they represent. Every occurrence of the symbols is counted in the new bit stream and a path-weighted Huffman tree is created from the counted occurrences. A coding from the Huffman tree ensues, along with an end-of-file marker. The latter is stored in a new compression file, including the dictionary earlier extracted from the earlier compressed file.
    Type: Application
    Filed: October 8, 2009
    Publication date: January 20, 2011
    Inventor: Craig N. Teerlink
  • Publication number: 20110016098
    Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position in a multi-dimensional space. This position relative to another file's position reveals distances between the files. Closest files can be grouped together. When contemplating voluminous numbers of files for digital spectrums, various methods include: concatenating all such files together to get a single key useful for creating a file's spectrum; or compressing files individually and combining their collective dictionaries into a single dictionary that defines the digital spectrum. Each provides advantage over the other. The latter consumes considerably less run time because each compression event can be distributed to a separate processor. Method two provides better spectrums because it is more “informationally” valid than is method one.
    Type: Application
    Filed: January 8, 2010
    Publication date: January 20, 2011
    Inventor: Craig N. Teerlink
  • Publication number: 20110016096
    Abstract: Methods and apparatus involve an original data stream arranged as a plurality of symbols. Of those symbols, all possible tuples are identified and the highest or most frequently occurring tuple is determined. A new symbol is created and substituted for each instance of the highest occurring tuple, which results in a new data stream. The new data stream is encoded and its size determined. Also, a size of a dictionary carrying all the original and new symbols is determined. The encoding size, the size of the dictionary and sizes of any other attendant overhead is compared to a size of the original data to see if compression has occurred, and by how much. Upon reaching pre-defined objectives, compression ceases. Decompression occurs oppositely. Other features include resolving ties between equally occurring tuples, path weighted Huffman coding, storing files, decoding structures, and computing arrangements and program products, to name a few.
    Type: Application
    Filed: September 28, 2009
    Publication date: January 20, 2011
    Inventor: Craig N. Teerlink
  • Publication number: 20110016070
    Abstract: An average environmental condition for a specified target date and time is determined by indexing a database of time series data to retrieve the environment condition for each day and time where an orbital position of the earth with respect to the sun is nearest to the orbital position of the earth on the target date and time. The average environmental condition is then determined from the retrieved environmental conditions.
    Type: Application
    Filed: July 14, 2009
    Publication date: January 20, 2011
    Inventor: Daniel N. Nikovski
  • Publication number: 20110010431
    Abstract: A system and method for reconciling media content available through a number of service providers. A request is received to identify media content. One or more characteristics of the media content are determined. A determination is made whether the media content is associated with an identifier in response to the one or more characteristics. An identifier is associated with the media content in response to determining there is not an existing identifier associated with the media content. The media content is cross referenced between the number of service providers utilizing the identifier and time zones. A database is updated to include the identifier and the one or more characteristics associated with the media content.
    Type: Application
    Filed: July 8, 2009
    Publication date: January 13, 2011
    Inventors: Kelsyn Rooks, David E. Emerson, Gary W. Lafreniere, Michael S. Goergen
  • Publication number: 20110004599
    Abstract: Searching of objects captured by a capture system can be improved by eliminating irrelevant objects from a query. In one embodiment, the present invention includes receiving such a query for objects captured by a capture system, the query including at least one search term. This search term is then hashed to a term bit position using a hash function. Then objects can be eliminated if, in a word index associated with the object, the term bit position is not set.
    Type: Application
    Filed: September 1, 2010
    Publication date: January 6, 2011
    Inventors: William Deninger, Erik de la Iglesia
  • Publication number: 20110004597
    Abstract: A method, system and platform hub for content aware routing of data, the platform hub comprising: a processor port for bidirectional data communication with a processing platform; at least one network ports for bidirectional data communication with at least one corresponding network agents; and at least one content aware units configured for performing content aware classification of data incoming into the platform hub, wherein the platform hub is configured for routing data to a suitable destination based on the content aware classification. The method for content aware routing of data comprises: receiving in a platform hub a data packet sent from a network towards a processing platform; performing a content aware classification in the platform hub; and routing the data packet based on the content aware classification.
    Type: Application
    Filed: March 16, 2010
    Publication date: January 6, 2011
    Inventors: Yehiel ENGEL, Avraham Ganor, Gal Gilat, Michael Chaim Schnarch
  • Publication number: 20110004598
    Abstract: A device for analyzing service response performance, a method and a program which can take action with high immediacy according to the communication state between a service provider's terminal and a service requester's terminal, and a recording medium containing the program are provided.
    Type: Application
    Filed: March 25, 2009
    Publication date: January 6, 2011
    Applicant: NEC CORPORATION
    Inventor: Shinji Kikuchi
  • Publication number: 20100332480
    Abstract: The invention relates to an online information retrieval system having a queue for storing load requests and a set of two or more load managers for retrieving data from the queue and indexing documents based on the request retrieved from the queue. Each load manager resides in a different geographical location. A set of candidate documents comprise a unique identifier and a version indicator, wherein the unique identifier for each candidate document is identical for a given document and the version indicator is associated with a determination of which document within the set of candidate documents shall ultimately be communicated to a user.
    Type: Application
    Filed: June 21, 2010
    Publication date: December 30, 2010
    Inventor: Jon Michael Verreaux
  • Publication number: 20100332454
    Abstract: Systems and methods are disclosed for performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy.
    Type: Application
    Filed: March 31, 2010
    Publication date: December 30, 2010
    Inventors: Anand Prahlad, Marcus S. Muller, Rajiv Kottomtharayil, Srinivas Kavuri, Parag Gokhale, Manoj Vijayan Retnemma
  • Publication number: 20100325092
    Abstract: A computing system can archive information from internetworked computers, such as Internet content, for later retrieval. A server system processes content providers, such as DNS registries and web sites, to extract and store content, including text, image, audio, and video content. For web sites, HTML source code is stored along with a browser-rendered display file. The content is perpetually archived to create a historical record of information for each content provider. An interface is used to retrieve the archived content in response to queries.
    Type: Application
    Filed: August 31, 2010
    Publication date: December 23, 2010
    Inventor: Rodney D. Johnson
  • Publication number: 20100318519
    Abstract: In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy.
    Type: Application
    Filed: June 10, 2009
    Publication date: December 16, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Marios Hadjieleftheriou, Nick Koudas, Divesh Srivastava
  • Publication number: 20100312761
    Abstract: A method of selecting graphic or video files having corresponding locators used to locate such graphic or video files using a computer. Identifiers are created by searching an area within a web page near a graphic or video file for searchable identification terms and searching an area within a web page near links to a graphic or video for searchable identification terms. The identifiers are stored in a database. User requests for graphic or video file content are received and the database of identifiers is searched to find graphic and video files corresponding criteria of the user. Graphic or video file content is then provided to the user.
    Type: Application
    Filed: June 21, 2010
    Publication date: December 9, 2010
    Applicant: Gemstar Development Corporation
    Inventor: Henry C. Yuen
  • Publication number: 20100312755
    Abstract: A method for electronically compressing and decompressing digital data using a context grammar includes grammatically compressing first digital data by discovering multiply occurring sequences of non-further-factorizable terminal symbols in the first digital data and replacing the discovered multiply occurring sequences of non-further-factorizable terminal symbols with non-terminal symbols that can be further factorized. Digital data belonging to the non-terminal symbols is stored in a context grammar. Second digital data is compressed using the context grammar. The first digital data relates to a column of data stored in a database and the second digital data relates to entries in the column of data stored in the database.
    Type: Application
    Filed: July 24, 2007
    Publication date: December 9, 2010
    Inventors: Eric Hildebrandt, Martin Bokler
  • Publication number: 20100312769
    Abstract: Methods, systems and software are described for analyzing micro-blog messages to detect abnormal activity of interest. The system includes a clusterer for clustering micro-blog messages received over a first period of time, a classifier for scoring the clustered messages; a knowledge base, a rule generator for generating classification rules from the knowledge base; and a matcher for matching the scored messages to information requests. Methods for operating the system and its components are described.
    Type: Application
    Filed: June 9, 2010
    Publication date: December 9, 2010
    Inventors: Edward J. Bailey, Samuel L. Hendel, Jeffrey D. Kinsey, Richard J. Schiller
  • Publication number: 20100306201
    Abstract: To provide a neighbor searching apparatus that can select an index suitable for each search target. A neighbor searching apparatus has: a storage part that stores a meta table containing index-dependent meta data associated with a data structure of each index; a database managing part that searches for an index associated with an instruction when receiving the instruction from a user and makes an indexing part perform a processing associated with the instruction using the index-dependent meta data associated with the index; and the indexing part that performs the processing associated with the instruction using the index-dependent meta data based on the instruction from the managing database part.
    Type: Application
    Filed: March 3, 2010
    Publication date: December 2, 2010
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yutaka Hirano, Mototaka Kanematsu, Toshihiro Kayama, Mayumi Ooto
  • Publication number: 20100306203
    Abstract: Disclosed herein, in certain embodiments, is a method of systematically presenting the contents of at least one document, comprising: (a) a user providing an electronic version of at least one document to a computer; (b) a user accepting or modifying noise words generated by a computer module; (c) generating a list of every non-noise word by means of a computer module wherein the list indicates every page on which a non-noise word appears; and (d) displaying the entire list of non-noise words. In some embodiments, the list of non-noise words further indicates the number of times a word occurs on a page. In some embodiments, the list of non-noise words further indicates each line on which a non-noise word appears.
    Type: Application
    Filed: June 2, 2010
    Publication date: December 2, 2010
    Applicant: INDEX LOGIC, LLC
    Inventors: Susan Jo Paulson Rozok, Peter Rozok
  • Publication number: 20100306202
    Abstract: A file format converting method for converting a first format file in a first file system of a storage apparatus to a second format file in a second file system is provided. The method includes creating a temporary file and defining a data amount of the temporary file as a first predetermined value; defining a start position of the temporary file to be the same as that of the first format file; and defining the data amount of the temporary file to be the same as that of the first format file to generate the second format file.
    Type: Application
    Filed: May 27, 2010
    Publication date: December 2, 2010
    Applicant: MSTAR SEMICONDUCTOR, INC.
    Inventor: Tsung-Yueh Lee
  • Publication number: 20100306200
    Abstract: A computer-implemented system and method are described for image searching and image indexing that may be incorporated in a mobile device that is part of an object identification system. A computer-implemented system and method relating to a MISIS client and MISIS server that may be associated with mobile pointing and identification system for the searching and indexing of objects in in situ images in geographic space taken from the perspective of a system user located near the surface of the Earth including horizontal, oblique, and airborne perspectives.
    Type: Application
    Filed: December 30, 2009
    Publication date: December 2, 2010
    Inventors: Christopher Edward FRANK, David CADUFF
  • Publication number: 20100305959
    Abstract: A system and tool set provide a media exchange environment which facilitates motion picture, still photo and audio usage by allowing the editing and playing of streaming media presentations from potentially several remote libraries. A combined multimedia presentation is then streamed to a user. The system also supports media rights management, and represents what permissions the user must obtain in connection with a contemplated multimedia presentation.
    Type: Application
    Filed: July 30, 2010
    Publication date: December 2, 2010
    Inventors: J. Mitchell Johnson, Yury A. Bukhshtab
  • Publication number: 20100299326
    Abstract: This disclosure details apparatuses, systems and methods for a forum ferreting system. Various implementations of the system may be configured to meet the needs of a variety of users. In one embodiment, the system may be configured to provide users with a variety of search options, including popularity and/or relevance. Users may also be provided with customized groupings or favorites via a customized interface. Such interfaces may allow users to apply search tools, save searches, apply filters, provide feedback, and/or the like. Additionally, in some embodiments, the forum ferreting system may facilitate contextual advertising.
    Type: Application
    Filed: October 24, 2008
    Publication date: November 25, 2010
    Inventor: Scott Germaise
  • Publication number: 20100299316
    Abstract: Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks.
    Type: Application
    Filed: August 2, 2010
    Publication date: November 25, 2010
    Inventors: Franz Faerber, Guenter Radestock, Andrew Ross
  • Publication number: 20100299333
    Abstract: A method to improve the effectiveness of hash-based data structures includes configuration of a data structure and transformation of hash codes as produced by a hash function, to yield a more uniform distribution of data amongst the slots in a data structure. Transformation results in a non-uniform but predictable distribution of hash codes. Configuration exploits the predictable nature of the transformed hash codes to accomplish more uniform and therefore more efficient distribution of items stored in a hash-based data structure.
    Type: Application
    Filed: May 13, 2010
    Publication date: November 25, 2010
    Inventor: Roger Frederick OSMOND
  • Publication number: 20100293155
    Abstract: A method for filtering file clusters is presented. In the method, a plurality of advanced filter actions with different filter conditions and independent from each other is performed on an obtained main result file. Thereby, a history record of each advanced filter is kept, and the history record of each advanced filter and respective search results are presented on a target interface in a presentation mode of opening a new page or updating an index list.
    Type: Application
    Filed: May 4, 2010
    Publication date: November 18, 2010
    Applicant: ESOBI INC.
    Inventors: Hong Yang Tsai, Hung Hsiang Ku, Hsun Hsueh Cho
  • Publication number: 20100293156
    Abstract: Provided is a database processing system including: a computer for outputting data in response to a received query request; and a storage system including a storage device for storing the data, in which: the storage device stores a plurality of partial indices indicating a storage location of the data; the data stored in the storage device is grouped; and the computer is configured to: receive the query request for the data; acquire one of the plurality of partial indices; specify, based on the query request for the data and the acquired one of the plurality of partial indices, a location at which the requested data is stored; and send a request to acquire the data stored at the specified location to the storage system. Accordingly, in the database processing system, a time period necessary to input and output the data is shortened.
    Type: Application
    Filed: February 25, 2010
    Publication date: November 18, 2010
    Inventors: Michiko Tanaka, Kazutomo Ushijima, Akira Shimizu, Seisuke Tokuda, Shinji Fujiwara, Nobuo Kawamura
  • Publication number: 20100293153
    Abstract: A method for compressing a table based on finite automata (FA) includes analyzing transferring characteristics of all states in an original two-dimensional structure table and combining continual states with unified transferring characteristics in the original two-dimensional structure table. A method for matching a table based on FA, a device for compressing a table, and a device for matching a table are also provided.
    Type: Application
    Filed: July 29, 2010
    Publication date: November 18, 2010
    Inventors: Yuchao Zhao, Jijun Li
  • Publication number: 20100290755
    Abstract: In the current invention, a method and apparatus for efficiently deleting large files is described. This is done by having a special inode for pointing to data blocks to be freed, and subsequently freeing the data blocks from the special inode in a controlled manner.
    Type: Application
    Filed: July 26, 2010
    Publication date: November 18, 2010
    Applicant: Broadcom Corporation
    Inventor: Yasantha Nirmal RAJAKARUNANAYAKE
  • Publication number: 20100293064
    Abstract: A data processing system is provided for determining specific digital content for displaying on a restaurant table top display unit corresponding to a specific time point of a consumption cycle. In particular, the said data processing system is provided for recording inputs and their associated time stamps from a table top unit, reading a current time from a digital clock, performing calculations on the said inputs, time stamps, and the current time, based on a predetermined algorithm, producing a calculated index; storing a plural of digital contents tagged with tagging indexes; selecting specific digital content with its associated tagging index matching said calculated index for displaying on a signage unit.
    Type: Application
    Filed: May 13, 2010
    Publication date: November 18, 2010
    Inventors: SHAWN B. GENTRY, KEVIN MOWRY, VIREN BALAR, RAYMOND HOWARD, JACK BAUM
  • Publication number: 20100293167
    Abstract: Methods and system for biological database indexing and query searching are described. In one embodiment, one or more words may be extracted from a biological sequence using a spacer. The spacer may be one or more characters within the biological sequence. The word and a position of the word within the biological sequence may be stored in a sequence index associated with the spacer. The sequence index may be capable of being used for an operation associated with the biological sequence.
    Type: Application
    Filed: June 18, 2008
    Publication date: November 18, 2010
    Inventors: Daniele Biasci, Guido Giudetti, Massimiliano Andreazzoli
  • Publication number: 20100293465
    Abstract: A system used for teleprompter service includes a server including a processor operationally coupled to a database, a memory, and a communications interface. The memory stores instructions that cause the server to receive video captured by a web camera and audio captured by a microphone. The video and audio are the recordings of a user's oral presentation. The server contemporaneously transmits, via the communication interface, data representative of text stored in the database, wherein the text is scrolled across a video monitor at a speed selected in real time by a user.
    Type: Application
    Filed: May 14, 2010
    Publication date: November 18, 2010
    Inventors: Paul E. Kleinschmidt, Joshua Hinman, Scott C. Helf, Sean M. Siler, Simon Chelebyan
  • Publication number: 20100287144
    Abstract: A scheme for accessing an index structure using a reference minimum bounding shape is disclosed. In one example embodiment, a reference minimum bounding shape that encloses two or more minimum bounding shapes may be identified from an index structure stored in memory. Each of the two or more minimum bounding shapes may correspond to a data object associated with a corresponding leaf node of the index structure. In one example embodiment, the index structure may be accessed using the reference minimum bounding shape. In one example embodiment, at least one minimum bounding shape of the two or more minimum bounding shapes may be represented in a relative representation calculated relative to the reference minimum bounding shape. Also disclosed are a method, a system and a non-transitory computer-readable storage medium for accomplishing the same scheme as described above.
    Type: Application
    Filed: July 30, 2010
    Publication date: November 11, 2010
    Applicant: SAP AG
    Inventors: Sang K. Cha, Ki-Hong Kim, Keun-Joo Kwon
  • Publication number: 20100287165
    Abstract: Generating an index includes receiving a reference sequence and applying one or more key patterns to the reference sequence to obtain a plurality of keys in the index. Each of the one or more key patterns is derived based on a corresponding set of oligomer sequence relationships of a plurality of oligomer sequences that are expected to be generated from the reference, and the keys correspond to a plurality of candidate and/or validated locations in the reference sequence.
    Type: Application
    Filed: February 2, 2010
    Publication date: November 11, 2010
    Inventors: Aaron L. Halpern, Igor Nazarenko
  • Publication number: 20100281029
    Abstract: A method and a system for providing recommendations based on branding are disclosed. For example, a brand preference corresponding to a first brand and a first category may be identified based on user activity. A recommendation is provided to the user based on the brand preference. The recommendation may be provided based on a predetermined brand relationship comprising the first brand associated with the first category, a second brand associated with a second category, and a recommendation score between the first and second categories and brands. The recommendation may provided by accessing a relationships database to determine at least one brand relationship of the brand relationships corresponding to the brand preference.
    Type: Application
    Filed: February 17, 2010
    Publication date: November 4, 2010
    Inventors: Nishith Parikh, Neelakantan Sundaresan
  • Publication number: 20100281033
    Abstract: Ranking systems are described. In an embodiment a large scale data center has peta bytes of items and a query engine is provided to find the top k most frequently occurring items. In embodiments, samples are taken from the data center at least until a specified number of samplings is met, or until a stopping rule is met. In examples, the samples form a sample sketch which is used to find the top k most frequently occurring items without the need to examine every item in the data center. In other examples, the number of samplings or stopping rule is varied to provide ranks or frequencies. In other embodiments the ranking system operates on items having values to find separators which divide the items into bins such that the proportion of the items in each bin is different. For example, a data set may be apportioned to different types of processor.
    Type: Application
    Filed: May 1, 2009
    Publication date: November 4, 2010
    Applicant: Microsoft Corporation
    Inventors: Milan Vojnovic, Dinkar Vasudevan
  • Publication number: 20100281028
    Abstract: There are provided methods, computer program products, and systems for indexing a data stream. A method for indexing a data stream having attribute values includes the steps of parsing the data stream, and forming an index of tuples for a subset of attribute values of the data stream. The index is configured for retrieving the top-K tuples that optimize linearly weighted sums of at least some of the attribute values in the subset.
    Type: Application
    Filed: April 2, 2008
    Publication date: November 4, 2010
    Inventors: Gang Luo, Kun-Lung Wu., Philip Shi-lung Yu
  • Publication number: 20100281019
    Abstract: An information map creating apparatus and method including summing strengths of associations of information elements and creating a duplicate of an information element selected on the basis of the sum of the strengths, calculating strengths, including direct paths, of the associations among the information elements in a state in which some association whose strength is relatively low is excluded from the associations of one of the information elements of a duplicate origin and the information element of a duplicate target, summing the strengths of the associations of each of the information elements, and excluding, from the object to be displayed, an association whose strength is relatively low among the associations of one information element among the information elements, whose strength summed by the summing unit is higher than the others.
    Type: Application
    Filed: April 26, 2010
    Publication date: November 4, 2010
    Applicant: Fujitsu Limited
    Inventor: Isamu WATANABE
  • Publication number: 20100281340
    Abstract: Adaptive endurance coding including a method for storing data that includes receiving write data and a write address. A compression algorithm is applied to the write data to generate compressed data. An endurance code is applied to the compressed data to generate a codeword. The endurance code is selected and applied in response to the amount of space saved by applying the compression to the write data. The codeword is written to the write address.
    Type: Application
    Filed: April 30, 2009
    Publication date: November 4, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: MIchele M. Franceschini, Ashish Jagmohan, John P. Karidis, Luis A. Lastras-Montano
  • Publication number: 20100281030
    Abstract: A document management & retrieval system is configured to: store, for each word in a set of words, appearance positions of the each word in a set of documents as a word index; store, for each tag in a set of tags attached to words, a set of words that appear to a right and left of the each tag, and also store, as a tag LR index, appearance positions of the each tag in a set of documents with a combination of the each tag and a word appearing to its right or a combination of the each tag and a word appearing to its left as a key; and, in a tag search where a query phrase contains words and a tag next to each other, refer to the index with a tag and the word to the right or left of the tag as a key, thereby reducing the size of a document list to be read without needing to have a tag name as a secondary key. A tag is updated by just updating two places in the tag LR index.
    Type: Application
    Filed: November 6, 2008
    Publication date: November 4, 2010
    Applicant: NEC Corporation
    Inventors: Yukitaka Kusumura, Toshiyuki Kamiya
  • Publication number: 20100281004
    Abstract: A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user.
    Type: Application
    Filed: April 28, 2010
    Publication date: November 4, 2010
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Vikram Kapoor, Amit Ganesh, Jesse Kamp, Sachin Kulkarni, Vineet Marwah, Kam Shergill, Roger Macnicol, Manosiz Bhattacharyya
  • Publication number: 20100281031
    Abstract: Methods and apparatus for judicially storing information to enable fast information retrieval are disclosed. The information is organized in information fields each field having a respective set of information elements. Cells of information elements are defined and corresponding information records are cached. The cells may be user defined or formed based on affinity coefficients of pairs of information elements. With a large number of fields, each having a large number of information elements, cells are generated recursively. Each cell is associated with at least one pre-defined query.
    Type: Application
    Filed: May 4, 2010
    Publication date: November 4, 2010
    Inventor: Christopher Brian Voice
  • Publication number: 20100274780
    Abstract: Technologies for forming logical indexes and utilizing such indexes so as to abstract many of the complexities resulting from referencing partitioned database tables. Included are technologies for making use of order-preserving properties of table metadata, for adding a partition equality predicate to an explicit predicate in merge-join processing on partitioned tables, and for selecting execution of a logical skip scan on a partitioned table when a query predicate does not reference a specific partition. Such technologies generally abstract from the query writer and processing systems explicit referencing of table partitions.
    Type: Application
    Filed: July 9, 2010
    Publication date: October 28, 2010
    Applicant: Microsoft Corporation
    Inventors: César Alejandro Galindo-Legaria, Craig S. Freedman, Milind M. Joshi
  • Publication number: 20100268702
    Abstract: Systems and methods for generating user-customized search results and building a semantics-enhanced search engine are disclosed. In one aspect, embodiments of the present disclosure include a method, which may be implemented on a system, of generating user-customized search results using user-defined semantic types. The method includes, identifying a first set of URI patterns that are associated with a first set of semantic types defined by a first user, storing the first set of URI patterns in a database embodied in a computer-readable storage medium, and/or semantically categorizing a first set of search results for the first user, as having content related to one or more of the first set of semantic types defined by the first user. The first set of search results can be categorized using the first set of URI patterns.
    Type: Application
    Filed: April 14, 2010
    Publication date: October 21, 2010
    Applicant: Evri, Inc.
    Inventors: James M. Wissner, Nova Spivack