Sequential Access, E.g., String Matching, Etc. (epo) Patents (Class 707/E17.039)

E Subclasses

On static storage (epo) (Class 707/E17.04)

Comparing simultaneously a plurality of search arguments with a simple file data, finite state machine (epo) (Class 707/E17.041)

Comparing simultaneously search arguments with more than one file data (epo) (Class 707/E17.042)

Comparing simultaneously search arguments with more than one file data (EPO) (Class 707/E17.043)

COMPUTING A SYMBOLIC BOUND FOR A PROCEDURE

Publication number: 20110078665

Abstract: A system that facilitates computing a symbolic bound with respect to a procedure that is executable by a processor on a computing device is described herein. The system includes a transition system generator component that receives the procedure and computes a disjunctive transition system for a control location in the procedure. A compute bound component computes a bound for the transition system, wherein the bound is expressed in terms of inputs to the transition system. The system further includes a translator component that translates the bound computed by the compute bound component such that the bound is expressed in terms of inputs to the procedure.

Type: Application

Filed: September 29, 2009

Publication date: March 31, 2011

Applicant: Microsoft Corporation

Inventors: Sumit Gulwani, Florian Frantz Zuleger, Sudeep Dilip Juvekar
SYSTEM AND METHOD FOR TOPIC EXTRACTION AND OPINION MINING

Publication number: 20110078167

Abstract: Methods, apparatus, and systems to determine a niche market of items or services, the first phase of which identifies a gap between demand and supply for a set of items. Session logs may be evaluated to compare transactions involving a specific item to those of a larger group of items. The resultant information identifies areas of high demand, but with low availability. The niche market information may be provided as direct merchandising items for sellers. In one example, the method generates niche market item web pages in specific categories. Additional methods, apparatus, and systems are disclosed.

Type: Application

Filed: September 28, 2009

Publication date: March 31, 2011

Inventors: Neelakantan Sundaresan, Yongzheng Zhang, Catherine Baudin, Dan Shen, Shen Huang
HANDWRITTEN DOCUMENT CATEGORIZER AND METHOD OF TRAINING

Publication number: 20110078191

Abstract: A method and an apparatus for training a handwritten document categorizer are disclosed. For each category in a set into which handwritten documents are to be categorized, discriminative words are identified from the OCR output of a training set of typed documents labeled by category. A group of keywords is established including some of the discriminative words identified for each category. Samples of each of the keywords in the group are synthesized using a plurality of different type fonts. A keyword model is then generated for each keyword, parameters of the model being estimated, at least initially, based on features extracted from the synthesized samples. Keyword statistics for each of a set of scanned handwritten documents labeled by category are generated by applying the generated keyword models to word images extracted from the scanned handwritten documents. The categorizer is trained with the keyword statistics and respective handwritten document labels.

Type: Application

Filed: September 28, 2009

Publication date: March 31, 2011

Applicant: Xerox Corporation

Inventors: Francois RAGNET, Florent C. Perronnin, Thierry Lehoux
Bit strings search apparatus, search method, and program

Publication number: 20110066638

Abstract: Using a tree configuration wherein node groups of four or more nodes composed of combinations of branch nodes, leaf nodes or empty nodes are linked into a tree form, a bit string search by a search key string is enabled by repeatedly linking to one of the nodes of a node group to which a primary node belongs in response to the bit values of keys of the search key string at the discrimination bit position included in the branch node.

Type: Application

Filed: November 17, 2010

Publication date: March 17, 2011

Applicant: S. Grants Co., Ltd.

Inventors: Toshio Shinjo, Koutaro Shinjo, Mitsuhiro Kokubun
FAST SIGNATURE SCAN

Publication number: 20110066637

Abstract: Systems and methods for scanning signatures in a string field. In one implementation, the invention provides a method for signature scanning. The method includes processing one or more signatures into one or more formats that include one or more fingerprints and one or more follow-on search data structures for each fixed-size signature or signature substring such that the number of fingerprints for each fixed-size signature or signature substring is equal to a step size for a signature scanning operation and the particular fixed-size signature or signature substring is identifiable at any location within any string fields to be scanned, receiving a particular string field, identifying any signatures included in the particular string field including scanning for the fingerprints for each scan step size and searching for the follow-on search data structures at the locations where one or more fingerprints are found, and outputting any identified signatures.

Type: Application

Filed: October 12, 2010

Publication date: March 17, 2011

Inventor: QIANG WANG
METHOD OF DETECTING CHARACTER STRING PATTERN AT HIGH SPEED USING LAYERED SHIFT TABLES

Publication number: 20110066631

Abstract: A character string pattern matching method for detecting the presence of at least one of N (N is a natural number equal to or greater than 2) patterns in specific text shifts a detection location across text by a maximum shift length using single-byte character-based layered SHIFT tables, thereby increasing a pattern matching speed as compared with the prior art pattern matching algorithms.

Type: Application

Filed: September 22, 2008

Publication date: March 17, 2011

Applicant: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION

Inventors: Yoon Ho Choi, Seung Woo Seo
SEARCHING AND ACCESSING DOCUMENTS ON PRIVATE NETWORKS FOR USE WITH CAPTURES FROM RENDERED DOCUMENTS

Publication number: 20110029504

Abstract: A facility for exposing an index of private documents is described. In a private network, the facility (1) identifies electronic versions of documents that are available inside the private network, including a distinguished document; (2) constructs an index covering the identified electronic versions of documents; and (3) exports the constructed index from the private network to an index publication server. At the index publication server, the facility (1) receives the exported index; (2) receives a query via a public network; and (3) uses an index, based upon the received index, to generate a query result for the received query that contains the distinguished document.

Type: Application

Filed: October 5, 2010

Publication date: February 3, 2011

Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushier, James Q. Stafford-Fraser
METHOD AND SYSTEM FOR CHARACTERIZING WEB CONTENT

Publication number: 20110029505

Abstract: An exemplary embodiment of the present invention provides a method of processing Web activity data. The method includes obtaining a database of clickstream data comprising a user identifier corresponding with a user ID and a uniform resource locator (URL) corresponding with a Web page visited from the user ID. The method also includes generating a plurality of features based on the URL. Further, the method includes generating a data structure comprising the user ID and the feature. The method also includes generating segment information from the data structure based on the similarity of a URL visitation pattern across different user IDs, wherein each segment in the segment information comprises one or more user IDs and one or more features.

Type: Application

Filed: July 31, 2009

Publication date: February 3, 2011

Inventors: Martin B. Scholz, Shyam Sundar Rajaram, Rajan Lukose
Web-Used Pattern Insight Platform

Publication number: 20110029516

Abstract: A web site usage pattern insight platform may be provided. User behaviors associated with web page requests, including search queries, may be captured and analyzed to provide usage pattern insights. The pattern insights may be aggregated across a plurality of users and may be used to provide recommendations for improving a system that hosts the web pages.

Type: Application

Filed: July 30, 2009

Publication date: February 3, 2011

Applicant: Microsoft Corporation

Inventors: Qing Chang, Keiichiro Suzuki, Harini Sridharan, Prashant Kamani, Aleksandr Lyamtsev, Mingyang Zhao, Aditee Kumthekar, Ashutosh Galande, Charles Ainslie, Staya Priya Hotani, Reshma Mehta, Tho Van Nguyen, Yuan Gao, Li Yang, Jin Wu, Shuang Yang, Smridh Thapar
SYSTEM AND METHOD FOR INFLUENCING A POSITION ON A SEARCH RESULT LIST GENERATED BY A COMPUTER NETWORK SEARCH ENGINE

Publication number: 20110022623

Abstract: A system and method for enabling information providers using a computer network such as the Internet to influence a position for a search listing within a search result list generated by an Internet search engine. The system and method of the present invention provides a database having accounts for the network information providers. Each account contains at least one search listing having at least three components: a description, a search term comprising one or more keywords, and a bid amount. The network information provider may add, delete, or modify a search listing after logging into his or her account via an authentication process. The network information provider influences the position for a search listing through a continuous online competitive bidding process. The bidding process occurs when the network information provider enters a new bid amount, which is preferably a money amount, for a search listing.

Type: Application

Filed: July 23, 2010

Publication date: January 27, 2011

Applicant: Yahoo! Inc.

Inventors: Darren J. Davis, Matthew Derer, Johann Garcia, Larry Greco, Tod E. Kurt, Thomas Kwong, Jonathan C. Lee, Ka Luk Lee, Preston Pfarner, Steve Skovran
FINITE AUTOMATON GENERATION SYSTEM FOR STRING MATCHING FOR MULTI-BYTE PROCESSING

Publication number: 20110022617

Abstract: An NFA circuit adapted to regular expressions and used for multibyte processing enables independent check of in what position the inputted character string matches. A 1-byte NFA converting unit (21) stores one or more regular expressions inputted by an input device (1) in a regular expression storage unit (31), sequentially reads out the regular expressions, and converts them into 1-byte processed NFAs with no ? transition. A multibyte NFA converting unit (22) converts the generated 1-byte processed NFAs into NFAs such that it can be judged in what position the inputted character string to be processed in multibyte matches a pattern on the basis of the operating mode and the number of processing bytes inputted by the input device (1) and processed and stores the NFAs in an NFA storage unit (32). An HDL converting unit (23) generates a hardware description language (HDL) of the NFA circuit from the state transition information relating to the NFAs inputted from a multibyte NFA converting unit (22).

Type: Application

Filed: March 19, 2009

Publication date: January 27, 2011

Inventor: Norio Yamagaki
RECORDING REQUESTING APPARATUS, RECORDING APPARATUS, SYSTEM, RECORDING APPARATUS SELECTING METHOD AND COMPUTER PROGRAM

Publication number: 20110002664

Abstract: In a system having a program viewing apparatus and a plurality of recording apparatuses connected to a network, a program viewing apparatus 31 allowing easy selection of a recording apparatus includes: a tuner 101, a program information storage unit 121 for storing program list information output from tuner 101; an IP communication unit 102 for transmission/reception to/from the recording apparatus; and a processing unit 110 for performing a process of selecting a recording apparatus. When a user instructs recording of a program he/she is watching, processing unit 110 transmits the program information read from program information storage unit 121 together with a search request, to recording apparatuses including a recording apparatuses 32. If a prescribed similarity relation is found between the received program information and recording history, the recording apparatus 32 notifies the program viewing apparatus 31 that it is eligible for recording the program.

Type: Application

Filed: February 20, 2009

Publication date: January 6, 2011

Inventors: Hideki Nishimura, Hiroyuki Nakaoka
DOCUMENT EDITING DEVICE AND DOCUMENT EDITING METHOD

Publication number: 20110004605

Abstract: A document editing device can edit a document using a markup language, and includes: an operation means; a display means that displays an editing screen for editing the document; a control means that searches a character string of a document displayed on the document editing screen, the character string being a character string to which a character decoration type identical to a search-target character decoration type specified by an operation of the operation means is set.

Type: Application

Filed: December 24, 2008

Publication date: January 6, 2011

Applicant: Kyocera Corporation

Inventor: Takenori Tomino
LOCATION SEARCH DEVICE, LOCATION SEARCH METHOD, AND COMPUTER-READABLE STORAGE MEDIUM STORING LOCATION SEARCH PROGRAM

Publication number: 20100325104

Abstract: A location search device, in which when one of buttons in a character input portion is repeatedly pressed, a plurality of characters assigned in advance to the one button is displayed in an input character display portion in a predetermined cyclic sequence. When one of the buttons is pressed, and then, another of the buttons is pressed, the character displayed in the display portion immediately prior to pressing of the other button is set as an input set character. When the character is input through the character input portion following the input set character, a character string is input. A plurality of compound character strings are created by combining the input set character string with the plurality of characters that is pressed next, and search object character strings that partially match the respective compound character strings are acquired from a character string storage portion, as input candidate character strings.

Type: Application

Filed: June 2, 2010

Publication date: December 23, 2010

Applicant: AISIN AW CO., LTD.

Inventor: Hiroshi KAWAUCHI
METHOD AND/OR SYSTEM FOR TAGGING TREES

Publication number: 20100318521

Abstract: Embodiments of methods and/or systems for tagging trees are disclosed.

Type: Application

Filed: July 2, 2010

Publication date: December 16, 2010

Applicant: Robert T. and Virginia T. Jenkins as Trustees of the Jenkins Family Trust Dated 2/8/2002

Inventor: Jack J. LeTourneau
DOCUMENT PROCESSING METHOD AND SYSTEM

Publication number: 20100306248

Abstract: A method and system for expanding a document set as a search data source in the field of business related search. The present invention provides a method of expanding a seed document in a seed document set. The method includes identifying one or more entity words of the seed document; identifying one or more topic words identifying one or more topic words related to the based entity word in the seed document where the entity word is located; forming an entity word-topic word pair from each identified topic word and the entity word on the basis of which each topic word is identified; and obtaining one or more expanded documents through web by taking the entity word and topic word in the each entity word-topic word pair as key words at the same time. A system for executing the above method is also provided.

Type: Application

Filed: May 25, 2010

Publication date: December 2, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sheng Hua Bao, Jie Cui, Hui Su, Zhong Su, Li Zhang
PATTERN MATCHER AND ITS MATCHING METHOD

Publication number: 20100306209

Abstract: A pattern matching method is disclosed. The method includes following steps. A character is searched in a skip table of a pattern such that a flag value and a skip value are returned. The sliding window is shifted according to the skip value when the flag value indicates the character is not a pattern end. The character plus at least one byte preceding the character is hashed when the flag value indicates the character is the pattern end such that a character hashing value is returned. A pattern end portion is hashed, wherein the size of the pattern end portion is equal to the size of the character plus the size of the byte such that a pattern hashing value is returned. The character hashing value is compared with the pattern hashing value. An exact matching process is performed when the character hashing value is equal to the pattern hashing value.

Type: Application

Filed: August 13, 2010

Publication date: December 2, 2010

Inventors: Tien-Fu Chen, Chieh-Jen Cheng
Information processing apparatus and information processing method

Publication number: 20100293158

Abstract: There is provided an information processing apparatus including a search processing section for causing a transmission/reception section to execute processing of transmitting a search request including a search condition to each of one or more information management devices, causing the transmission/reception section to execute processing of receiving, as a response to the search request and from each of the one or more information management devices via a network, content information corresponding to the search condition from among pieces of content information and management subject identification information for identifying the information management device which manages the content information, and correlating the management subject identification information with content identification information and content-related information that are included in the content information received by the transmission/reception section and causing a storage section to store the correlated management subject identificatio

Type: Application

Filed: April 27, 2010

Publication date: November 18, 2010

Applicant: Sony Corporation

Inventors: Nobuyoshi Tomita, Yasuaki Honda, Yasuo Endo
Range Definition Method and System

Publication number: 20100281050

Abstract: A range-conversion method and system includes receiving data records. Each data record includes one or more data fields and a field value associated with each data field. One or more data fields are identified as a range-based data field. A plurality of text-based range descriptors are defined, such that each text-based range descriptor is associated with a range of field values for one of the range-based data fields.

Type: Application

Filed: July 15, 2010

Publication date: November 4, 2010

Inventor: Roy Schoenberg
METHOD AND SYSTEM FOR RECOMMENDATION OF CONTENT ITEMS

Publication number: 20100281025

Abstract: A method of generating recommendations for content items comprises providing a domain ontology where concepts are characterized by a term vector with terms and associated weights. Associated term sets, each of which comprises a set of terms that characterize a content item, are further provided. A concept set is generated for each associated term set by determining the concepts of the domain ontology that match the terms of the associated term set. In addition, a user profile for a user is provided where the user profile comprises at least some of the concepts of the ontology coupled with preference weights. Recommendations for content items are generated based on the plurality of associated concept sets and the user profile. The invention may allow improved and/or facilitated generation of recommendations from text based characterizing data.

Type: Application

Filed: May 4, 2009

Publication date: November 4, 2010

Applicant: MOTOROLA, INC.

Inventors: Dorothea Tsatsou, Paul C. Davis, Symeon Papadopoulos, Fotis Menemenis, Ben M. Bratu, George Kalfas, Ioannis Kompatsiaris
Advanced Warning

Publication number: 20100268696

Abstract: The present invention is directed to a system, method and server to assist account issuers in managing risk, fraud and unauthorized use. A system, method and server for use in pushing advanced warning alerts to issuers based on consumer data element level triggering events and fraud and unauthorized use reports is disclosed. The ability to the push the alerts to issuers with a permissible purpose for receiving the information in the alerts provides a real-time, online and cost effective way of providing issuers with valuable risk management tools.

Type: Application

Filed: July 20, 2009

Publication date: October 21, 2010

Inventors: Brad Nightengale, Sharon Rowberry
SEARCH AND SEARCH OPTIMIZATION USING A PATTERN OF A LOCATION IDENTIFIER

Publication number: 20100268700

Abstract: Systems and methods for search and search optimization using a pattern in a location identifier is disclosed. In one aspect, embodiments of the present disclosure include a method, which may be implemented on a system, of search and search optimization. The method includes, detecting a set of location identifiers that have a pattern that matches a specified pattern and identifying a set of search results as having content related to the semantic type. The specified pattern can be stored in a computer-readable storage medium and corresponds to a semantic type. The set of search results can include objects associated with the set of location identifiers having the specified pattern.

Type: Application

Filed: April 14, 2010

Publication date: October 21, 2010

Applicant: Evri, Inc.

Inventors: James M. Wissner, Nova T. Spivack
Information Terminal Apparatus

Publication number: 20100268721

Abstract: An information terminal device having a higher degree of convenience than conventional ones is provided. The information terminal device according to the present invention is provided with a display 5, a key input device 6, a memory 7 and a control circuit 2.

Type: Application

Filed: December 10, 2008

Publication date: October 21, 2010

Applicant: KYOCERA CORPORATION

Inventor: Takashi Kitano
Database index key update method and program

Publication number: 20100262617

Abstract: A method of database update processing for updating efficiently database index keys, when new database index keys are supplied to replace index keys already in the database, generates a delta data between the new and old data comprising insert and delete keys by delete processing from a coupled node tree holding the index keys in the old data using index keys of new data as delete keys, and generates new data by delete and insert processing from and into a coupled node tree holding index keys in old data as index keys using the delete keys and insert keys of the delta data.

Type: Application

Filed: June 18, 2010

Publication date: October 14, 2010

Applicant: S. Grants Co., Ltd.

Inventors: Toshio Shinjo, Mitsuhiro Kokubun
IN-CONTEXT EXACT (ICE) MATCHING

Publication number: 20100262621

Abstract: Methods, systems and program product are disclosed for determining a matching level of a text lookup segment with a plurality of source texts in a translation memory in terms of context. In particular, embodiments of the present invention determines any exact matches for the lookup segment in the plurality of source texts, and determines, in the case that at least one exact match is determined, that a respective exact match is an in-context exact (ICE) match for the lookup segment in the case that a context of the lookup segment matches that of the respective exact match. The degree of context matching required can be predetermined, and results prioritized. The invention also includes methods, systems and program products for storing a translation pair of source text and target text in a translation memory including context, and the translation memory so formed. The invention ensures that content is translated the same as previously translated content and reduces translator intervention.

Type: Application

Filed: October 27, 2009

Publication date: October 14, 2010

Inventors: Russ Ross, Kevin Gillespie, Oliver Christ, Daniel Brockmann
INFORMATION SEARCH METHOD, APPARATUS, PROGRAM AND COMPUTER READABLE RECORDING MEDIUM

Publication number: 20100257159

Abstract: An information search apparatus is provided. The information search apparatus includes: a character string input unit configured to obtain a character string from a client; a character string information search unit configured to obtain information that includes the character string from an index DB; a similarity calculation unit configured to calculate degree of similarity between the character string and searched information; and an output unit configured to output the searched information in descending order of the degree of similarity.

Type: Application

Filed: September 10, 2008

Publication date: October 7, 2010

Applicant: Nippon Telegraph and Telephone Corporation

Inventors: Yukio Uematsu, Kengo Fujioka, Syunsuke Konagai, Ryoji Kataoka
Methods and Apparatus for Identifying Conditional Functional Dependencies

Publication number: 20100250596

Abstract: Methods and apparatus are provided for discovering minimal conditional functional dependencies (CFDs). CFDs extend functional dependencies by supporting patterns of semantically related constants, and can be used as rules for cleaning relational data. A disclosed CFDMiner algorithm, based on techniques for mining closed itemsets, discovers constant minimal CFDs. A disclosed CTANE algorithm discovers general minimal CFDs based on the levelwise approach. A disclosed FastCFD algorithm discovers general minimal CFDs based on a depth-first search strategy, and an optimization technique via closed-itemset mining to reduce search space.

Type: Application

Filed: March 26, 2009

Publication date: September 30, 2010

Inventors: Wenfei Fan, Ming Xiong
GRAPH BASED RE-COMPOSITION OF DOCUMENT FRAGMENTS FOR NAME ENTITY RECOGNITION UNDER EXPLOITATION OF ENTERPRISE DATABASES

Publication number: 20100250598

Abstract: Methods and systems are described that involve recognizing complex entities from text documents with the help of structured data and Natural Language Processing (NLP) techniques. In one embodiment, the method includes receiving a document as input from a set of documents, wherein the document contains text or unstructured data. The method also includes identifying a plurality of text segments from the document via a set of tagging techniques. Further, the method includes matching the identified plurality of text segments against attributes of a set of predefined entities. Lastly, a best matching predefined entity is selected for each text segment from the plurality of text segments. In one embodiment, the system includes a set of documents, each document containing text or unstructured data. The system also includes a database storage unit that stores a set of predefined entities, wherein each entity contains a set of attributes.

Type: Application

Filed: March 30, 2009

Publication date: September 30, 2010

Inventors: Falk Brauer, Wojciech Barczynski, Hong-Hai Do, Alexander Loser, Marcus Schramm
Recognition of addresses from the body of arbitrary text

Publication number: 20100250562

Abstract: A method of analyzing words in an arbitrary text document comprises identifying a candidate name of an inhabited area in an arbitrary text, searching and isolating strings to the left and the right of the candidate name, comparing these strings to a map database comprising addresses containing the candidate name, and thereby determining a complete address from the strings matching the map database and the candidate name. A method for searching for a service or product on the World Wide Web comprises providing a global database of web pages indexed by words and locations. The global database is searched using keywords describing the service or product and using a search location. The search process returns a list of web pages matching the keywords and the search location.

Type: Application

Filed: March 24, 2009

Publication date: September 30, 2010

Inventor: Ivica Siladic
METHOD AND APPARATUS FOR INTEGRATION OF COMMUNITY-PROVIDED PLACE DATA

Publication number: 20100250599

Abstract: An approach is provided for integrating place metadata provided by a community of metadata builders, including receiving registration data that indicates one or more values for a corresponding one or more attributes that describe a place. A place is associated with a geographic location. Providing an indication of match between the registration data and metadata for a predetermined place is also initiated. In some embodiments, a new entry for a set of predetermined places is generated based on validating the registration data and a negligible degree of match. In some embodiments, a unique identifier for the place is included in indication of match for either a new place represented by the registration data or a matching predetermined place.

Type: Application

Filed: June 4, 2009

Publication date: September 30, 2010

Applicant: Nokia Corporation

Inventors: Andreas SCHMIDT, Alexander Grosse
METHOD AND SYSTEM FOR PREEMPTIVE SCANNING OF COMPUTER FILES

Publication number: 20100242109

Abstract: In embodiments of the present invention improved capabilities are described for reducing computer file access time associated with on-access scanning through predictive preemptive scanning, where the prediction may be enabled through the development and use of a file access performance cost mapping of a computing facility's file system. In a first step, file access information describing a pattern of each of a plurality of computer files that have been accessed in a computer file system may be collected. In a second step, the file access information may be processed to generate a file access performance cost statistic for each of the plurality of computer files, where the file access performance cost statistic may be a measure of the time aggregate effect on the computing facility's system performance associated with the access of the file. In a third step, the file access performance cost statistic may be maintained for each of the plurality of files accessed by the computing facility.

Type: Application

Filed: March 17, 2009

Publication date: September 23, 2010

Inventor: Graham J. Lee
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND COMPUTER PROGRAM

Publication number: 20100232690

Abstract: An image processing apparatus includes a character recognition unit configured to perform character recognition on a plurality of character images in a document image to acquire a character code corresponding to each character image, and a generation unit configured to generate an electronic document, wherein the electronic document includes the document image, a plurality of character codes acquired by the character recognition unit, a plurality of glyphs, and data which indicates the glyphs to be used to render each of the character codes, wherein each of the plurality of glyphs is shared and used by different character codes based on the data when rendering characters that correspond to the plurality of character codes acquired by the recognition unit.

Type: Application

Filed: June 23, 2008

Publication date: September 16, 2010

Applicant: CANON KABUSHIKI KAISHA

Inventors: Tomotoshi Kanatsu, Makoto Enomoto, Taeko Yamazaki
Method and Systems for Efficient Delivery of Previously Stored Content

Publication number: 20100235374

Abstract: Systems and methods for reducing file sizes for files delivered over a network are disclosed. A method comprises receiving a first file comprising sequences of data; creating a hash table having entries corresponding to overlapping sequences of data; receiving a second file comprising sequences of data; comparing each of the sequences of data in the second file to the sequences of data in the hash table to determine sequences of data present in both the first and second files; and creating a third file comprising sequences of data from the second file and representations of locations and lengths of said sequences of data present in both the first and second files.

Type: Application

Filed: May 28, 2010

Publication date: September 16, 2010

Inventors: Henk Bots, Srikanth Devarajan, Saravana Annamalaisami
System and Method for Entropy-Based Near-Match Analysis

Publication number: 20100235392

Abstract: A system and method for an entropy-based near-match analysis identifies target files that are almost, but not identical, to a reference file. A computing processor computes entropies of the reference and target files, and determines the likeness of the target files to the references file based on the computed entropies. The computing processor determines a near match between the target file and the reference file if the likeness of the two files is within a user-defined tolerance level. According to one embodiment of the invention, the information entropy is a weighted value that takes into account the size of the file.

Type: Application

Filed: March 11, 2010

Publication date: September 16, 2010

Inventors: Shawn McCreight, Dominik Weber
HOLISTIC DISAMBIGUATION FOR ENTITY NAME SPOTTING

Publication number: 20100223292

Abstract: A method resolves ambiguous spotted entity names in a data corpus by determining an activation level value for each of a plurality of nodes corresponding to a single ambiguous entity name. The activation levels for each of the nodes may be modified by inputting outside domain knowledge corresponding to the nodes to increase the activation value of the nodes, spotting entity names corresponding to the nodes to increase the activation value of the nodes, searching the data corpus to spot newly posted entity names to increase the activation value of the nodes, and searching the data corpus to reduce or deactivate the activation value of the nodes by eliminating false positives. The ambiguous entity name is assigned to the node determined to have the highest activation level and is then outputted to a user.

Type: Application

Filed: February 27, 2009

Publication date: September 2, 2010

Applicant: International Business Machines Corporation

Inventors: Varun Bhagwan, Tyrone W.A. Grandison, Daniel F. Gruhl, Jan H. Pieper
MEASURING CONTEXTUAL SIMILARITY

Publication number: 20100223280

Abstract: A contextual similarity measurement system computes a similarity model for a reference document using a prediction by partial match method. The system further computes a similarity measure between a compared document and the reference document using the similarity for the reference document.

Type: Application

Filed: February 27, 2009

Publication date: September 2, 2010

Inventor: JAMES PAUL SCHNEIDER
INFRASTRUCTURE FOR SPILLING PAGES TO A PERSISTENT STORE

Publication number: 20100223305

Abstract: Techniques for managing memory usage in a processing system are provided. This may be achieved by receiving a data stream including multiple tuples and determining a query plan that was generated for a continuous query applied to the multiple tuples in the data stream. The query plan may include one or more operators. Before scheduling an operator in the query plan, it is determined when an eviction is to be performed based a level of free memory of the processing system. An eviction candidate is determined and a page associated with the eviction candidate is evicted from the memory to a persistent storage.

Type: Application

Filed: March 2, 2009

Publication date: September 2, 2010

Applicant: Oracle International Corporation

Inventors: Hoyong Park, Namit Jain, Anand Srinivasan, Shailendra Mishra
Extrusion detection using taint analysis

Patent number: 7788235

Abstract: An extrusion detection system prevents the release of sensitive data from an enterprise. The system includes administration module for broadcasting taint instructions, each of which include a definition of sensitive data. The system also includes a plurality of extrusion detection nodes. Each node marks sensitive data as tainted responsive to the taint instructions, marks data that depends on sensitive data as tainted. When the potential release of tainted data is detected, an action is executed responsive to the taint instructions.

Type: Grant

Filed: September 29, 2006

Date of Patent: August 31, 2010

Assignee: Symantec Corporation

Inventor: Matthew Yeo
Method for automatically sensing a set of items

Publication number: 20100217770

Abstract: A method for automatically sensing a set of elements in a computer system, wherein each element in the set has an associated character body from a plurality of character bodies, and each character body comprises character strings which characterize a respective element, the performance of the method involving a search for at least one prescribed character string within the character bodies and use of the at least one character string to ascertain at least one property for at least one element, and association of this at least one ascertained property with at least one category, and this involving a user of the method being provided with a taxonomy which is inherent of the set of elements.

Type: Application

Filed: February 17, 2010

Publication date: August 26, 2010

Inventor: Peter Ernst
WORD SPOTTING FALSE ALARM PHRASES

Publication number: 20100217596

Abstract: In one aspect, a method for processing media includes accepting a query. One or more language patterns are identified that are similar to the query. A putative instance of the query is located in the media. The putative instance is associated with a corresponding location in the media. The media in a vicinity of the putative instance is compared to the identified language patterns and data characterizing the putative instance of the query is provided according to the comparing of the media to the language patterns, for example, as a score for the putative instance that is determined according to the comparing of the media to the language patterns.

Type: Application

Filed: February 24, 2009

Publication date: August 26, 2010

Applicant: Nexidia Inc.

Inventors: Robert W. Morris, Jon A. Arrowood, Mark A. Clements, Kenneth King Griggs, Peter S. Cardillo, Marsal Gavalda
APPARATUS FOR PROCESSING STRINGS SIMULTANEOUSLY

Publication number: 20100211591

Abstract: An exemplary string processing method for specific byte string processing with word-related instructions includes: loading a plurality of first predetermined strings; comparing a specific string with the loaded first predetermined strings simultaneously, thereby generating a plurality of comparison results corresponding to the specific string; and generating a string processing result according to the comparison results. A string processing apparatus uses the string processing method.

Type: Application

Filed: February 16, 2009

Publication date: August 19, 2010

Inventors: Chuan-Hua Chang, Chi-Chang Lai, Hong-Men Su
Characterizing User Information

Publication number: 20100211960

Abstract: Among other disclosed subject matter, a computer-implemented method for characterizing user information includes receiving a plurality of identifiers associated with respective users. The method includes identifying, using the plurality of identifiers, any information portions in an information collection relating to at least one of the users, the information collection reflecting network activities by the users. The method includes generating a record that includes the plurality of identifiers associated with the corresponding information portions. The method includes identifying at least one of the information portions as corresponding to a category established for user classification. The method includes identifying a subset of the plurality of identifiers as associated with the category; and.

Type: Application

Filed: February 17, 2009

Publication date: August 19, 2010

Applicant: GOOGLE INC.

Inventors: Sarah Sirajuddin, Xuefu Wang, Angshuman Guha, Oren E. Zamir, Aitan Weinberg
QUERY STRING MATCHING METHOD AND APPARATUS

Publication number: 20100205173

Abstract: In one implementation, a method is provided for increasing relevance of database search results. The method includes receiving a subject query string and determining a trained edit distance between the subject query string and a candidate string using trained cost factors derived from a training set of labeled query transformations. A trained cost factor includes a conditional probability for mutations in labeled non-relevant query transformations and a conditional probability for mutations in labeled relevant query transformations. The candidate string is evaluated the for selection based on the trained edit distance. In some implementations, the cost factors may take into account the context of a mutation. As such, in some implementations multi-dimensional matrices are utilized which include the trained cost factors.

Type: Application

Filed: April 22, 2010

Publication date: August 12, 2010

Applicant: YAHOO! INC.

Inventor: John M. Carnahan
SYSTEMS AND METHODS FOR IDENTIFYING UNWANTED OR HARMFUL ELECTRONIC TEXT

Publication number: 20100205123

Abstract: The present invention relates to systems and methods for identifying and removing unwanted or harmful electronic text (e.g., spam). In particular, the present invention provides systems and methods utilizing inexact string matching methods and machine learning and non-learning methods for identifying and removing unwanted or harmful electronic text.

Type: Application

Filed: August 8, 2007

Publication date: August 12, 2010

Applicant: TRUSTEES OF TUFTS COLLEGE

Inventors: D. Sculley, Gabriel Wachman, Carla E. Brokley
Method and Device for High Performance Regular Expression Pattern Matching

Publication number: 20100198850

Abstract: Disclosed herein is an improved architecture for regular expression pattern matching. Improvements to pattern matching deterministic finite automatons (DFAs) that are described by the inventors include a pipelining strategy that pushes state-dependent feedback to a final pipeline stage to thereby enhance parallelism and throughput, augmented state transitions that track whether a transition is indicative of a pattern match occurring thereby reducing the number of necessary states for the DFA, augmented state transition that track whether a transition is indicative of a restart to the matching process, compression of the DFA's transition table, alphabet encoding for input symbols to equivalence class identifiers, the use of an indirection table to allow for optimized transition table memory, and enhanced scalability to facilitate the ability of the improved DFA to process multiple input symbols per cycle.

Type: Application

Filed: February 10, 2010

Publication date: August 5, 2010

Applicant: Exegy Incorporated

Inventors: Ron K. Cytron, David Edward Taylor, Benjamin Curry Brodie
Extracting Patterns from Sequential Data

Publication number: 20100191753

Abstract: Described is a technology in which sequential data, such as application program command sequences, are processed into patterns, such as for use in analyzing program usage. In one aspect, sequential data may be first transformed via state machines that remove repeated data, group similar data into sub-sequences, and/or remove noisy data. The transformed data is then segmented into units. A pattern extraction mechanism extracts patterns from the units into a pattern set, by calculating a stability score (e.g., a mutual information score) between succeeding units, selecting the pair of units having the most stability (e.g., the highest score), and adding corresponding information for that pair into the pattern set. Pattern extraction is iteratively repeated until a stopping criterion is met, e.g., the pattern set reaches a defined size, or when the stability score is smaller than a pre-set threshold.

Type: Application

Filed: January 26, 2009

Publication date: July 29, 2010

Applicant: Microsoft Corporation

Inventors: Jie Su, Min Chu, Wenli Zhu, Jian Wang
METHODS FOR MATCHING METADATA FROM DISPARATE DATA SOURCES

Publication number: 20100185637

Abstract: Methods for matching a candidates with a target utilizing extract, transform and load (ETL) metadata utilizing a computer, the candidates originating from a number of secondary data sources are presented including: causing the computer to receive the target from a target data source; causing the computer to fetch the candidates from the number of secondary data sources; causing the computer to process match rules, the match rules configured for determining whether the candidates match with the target, where the ETL metadata provides data for the processing; if the number of match rules determines a potential candidate match, causing the computer to score the potential candidate match utilizing a weighting method, the weighting method corresponding with a degree of importance of the match, where the potential candidate match corresponds with one of candidates; and causing the computer to display the potential candidate match.

Type: Application

Filed: January 14, 2009

Publication date: July 22, 2010

Applicant: International Business Machines Corporation

Inventors: Richard K. Morris, Neville T. Myatt
SCALABLE SEMI-STRUCTURED NAMED ENTITY DETECTION

Publication number: 20100185691

Abstract: Disclosed are methods and apparatus for performing named entity recognition. A set of candidates and corresponding contexts are obtained, each of the set of candidates being a potential seed example of an entity. The contexts of at least a portion of the set of candidates are compared with contexts of a set of seed examples of the entity such that a subset of the set of candidates are added to the set of the seed examples. A set of rules are created from the set of seed examples obtained in the comparing step. A final set of seed examples of the entity is generated by executing the set of rules against the set of candidates.

Type: Application

Filed: January 20, 2009

Publication date: July 22, 2010

Applicant: Yahoo! Inc.

Inventors: Utku Irmak, Reiner Kraft
Method and system for indexing and searching an iris image database

Patent number: 7761453

Abstract: A method and system for indexing and searching a database of iris images having a system to expedite a process of matching a subject to millions (more or less) of templates within a database is disclosed. The system may progressively match an iris image template to an iris template in a database by progressing from a top layer of the database to a lower layer of the database. Such matching or retrieval may use a subject code as a query or probe and then find a similarity measure for the features of codes or templates in the database. A multi-stage hierarchal clustering process may be used to compress codes and/or templates.

Type: Grant

Filed: March 2, 2007

Date of Patent: July 20, 2010

Assignee: Honeywell International Inc.

Inventor: Rida M. Hamza
INTEREST-GROUP DISCOVERY SYSTEM

Publication number: 20100174724

Abstract: A method of social networking is disclosed in which users may indicate areas of interest to them and/or search for other users with the same or similar interests. Users may specify one or more areas of interest to them and enter those interests in a database, by means of a list of “tags” or keywords, called a taglist. The values of the tags may be weighted. A user wishing to find other users with similar interests may input a search taglist which is compared with the other taglists stored in the database, and a list of the users with the closest matching taglists is returned. The method may also be used to characterize documents, projects, media files or other data objects, so that such items may be searched in similar fashion.

Type: Application

Filed: January 1, 2010

Publication date: July 8, 2010

Inventors: David Robert Wallace, Marilynn Klamkin, Kali Donovan

prev 1 2 3 4 5 next