Sequential Access, E.g., String Matching, Etc. (epo) Patents (Class 707/E17.039)
  • Publication number: 20120124047
    Abstract: Example methods, apparatus, and articles of manufacture to manage log entries are disclosed. A disclosed example method involves grouping first log entries into a first group based on a matching portion among the first log entries. The example method also involves identifying a non-matching portion of the first log entries and associating an identifier with the non-matching portion. A processor is operated to generate a text string template comprising the identifier and the at least one matching portion in a human-readable format. The identifier replaces the non-matching portion in the template.
    Type: Application
    Filed: November 17, 2010
    Publication date: May 17, 2012
    Inventor: Eric Hubbard
  • Publication number: 20120109972
    Abstract: A vectorization process is employed in which chemical identifier strings are converted into respective vectors. These vectors may then be searched to identify molecules that are identical or similar to each other. The dimensions of the vector space can be defined by sequences of symbols that make up the chemical identifier strings. The International Chemical Identifier (InChI) string defined by the International Union of Pure and Applied Chemistry (IUPAC) is particularly well suited for these methods.
    Type: Application
    Filed: December 21, 2011
    Publication date: May 3, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Stephen Kane Boyer, GREGORY BREYTA, TAPAS KANUNGO, JEFFREY THOMAS KREULEN, JAMES J. RHODES
  • Publication number: 20120095991
    Abstract: A data-driven information navigation system and method enable search and analysis of a set of objects or other materials by certain common attributes that characterize the materials, as well as by relationships among the materials. The invention includes several aspects of a data-driven information navigation system that employs this navigation mode. The navigation system of the present invention includes features of a knowledge base, a navigation model that defines and enables computation of a collection of navigation states, a process for computing navigation states that represent incremental refinements relative to a given navigation state, and methods of implementing the preceding features.
    Type: Application
    Filed: September 12, 2011
    Publication date: April 19, 2012
    Applicant: ENDECA TECHNOLOGIES, INC.
    Inventors: Adam J. Ferrari, Frederick C. Knabe, Vinay S. Mohta, Jason P. Myatt, Benjamin S. Scarlet, Daniel Tunkelang, John S. Walter, Joyce Wang, Michael Tucker
  • Publication number: 20120095750
    Abstract: Parsing technology is applied to observable collections. More specifically, a parser, such as combinator parser, can be employed to perform syntactic analysis over one or more observable collections. Further, multiple observable collections can be combined into a single collection and time can be captured by annotating collection items or generating time items.
    Type: Application
    Filed: October 14, 2010
    Publication date: April 19, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Henricus Johannes Maria Meijer, John Wesley Dyer, Daniel Johannes Pieter Leijen
  • Publication number: 20120089632
    Abstract: An information terminal apparatus is provided with a storage unit and a search unit. The storage unit stores information on a plurality of character strings and information on the pronunciation and shape of each of the characters included in the plurality of character strings. The search unit receives a first input representing part or all of one of the pronunciation and shape of a character, and retrieves possible character strings which have a first character matching the first input. The search unit then receives a second input representing part or all of the other of the pronunciation and shape of a character, and extracts possible character strings which further have a second character matching the second input, from the possible character strings retrieved in response to the first input.
    Type: Application
    Filed: July 29, 2011
    Publication date: April 12, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Tong ZHOU, Tarq YAMADA
  • Publication number: 20120078883
    Abstract: Methods and systems for accessing documents in document collections using predictive word sequences are disclosed. A method for accessing documents using predictive word sequences include creating a candidate list of word sequences where respective ones of the word sequences comprise one or more elements derived from the document corpus; expanding the candidate list by adding one or more new word sequences, where each new pattern is created by combining one or more elements derived from the document corpus with one of the word sequences currently in the candidate list; determining a predictive power with respect to the subject for respective ones of entries of the candidate list, where the entries include the word sequences and the new word sequences; pruning from the candidate list ones of said entries with the determined predictive power less than a predetermined threshold; and accessing documents from the document corpus based on the pruned candidate list.
    Type: Application
    Filed: September 28, 2010
    Publication date: March 29, 2012
    Applicant: The MITRE Corporation
    Inventor: Paul Christian MELBY
  • Publication number: 20120078943
    Abstract: A computer searching technique identifies high quantitative patterns in data. A spatial indexing technique, such as an R-tree is used to represent the data. Then a pattern searching algorithm is used to identify anchor points that define the componentwise minimum patterns. High quantitative patterns are found responsive to the componentwise minimum patterns. The search strategy is demonstrated relevant to the problem of finding suitable locations for a retail business with reference to environments of prior successful retail businesses.
    Type: Application
    Filed: September 27, 2010
    Publication date: March 29, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jin Dong, Ta-Hsin Li, Hua Liang, Ming Xie, Wen Jun Yin, Bin Z. Zhang
  • Publication number: 20120059849
    Abstract: In one embodiment, a system and method is provided to browse and analyze files comprising text strings tagged with metadata. The system and method comprise various functions including browsing the metadata tags in the file, browsing the text strings, selecting subsets of the text strings by including or excluding strings tagged with specific metadata tags, selecting text strings by matching patterns of words and/or parts of speech in the text string and matching selected text strings to a database to identify similar text string. The system and method further provide functions to generate suggested text selection rules by analyzing a selected subset of a plurality of text strings.
    Type: Application
    Filed: September 8, 2010
    Publication date: March 8, 2012
    Applicant: DEMAND MEDIA, INC.
    Inventors: David M. Yehaskel, Henrik M. Kjallbring
  • Publication number: 20120059833
    Abstract: An apparatus for preventing library information leakage includes an input processing unit to register, in an input-item list, library information items included in input design data, an input-item-list storing unit to store the input-item list, and an output processing unit to include, in output design data, library information items that match the library information items registered in the input-item list.
    Type: Application
    Filed: August 25, 2011
    Publication date: March 8, 2012
    Applicant: FUJITSU LIMITED
    Inventor: Kazuhiro Matsuzaki
  • Publication number: 20120054201
    Abstract: Systems and methods are disclosed for tracking an object as it traverses a sequential chain. The relationships between the object, its movement through space and time, and the entities associated with the object at a discreet point of time are captured by a sequential chain. A unique identifier may be created that is continuously modified as the object traverses the sequential chain. The unique identifier may be used to capture relationship information between the object and its related entities and movements.
    Type: Application
    Filed: August 25, 2011
    Publication date: March 1, 2012
    Applicant: SCR Technologies, Inc.
    Inventor: Randal B. Fischer
  • Publication number: 20120047147
    Abstract: In one embodiment, a user of a social networking system requests to search for a place near the user's current location. The social networking system generates a list of places near the user's current location, select a sub-set from the list of places based on visibility and activity of the user and the user's social contacts for each place in the list, and returns the sub-set to the user.
    Type: Application
    Filed: August 18, 2010
    Publication date: February 23, 2012
    Inventors: Joshua Redstone, Benjamin J. Gertzfield, Eyal M. Sharon, Srinivasa P. Narayanan, Daniel Jeng-Ping Hui
  • Publication number: 20120047129
    Abstract: In one embodiment, a user of a social networking system requests to check in a place near the user's current location. The social networking system generates a list of places near the user's current location, ranks the places in the list of places near the user's current location by a distance between each place and the user's current location, as well as activity of the user and the user's social contacts for each place, and returns the ranked list to the user.
    Type: Application
    Filed: August 18, 2010
    Publication date: February 23, 2012
    Inventors: Joshua Redstone, Eyal M. Sharon, Srinivasa P. Narayanan
  • Publication number: 20120041958
    Abstract: Prefixes are registered on a first list as index elements for respective registration patterns. Each prefix is selected as the longest of different-length prefixes that are extractable from a registration pattern in accordance with an extraction rule. Suffixes, which are the remaining parts of the registration patterns excluding the respective prefixes, are registered on a second list. Using different-length prefixes that are extracted from a retrieval key in accordance with the extraction rule, a prefix retriever searches the first list to retrieve a registration pattern whose prefix matches any of the prefixes of the retrieval key. A suffix checker carries out a check on the suffix of the registration pattern retrieved by the prefix retriever, among the suffixes on the second list, as to whether the suffix of the registration pattern matches the suffix of the retrieval key.
    Type: Application
    Filed: October 19, 2011
    Publication date: February 16, 2012
    Applicant: NEC CORPORATION
    Inventor: Akihiro MOTOKI
  • Publication number: 20120036149
    Abstract: A method, article of manufacture, and system for enabling context surrounding a search result to be displayed succinctly. The method includes searching a document set configured as a frequency ordered suffix tree to obtain a frequency ordered context tree. Applying dynamic programming to the frequency ordered context tree to retrieve a set (C) of context strings (c) having n1 elements of context strings (c). Defining an area covered by a character string (s) in the entire set of context strings C {c1, . . . , cn1} as the product of (1) the number (n2) of context strings (c) having s as a prefix and (2) the length of character string (s). Obtaining a set of character strings (S) that maximizes the sum of areas. In addition, dynamic programming can include a pruning process such that if an upper limit does not reach a maximum value, the search in progress is abandoned.
    Type: Application
    Filed: July 19, 2011
    Publication date: February 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yuta Tsuboi, Yuya Unno
  • Publication number: 20110307503
    Abstract: Apparatus, systems, and methods for analyzing data are described. The data can be analyzed using a hierarchical structure. One such hierarchical structure can comprise a plurality of layers, where each layer performs an analysis on input data and provides an output based on the analysis. The output from lower layers in the hierarchical structure can be provided as inputs to higher layers. In this manner, lower layers can perform a lower level of analysis (e.g., more basic/fundamental analysis), while a higher layer can perform a higher level of analysis (e.g., more complex analysis) using the outputs from one or more lower layers. In an example, the hierarchical structure performs pattern recognition.
    Type: Application
    Filed: November 10, 2010
    Publication date: December 15, 2011
    Inventor: Paul Dlugosch
  • Publication number: 20110302157
    Abstract: One or more techniques and systems are disclosed for generating comparative patterns for use in identifying comparators. A set of comparator pairs is extracted from a first comparative pattern in a pattern database that comprises one or more comparative patterns. Questions are retrieved from a question collection using respective comparator pairs to generate comparative questions. Potential comparative patterns are generated from a combination of the comparator pairs and comparative questions, and the potential comparative patterns are evaluated by determining their reliability, in order to generate second comparative patterns for the pattern database.
    Type: Application
    Filed: June 8, 2010
    Publication date: December 8, 2011
    Applicant: Microsoft Corporation
    Inventors: Shasha Li, Chin-Yew Lin, Youngin Song
  • Publication number: 20110302204
    Abstract: A method and apparatus for text information management are provided. The method includes: generating a text expression corresponding to text information; searching for the text expression corresponding to the text information when the text information is to be used again, and obtaining the text information according to the text expression. Through the method and apparatus, a user can search for received text information simply, conveniently and rapidly without searching for text information from chatting records, and therefore lots of time can be saved.
    Type: Application
    Filed: August 12, 2011
    Publication date: December 8, 2011
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Cheng Luo
  • Publication number: 20110295869
    Abstract: An apparatus and a method for searching one or more documents for several different strings is described. A finite state machine receives and processes one or more search strings with a tail-first search. A matching string machine forms states based on the characters in the search string with at least one state accepting a match. The states are annotated with a pattern that indicates what the state has matched and can match. Each position within the pattern is either a character that has been matched at that point or an indicator that it is unknown.
    Type: Application
    Filed: May 31, 2010
    Publication date: December 1, 2011
    Applicant: Red Hat, Inc.
    Inventor: James Paul Schneider
  • Patent number: 8069304
    Abstract: A network device determines the presence of the pre-specified string in a message based on a sequence matching rule. A sequence represents non-contiguous portions of the message. A combination of content addressable memory, programmable processing units, and the programmable control unit may determine the presence of the pre-specified string in the message by comparing the non-contiguous portions of the message. Such an approach may reduce the computational resources required for searching the pre-specified string in the message.
    Type: Grant
    Filed: September 6, 2007
    Date of Patent: November 29, 2011
    Assignee: Intel Corporation
    Inventors: Murukanandam Kamalam Panchalingam, Nithish Mahalingam
  • Publication number: 20110289270
    Abstract: According to one aspect of the present disclosure a method and technique for managing data transfer is disclosed. The method includes comparing, by a processor unit of a data processing system, data to be written to a memory subsystem to a stored data pattern and, responsive to determining that the data matches the stored data pattern, replacing the matching data with a pattern tag corresponding to the matching data pattern. The method also includes transmitting the pattern tag to the memory subsystem.
    Type: Application
    Filed: May 24, 2010
    Publication date: November 24, 2011
    Inventors: Robert H. Bell, JR., Louis Bennie Capps, JR., Daniel M. Dreps, Luis A. Lastras-Montano, Michael Jay Shapiro
  • Publication number: 20110264691
    Abstract: Disclosed herein is an information processing apparatus including: a sensor information acquisition section configured to acquire sensor information outputted from a sensor for detecting a user motion and sensor information outputted from a sensor for obtaining a user current location; an action pattern detection block configured to analyze sensor information indicative of a user motion to detect an action pattern corresponding to the acquired sensor information from a plurality of action patterns obtained by classifying user's actions that are executed in a comparatively short time; a keyword conversion block configured to convert, on the basis of the sensor information indicative of a current location, the information into at least one keyword associated with the current location; and a text extraction block configured to extract a text for user presentation from a plurality of texts on the basis of the detected action pattern and the generated at least one keyword.
    Type: Application
    Filed: April 19, 2011
    Publication date: October 27, 2011
    Inventors: Takahito MIGITA, Katsuyoshi Kanemoto, Hiroyuki Masuda, Naoto Tsuboi
  • Publication number: 20110258210
    Abstract: An apparatus includes a data processing system for matching a first input string with a first regular expression set. The data processing system includes a processor, a memory storing a computer program, a processor configured to execute the computer program. The computer program includes instructions for performing mapping at least two related symbols of the first regular expression set to a unique symbol, generating a second input string by replacing each instance of the at least two related symbols in the first input string with the unique symbol, and operating a first state machine on the input string and a second state machine on the second input string to determine whether the first input string matches with the first regular expression set.
    Type: Application
    Filed: April 20, 2010
    Publication date: October 20, 2011
    Applicant: International Business Machines Corporation
    Inventors: Virat Agarwal, Davide Pasetto, Fabrizio Petrini
  • Publication number: 20110252024
    Abstract: A system, method, and computer program product are provided for identifying objects as being at least potentially unwanted based on strings of symbols identified therein. In use, strings of symbols are identified in a plurality of sequential lines of an object. Further, the object is conditionally identified as being at least potentially unwanted, based on the strings of symbols.
    Type: Application
    Filed: June 21, 2011
    Publication date: October 13, 2011
    Inventor: Aravind Dharmastala
  • Publication number: 20110252046
    Abstract: Embodiments of the present invention include a method and apparatus for encoding the signature string X into a first part B and a second part R with reference to a dictionary comprising a plurality of codes. The first part B identifies which, if any, characters of the signature string X are wildcard characters. The second part R is formed by, for each character in the signature string X that is not a wildcard character, retrieving a code from the dictionary based on the character and its position within the signature string X, the dictionary holding a different code for each such character-position pairing, and combining the retrieved codes according to a predetermined logical operation (e.g. XOR) to form the second part R.
    Type: Application
    Filed: December 16, 2008
    Publication date: October 13, 2011
    Inventors: Geza Szabo, István Gódor, Szabolcs Malomsoky, Sândor Györi
  • Publication number: 20110246523
    Abstract: A method of examining a data stream to detect presence of a complex string belonging to a complex dictionary is provided. The method includes associating an array of state variables and an array of reference states with the complex dictionary; detecting a simple string in the data stream, the simple string being a constituent string in the complex string in the complex dictionary; updating a state variable associated with the complex string according to all relative positions of the simple string within the complex string; and determining that the complex string is present in the data stream when the state variable attains a corresponding reference state. A corresponding system is also provided.
    Type: Application
    Filed: June 7, 2011
    Publication date: October 6, 2011
    Inventor: Kevin Gerard Boyce
  • Publication number: 20110239001
    Abstract: A method of scanning secure data in a data store is performed in a manner that does not expose the scan data, the files being searched, or information about when matches occur between the scan data and the files. During the scan process, encrypted versions of searched files are compared to encrypted versions of match strings, and any resulting match data is encrypted before being written into a results file. In addition, to disguise when match entries are written, during the scan one or more encrypted dummy items are written into the results file.
    Type: Application
    Filed: March 25, 2010
    Publication date: September 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Robert John McCormack
  • Publication number: 20110231427
    Abstract: Upon receiving an input of an input character a1 by a user, a game apparatus stores the received input character a1 as an unfixed character, and displays the stored unfixed character a1. The game apparatus obtains option character strings a2 corresponding to the input character a1 from an option character string database, and receives an operation which the user performs for selecting an option character string a2 from the obtained option character strings a2. Upon receiving the operation for selecting an option character string a2, the game apparatus determines the selected option character string a2 to be a fixed character string, and stores the determined fixed character string, and then outputs the stored fixed character string. Here, only when receiving the operation for selecting an option character string, the game apparatus determines a fixed character string.
    Type: Application
    Filed: May 24, 2010
    Publication date: September 22, 2011
    Applicant: NINTENDO CO., LTD.
    Inventor: Keigo Nakano
  • Publication number: 20110225193
    Abstract: A method for retrieving data in a data source is provided. The method includes receiving a search term; identifying an active tag associated with the search term; correlating the active tag to dynamic data that is operative to adapt to a mining context in which data is stored; and retrieving the data using the dynamic data.
    Type: Application
    Filed: March 9, 2010
    Publication date: September 15, 2011
    Inventors: Cullen F. Jennings, Joseph Brian Burton, Thomas M. Wesselman, Shantanu Sarkar
  • Publication number: 20110208723
    Abstract: Systems and methods of the present invention provide for the word splitting and reliability score for an entered character string. A list of keywords may be extracted from the character string entered into a user interface on a client. These keywords may be compared to potential matches in a dictionary database and a reliability score for word splits and keywords strings may be compiled and displayed to the user. The client may also display the reliability score using a plurality of logical groupings within a reliability score process.
    Type: Application
    Filed: September 30, 2010
    Publication date: August 25, 2011
    Applicant: THE GO DADDY GROUP, INC.
    Inventors: Paul Nicks, Gregory Glick, Doug Schmucker, Jeff Belina, Patrick Lutwitze
  • Publication number: 20110208758
    Abstract: In one embodiment, a title is selected for content to be published online. A method implemented in a data processing system includes receiving a plurality of text strings. A plurality of rules are applied to the text strings. If a condition specified in one of the rules exists in a given text string, one or more attributes are associated to that text string as metadata. One or more of the text strings are selected, using the metadata, as a potential title for the content. A final title is prepared based on the potential title, and the content is published online under the final title.
    Type: Application
    Filed: June 30, 2010
    Publication date: August 25, 2011
    Applicant: DEMAND MEDIA, INC.
    Inventors: David M. Yehaskel, Henrik M. Kjallbring
  • Publication number: 20110184965
    Abstract: Resources may be managed in a topology for audio/video streaming. The topology includes audio/video sources and sinks and intervening branch devices. Messages between these sources, sinks, and branch devices may be used for resource management.
    Type: Application
    Filed: May 28, 2010
    Publication date: July 28, 2011
    Inventor: Srikanth Kambhatla
  • Publication number: 20110179052
    Abstract: A pattern identification apparatus identifies a pattern that exists in input data using registration data including data of a reference pattern or a feature amount thereof and identification parameters defining processing details for comparing the input data with the registration data, the apparatus holding the registration data and the identification parameters in association with a label. The apparatus acquires registration data, identification parameters, and a label generated by an external apparatus, and registers the acquired registration data and identification parameters in association with the acquired label.
    Type: Application
    Filed: January 6, 2011
    Publication date: July 21, 2011
    Applicant: CANON KABUSHIKI KAISHA
    Inventor: Hiroshi Sato
  • Publication number: 20110161357
    Abstract: A recording medium stores an information processing program that causes a computer having a storage unit storing therein a file group in which character code strings are described, to execute generating combined identification information by dividing into two portions, at least one among identification information for a preceding character code and identification information for a succeeding character code and respectively combining the portions with the identification information that is not divided, the preceding and succeeding character codes constituting a character code string for two-consecutive grams in a file among the file group; storing to the storage unit, various consecutive-gram divided maps obtained by allocating to each type of combined identification information generated, a string of bits corresponding to the quantity of files in the file group; and updating in the consecutive-gram divided maps, a bit indicating whether the character code string for the two-consecutive grams is present in the
    Type: Application
    Filed: December 20, 2010
    Publication date: June 30, 2011
    Applicant: FUJITSU LIMITED
    Inventors: Masahiro KATAOKA, Keishiro Tanaka
  • Publication number: 20110153663
    Abstract: Systems and methods to provide a recommendation engine that uses implicit feedback observations are provided. A particular method includes receiving accessing data comprising a plurality of implicit feedback observations for a plurality of users. The plurality of users includes a first user that requested a recommendation. Each implicit feedback observation is associated with a particular user and a particular item of a plurality of items. The method includes determining a plurality of preference ratings and a plurality of confidence ratings for each user of the plurality of users for each item based on the plurality of implicit feedback observations. The method includes generating a recommendation list of one or more of the plurality of items for the first user based on the plurality of preference ratings and the plurality of confidence ratings.
    Type: Application
    Filed: December 21, 2009
    Publication date: June 23, 2011
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Yehuda Koren, Yifan Hu, Christopher T. Volinsky
  • Publication number: 20110153628
    Abstract: A method and system for managing a media advertising enterprise including process and workflow capabilities for enterprise data matching. An EDM (Enterprise Data Management) module can be configured to include a set of rules at an enterprise level to manage disparate and disconnected records associated with an entity. A number of unmatched and enterprise entities that matches with respect to an active entity can be returned based on a fuzzy logic. A matching process can then be performed to accurately match the active entity and the unmatched entities with respect to a parent enterprise entity. The unmatched entity can be put on hold if additional information is required for performing a right match after assigning the parent enterprise entity. A note can also be added in order to place the unmatched entity on hold. Such an optimization mechanism can interactively manage and report records at the enterprise level in a simple and efficient manner.
    Type: Application
    Filed: December 21, 2009
    Publication date: June 23, 2011
    Inventors: Kohinoor Basu, Angel Barnachea Chua, Matthew M. Ferry, Scott Arthur Roberts
  • Publication number: 20110137925
    Abstract: Systems and methods for determining delivery hierarchical information for an Ad unit. In an example method performed at a server, location information is received from a user computer system via a network. The location information is associated with an Ad unit presented on a webpage accessed by the user computer system. The server determines if any previously stored signature information matches with at least a portion of the received location information, then extracts domain information for any portions of the received location information that matches with any previously stored signature information. Then, at least partial delivery hierarchical information is generated based on the extracted domain information. The process repeats for other location information associated with the Ad unit that is received at the server. The generated delivery hierarchical information is aggregated. Then a report is generated based on the aggregated delivery hierarchical information.
    Type: Application
    Filed: December 8, 2010
    Publication date: June 9, 2011
    Applicant: MPIRE CORPORATION
    Inventors: James Baird, Nick Redmond, Gregory Harrison, Brian Gebala, John Kawamoto
  • Publication number: 20110137942
    Abstract: A query pattern handler may be configured to determine at least one query pattern to be matched against a stream of events, and may be configured to determine a plurality of run-time patterns representing active instances of the at least one query pattern which are currently available for matching, and which each include a plurality of states. An event scheduler may be configured to receive an event of the stream of events, the event associated with a current event set of the stream of events. A run-time pattern scheduler may be configured to determine a ranked set of the run-time patterns based on a priority metric which characterizes, for each run-time pattern, an advancement of each run-time pattern from a current state thereof when matched against the current event set. A pattern match evaluator may be configured to evaluate each run-time pattern of the ranked set, in turn, against the current event set.
    Type: Application
    Filed: January 29, 2010
    Publication date: June 9, 2011
    Applicant: SAP AG
    Inventors: Ying Yan, Jin Zhang, Ming-Chien Shan
  • Publication number: 20110137912
    Abstract: The invention provides a system and method for retrieving documents from a collection of documents that match a word search query. A word index is generated for each document in which each entry is an enriched-term string built from the stemmed form of the word to be searched and a separator character followed by the original form of the word to be searched. During a retrieving operation, a search query is processed depending the original form or the stemmed form of a word to be searched. Cross-documents tables are addressed to find documents that match the enriched-term string of the word to be searched.
    Type: Application
    Filed: October 5, 2010
    Publication date: June 9, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roberto Ragusa, Ciro Ragusa, Roberto Guarda
  • Publication number: 20110137896
    Abstract: Provided is an information processing apparatus including an input portion, a metadata acquiring portion, a data forming portion, and a predictive converting portion. The input portion receives selection of content from a user. The metadata acquiring portion acquires metadata including a word indicative of information concerning the content whose selection was received by the input portion. The data forming portion extracts the word from the acquired metadata and forms predictive conversion data for each of the words. The predictive converting portion carries out predictive conversion of a word with respect to input data from the user using the formed predictive conversion data.
    Type: Application
    Filed: November 15, 2010
    Publication date: June 9, 2011
    Applicant: Sony Corporation
    Inventors: Shinya Masunaga, Tomoaki Takemura
  • Publication number: 20110119284
    Abstract: Provided are, among other things, systems, methods and techniques for generating a representative data string. In one representative implementation: (a) starting data positions are identified within input strings of data values; (b) a subsequence of output data values is determined based on the data values at data positions determined with reference to the starting data positions within the input strings; (c) an identification is made as to which of the input strings have segments that match the subsequence of output data values, based on a matching criterion; (d) steps (a)-(c) are repeated for a number of iterations; and (e) the subsequences of output data values are combined across the iterations to provide an output data string, with the determination in step (b) for a current iteration being based on the identification in step (c) for a previous iteration.
    Type: Application
    Filed: January 18, 2008
    Publication date: May 19, 2011
    Inventors: Krishnamurthy Viswanathan, Ram Swaminathan
  • Publication number: 20110119304
    Abstract: A method for detecting and locating occurrence in a data stream of any complex string belonging to a predefined complex dictionary is disclosed. A complex string may comprise an arbitrary number of interleaving coherent strings and ambiguous strings. The method comprises a first process for transforming the complex dictionary into a simple structure to enable continuously conducting computationally efficient search, and a second process for examining received data in real time using the simple structure. The method may be implemented as an article of manufacture comprising at least one processor-readable medium and instructions carried on the at least one medium. The instructions causes a processor to match examined data to an object complex string belonging to the complex dictionary, where the matching process is based on equality to constituent coherent strings, and congruence to ambiguous strings, of the object complex string.
    Type: Application
    Filed: January 21, 2011
    Publication date: May 19, 2011
    Inventor: Kevin Gerard BOYCE
  • Patent number: 7945572
    Abstract: The present invention provides systems and methods for automatically mining massive intelligence databases to discover sequential patterns therein using a novel combination of forward and reverse temporal processing techniques as an enhancement to well known pattern discovery algorithms.
    Type: Grant
    Filed: March 21, 2008
    Date of Patent: May 17, 2011
    Assignee: The Johns Hopkins University
    Inventors: Brett D. Lapin, David W. Porter
  • Publication number: 20110093496
    Abstract: Previously configured state machines may accept an input string, and for each of the regular expression(s), check for a match between the input string accepted and the given regular expression using the configured nodes of the state machine corresponding to the given regular expression. Checking for a match between the input string accepted and the given regular expression using configured nodes of a state machine corresponding to the given regular expression by using the configured nodes of the state machine may include (1) checking detection events from a simple string detector, (2) submitting queries to identified modules of a variable string detector, and (3) receiving detection events from the identified modules of the variable string detector.
    Type: Application
    Filed: October 18, 2010
    Publication date: April 21, 2011
    Inventors: Masanori BANDO, Nabi Sertac Artan, Hung-Hsiang Jonathan Chao
  • Publication number: 20110082885
    Abstract: It is provided a method of matching several pairs out of a plurality of persons. The method includes receiving from the persons personal opinions on other persons, based on at least a collaborative filtering algorithm, calculating from the personal opinions a first estimated opinion of a first person on a second person and a second estimated opinion of the second person on the first person, and matching the first person and the second person in accordance with the estimated opinions. The method may include connecting the first person and the second person for a predetermined time duration. The first and second estimated opinions may be, respectively, the estimated probability that the first person wants to be matched to the second person, and the estimated probability that the second person wants to be matched to the first person. The matching of the first person and the second person is done in accordance with the result of the multiplication of the probabilities.
    Type: Application
    Filed: October 7, 2009
    Publication date: April 7, 2011
    Inventor: Saar WILF
  • Publication number: 20110078167
    Abstract: Methods, apparatus, and systems to determine a niche market of items or services, the first phase of which identifies a gap between demand and supply for a set of items. Session logs may be evaluated to compare transactions involving a specific item to those of a larger group of items. The resultant information identifies areas of high demand, but with low availability. The niche market information may be provided as direct merchandising items for sellers. In one example, the method generates niche market item web pages in specific categories. Additional methods, apparatus, and systems are disclosed.
    Type: Application
    Filed: September 28, 2009
    Publication date: March 31, 2011
    Inventors: Neelakantan Sundaresan, Yongzheng Zhang, Catherine Baudin, Dan Shen, Shen Huang
  • Publication number: 20110078179
    Abstract: An electronic dictionary including: a dictionary storage unit to store dictionary information in which headwords and explanation information are correlated, respectively; an input unit; a CPU to retrieve a headword corresponding to the search string inputted by the input unit from the dictionary storage unit, display the headword in a headword list, read, from the dictionary storage unit, explanation information of a headword selected from the list, and display the explanation information; and a keyword registration unit to register an inputted keyword correlated with the headword whose explanation information is displayed, wherein the CPU retrieves a headword correlated with a keyword corresponding to the search string from the keyword registration unit when the search string is inputted by the input unit in a state where the keyword correlated with the headword is registered in the keyword registration unit, and adds the headword to the headword list to display the headword.
    Type: Application
    Filed: September 28, 2010
    Publication date: March 31, 2011
    Applicant: CASIO COMPUTER CO., LTD.
    Inventor: Shunsuke Unno
  • Publication number: 20110078176
    Abstract: An apparatus includes a string-accepting section that accepts a string; a first retrieval section that retrieves a first characteristic to be used for image search from a database storing sensitivity words and nouns in association with characteristics using a combination of a sensitivity word and a noun extracted from the string; and a search section that searches for an image using the first characteristic.
    Type: Application
    Filed: September 23, 2010
    Publication date: March 31, 2011
    Applicant: SEIKO EPSON CORPORATION
    Inventor: Ikuo Hayaishi
  • Publication number: 20110078665
    Abstract: A system that facilitates computing a symbolic bound with respect to a procedure that is executable by a processor on a computing device is described herein. The system includes a transition system generator component that receives the procedure and computes a disjunctive transition system for a control location in the procedure. A compute bound component computes a bound for the transition system, wherein the bound is expressed in terms of inputs to the transition system. The system further includes a translator component that translates the bound computed by the compute bound component such that the bound is expressed in terms of inputs to the procedure.
    Type: Application
    Filed: September 29, 2009
    Publication date: March 31, 2011
    Applicant: Microsoft Corporation
    Inventors: Sumit Gulwani, Florian Frantz Zuleger, Sudeep Dilip Juvekar
  • Publication number: 20110078153
    Abstract: Prefixes are registered on a first list as index elements for respective registration patterns. Each prefix is selected as the longest of different-length prefixes that are extractable from a registration pattern in accordance with an extraction rule. Suffixes, which are the remaining parts of the registration patterns excluding the respective prefixes, are registered on a second list. Using different-length prefixes that are extracted from a retrieval key in accordance with the extraction rule, a prefix retriever searches the first list to retrieve a registration pattern whose prefix matches any of the prefixes of the retrieval key. A suffix checker carries out a check on the suffix of the registration pattern retrieved by the prefix retriever, among the suffixes on the second list, as to whether the suffix of the registration pattern matches the suffix of the retrieval key.
    Type: Application
    Filed: December 10, 2010
    Publication date: March 31, 2011
    Applicant: NEC CORPORATION
    Inventor: Akihiro MOTOKI
  • Publication number: 20110078191
    Abstract: A method and an apparatus for training a handwritten document categorizer are disclosed. For each category in a set into which handwritten documents are to be categorized, discriminative words are identified from the OCR output of a training set of typed documents labeled by category. A group of keywords is established including some of the discriminative words identified for each category. Samples of each of the keywords in the group are synthesized using a plurality of different type fonts. A keyword model is then generated for each keyword, parameters of the model being estimated, at least initially, based on features extracted from the synthesized samples. Keyword statistics for each of a set of scanned handwritten documents labeled by category are generated by applying the generated keyword models to word images extracted from the scanned handwritten documents. The categorizer is trained with the keyword statistics and respective handwritten document labels.
    Type: Application
    Filed: September 28, 2009
    Publication date: March 31, 2011
    Applicant: Xerox Corporation
    Inventors: Francois RAGNET, Florent C. Perronnin, Thierry Lehoux