Abstract: A card-handling device may include a card intake, a card output, a card imaging device positioned between the card intake and the card output. The card imaging device may be configured to identify a non-conforming card. The card-handling device may be configured to store the non-conforming card in a designated location and/or to reorient the non-conforming card with a card-flipping apparatus configured to reorient flipped cards identified as non-conforming cards.
Abstract: Systems, methods, and computer programming products for estimating ingestion time of ingested files to be transformed into a searchable state for content mining by an on-premises computing environment or cloud environment, including multi-tenant cloud environments. Ingested files being indexed are analyzed for divisibility. Ingestion time varies based on the number of divisible elements (such as lines) of data within the ingested file and the amount of data per divisible element. A converter divides files into a plurality of elements treated as independent data and calculates the estimated ingestion time based on the number of divisions and file size for each divisible element. Estimated ingestion time is stored to internal fields corresponding to each divisible element in the index for the search data. During content mining, an internal condition is added to received search queries, displaying only search results where the estimated ingestion time is older than the current time.
Type:
Grant
Filed:
November 15, 2021
Date of Patent:
November 5, 2024
Assignee:
International Business Machines Corporation
Abstract: Cameras having storage fixtures within their fields of view are programmed to capture images and process clips of the images to generate sets of features representing product spaces and actors depicted within such images, and to classify the clips as depicting or not depicting a shopping event. Where consecutive clips are determined to depict a shopping event, features of such clips are combined into a sequence and transferred, along with classifications of the clips and a start time and end time of the shopping event, to a multi-camera system. A shopping hypothesis is generated based on such sequences of features received from cameras, along with information regarding items detected within the hands of such actors, to determine a summary of shopping activity by an actor, and to update a record of items associated with the actor accordingly.
Type:
Grant
Filed:
June 29, 2022
Date of Patent:
October 29, 2024
Assignee:
Amazon Technologies, Inc.
Inventors:
Chris Broaddus, Jayakrishnan Kumar Eledath, Tian Lan, Hui Liang, Gerard Guy Medioni, Chuhang Zou
Abstract: A method of configuring network elements in a design network topology includes receiving an image of the design network topology; attempting to retrieve design data from the received image corresponding to the design network topology; when the design data is retrieved, querying a topologies database using the design data to find a previously determined network topology that substantially matches the design network topology; when the design data is not retrieved, predicting a network topology using an unsupervised machine learning algorithm; identifying configurations for network elements in the matching network topology or in the predicted network topology in a configurations database; determining design configurations for the network elements of the design network topology from the identified configurations; translating the design configurations of the network elements to a standard format; and pushing the translated design configurations to actual network elements and/or virtual network elements correspondi
Abstract: A method of x-ray projection geometry calibration in x-ray cone beam computed tomography, including: at least one step (S1) of obtaining two-dimensional x-ray images or a sinogram of at least a part of an object, generated through relatively rotating around the object a detector and an x-ray source projecting x-rays towards the detector; further including: at least one step (S4) of detecting in the two dimensional x-ray images or the sinogram at least one feature of the object by using a trained artificial intelligence algorithm; and at least one step of creating, based on the detection, calibration information which defines the geometry of the x-ray projection.
Abstract: A training data acquirer acquires training data including article image data, image-filter-related data indicating a combination of a plurality of image filters used for image processing of the article image data and a value of a parameter for each of the plurality of image filters, and optical character recognition (OCR) score data indicating a score of character recognition output through OCR when image processing is performed on the article image data using the image filters based on the image-filter-related data. A trained model generator generates a trained model indicating a relationship between the article image data, the image-filter-related data, and the OCR score data through machine learning using the training data.
Abstract: A computer device extracts an image feature of an image that includes one or more characters to be recognized. The image feature includes a plurality of image feature vectors. The device uses an attention mechanism to compute and output attention weight values corresponding to the target number of characters, based on the image feature vectors, through parallel computing. Each of the attention weight values corresponds to one or more respective characters and represents an importance of the plurality of image feature vectors for the respective characters. The device obtains at least one character according to the plurality of image feature vectors and the target number of attention weight values. Therefore, in a character recognition process, with recognition based on the foregoing attention mechanism, a character in any shape can be effectively recognized by using a simple procedure, thereby avoiding a cyclic operation process and greatly improving operation efficiency.
Type:
Grant
Filed:
September 15, 2021
Date of Patent:
September 17, 2024
Assignee:
TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Abstract: The present invention provides a deep learning object detection method that locates the distant region in the image in real-time and concentrates on distant objects in a front dash cam perspective, trying to solve a common problem in advanced driver assistance system (ADAS) applications, that the detectable range of the system is not far enough.
Type:
Grant
Filed:
March 16, 2022
Date of Patent:
September 17, 2024
Assignee:
National Yang Ming Chiao Tung University
Abstract: In accordance with implementations of the present disclosure, there is provided a solution for portrait editing and synthesis. In this solution, a first image about a head of a user is obtained. A three-dimensional head model representing the head of the user is generated based on the first image. In response to receiving a command of changing a head feature of the user, the three-dimensional head model is transformed to reflect the changed head feature. A second image about the head of the user is generated based on the transformed three-dimensional head model, and reflects the changed head feature of the user. In this way, the solution can realize editing of features like a head pose and/or a facial expression based on a single portrait image without manual intervention and automatically synthesize a corresponding image.
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computer aided diagnosis of a medical image. One of the methods includes processing the medical image through a machine learning (ML) model to provide a first feature representation of the medical image, wherein the ML model includes an input layer and a pooling layer, wherein the first feature representation is an output from the pooling layer, generating, by the ML model, a sequence of second feature representations of the medical image from the first feature representation of the medical image, wherein each second feature representation in the sequence of the second feature representations having a lower dimension than the first feature representation, and generating, by the ML model, an output as a last second feature representation in the sequence of the second feature representations.
Abstract: To easily and correctly determine substantial identity of a plurality of images in each of which an object is placed, provided is an image judgement method including steps of obtaining first object data from a first image with a use of an R-CNN 12, which is a first machine learning model, where the first object data indicates an attribute and a layout of an object in the first image, obtaining second object data from a second image with a use of the R-CNN 12, where the second object data indicates an attribute and a layout of an object in the second image, and determining substantial identity of the first image and the second image based on the first object data and the second object data with a use of a CNN, which is a second machine learning model.
Abstract: A method of processing input data for a given layer of a neural network using a data processing system comprising compute resources for performing convolutional computations is described. The input data comprises a given set of input feature maps, IFMs, and a given set of filters. The method comprises generating a set of part-IFMs including pluralities of part-IFMs which correspond to respective IFMs, of the given set of IFMs. The method further includes grouping part-IFMs in the set of part-IFMs into a set of selections of part-IFMs. The method further includes convolving, by respective compute resources of the data processing system, the set of selections with the given set of filters to compute a set of part-output feature maps. A data processing system for processing input data for a given layer of a neural network is also described.
Type:
Grant
Filed:
December 23, 2020
Date of Patent:
August 13, 2024
Assignee:
Arm Limited
Inventors:
John Wakefield Brothers, III, Kartikeya Bhardwaj, Alexander Eugene Chalfin, Danny Daysang Loh
Abstract: An object recognition method, an object recognition system, and a readable storage medium are provided. The object recognition method is to recognize an object from a first image, and includes: acquiring a first image; calculating depth information of the first image; performing superpixel segmentation on the first image, to obtain a superpixel image; generating three-dimensional image data of the first image according to the depth information and the image data of the superpixel image; and inputting the three-dimensional image data into a depth neural network for object recognition, to obtain a recognition result.
Abstract: A method of tracking an object includes obtaining a likelihood of a free behavior model of the object and a likelihood of a constant speed model of the object using the position, the speed, and the type of the object determined at a previous time point and the position, the speed, and the type of the object determined at the current time point, and correcting the type of the object at the current time point using the likelihood of the free behavior model, the likelihood of the constant speed model, and the measured type of the object.
Abstract: An image processing apparatus includes one or more memories storing instructions, and one or more processors executing the instructions to obtain document data before printing, to set an inspection area that is an inspection target for an image represented by the obtained document data, to update, in a case when the document data is modified, the set inspection area based on the modified document data, and to generate, based on the modified document data, a reference image for inspection of a printed material.
Abstract: An image recognition method and apparatus. The method comprises: obtaining original image data, convolutional neural network configuration parameters, and convolutional neural network operation parameters from a data transfer bus, the original image data comprising M pieces of pixel data, and M being a positive integer (101); and performing convolutional neural network operation on the original image data by a convolutional neural network operation module according to the convolutional neural network configuration parameters and the convolutional neural network operation parameters (102), wherein the convolutional neural network operation module comprises a convolution operation unit, a batch processing operation unit, and an activation operation unit connected in sequence. The method improves the real timeliness of image recognition.
Abstract: A firewall system stores filtering criteria which include rules for blocking presentation of all or a portion of the media content based at least in part on an identity of an individual appearing in the media content. The firewall system receives the media content. The firewall system determines the identity of the individual appearing in the media content. Based at least in part on the identity of the individual appearing in the media content and the filtering criteria, the firewall system determines an action for securing the media content. The action may be allowing presentation of the media content, blocking presentation of the media content, or blocking presentation of a portion of the media content. The determined action is automatically implemented.
Abstract: The disclosure relates to a method for flagging at least an event of interest in an unlabeled time series of a parameter relative to a wellsite (including to the well, formation or a wellsite equipment), wherein the time series of the parameter is a signal of the parameter as a function of time. The disclosure also relates to a method for evaluation a downhole operation such as a pressure test using a pressure time series. Such methods comprises collecting a time series, extracting at least an unlabeled subsequence of predetermined duration in the time series, and assigning an event of interest a label, in particular representative of the status of the downhole operation, to at least one of the unlabeled subsequences. A command may be sent to a wellsite operating system based on assigned label.
Abstract: Described is a method that involves operating an unmanned aerial vehicle (UAV) to begin a flight, where the UAV relies on a navigation system to navigate to a destination. During the flight, the method involves operating a camera to capture images of the UAV's environment, and analyzing the images to detect features in the environment. The method also involves establishing a correlation between features detected in different images, and using location information from the navigation system to localize a feature detected in different images. Further, the method involves generating a flight log that includes the localized feature. Also, the method involves detecting a failure involving the navigation system, and responsively operating the camera to capture a post-failure image. The method also involves identifying one or more features in the post-failure image, and determining a location of the UAV based on a relationship between an identified feature and a localized feature.
Abstract: Methods and systems for automated construction of an anonymized facial recognition library are disclosed. A camera of a client device may capture a first plurality of images of faces of members of a panel of viewers of media content presented on a content presentation device collocated with the client device during viewing sessions. A first machine learning (ML) model may be applied to the first plurality to generate a second plurality of feature vectors, each associated with a different one of the images. One or more clusters of feature vectors of the second plurality may be computationally determined within a vector space of the feature vectors. A respective centroid feature vector may be determined for each respective cluster, and assigned a unique ID. A respective association between each cluster ID and a respective name ID may be determined based on panel-member information received at the client device.
Abstract: The present disclosure generally relates to methods and user interfaces for managing visual content at a computer system. In some embodiments, methods and user interfaces for managing visual content in media are described. In some embodiments, methods and user interfaces for managing visual indicators for visual content in media are described. In some embodiments, methods and user interfaces for inserting visual content in media are described. In some embodiments, methods and user interfaces for identifying visual content in media are described. In some embodiments, methods and user interfaces for translating visual content in media are described. In some embodiments, methods and user interfaces for translating visual content in media are described. In some embodiments, methods and user interfaces for managing user interface objects for visual content in media are described.
Type:
Grant
Filed:
March 22, 2023
Date of Patent:
June 4, 2024
Assignee:
Apple Inc.
Inventors:
Grant R. Paul, Kellie L Albert, Nathan De Vries, James N. Jones
Abstract: A messaging system performs neural network hair rendering for images provided by users of the messaging system. A method of neural network hair rendering includes processing a three-dimensional (3D) model of fake hair and a first real hair image depicting a first person to generate a fake hair structure, and encoding, using a fake hair encoder neural subnetwork, the fake hair structure to generate a coded fake hair structure. The method further includes processing, using a cross-domain structure embedding neural subnetwork, the coded fake hair structure to generate a fake and real hair structure, and encoding, using an appearance encoder neural subnetwork, a second real hair image depicting a second person having a second head to generate an appearance map. The method further includes processing, using a real appearance renderer neural subnetwork, the appearance map and the fake and real hair structure to generate a synthesized real image.
Abstract: Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for training a data classification model. The method includes generating a first training rule based on probabilities of classifying a plurality of sample data into corresponding classes by a data classification model. The method also includes generating a second training rule based on relevances of the plurality of sample data to the corresponding classes. In addition, the method also includes training the data classification model using the first training rule and the second training rule. With this method, a data classification model is trained, so that the data classification accuracy of the data classification model and the robustness to noise can be improved.
Abstract: A monitoring system is configured to monitor a property. The monitoring system includes a camera, a sensor, and a monitor control unit. The monitor control unit is configured to receive image data and sensor data. The monitor control unit is configured to determine that the image data includes a representation of a person. The monitor control unit is configured to determine an orientation of a representation of a head of the person. The monitor control unit is configured to determine that the representation of the head of the person likely includes a representation of a face of the person. The monitor control unit is configured to determine that the face of the person is likely concealed. The monitor control unit is configured to determine a malicious intent score that reflects a likelihood that the person has a malicious intent. The monitor control unit is configured to perform an action.
Type:
Grant
Filed:
March 18, 2021
Date of Patent:
May 7, 2024
Assignee:
Alarm.com Incorporated
Inventors:
Donald Madden, Achyut Boggaram, Gang Qian, Daniel Todd Kerzner
Abstract: A method and related system operations include obtaining a video stream with an image sensor of a camera device, detecting a plurality of target objects by executing a neural network model based on the video stream with a vision processor unit of the camera device. The method also includes generating a plurality of bounding boxes, determining a plurality of character sequences by, for each respective bounding box of the plurality of bounding boxes, performing a set of optical character recognition (OCR) operations to determine a respective character sequence of the plurality of character sequences. The method also includes updating a plurality of tracklets to indicate the plurality of bounding boxes and storing the plurality of tracklets in association with the plurality of character sequences in a memory of the camera device.
Type:
Grant
Filed:
February 13, 2023
Date of Patent:
May 7, 2024
Assignee:
Verkada Inc.
Inventors:
Mayank Gupta, Suraj Arun Vathsa, Song Cao, Yi Xu, Yuanyuan Chen, Yunchao Gong
Abstract: A method can include receiving (1) images of at least one subject and (2) at least one total mass value for the at least one subject. The method can further include executing a first machine learning model to identify joints of the at least one subject. The method can further include executing a second machine learning model to determine limbs of the at least one subject based on the joints and the images. The method can further include generating three-dimensional (3D) representations of a skeleton based on the joints and the limbs. The method can further include determining a torque value for each limb, based on at least one of a mass value and a linear acceleration value, or a torque inertia and an angular acceleration value. The method can further include generating a risk assessment report based on at least one torque value being above a predetermined threshold.
Type:
Grant
Filed:
April 6, 2022
Date of Patent:
April 16, 2024
Assignees:
UNIVERSITY OF IOWA RESEARCH FOUNDATION, INSEER, INC.
Inventors:
Alec Diaz-Arias, Mitchell Messmore, Dmitry Shin, John Rachid, Stephen Baek, Jean Robillard
Abstract: A feature mapping computer system configured to (i) receive a localized image including a photo depicting a driving environment and location data associated with the photo, (ii) identify, using an image recognition module, a roadway feature depicted in the photo, (iii) generate, using a photogrammetry module, a point cloud based upon the photo and the location data, wherein the point cloud comprises a set of data points representing the driving environment in a three dimensional (“3D”) space, (iv) localize the point cloud by assigning a location to the point cloud based upon the location data, and (v) generate an enhanced base map that includes a roadway feature.
Type:
Grant
Filed:
November 23, 2022
Date of Patent:
April 9, 2024
Assignee:
STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY
Inventors:
Jeremy Carnahan, Michael Stine McGraw, John Andrew Schirano
Abstract: At least a main area and a candidate area are provided on a display screen of a display device, and a display control unit that causes content displayed in the candidate area when reproduction of content displayed in the main area ends to be displayed in the main area, and a detection unit that detects a person from a captured image of at least a place where the display screen is visually recognizable are included, wherein the display control unit changes the content displayed in the candidate area depending on the person during reproduction of the content displayed in the main area.
Abstract: In one implementation, a method of defining a negative space in a three-dimensional scene model is performed at a device including a processor and non-transitory memory. The method includes obtaining a three-dimensional scene model of a physical environment including a plurality of points, wherein each of the plurality of points is associated with a set of coordinates in a three-dimensional space. The method includes defining a subspace in the three-dimensional space with less than a threshold number of the plurality of points. The method includes determining a semantic label for the subspace. The method includes generating a characterization vector of the subspace, wherein the characterization vector includes the spatial extent of the subspace and the semantic label.
Abstract: Embodiments of the present application relate to the technical field of computer information and provide a method and an apparatus for aligning elements in a document, an electronic device and a storage medium. The method includes: obtaining elements contained in a document; assigning the obtained elements into groups; obtaining an inter-group alignment manner between the groups, and obtaining an intra-group alignment manner for elements in each of the groups; and aligning all the groups based on the inter-group alignment manner, and aligning elements in each of the groups based on the intra-group alignment manner. By means of the solution for aligning elements in a document provided by the embodiments of the present application, the efficiency for aligning the elements in the document can be improved.
Abstract: Exemplary embodiments relate to a method for selecting an image of interest to construct a retrieval database including receiving an image captured by an imaging device, detecting an object of interest in the received image, selecting an image of interest based on at least one of complexity of the image in which the object of interest is detected and image quality of the object of interest, and storing information related to the image of interest in the retrieval database, and an image control system performing the same.
Type:
Grant
Filed:
February 21, 2020
Date of Patent:
March 5, 2024
Assignee:
KOREA INSTITUTE OF SCIENCE AND TECHNOLOGY
Inventors:
Ig Jae Kim, Heeseung Choi, Haksub Kim, Seungho Chae, Yoonsik Yang
Abstract: Provided is a mouth shape synthesis device and a method using an artificial neural network. To this end, an original video encoder that encodes original video data which is a target of a mouth shape synthesis as a video including a face of a synthesis target; an audio encoder that encodes audio data that is a basis for the mouth shape synthesis and outputs an audio embedding vector; and a synthesized video decoder that uses the original video embedding vector and the audio embedding vector as input data, and outputs synthesized video data in which a mouth shape corresponding to the audio data is synthesized on the synthesis target face may be provided.
Type:
Grant
Filed:
September 8, 2021
Date of Patent:
March 5, 2024
Assignee:
LIONROCKET INC.
Inventors:
Seung Hwan Jeong, Hyung Jun Moon, Jun Hyung Park
Abstract: At least a main area and a candidate area are provided on a display screen of a display device, and a display control unit that causes content displayed in the candidate area when reproduction of content displayed in the main area ends to be displayed in the main area, and a detection unit that detects a person from a captured image of at least a place where the display screen is visually recognizable are included, wherein the display control unit changes the content displayed in the candidate area depending on the person during reproduction of the content displayed in the main area.
Abstract: Methods, systems, and computer programs are presented for adding new features to a network service. A method includes receiving an image depicting an object of interest. A category set is determined for the object of interest and an image signature is generated for the image. Using the category set and the image signature, the method identifies a set of publications within a publication database and assigns a rank to each publication. The method causes presentation of the ranked list of publications at a computing device from which the image was received.
Abstract: A three-dimensional face model generation method is provided. The method includes: obtaining an inputted three-dimensional face mesh of a target object; aligning the three-dimensional face mesh with a first three-dimensional face model of a standard object according to face keypoints; performing fitting on the three-dimensional face mesh and a local area of the first three-dimensional face model, to obtain a second three-dimensional face model after local fitting; and performing fitting on the three-dimensional face mesh and a global area of the second three-dimensional face model, to obtain a three-dimensional face model of the target object after global fitting.
Type:
Grant
Filed:
October 22, 2021
Date of Patent:
February 13, 2024
Assignee:
TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Abstract: A question and answer (Q&A) system is enhanced to support natural language queries into any document format regardless of where the underlying documents are stored. The Q&A system may be implemented “as-a-service,” e.g., a network-accessible information retrieval platform. Preferably, the techniques herein enable a user to quickly and reliably locate a document, page, chart, or data point that he or she is looking for across many different datasets. This provides for a unified view of all of the user's (or, more generally, an enterprise's) information assets (such as Adobe® PDFs, Microsoft® Word documents, Microsoft Excel spreadsheets, Microsoft PowerPoint presentations, Google Docs, scanned materials, etc.), and to be able to deeply search all of these sources for the right document, page, sheet, chart, or even answer to a question.
Type:
Grant
Filed:
November 22, 2021
Date of Patent:
February 13, 2024
Assignee:
Searchable AI Corp
Inventors:
Aaron Sisto, Nick Martin, Brian Shin, Hung Nguyen
Abstract: The present disclosure provides a three-dimensional element layout visualization method and apparatus. The three-dimensional element layout visualization method includes: determining a bounding box of each three-dimensional element in a visualization interface; projecting the bounding box of each three-dimensional element in a target projection direction to obtain edge elements of the bounding box of each three-dimensional element in the visualization interface; and determining, according to positions of the edge elements in the visualization interface, at least one edge element set satisfying a first preset condition, and displaying a collinearity identification corresponding to the edge element set in the visualization interface.
Abstract: An inspection apparatus comprises an obtaining unit configured to obtain first captured images of printed pages, a first setting unit configured to set an inspection region within a selected captured image selected from the first captured images of the respective pages obtained by the obtaining unit, and a second setting unit configured to perform setting performed on the selected captured image for the first captured image of a page decided based on a result of image analysis on the first captured images of the respective pages obtained by the obtaining unit.
Abstract: A system and method for utilizing a multi-sensor video surveillance system for traffic and compliance reporting. Utilizing closed caption televisions (CCTVs), cameras and other sensors, the system can generate heat maps and monitor activity in the scene. Based on the heat maps and object detection, the system can monitor traffic pattern for vehicles and monitor people for personal protective equipment (PPE) compliance.
Type:
Grant
Filed:
March 1, 2022
Date of Patent:
January 23, 2024
Inventors:
James Allan Douglas Cameron, Matthew Aaron Rogers Carle, Jonathan Taylor Millar
Abstract: Techniques are described for automated microinjection of substances, such as genetic material, into single cells in tissue samples. An example system comprises a robotic manipulator apparatus configured to hold and position a micropipette. Furthermore, the system comprises a microscope camera positioned to observe an injection site. A computing device receives image data from a microscope camera of the system, where the image data represents an image of a tissue sample. The computing device receives, via a user interface, an indication of a line traced by a user on the image of a tissue sample. In response, the computing device controls the robotic manipulator apparatus to move a tip of the micropipette along a path defined by the trajectory line. The pressure controller injects a gas into the micropipette to eject a substance out of the micropipette at one or more points along the path defined by the trajectory line.
Type:
Grant
Filed:
September 6, 2018
Date of Patent:
January 9, 2024
Assignee:
Regents of the University of Minnesota
Inventors:
Suhasa Kodandaramaiah, Elena Taverna, Gabriella Shull, Wieland Huttner
Abstract: The present disclosure describes a method, an apparatus, and a non-transitory computer-readable medium for detecting sensitive text information such as privacy-related text information from a signal and modifying the signal by removing the detected sensitive text information therefrom. The apparatus receives the signal such as an image, a video clip, or an audio clip, and recognizes a text string therefrom. The apparatus then detects, from the text string, a substring based on a similarity between the substring and a regular expression, and modifies the signal by removing information related to the detected substring from the signal.
Abstract: Video object and keypoint location detection techniques are presented. The system includes a detection system for generation locations of an object's keypoints along with probabilities associated with the locations, and a stability system for stabilizing keypoint locations of the detected objects. In some aspects, the generated probabilities are two-dimensional array correspond locations within input images, and stability system fits the generated probabilities to a two-dimensional probability distribution function.
Abstract: In various examples, locations of directional landmarks, such as vertical landmarks, may be identified using 3D reconstruction. A set of observations of directional landmarks (e.g., images captured from a moving vehicle) may be reduced to 1D lookups by rectifying the observations to align directional landmarks along a particular direction of the observations. Object detection may be applied, and corresponding 1D lookups may be generated to represent the presence of a detected vertical landmark in an image.
Type:
Grant
Filed:
April 12, 2021
Date of Patent:
December 12, 2023
Assignee:
NVIDIA Corporation
Inventors:
Philippe Bouttefroy, David Nister, Ibrahim Eden
Abstract: A method and device for small sample defect classification and a computing equipment are disclosed. The method comprises: separating a target to be tested into parts, and segmenting an original image of the target to be tested into at least two sub-images containing different parts according to the separated parts; establishing small sample classification models with respect to each sub-image and the original image respectively, and obtaining a classification result of each sub-image and a classification result of the original image by using corresponding classification models, wherein the classification result includes a defect category and a corresponding category probability; and determining and outputting a defect category of the target to be tested according to classification results of all the sub-images and the original image.
Abstract: A method and a system of hybrid data-and-model-driven hierarchical network reconfiguration are provided. Taking into account that the optimal operation structure of the power grid may change after a new energy is connected into the power grid in the distribution manner, this method combines the mathematical model of network reconfiguration and clustering with the deep learning model, and uses data recorded by the smart meter, to reconfigure the network in a hybrid data-and-model driving mode. This method proposes a network compression method and a hierarchical decompression method based on deep learning, and improves the efficiency of network reconfiguration by the way of distributed calculation and combination of on-line and off-line calculation.
Abstract: A human behavior recognition method, a device, and a storage medium are provided, which are related to the field of artificial intelligence, specifically to computer vision and deep learning technologies, and applicable to smart city scenarios. The method includes: obtaining attribute information of a target object and N pieces of candidate behavior-related information of a target human from a target image, wherein N is an integer greater than or equal to 1; determining target behavior-related information based on comparison results between the N pieces of candidate behavior-related information and the attribute information of the target object; and determining a behavior recognition result of the target human based on the target behavior-related information.
Abstract: A hardware accelerator for an object detection network and a method for detecting an object are provided. The present disclosure provides robust object detection that advantageously augments traditional deterministic bounding box predictions with spatial uncertainties for various computer vision applications, such as, for example, autonomous driving, robotic surgery, etc.
Type:
Grant
Filed:
February 19, 2021
Date of Patent:
November 21, 2023
Assignee:
Arm Limited
Inventors:
Partha Prasun Maji, Tiago Manuel Lourenco Azevedo
Abstract: Disclosed are multiple object detection method and apparatus. The multiple object detection apparatus includes a feature map extraction unit for extracting a plurality of multi-scale feature maps based on an input image, and a feature map fusion unit for generating a multi-scale fusion feature map including context information by fusing adjacent multi-scale feature maps among the plurality of multi-scale feature maps generated by the feature map extraction unit.
Type:
Grant
Filed:
July 12, 2022
Date of Patent:
November 14, 2023
Assignee:
CHUNG ANG UNIVERSITY INDUSTRY ACADEMIC COOPERATION
Inventors:
Joon Ki Paik, Sang Woo Park, Dong Geun Kim, Dong Goo Kang
Abstract: Approximate modeling of next combined result for stopping text-field recognition in a video stream. In an embodiment, text-recognition results are generated from frames in a video stream and combined into an accumulated text-recognition result. A distance between the accumulated text-recognition result and a next accumulated text-recognition result is estimated based on an approximate model of the next accumulated text-recognition result, and a determination is made of whether or not to stop processing based on this estimated distance. After processing is stopped, the final accumulated text-recognition result may be output.
Type:
Grant
Filed:
February 19, 2021
Date of Patent:
November 14, 2023
Assignee:
Smart Engines Service, LLC
Inventors:
Konstantin Bulatovich Bulatov, Vladimir Viktorovich Arlazarov
Abstract: An image processing apparatus includes a memory and processing circuitry. The memory sequentially stores a plurality of read images read by a reading device. The processing circuitry transfers pairs of read images from the memory to an external apparatus in ascending order of priority in bookbinding to leave pairs of read images in the memory in descending order of priority in bookbinding. The processing circuitry further instructs an image forming device to execute image formation with pairs of read images in descending order of priority in bookbinding.