Abstract: A voice interaction method and apparatus, a terminal, a server, and a readable storage medium are provided. The method includes the following steps: obtaining a user's control object and control intention according to the user's voice; determining whether the control object hits an user-selectable notification information pre-stored in a server; performing control corresponding to the control intention on the control object, if the control object hits the user-selectable notification information, under a voice interaction scenario, wherein the user-selectable notification information is notification information which is pre-stored in the server and has a preset rule with respect to the user.
Abstract: The present disclosure provides a multiple targets-tracking method and apparatus, a device and a storage medium, wherein the method comprises: obtaining a to-be-processed current image, inputting the current image into a convolutional neural network model obtained by pre-training, and obtaining a target detection result; respectively extracting feature vectors of each detected target from a pre-selected convolutional layer; respectively calculating a similarity between the feature vectors of each target in the current image and feature vectors of each target of previous images, completing association of the same target between different image frames according to calculation results, and allocating a tracking serial number. The solution of the present disclosure may be applied to meet requirements for real-time processing.
Abstract: Embodiments of the present disclosure disclose a method and apparatus for generating information. A specific embodiment of the method comprises: acquiring associated information of a road to be evaluated for driving difficulty; generating traffic environment information of the road based on the associated information; and inputting the traffic environment information of the road into a driving difficulty evaluation model to obtain a driving difficulty level of the road for an autonomous driving vehicle. The driving difficulty of the road for the autonomous driving vehicle is evaluated, to obtain the driving difficulty level of the road for the autonomous driving vehicle. Thus, based on a driving difficulty level of each road for the autonomous driving vehicle, the autonomous driving vehicle may select a road having a low driving difficulty level to drive.
Abstract: The present disclosure discloses a method and apparatus for analyzing medical image. A specific embodiment of the method includes: acquiring medical image data; generating multi-scale decision sample data based on the medical image data; inputting the multi-scale decision sample data into a deep neural network model to obtain an auxiliary diagnosis data of the medical image, the deep neural network model being trained according to a consistency principle between multi-scale training sample data and an output result of the deep neural network model. In the embodiment, a multi-scale training sample is used to accelerate the training process of the deep neural network model, thus the auxiliary diagnosis decision process can be accelerated, while the accuracy of the trained deep neural network model of the embodiment is improved according to a consistency principle of data between different scales and output results, thereby improving the accuracy of auxiliary diagnosis decision.
Abstract: A method and apparatus for unlocking a terminal screen are provided. A specific embodiment of the method includes: determining a current screen state of the terminal being a state of awaiting unlocking; acquiring illumination information of an environment of the terminal in a predetermined period of time, the illumination information comprising an illumination intensity and a duration of the illumination intensity; judging whether the illumination information meets a predetermined condition; and switching the current screen state of the terminal to a successfully unlocked state in response to determining that the illumination information meets the predetermined condition. The embodiment achieves high-precision unlocking of a terminal screen without the need of manual operations on the terminal screen.
Abstract: Embodiments of the present disclosure disclose a method and apparatus for identifying information. One embodiment of the method includes: collecting to-be-processed audio in real-time; performing voice recognition on the to-be-processed audio; performing data-processing on the to-be-processed audio, when the audio is recognized as a wake-up word, the wake-up word is used for instructing performing data-processing on the to-be-processed audio. The embodiment can identify keywords from the to-be-processed audio obtained in real-time and then perform data-processing on the to-be-processed audio, which improves completeness in obtaining the to-be-processed audio and accuracy in performing data-processing on the to-be-processed audio.
Abstract: Embodiments of the present disclosure disclose a method and apparatus for acquiring information. A specific embodiment of the method includes: acquiring a fundus image; introducing the fundus image into a pre-trained disease grading model to obtain disease grading information, the disease grading model being used for extracting characteristic information from a lesion image included in the fundus image, and generating disease grading information based on the extracted characteristic information, the disease grading information including grade information of a disease, a lesion type, a lesion location, and a number of lesions included by the disease; and constructing output information using the disease grading information. This embodiment improves the accuracy of grading information.
Abstract: The present disclosure provides a method for processing speech splicing and synthesis and apparatus, a computer device and a readable medium. The method comprises: expanding a speech library according to a pre-trained speech synthesis model and an obtained synthesized text; the speech library before the expansion comprises manually-collected original language materials; using the expanded speech library to perform speech splicing and synthesis processing. According to the technical solution of the present embodiment, the speech library is expanded so that the speech library includes sufficient language materials. As such, when speech splicing processing is performed according to the expanded speech library, it is possible to select more speech segments, and thereby improve coherence and naturalness of the effect of speech synthesis so that the speech synthesis effect is very coherent with very good naturalness and can sufficiently satisfy the user's normal use.
Abstract: The present disclosure provides an obstacle detecting method and apparatus, a device and a storage medium, wherein the method comprises: obtaining a 3D point cloud collected by an unmanned vehicle during travel; projecting the 3D point cloud on a 2-dimensional grid to respectively obtain feature information of each grid; inputting the feature information of each grid into a pre-trained prediction model to respectively obtain obstacle prediction parameters of each grid; performing grid clustering according to the obstacle prediction parameters of each grid to obtain obstacle detection results. The solutions of the present disclosure can be applied to improve the accuracy of detection results.
Abstract: A method and apparatus for recognizing a character area in an image are provided. A specific embodiment of the method comprises: acquiring color values and position information of pixel points in a to-be-recognized image; clustering the pixel points based on the color values of the pixel points, the color values of the pixel points in each pixel point category being identical or similar; determining, for each category of clustered pixel points, outlines of connected regions including the pixel points in each category of pixel points, to obtain an outline set; and merging, based on the color values and the position information of the outlines in the outline set, the outlines, to obtain character areas in the image. The embodiment has improved the accuracy of recognizing character lines in recognizing characters in an image.
Abstract: A map based navigation method, apparatus and storage medium. The method includes: acquiring a location status of a user terminal carrying a navigation client and a navigation operation of a user; and switching automatically between at least two types of navigation pages of the navigation client, based on the location status of the user terminal and the navigation operation of the user. This method can achieve an automatic switching between different types of navigation pages and satisfy different needs of the user in different situations.
Abstract: The present disclosure provides an interface intelligent interaction control method, apparatus and system, and a storage medium, wherein the method comprises: receiving user-input speech information, and obtaining a speech recognition result; determining scenario elements associated with the speech recognition result; generating an entry corresponding to each scenario element and sending the speech recognition result and the entry to a cloud server; receiving an entry which is best matched with the speech recognition result, returned by the cloud server and selected from the received entries; performing an interface operation corresponding to the best-matched entry. The solution of the present disclosure can be applied to improve flexibility and accuracy of the speech control.
Abstract: A method and apparatus for acquiring information. A specific embodiment of the method includes: acquiring a to-be-processed near-infrared image and scenario information corresponding to the to-be-processed near-infrared image, the scenario information being used to represent a scenario of acquiring the to-be-processed near-infrared image; searching for a pre-trained near-infrared image recognition model corresponding to the scenario information, the near-infrared image recognition model being used to identify near-infrared feature information from the to-be-processed near-infrared image; and importing the to-be-processed near-infrared image into the near-infrared image recognition model to obtain the near-infrared feature information of the to-be-processed near-infrared image. This embodiment speeds up the acquisition of the near-infrared feature information and improves the accuracy and efficiency of acquiring the near-infrared feature information.
Abstract: A method for tracking a target profile in a video includes: determining position information of corner points of the target profile and parameter information of a first edge formed by adjacent corner points in a previous image frame adjacent to a current image frame; tracking corner points of the target profile in the previous image frame to acquire position information of the corner points of the target profile in the current image frame to determine parameter information of a second edge; predicting to acquire predicted parameter information, and generating candidate target profiles based on the predicted parameter information, in response to determining that a similarity between the first edge and a second edge corresponding to the first edge being less than a first preset threshold; and determining a final position of the target profile in the current image frame based on the candidate target profiles.
Abstract: The present disclosure provides a semantic analysis method and apparatus based on artificial intelligence. The method includes: matching input information to be processed with a preset semantic template, in which the preset semantic template is generated according to semantic slot information and equipment information corresponding to an application scenario; when the input information to be processed is successfully matched with the preset semantic template, converting the input information to formative data according to a target semantic template successfully matched with the input information; normalizing the formative data and generating a semantic analysis result corresponding to the input information.
Abstract: The present disclosure provides a method for switching control modes of a smart TV, a device and a computer readable medium. The method comprises: collecting a user's first triggering operation when the smart TV set is in a first control mode; according to a preset switching operation corresponding to a first control mode, verifying whether a first triggering operation is a switching operation for switching the first control mode into a second control mode; if yes, switching the control mode of the smart TV set from the first control mode to the second control mode. The technical solution of the present embodiment may be employed to implement the switching of different control modes of the TV set so that at the same time, the smart TV set is only located in one control mode so that messy control of the smart TV set will not be caused, and the control efficiency of the smart TV set is effectively improved.
Abstract: Provided in the present invention are a method and apparatus for labeling training samples. In the embodiments of the present invention, two mutually independent classifiers, i.e. a first classifier and a second classifier, are used to perform collaborative forecasting on M unlabeled first training samples to obtain some of the labeled first training samples, without the need for the participation of operators; the operation is simple and the accuracy is high, thereby improving the efficiency and reliability of labeling training samples.
Abstract: An object tracking method, device are provided according to the disclosure. The object tracking method includes: acquiring an object image comprising an object; determining a current image with the object from images captured consecutively, according to the object image; calculating a first pose of the object in the current image using a first algorithm, and calculating a second pose of the object in the current image using a second algorithm, wherein the first algorithm has a faster calculating speed than the second algorithm; determining a pose of the object to be the first pose before a calculation of the second pose is completed; and in a case that the calculation of the second pose is completed, calculating a third pose of the object according to the first pose and the second pose; and determining the pose of the object to be the third pose.
Abstract: An artificial intelligence-based cross-language speech transcription method and apparatus, a device and a readable medium. The method includes pre-processing to-be-transcribed speech data to obtain multiple acoustic features, the to-be-transcribed speech data being represented in a first language; predicting a corresponding translation text after transcription of the speech data according to the multiple acoustic features and a pre-trained cross-language transcription model; wherein the translation text is represented in a second language which is different from the first language. According to the technical solution, it is unnecessary, upon cross-language speech transcription, to perform speech recognition first and then perform machine translation, but to directly perform cross-language transcription according to the pre-trained cross-language transcription model.