Patents by Inventor Saurabh Tahiliani
Saurabh Tahiliani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240127790Abstract: A device may receive and convert audio data to text data in real-time, and may detect a network fluctuation that causes missing voice packets. The device may process partial text and context of the text data, with a model, to generate a new phrase, and may generate a response phoneme for the new phrase. The device may utilize a text embedding model to generate a text embedding for the response phoneme, and may process the audio data, with the model, to generate a target voice sequence. The device may utilize an audio embedding model to generate an audio embedding for the target voice sequence, and may combine the text embedding and the audio embedding to generate an embedding input vector. The device may process the embedding input vector, with an audio synthesis model, to generate a final voice response, and may provide the audio data and the final voice response.Type: ApplicationFiled: October 12, 2022Publication date: April 18, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Saurabh TAHILIANI, Subham BISWAS
-
Publication number: 20240121487Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: ApplicationFiled: December 19, 2023Publication date: April 11, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20240096075Abstract: A method may include receiving a number of images to train a first neural network, masking a portion of each of the images and inputting the masked images to the first neural network. The method may also include generating, by the first neural network, probable pixel values for pixels located in the masked portion of each of the plurality of images, forwarding the images including the probable pixel values to a second neural network and determining, by the second neural network, whether each of the probable pixel values is contextually suitable. The method may further include identifying pixels in each of the plurality of images that are not contextually suitable.Type: ApplicationFiled: September 21, 2022Publication date: March 21, 2024Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20240095449Abstract: In some implementations, a transcription system may generate a first transcript based on audio data of a conversation between a first user and a second user. The transcription system may determine, using a first machine learning model of the transcript system, that a portion of the first transcript is incorrect. The transcription system may generate, using a second machine learning model, additional data for transcribing the audio data based on determining that the portion of the first transcript is incorrect. The additional data is generated using a portion of the audio data corresponding to the portion of the first transcript. The transcription system may generate a second transcript based on the audio data and the additional data. The transcription system may provide the second transcript to one or more devices.Type: ApplicationFiled: September 16, 2022Publication date: March 21, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Prakash RANGANATHAN, Saurabh TAHILIANI
-
Patent number: 11934614Abstract: Disclosed are systems and methods for an anomaly detection framework that operates as an executable analysis tool for devices to operate in order to determine whether the device contains an unresponsive touch screen (e.g., defective or malfunctioning touch screen). The disclosed framework can analyze the capacitance capabilities of the touch screen, inclusive of the touch layers associated with the touch screen panel, and determine when a device's touch screen is unresponsive to user provided input, which can be any type of touch or gesture provided on a touch screen.Type: GrantFiled: October 21, 2022Date of Patent: March 19, 2024Assignee: Verizon Patent and Licensing Inc.Inventors: Prakash Ranganathan, Saurabh Tahiliani
-
Publication number: 20240054758Abstract: Techniques for identifying and tracking objects in digital content are disclosed. In one embodiment, a method is disclosed comprising obtaining a frame of digital content, the frame comprising pixel data, detecting an object using the pixel data, determining a set of attributes for the detected object, the set of attributes comprising position, object segment and affine attributes, determining a similarity measurement for the detected object and a second object using the set of attributes corresponding to the detected object and the second object's set of attributes, and using the similarity measurement to make a similarity determination whether or not the detected object and the second object are a same object.Type: ApplicationFiled: August 11, 2022Publication date: February 15, 2024Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Prakash RANGANATHAN, Saurabh TAHILIANI
-
Publication number: 20240037824Abstract: Techniques for generating emotionally-aware digital content are disclosed. In one embodiment, a method is disclosed comprising obtaining audio input, obtaining a textual representation of the audio input; using the textual representation of the audio input to identify an emotion corresponding to the audio input; generating an emotionally-aware facial representation in accordance with the textual representation and the identified emotion; using the emotionally-aware facial representation to generate one or more images comprising at least one facial expression corresponding to the identified emotion; and providing digital content comprising the one or more images.Type: ApplicationFiled: July 26, 2022Publication date: February 1, 2024Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Patent number: 11889168Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: GrantFiled: July 11, 2022Date of Patent: January 30, 2024Assignee: Verizon Patent and Licensing Inc.Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20240015371Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: ApplicationFiled: July 11, 2022Publication date: January 11, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20230403559Abstract: In an example, a text message sent by a first user equipment (UE) and addressed to a second UE is received. In response to receiving the text message, a set of information associated with the text message is determined based upon information determined by a first carrier of the first UE and/or the second UE. The text message is classified as spam or not spam based upon the set of information.Type: ApplicationFiled: June 13, 2022Publication date: December 14, 2023Inventors: Prakash Ranganathan, Saurabh Tahiliani
-
Patent number: 11825353Abstract: A system described herein may provide a technique for the assignment of Centralized Units (“CUs”) to Distributed Units (“DUs”) in a radio access network (“RAN”) that includes a distributed or hierarchical arrangement of network infrastructure equipment. Different groups of DUs may be modeled based on usage or traffic patterns, and complementary groups of DUs may be identified based on measures of usage that may vary with time. For example, one model associated with one group of DUs may experience relatively heavy usage during morning hours and light usage during evening hours, and another model associated with a complementary group of DUs may experience relatively light usage during morning hours and heavy usage during evening hours.Type: GrantFiled: November 29, 2021Date of Patent: November 21, 2023Assignee: Verizon Patent and Licensing Inc.Inventors: Seng Gan, Subham Biswas, Christopher A. Graffeo, Saurabh Tahiliani
-
Publication number: 20230334814Abstract: A device may receive unprocessed images to be labeled, and may utilize a first neural network model to identify objects of interest in the unprocessed images and bounding boxes for the objects of interest. The device may annotate the objects of interest to generate annotated objects of interest, and may utilize a second neural network model to group the annotated objects of interest into clusters. The device may utilize a third neural network model to determine labels for the clusters, and may request manually-generated labels for clusters for which labels are not determined. The device may receive the manually-generated labels, and may label the unprocessed images with the labels and the manually-generated labels to generate labeled images. The device may generate a training dataset based on the labeled images, and may train a computer vision model with the training dataset to generate a trained computer vision model.Type: ApplicationFiled: April 19, 2022Publication date: October 19, 2023Applicant: Verizon Patent and Licensing Inc.Inventors: Prakash RANGANATHAN, Saurabh TAHILIANI
-
Patent number: 11750742Abstract: A device may receive audio data of a first call between a first user and a second user. The device may generate, based on the audio data, time series data associated with an audio signal of the first call and may process, using a first machine learning model, the time series data to generate first call insight information regarding one or more first insights associated with the first call. The device may process the audio data to generate image data associated with the audio signal and may process, using a second machine learning model, the image data to generate second call insight information regarding one or more second insights associated with the first call. The device may combine the first call insight information and the second call insight information to generate combined call insight information and cause an action to be performed based on the combined call insight information.Type: GrantFiled: September 8, 2022Date of Patent: September 5, 2023Assignee: Verizon Patent and Licensing Inc.Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20230215128Abstract: Systems and methods described herein utilize synthetic pixel generation using a custom neural network to generate synthetic versions of objects hidden by occlusions for effective detection and tracking. A computing device stores an object detector model and a synthetic image generator model; receives a video feed; detects objects of interest in a current frame of the video feed; identifies an occluded object in the current frame; retrieves a previous frame from the video feed; generates synthetic data based on the previous frame for the occluded object; and forwards a modified version of the current frame to an object tracking system, wherein the modified version of the current frame includes the synthetic data.Type: ApplicationFiled: January 5, 2022Publication date: July 6, 2023Inventors: Prakash Ranganathan, Saurabh Tahiliani
-
Publication number: 20230171644Abstract: A system described herein may provide a technique for the assignment of Centralized Units (“CUs”) to Distributed Units (“DUs”) in a radio access network (“RAN”) that includes a distributed or hierarchical arrangement of network infrastructure equipment. Different groups of DUs may be modeled based on usage or traffic patterns, and complementary groups of DUs may be identified based on measures of usage that may vary with time. For example, one model associated with one group of DUs may experience relatively heavy usage during morning hours and light usage during evening hours, and another model associated with a complementary group of DUs may experience relatively light usage during morning hours and heavy usage during evening hours.Type: ApplicationFiled: November 29, 2021Publication date: June 1, 2023Applicant: Verizon Patent and Licensing Inc.Inventors: Seng Gan, Subham Biswas, Christopher A. Graffeo, Saurabh Tahiliani
-
Publication number: 20230169990Abstract: Techniques for generating emotionally-aware audio, or voice, responses for a user interface of an application, such as an automated voice response application, are disclosed. In one embodiment, a method is disclosed comprising obtaining voice input from a user via an automated voice response user interface of an application, obtaining a textual representation of the voice input, using the textual representation of the voice input from a user to obtain a source emotion of the user, determining a response emotion using the source emotion, generating a response textual representation indicating textual content of the response, generating a frequency spectrum representation of the response in accordance with the response textual representation and the response emotion, using the frequency spectrum representation of the response to generate a voice response reflective of the textual content of the response and the response emotion, and communicating the response to the user via the user interface.Type: ApplicationFiled: December 1, 2021Publication date: June 1, 2023Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20230058560Abstract: A device may receive audio data of a first call between a first user and a second user. The device may generate, based on the audio data, time series data associated with an audio signal of the first call and may process, using a first machine learning model, the time series data to generate first call insight information regarding one or more first insights associated with the first call. The device may process the audio data to generate image data associated with the audio signal and may process, using a second machine learning model, the image data to generate second call insight information regarding one or more second insights associated with the first call. The device may combine the first call insight information and the second call insight information to generate combined call insight information and cause an action to be performed based on the combined call insight information.Type: ApplicationFiled: September 8, 2022Publication date: February 23, 2023Applicant: Verizon Patent and Licensing Inc.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20230050134Abstract: Disclosed are embodiments for improving training data for machine learning (ML) models. In an embodiment, a method is disclosed where an augmentation engine receives a seed example, the seed example stored in a seed training data set; generates an encoded seed example of the seed example using an encoder; inputs the encoded seed example into a machine learning model and receives a candidate example generated by the machine learning model; determines that the candidate example is similar to the encoded seed example; and augments the seed training data set with the candidate example.Type: ApplicationFiled: August 11, 2021Publication date: February 16, 2023Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20230039235Abstract: Techniques for generating conversational responses for a conversational user interface are disclosed. In one embodiment, a method is disclosed comprising obtaining user input from a user via a conversational user interface, using the user input to obtain a user emotion and a user intent, obtaining candidate probabilities for a fragment of a response to the user input using the obtained user emotion, the obtained user intent and the user input, generating the response to the user input using the candidate probabilities obtained for the fragment to select a candidate for the fragment of the response, and communicating the response to the user via the conversational user interface.Type: ApplicationFiled: August 4, 2021Publication date: February 9, 2023Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Subham BISWAS, Bharatwaaj SHANKAR, Saurabh TAHILIANI
-
Publication number: 20230027936Abstract: One or more computing devices, systems, and/or methods are provided. In an example, a conversation path associated with a revised code segment of a conversational interaction entity is identified by a processor. The conversation path has a predetermined intent. A conversational phrase is generated by the processor for the conversation path. The conversational interaction entity is employed by the processor using the conversation path and the conversational phrase to generate a resultant intent. An issue report is generated by the processor for the conversational interaction entity responsive to the resultant intent not matching the predetermined intent.Type: ApplicationFiled: July 23, 2021Publication date: January 26, 2023Inventors: Prakash Ranganathan, Saurabh Tahiliani