Patents by Inventor Saurabh Tahiliani
Saurabh Tahiliani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250053992Abstract: The present teaching relates to conduct persona-adaptive communications with a customer at a geo-locale. Transcripts of a current and historic communications involving the customer are used to characterize the persona of the customer. Transcripts of historic communications with customers at the geo-locale are used to characterize the persona of the geo-locale. Current persona of the customer exhibited in the current communication is combined with the customer's persona and the geo-locale's persona to compute a response input vector, A language model generates, based on the response input vector, a persona-adaptive response, which is then sent to the customer a response.Type: ApplicationFiled: August 10, 2023Publication date: February 13, 2025Applicant: Verizon Patent and Licensing Inc.Inventors: Durgesh Kumar, Saurabh Tahiliani
-
Publication number: 20250021818Abstract: The present teaching relates to compressing a model for an application to generate a compressed model. The model has multiple layers, each of which has multiple nodes. Operating the model utilizing an application-dependent dataset, redundant nodes/layers in the model are identified via a loss-based assessment. The loss-based assessment using aggregated output vectors computed based on output vectors produced by the nodes/layers of the model in response to the data samples of the application-dependent dataset. Removing the redundant nodes/layers yields the compressed model.Type: ApplicationFiled: July 14, 2023Publication date: January 16, 2025Applicant: Verizon Patent and Licensing Inc.Inventors: Subham Biswas, Saurabh Tahiliani
-
Patent number: 12197860Abstract: One or more computing devices, systems, and/or methods are provided. In an example, a conversation path associated with a revised code segment of a conversational interaction entity is identified by a processor. The conversation path has a predetermined intent. A conversational phrase is generated by the processor for the conversation path. The conversational interaction entity is employed by the processor using the conversation path and the conversational phrase to generate a resultant intent. An issue report is generated by the processor for the conversational interaction entity responsive to the resultant intent not matching the predetermined intent.Type: GrantFiled: July 23, 2021Date of Patent: January 14, 2025Assignee: Verizon Patent and Licensing Inc.Inventors: Prakash Ranganathan, Saurabh Tahiliani
-
Patent number: 12200322Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: GrantFiled: December 19, 2023Date of Patent: January 14, 2025Assignee: Verizon Patent and Licensing Inc.Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20240419901Abstract: A device may receive text data associated with a chatbot, a live chat, or an interactive voice response system, and may preprocess the text data with one or more preprocessing techniques to generate preprocessed data and key intents. The device may convert the preprocessed data and the key intents into embeddings, and may combine the embeddings into an input vector. The device may process the input vector, with a language model, to identify relationships between words and phrases of the text data, and may process the input vector and the relationships, with a summary generation model, to generate a summary of the text data. The device may perform one or more actions based on the summary of the text data.Type: ApplicationFiled: June 15, 2023Publication date: December 19, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Durgesh KUMAR, Saurabh TAHILIANI
-
Publication number: 20240420442Abstract: A device may receive unprocessed images to be labeled, and may utilize a first neural network model to identify objects of interest in the unprocessed images and bounding boxes for the objects of interest. The device may annotate the objects of interest to generate annotated objects of interest, and may utilize a second neural network model to group the annotated objects of interest into clusters. The device may utilize a third neural network model to determine labels for the clusters, and may request manually-generated labels for clusters for which labels are not determined. The device may receive the manually-generated labels, and may label the unprocessed images with the labels and the manually-generated labels to generate labeled images. The device may generate a training dataset based on the labeled images, and may train a computer vision model with the training dataset to generate a trained computer vision model.Type: ApplicationFiled: August 27, 2024Publication date: December 19, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Prakash RANGANATHAN, Saurabh TAHILIANI
-
Publication number: 20240386887Abstract: The present teaching relates to personalized IVR communications with a customer at a geo-locale. A first set of transcripts of the current and historic communications involving the customer and a second set of transcripts of historic communications associated with the geo-locale are analyzed to compute a personalized contextual vector, a geo-localized contextual vector, and a current text vector. The computed vectors are used by a language model to generate a personalized and geo-locale aware prompt, which is used to generate an IVR communication and is sent to the customer as a response.Type: ApplicationFiled: May 18, 2023Publication date: November 21, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Durgesh Kumar, Saurabh Tahiliani
-
Publication number: 20240321260Abstract: A device may receive video data that includes a text transcript, audio sequences, and image frames, and may detect a network fluctuation. The device may process the text transcript to generate a new phrase, and may generate a response phoneme based on the new phrase. The device may generate a text embedding based on the response phoneme, and may process the audio sequences to generate a target voice sequence. The device may generate an audio embedding based on the target voice sequence, and may process the image frames to generate a target image sequence. The device may generate an image embedding based on the target image sequence, and may combine the embeddings to generate an embedding input vector. The device may generate a final voice response and a final video based on the embedding input vector, and may provide the video data, the final voice response, and the final video.Type: ApplicationFiled: March 24, 2023Publication date: September 26, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20240311983Abstract: In an example, an image may be identified. Object detection may be performed on the image to identify a region including a distorted representation of an object. The region may be masked to generate a masked image including a masked region corresponding to the object. Using a machine learning model, the masked region may be replaced with an undistorted representation of the object to generate a modified image.Type: ApplicationFiled: March 16, 2023Publication date: September 19, 2024Inventors: Subham Biswas, Saurabh Tahiliani
-
Patent number: 12094181Abstract: A device may receive unprocessed images to be labeled, and may utilize a first neural network model to identify objects of interest in the unprocessed images and bounding boxes for the objects of interest. The device may annotate the objects of interest to generate annotated objects of interest, and may utilize a second neural network model to group the annotated objects of interest into clusters. The device may utilize a third neural network model to determine labels for the clusters, and may request manually-generated labels for clusters for which labels are not determined. The device may receive the manually-generated labels, and may label the unprocessed images with the labels and the manually-generated labels to generate labeled images. The device may generate a training dataset based on the labeled images, and may train a computer vision model with the training dataset to generate a trained computer vision model.Type: GrantFiled: April 19, 2022Date of Patent: September 17, 2024Assignee: Verizon Patent and Licensing Inc.Inventors: Prakash Ranganathan, Saurabh Tahiliani
-
Publication number: 20240211994Abstract: A method may include receiving frames associated with a video stream, identifying a first object image included in at least some of the frames and masking a region, in the at least some of the frames, associated with the first object image. The method may also include receiving information identifying at least one attribute associated with a user and identifying, based on the received information, a second object image to replace the first object image. The method may further include replacing pixel values in the masked region with contextually suitable pixel values associated with the second object image and outputting the video stream with the second object image replacing the first object image in the at least some of the frames.Type: ApplicationFiled: December 22, 2022Publication date: June 27, 2024Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20240127790Abstract: A device may receive and convert audio data to text data in real-time, and may detect a network fluctuation that causes missing voice packets. The device may process partial text and context of the text data, with a model, to generate a new phrase, and may generate a response phoneme for the new phrase. The device may utilize a text embedding model to generate a text embedding for the response phoneme, and may process the audio data, with the model, to generate a target voice sequence. The device may utilize an audio embedding model to generate an audio embedding for the target voice sequence, and may combine the text embedding and the audio embedding to generate an embedding input vector. The device may process the embedding input vector, with an audio synthesis model, to generate a final voice response, and may provide the audio data and the final voice response.Type: ApplicationFiled: October 12, 2022Publication date: April 18, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Saurabh TAHILIANI, Subham BISWAS
-
Publication number: 20240121487Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: ApplicationFiled: December 19, 2023Publication date: April 11, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Publication number: 20240096075Abstract: A method may include receiving a number of images to train a first neural network, masking a portion of each of the images and inputting the masked images to the first neural network. The method may also include generating, by the first neural network, probable pixel values for pixels located in the masked portion of each of the plurality of images, forwarding the images including the probable pixel values to a second neural network and determining, by the second neural network, whether each of the probable pixel values is contextually suitable. The method may further include identifying pixels in each of the plurality of images that are not contextually suitable.Type: ApplicationFiled: September 21, 2022Publication date: March 21, 2024Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20240095449Abstract: In some implementations, a transcription system may generate a first transcript based on audio data of a conversation between a first user and a second user. The transcription system may determine, using a first machine learning model of the transcript system, that a portion of the first transcript is incorrect. The transcription system may generate, using a second machine learning model, additional data for transcribing the audio data based on determining that the portion of the first transcript is incorrect. The additional data is generated using a portion of the audio data corresponding to the portion of the first transcript. The transcription system may generate a second transcript based on the audio data and the additional data. The transcription system may provide the second transcript to one or more devices.Type: ApplicationFiled: September 16, 2022Publication date: March 21, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Prakash RANGANATHAN, Saurabh TAHILIANI
-
Patent number: 11934614Abstract: Disclosed are systems and methods for an anomaly detection framework that operates as an executable analysis tool for devices to operate in order to determine whether the device contains an unresponsive touch screen (e.g., defective or malfunctioning touch screen). The disclosed framework can analyze the capacitance capabilities of the touch screen, inclusive of the touch layers associated with the touch screen panel, and determine when a device's touch screen is unresponsive to user provided input, which can be any type of touch or gesture provided on a touch screen.Type: GrantFiled: October 21, 2022Date of Patent: March 19, 2024Assignee: Verizon Patent and Licensing Inc.Inventors: Prakash Ranganathan, Saurabh Tahiliani
-
Publication number: 20240054758Abstract: Techniques for identifying and tracking objects in digital content are disclosed. In one embodiment, a method is disclosed comprising obtaining a frame of digital content, the frame comprising pixel data, detecting an object using the pixel data, determining a set of attributes for the detected object, the set of attributes comprising position, object segment and affine attributes, determining a similarity measurement for the detected object and a second object using the set of attributes corresponding to the detected object and the second object's set of attributes, and using the similarity measurement to make a similarity determination whether or not the detected object and the second object are a same object.Type: ApplicationFiled: August 11, 2022Publication date: February 15, 2024Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Prakash RANGANATHAN, Saurabh TAHILIANI
-
Publication number: 20240037824Abstract: Techniques for generating emotionally-aware digital content are disclosed. In one embodiment, a method is disclosed comprising obtaining audio input, obtaining a textual representation of the audio input; using the textual representation of the audio input to identify an emotion corresponding to the audio input; generating an emotionally-aware facial representation in accordance with the textual representation and the identified emotion; using the emotionally-aware facial representation to generate one or more images comprising at least one facial expression corresponding to the identified emotion; and providing digital content comprising the one or more images.Type: ApplicationFiled: July 26, 2022Publication date: February 1, 2024Applicant: VERIZON PATENT AND LICENSING INC.Inventors: Subham BISWAS, Saurabh TAHILIANI
-
Patent number: 11889168Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: GrantFiled: July 11, 2022Date of Patent: January 30, 2024Assignee: Verizon Patent and Licensing Inc.Inventors: Subham Biswas, Saurabh Tahiliani
-
Publication number: 20240015371Abstract: A video summary device may generate a textual summary of a transcription of a virtual event. The video summary device may generate a phonemic transcription of the textual summary and generate a text embedding based on the phonemic transcription. The video summary device may generate an audio embedding based on a target voice. The video summary device may generate an audio output of the phonemic transcription uttered by the target voice. The audio output may be generated based on the text embedding and the audio embedding. The video summary device may generate an image embedding based on video data of a target user. The image embedding may include information regarding images of facial movements of the target user. The video summary device may generate a video output of different facial movements of the target user uttering the phonemic transcription, based on the text embedding and the image embedding.Type: ApplicationFiled: July 11, 2022Publication date: January 11, 2024Applicant: Verizon Patent and Licensing Inc.Inventors: Subham BISWAS, Saurabh TAHILIANI