Patents by Inventor Renjie Tao
Renjie Tao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250140246Abstract: One example method includes receiving, during a virtual conference from a client device, a request to generate a real-time summary of the virtual conference, a plurality of client devices participating in the virtual conference and exchanging audio and video streams; receiving, during the virtual conference, a plurality of utterances generated by automatic speech recognition (“ASR”) of the audio streams; generating a group of consecutive utterances from the plurality of utterances; determining a segment based on the group of consecutive utterances; generating, using a large language model (“LLM”), a segment summary based on the segment; and providing the segment summary to the client device.Type: ApplicationFiled: October 31, 2023Publication date: May 1, 2025Applicant: Zoom Video Communications, Inc.Inventors: Bilung Lee, Renjie Tao, Yun Zhang
-
Publication number: 20250140244Abstract: One example method includes joining, by a client application executed by a client device, a virtual conference hosted by a virtual conference provider, a plurality of participants attending the virtual conference; receiving, by the client application, a question associated with the virtual conference; generating a query context based on a real-time transcript of the virtual conference; providing the query context and the question to a trained large language model (“LLM”); and receiving a response from the LLM based on the question and the query context.Type: ApplicationFiled: October 26, 2023Publication date: May 1, 2025Applicant: Zoom Video Communications, Inc.Inventors: Bilung Lee, Renjie Tao, Yun Zhang
-
Publication number: 20250140245Abstract: One example method includes transmitting, by a client device, a request for a reduced large language model (“LLM”) to a remote server; receiving, by the client device from the remote server, and storing the reduced LLM, the reduced LLM based on a trained general LLM; receiving, by the client device, a request to generate content using the reduced LLM; providing the request to the reduced LLM; and receiving generated content from the reduced LLM based on the request.Type: ApplicationFiled: October 30, 2023Publication date: May 1, 2025Applicant: Zoom Video Communications, Inc.Inventors: Bilung Lee, Vijay Venkataswamy Parthasarathy, Renjie Tao, Zheng Yuan, Bing Zhao
-
Publication number: 20250104570Abstract: Example methods and systems for automatic generation of interaction tools. A communication platform receives a request to generate an interaction tool associated with a virtual communication session and accesses virtual communication data associated with the virtual communication session. The communication platform identifies a set of keypoint data from the virtual communication data based on the request using a machine learning model. The communication platform generates a list of questions based on the set of keypoint data and the request using a first generative artificial intelligence (AI) model. The communication platform provides the interaction tool based on the list of questions.Type: ApplicationFiled: September 26, 2023Publication date: March 27, 2025Inventors: Bilung LEE, Vijay Venkataswamy Parthasarathy, Renjie Tao, Matthieu Tardivel, Yun Zhang
-
Publication number: 20250103821Abstract: Example methods and systems for facilitating queries about a virtual communication session are provided. A communication platform receives an initial query about the virtual communication session from a user. The communication platform accesses virtual communication data associated with a virtual communication session. The communication platform generates an initial response to the initial query based on the virtual communication data using a first pre-trained generative artificial intelligence (AI) model. The communication platform generates a first set of follow-up queries based on the initial response using a second pre-trained generative AI model. The communication platform receives a selection of a first follow-up query out of the first set of follow-up queries. The communication platform provides a first response to the first follow-up query using the first pre-trained generative AI model.Type: ApplicationFiled: September 25, 2023Publication date: March 27, 2025Inventors: Wei Ji, Bilung Lee, Vijay Venkataswamy Parthasarathy, Renjie Tao
-
Publication number: 20250097385Abstract: One method for multimodal meeting segmentation includes receiving a meeting recording and a transcript, the meeting recording comprising a first media stream; generating, using a trained machine learning (“ML”) model, a first set of embeddings corresponding to the first media stream and a second set of embeddings corresponding to the transcript; generating a first set of segments based on the first set of embeddings and a second set of segments based on the second set of embeddings; generating a final set of segments based on the first and second sets of segments; and associating the final set of segments with the meeting recording and storing the final set of segments.Type: ApplicationFiled: July 19, 2024Publication date: March 20, 2025Applicant: Zoom Video Communications, Inc.Inventors: Bilung Lee, Vijay Venkataswamy Parthasarathy, Renjie Tao, Chih-Kai Ting, Yun Zhang
-
Publication number: 20250061713Abstract: Methods and systems provide for video-based and transcript-based segmentation of communication session content. The method may include obtaining a transcript associated with video content of a communication session and performing video-based segmentation on the video content to determine a category label from a list of category labels for each video frame of the video content. The video content may include topic segments of consecutive frames associated with a same category label. The method may further include performing transcript-based segmentation to divide one of the topic segments into a first topic segment associated with a first title and a second topic segment associated with a second title.Type: ApplicationFiled: November 5, 2024Publication date: February 20, 2025Inventors: Andrew Miller-Smith, Renjie Tao, Ling Tsou
-
Patent number: 12198433Abstract: Methods and systems provide for search results within segmented communication session content. In one embodiment, the system receives a transcript and video content of a communication session between participants, the transcript including timestamps for a number of utterances associated with speaking participants; processes the video content to extract textual content visible within the frames of the video content; segments frames of the video content into a number of contiguous topic segments; determines a title for each topic segment; assigns a category label for each topic segment; receives a request from a user to search for specified text within the video content; determines one or more titles or category labels for which a prediction of relatedness with the specified text is present; and presents content from at least one topic segment associated with the one or more titles or category labels for which a prediction of relatedness is present.Type: GrantFiled: January 31, 2023Date of Patent: January 14, 2025Assignee: Zoom Video Communications, Inc.Inventors: Andrew Miller-Smith, Renjie Tao, Ling Tsou
-
Publication number: 20240414017Abstract: Some examples involve an artificial intelligence (AI) system for handling a query about a conversation, such as a conversation between attendees of a videoconferencing meeting. As one example, the system can receive a query from a user about a conversation between attendees of a videoconferencing meeting. The system can determine a relevant portion of the conversation based on the query, determine an intent of the query by providing the query as input to an intent detection model, and select a machine-learning model from among a group of machine-learning models based on the intent of the query. The system can then provide the relevant portion of the conversation as input to the selected machine-learning model. The machine-learning model can generate an output based on the relevant portion of the conversation. The system can transmit the output to the user in a response to the query.Type: ApplicationFiled: June 6, 2023Publication date: December 12, 2024Inventors: Bilung LEE, Vijay Venkataswamy Parthasarathy, Renjie Tao, Sasank Vemuri
-
Publication number: 20240330792Abstract: Systems and methods for recommending communication channels and generating content for next-step communication are disclosed. A communication analytics platform accesses project metadata and communication data associated with a project. The communication analytics platform determines a recommendation of one or more communication channels for next-step communication for the project based on the project metadata and the communication data. The communication analytics platform generates content for the next-step communication using a generative artificial intelligence (AI) model based on the project metadata and the communication data. The communication analytics platform provides the recommendation of one or more communication channels and the generated content to a user associated with the project.Type: ApplicationFiled: July 31, 2023Publication date: October 3, 2024Inventors: Bilung Lee, Vijay Venkataswamy Parthasarathy, Renjie Tao, Bing Zhao
-
Publication number: 20240037941Abstract: Methods and systems provide for search results within segmented communication session content. In one embodiment, the system receives a transcript and video content of a communication session between participants, the transcript including timestamps for a number of utterances associated with speaking participants; processes the video content to extract textual content visible within the frames of the video content; segments frames of the video content into a number of contiguous topic segments; determines a title for each topic segment; assigns a category label for each topic segment; receives a request from a user to search for specified text within the video content; determines one or more titles or category labels for which a prediction of relatedness with the specified text is present; and presents content from at least one topic segment associated with the one or more titles or category labels for which a prediction of relatedness is present.Type: ApplicationFiled: January 31, 2023Publication date: February 1, 2024Inventors: Andrew Miller-Smith, Renjie Tao, Ling Tsou
-
Publication number: 20230394861Abstract: Methods and systems provide for providing extraction of textual content from video of a communication session. In one embodiment, the system receives video content of a communication session which includes a number of participants. The system then extracts frames from the video content, and classifies the frames of the video content. The system identifies one or more distinguishing frames containing text. For each distinguishing frame containing text, the system detects a title within the frame, crops a title area with the title within the frame, and extracts, via optical character recognition (“OCR”), the title from the cropped title area of the frame. The system extracts, via OCR, textual content from the distinguishing frames containing text, and then transmits the extracted textual content and extracted titles to one or more client devices.Type: ApplicationFiled: June 4, 2022Publication date: December 7, 2023Inventors: Renjie Tao, Ling Tsou
-
Publication number: 20230394860Abstract: Methods and systems provide for video-based search results within a communication session. In one embodiment, the system receives video content of a communication session with a number of participants; extracts, via optical character recognition (“OCR”), textual content from the frames of the video content, each piece of textual content including a timestamp representing a temporal location of the frame within the video content; receives, from a client device associated with a user, a request to search for specified text within the video content; in response to receiving the request, determines one or more matching pieces of textual content which match to the specified text; and presents, to the client device, the matching pieces of textual content.Type: ApplicationFiled: June 4, 2022Publication date: December 7, 2023Inventors: Renjie Tao, Ling Tsou
-
Publication number: 20230394858Abstract: Methods and systems provide for resolution-based extraction of textual content. In one embodiment, the system receives video content of a communication session with participants. The system then extracts high-resolution versions and low-resolution versions of frames from the video content, and classifies the low-resolution frames of the video content based on identifying text within the low-resolution frames. The system identifies one or more low-resolution distinguishing frames containing text. For each low-resolution distinguishing frame containing text, the system detects a title within the frame, crops a title area with the title within the frame, and extracts, via optical character recognition (“OCR”), the title from the cropped title area of the high-resolution version of the frame.Type: ApplicationFiled: June 4, 2022Publication date: December 7, 2023Inventor: Renjie Tao
-
Publication number: 20230394851Abstract: Methods and systems provide for providing video frame type classification in a communication session. In one embodiment, the system receives video content of a communication session with a number of participants; extracts frames from the video content; classifies the frames of the video content based on image analysis; and transmits, to one or more client devices, the classification of the frames of the video content.Type: ApplicationFiled: June 4, 2022Publication date: December 7, 2023Inventors: Renjie Tao, Ling Tsou
-
Publication number: 20230394854Abstract: Methods and systems provide for providing video-based chapter generation for a communication session. In one embodiment, the system receives a transcript and video content of a communication session between participants, the transcript including timestamps for a number of utterances associated with speaking participants; processes the video content to extract one or more pieces of textual content visible within the frames of the video content; segments frames of the video content into a number of contiguous topic segments; determines a title for each topic segment from one or more of: the transcript, and the extracted textual content; assigns a category label for each topic segment from a prespecified list of category labels; and transmits, to one or more client devices, the list of topic segments with determined title and assigned category label for each of the merged topic segments.Type: ApplicationFiled: June 4, 2022Publication date: December 7, 2023Inventors: Ravi Teja Polavaram, Renjie Tao, Ling Tsou, Tong Wang, Yun Zhang
-
Publication number: 20230394827Abstract: Methods and systems provide title detection for presented slides. In one embodiment, the system receives video content of a communication session with a number of participants; extracts frames from the video content; classifies the frames of the video content; identifies one or more distinguishing frames containing a presentation slide; for each distinguishing frame containing a presentation slide, detects a title within the frame; and transmits, to one or more client devices, the titles for each of the distinguishing frames comprising a presentation slide.Type: ApplicationFiled: June 4, 2022Publication date: December 7, 2023Inventors: Renjie Tao, Ling Tsou
-
Patent number: 11580737Abstract: Methods and systems provide for search results within segmented communication session content. In one embodiment, the system receives a transcript and video content of a communication session between participants, the transcript including timestamps for a number of utterances associated with speaking participants; processes the video content to extract textual content visible within the frames of the video content; segments frames of the video content into a number of contiguous topic segments; determines a title for each topic segment; assigns a category label for each topic segment; receives a request from a user to search for specified text within the video content; determines one or more titles or category labels for which a prediction of relatedness with the specified text is present; and presents content from at least one topic segment associated with the one or more titles or category labels for which a prediction of relatedness is present.Type: GrantFiled: July 31, 2022Date of Patent: February 14, 2023Assignee: Zoom Video Communications, Inc.Inventors: Andrew Miller-Smith, Renjie Tao, Ling Tsou