Patents by Inventor Jian David Wang

Jian David Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240185449
    Abstract: A video conference call system is provided with a camera to generate an input frame image of a conference room, where the video conference call system detects a human head for each meeting participant captured in the input frame image by applying a machine learning human head detector model to said input image frame, generates a head bounding box which surrounds each detected human head and identifies a corresponding meeting participant, extracts a pixel width measure and pixel height measure from each head bounding box, and applies the extracted pixel width measure and pixel height measure to one or more reverse lookup tables to extract meeting room coordinates for each meeting participant identified by a corresponding head bounding box.
    Type: Application
    Filed: October 22, 2022
    Publication date: June 6, 2024
    Inventors: Rajen Bhatt, Jian David Wang
  • Patent number: 11985417
    Abstract: Described are multiple cameras in a conference room, each pointed in a different direction. A primary camera includes a microphone array to perform sound source localization (SSL). The SSL is used in combination with a video image to identify the speaker from among multiple individuals that appear in the video image. Pose information of the speaker is developed. Pose information of each individual identified in each other camera is developed. The speaker pose information is compared to the pose information of the individuals from the other cameras. The best match for each other camera is selected as the speaker in that camera. The speaker views of each camera are compared to determine the speaker view with the most frontal view of the speaker. That camera is selected to provide the video for provision to the far end.
    Type: Grant
    Filed: June 16, 2022
    Date of Patent: May 14, 2024
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Jian David Wang, Xiangdong Wang, Varun Ajay Kulkarni
  • Patent number: 11606510
    Abstract: Multiple cameras in a conference room, each pointed in a different direction and including a microphone array to perform sound source localization (SSL). The SSL is used in combination with the video image to identify the speaker from among multiple individuals that appear in the video image. Neural network or machine learning processing is performed on the identified speaker to determine the quality of the front or facial view of the speaker. The best view of the speaker's face from the various cameras is selected to be provided to the far end. If no view is satisfactory, a default view is selected and that is provided to the far end. The use of the SSL allows selection of the proper individual from a group of individuals in the conference room, so that only the speaker's head is analyzed for the best facial view and then framed for transmission.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: March 14, 2023
    Assignee: PLANTRONICS, INC.
    Inventors: Jian David Wang, John Paul Spearman, Varun Ajay Kulkarni, Yong Yan, Xiangdong Wang, Peter L. Chu, David A. Bryan
  • Publication number: 20230060798
    Abstract: The attention level of participants is measured and then the resulting value is provided on a display of the participants. The participants are presented in a gallery view layout. The frame of each participant is colored to indicate the attention level. The entire window is tinted in colors representing the attention level. The blurriness of the participant indicates attention level. The saturation the participant indicates attention level. The window sizes vary based on attention level. Color bars are added to provide indications of percentages of attention level over differing time periods. Neural networks are used to find the faces of the participants and then develop facial keypoint values which are used to determine gaze direction, which in turn is used to develop an attention score. The attention score is then used to determine the settings of the layout.
    Type: Application
    Filed: July 22, 2022
    Publication date: March 2, 2023
    Inventors: Jian David Wang, Rajen Bhatt, Kui Zhang, Thomas Joseph Puorro, David A. Bryan
  • Publication number: 20220408029
    Abstract: Multiple cameras in a conference room, each pointed in a different direction. At least a primary camera includes a microphone array to perform sound source localization (SSL). The SSL is used in combination with a video image to identify the speaker from among multiple individuals that appear in the video image. Neural network or machine learning processing is performed on the primary camera video of the identified speaker to determine the facial pose of speaker. The locations of the other cameras with respect to the primary camera have been determined. Using those locations and the facial pose, the camera with the best frontal view of the speaker is determined. That camera is set as the designated camera to provide video for transmission to the far end.
    Type: Application
    Filed: June 14, 2022
    Publication date: December 22, 2022
    Inventors: Jian David Wang, John Paul Spearman
  • Publication number: 20220408015
    Abstract: Described are multiple cameras in a conference room, each pointed in a different direction. A primary camera includes a microphone array to perform sound source localization (SSL). The SSL is used in combination with a video image to identify the speaker from among multiple individuals that appear in the video image. Pose information of the speaker is developed. Pose information of each individual identified in each other camera is developed. The speaker pose information is compared to the pose information of the individuals from the other cameras. The best match for each other camera is selected as the speaker in that camera. The speaker views of each camera are compared to determine the speaker view with the most frontal view of the speaker. That camera is selected to provide the video for provision to the far end.
    Type: Application
    Filed: June 16, 2022
    Publication date: December 22, 2022
    Inventors: Jian David Wang, Xiangdong Wang, Varun Ajay Kulkarni
  • Publication number: 20220400216
    Abstract: Multiple cameras in a conference room, each pointed in a different direction and including a microphone array to perform sound source localization (SSL). The SSL is used in combination with the video image to identify the speaker from among multiple individuals that appear in the video image. Neural network or machine learning processing is performed on the identified speaker to determine the quality of the front or facial view of the speaker. The best view of the speaker's face from the various cameras is selected to be provided to the far end. If no view is satisfactory, a default view is selected and that is provided to the far end. The use of the SSL allows selection of the proper individual from a group of individuals in the conference room, so that only the speaker's head is analyzed for the best facial view and then framed for transmission.
    Type: Application
    Filed: June 9, 2021
    Publication date: December 15, 2022
    Inventors: Jian David WANG, John Paul SPEARMAN, Varun Ajay KULKARNI, Yong YAN, Xiangdong WANG, Peter L. CHU, David A. BRYAN
  • Patent number: 11516433
    Abstract: The development of a region of interest (ROI) video frame that includes only ROIs of interest and not other elements and providing the ROI video frames in a single video stream simplifies the development of gallery view continuous presence displays. ROI position and size information metadata can be provided or subpicture concepts of the particular codec can be used to separate the ROIs in the ROI video frame. Metadata can provide perspective/distortion correction values, speaker status and any other information desired about the participant or other ROI, such as name. Only a single encoder and a single decoder is needed, simplifying both transmitting and receiving endpoints. Only a single video stream is needed, reducing bandwidth requirements. As each participant can be individually isolated, the participants can be provided in similar sizes and laid out as desired in a continuous presence display that is pleasing to view.
    Type: Grant
    Filed: August 27, 2021
    Date of Patent: November 29, 2022
    Assignee: PLANTRONICS, INC.
    Inventors: Yong Yan, Stephen C. Botzko, Jian David Wang
  • Patent number: 11265583
    Abstract: Utilizing two LTR frames for improved error recovery. By using two LTR frames, much better performance is achieved in terms of error recovery as the likelihood of the decoder having one of the two LTR frames is very high. When the decoder determines a frame is lost, the decoder provides a fast update request (FUR). The FUR includes a listing of the LTR frames present at the decoder. With this indication of the LTR frames present at the decoder, the encoder utilizes one of the LTR frames, preferably the most recent, to use as a reference to send the next frame as a P frame. The P frame is sent with an indication of the LTR frame used as reference. The use of two LTR frames and the feedback of LTR frames present at the decoder allows the minimization of the use of I frames for error recovery.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: March 1, 2022
    Assignee: PLANTRONICS INC.
    Inventors: Jian David Wang, John Paul Spearman, Stephen C. Botzko
  • Publication number: 20210211742
    Abstract: Utilizing two LTR frames for improved error recovery. By using two LTR frames, much better performance is achieved in terms of error recovery as the likelihood of the decoder having one of the two LTR frames is very high. When the decoder determines a frame is lost, the decoder provides a fast update request (FUR). The FUR includes a listing of the LTR frames present at the decoder. With this indication of the LTR frames present at the decoder, the encoder utilizes one of the LTR frames, preferably the most recent, to use as a reference to send the next frame as a P frame. The P frame is sent with an indication of the LTR frame used as reference. The use of two LTR frames and the feedback of LTR frames present at the decoder allows the minimization of the use of I frames for error recovery.
    Type: Application
    Filed: January 6, 2020
    Publication date: July 8, 2021
    Inventors: Jian David Wang, John Paul Spearman, Stephen C. Botzko