Patents by Inventor Hsi-Jung Wu

Hsi-Jung Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11956295
    Abstract: Techniques for multi-view video streaming are described in the present disclosure, wherein a viewport prediction may be employed at a client-end based on analysis of pre-fetched media item data and ancillary information. A streaming method may first prefetch a portion of content of a multi-view media item. The method may next identify a salient region from the prefetched content and may then download additional content of the media item that corresponds to the identified salient region.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: April 9, 2024
    Assignee: APPLE INC.
    Inventors: Fanyi Duanmu, Alexandros Tourapis, Jun Xin, Hsi-Jung Wu, Xiaosong Zhou
  • Patent number: 11924391
    Abstract: A system obtains a data set representing immersive video content for display at a display time, including first data representing the content according to a first level of detail, and second data representing the content according to a second higher level of detail. During one or more first times prior to the display time, the system causes at least a portion of the first data to be stored in a buffer. During one or more second times prior to the display time, the system generates a prediction of a viewport for displaying the content to a user at the display time, identifies a portion of the second data corresponding to the prediction of the viewport, and causes the identified portion of the second data to be stored in the video buffer. At the display time, the system causes the content to be displayed to the user using the video buffer.
    Type: Grant
    Filed: December 16, 2022
    Date of Patent: March 5, 2024
    Assignee: Apple Inc.
    Inventors: Fanyi Duanmu, Jun Xin, Hsi-Jung Wu, Xiaosong Zhou
  • Patent number: 11924440
    Abstract: The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.
    Type: Grant
    Filed: August 24, 2022
    Date of Patent: March 5, 2024
    Assignee: APPLE INC.
    Inventors: Alexandros Michael Tourapis, Yeping Su, David Singer, Hsi-Jung Wu
  • Publication number: 20240048776
    Abstract: Disclosed is a method that includes receiving an image frame having a plurality of coded blocks, determining a prediction unit (PU) from the plurality of coded blocks, determining one or more motion compensation units arranged in an array within the PU, and applying a filter to one or more boundaries of the one or more motion compensation units. Also disclosed is a method that includes receiving a reference frame that includes a reference block, determining a timing for deblocking a current block, performing motion compensation on the reference frame to obtain a predicted frame that includes a predicted block, performing reconstruction on the predicted frame to obtain a reconstructed frame that includes a reconstructed PU, and applying, at the timing for deblocking the current block, a deblocking filter based on one or more parameters to the reference block, the predicted block, or the reconstructed PU.
    Type: Application
    Filed: September 29, 2022
    Publication date: February 8, 2024
    Inventors: Yixin Du, Alexandros Tourapis, Alican Nalci, Guoxin Jin, Hilmi Enes Egilmez, Hsi-Jung Wu, Jun Xin, Yeqing Wu, Yunfei Zheng
  • Publication number: 20240040120
    Abstract: Video coders and decoders perform transform coding and decoding on blocks of video content according to an adaptively selected transform type. The transform types are organized into a hierarchy of transform sets where each transform set includes a respective number of transforms and each higher-level transform set includes the transforms of each lower-level transform set within the hierarchy. The video coders and video decoders may exchange signaling that establishes a transform set context from which a transform set that was selected for coding given block(s) may be identified. The video coders and video decoders may exchange signaling that establishes a transform decoding context from which a transform that was selected from the identified transform set to be used for decoding the transform unit. The block(s) may be coded and decoded by the selected transform.
    Type: Application
    Filed: July 25, 2023
    Publication date: February 1, 2024
    Inventors: Hilmi Enes EGILMEZ, Yunfei ZHENG, Alican NALCI, Yeqing WU, Yixin DU, Guoxin JIN, Alexandros TOURAPIS, Jun XIN, Hsi-Jung WU
  • Publication number: 20240040151
    Abstract: Techniques are described for express and implied signaling of transform mode selections in video coding. Information derived from coefficient samples in a given transform unit (TU) or prediction unit (PU) may constrain or modify signaling of certain syntax elements at the coding block (CB), TU, or PU levels. For instance, based on the spatial locations of decoded coefficients, the spatial patterns of coefficients, or the correlation with the coefficients in neighboring blocks, various syntax elements such as the transform type and related flags/indices or secondary transform modes/flags indices, a residual coding mode, intra and inter prediction modes, and scanning order may be disabled or constrained. In another case, if the coefficient samples match a desired spatial pattern or have other desired properties then a default transform type, a default secondary transform type, a default intra and inter prediction mode or other block level modes may be inferred at the decoder side.
    Type: Application
    Filed: May 4, 2023
    Publication date: February 1, 2024
    Inventors: Alican Nalci, Yunfei Zheng, Hilmi E. Egilmez, Yeqing WU, Yixin Du, Alexis Tourapis, Jun Xin, Hsi-Jung Wu
  • Publication number: 20240040124
    Abstract: A flexible coefficient coding (FCC) approach is presented. In the first aspect, spatial sub-regions are defined over a transform unit (TU) or a prediction unit (PU). These sub-regions organize the coefficient samples residing inside a TU or a PU into variable coefficient groups (VCGs). Each VCG corresponds to a sub-region inside a larger TU or PU. The shape of VCGs or the boundaries between different VCGs may be irregular, determined based on the relative distance of coefficient samples with respect to each other. Alternatively, the VCG regions may be defined according to scan ordering within a TU. Each VCG can encode a 1) different number of symbols for a given syntax element, or a 2) different number of syntax elements within the same TU or PU. Whether to code more symbols or more syntax elements may depend on the type of arithmetic coding engine used in a particular coding specification. For multi-symbol arithmetic coding (MS-AC), a VCG may encode a different number of symbols for a syntax element.
    Type: Application
    Filed: July 25, 2023
    Publication date: February 1, 2024
    Inventors: Alican NALCI, Yunfei ZHENG, Hilmi Enes EGILMEZ, Yeqing WU, Yixin DU, Alexandros TOURAPIS, Jun XIN, Hsi-Jung WU, Arash VOSOUGHI, Dzung T. HOANG
  • Patent number: 11882294
    Abstract: The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.
    Type: Grant
    Filed: August 24, 2022
    Date of Patent: January 23, 2024
    Assignee: APPLE INC.
    Inventors: Alexandros Michael Tourapis, Yeping Su, David Singer, Hsi-Jung Wu
  • Patent number: 11847823
    Abstract: Video object and keypoint location detection techniques are presented. The system includes a detection system for generation locations of an object's keypoints along with probabilities associated with the locations, and a stability system for stabilizing keypoint locations of the detected objects. In some aspects, the generated probabilities are two-dimensional array correspond locations within input images, and stability system fits the generated probabilities to a two-dimensional probability distribution function.
    Type: Grant
    Filed: June 4, 2021
    Date of Patent: December 19, 2023
    Assignee: APPLE INC.
    Inventors: Xiaoxia Sun, Jiefu Zhai, Ke Zhang, Xiaosong Zhou, Hsi-Jung Wu
  • Publication number: 20230394081
    Abstract: A video classification, indexing, and retrieval system is disclosed that classifies and retrieves video along multiple indexing dimensions. A search system may field queries identifying desired parameters of video, search an indexed database for videos that match the query parameters, and create clips extracted from responsive videos that are provided in response. In this manner, different queries may cause different clips to be created from a single video, each clip tailored to the parameters of the query that is received.
    Type: Application
    Filed: June 1, 2023
    Publication date: December 7, 2023
    Inventors: Shujie LIU, Xiaosong ZHOU, Hsi-Jung WU, Jiefu ZHAI, Ke ZHANG, Ming CHEN
  • Publication number: 20230396819
    Abstract: A video delivery system generates and stores reduced bandwidth videos from source video. The system may include a track generator that executes functionality of application(s) to be used at sink devices, in which the track generator generates tracks from execution of the application(s) on source video and generates tracks having a reduced data size as compared to the source video. The track generator may execute a first instance of application functionality on the source video, which identifies region(s) of interest from the source video. The track generator further may downsample the source video according to downsampling parameters, and execute a second instance of application functionality on the downsampled video. The track generator may determine, from a comparison of outputs from the first and second instances of the application, whether the output from the second instance of application functionality is within an error tolerance of the output from the first instance of application functionality.
    Type: Application
    Filed: June 1, 2023
    Publication date: December 7, 2023
    Inventors: Ke ZHANG, Xiaoxia SUN, Shujie LIU, Xiaosong ZHOU, Jian LI, Xun SHI, Jiefu ZHAI, Albert E KEINATH, Hsi-Jung WU, Jingteng XUE, Xingyu ZHANG, Jun XIN
  • Patent number: 11818502
    Abstract: Embodiments of the present disclosure provide systems and methods for perspective shifting in a video conferencing session. In one exemplary method, a video stream may be generated. A foreground element may be identified in a frame of the video stream and distinguished from a background element of the frame. Data may be received representing a viewing condition at a terminal that will display the generated video stream. The frame of the video stream may be modified based on the received data to shift of the foreground element relative to the background element. The modified video stream may be displayed at the displaying terminal.
    Type: Grant
    Filed: June 22, 2022
    Date of Patent: November 14, 2023
    Assignee: APPLE INC.
    Inventors: Jae Hoon Kim, Chris Y. Chung, Dazhong Zhang, Hang Yuan, Hsi-Jung Wu, Xiaosong Zhou, Jiefu Zhai
  • Patent number: 11818394
    Abstract: Techniques are disclosed for coding video data predictively based on predictions made from spherical-domain projections of input pictures to be coded and reference pictures that are prediction candidates. Spherical projection of an input picture and the candidate reference pictures may be generated. Thereafter, a search may be conducted for a match between the spherical-domain representation of a pixel block to be coded and a spherical-domain representation of the reference picture. On a match, an offset may be determined between the spherical-domain representation of the pixel block to a matching portion of the of the reference picture in the spherical-domain representation. The spherical-domain offset may be transformed to a motion vector in a source-domain representation of the input picture, and the pixel block may be coded predictively with reference to a source-domain representation of the matching portion of the reference picture.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: November 14, 2023
    Assignee: APPLE INC.
    Inventors: Jae Hoon Kim, Xiaosong Zhou, Dazhong Zhang, Hang Yuan, Jiefu Zhai, Chris Y. Chung, Hsi-Jung Wu
  • Publication number: 20230300341
    Abstract: Techniques are disclosed for generating virtual reference frames that may be used for prediction of input video frames. The virtual reference frames may be derived from already-coded reference frames and thereby incur reduced signaling overhead. Moreover, signaling of virtual reference frames may be avoided until an encoder selects the virtual reference frame as a prediction reference for a current frame. In this manner, the techniques proposed herein contribute to improved coding efficiencies.
    Type: Application
    Filed: January 20, 2023
    Publication date: September 21, 2023
    Inventors: Yeqing WU, Yunfei ZHENG, Alexandros TOURAPIS, Alican NALCI, Yixin DU, Hilmi Enes EGILMEZ, Albert E. KEINATH, Jun XIN, Hsi-Jung WU
  • Publication number: 20230269400
    Abstract: In communication applications, aggregate source image data at a transmitter exceeds the data that is needed to display a rendering of a viewport at a receiver. Improved streaming techniques that include estimating a location of a viewport at a future time. According to such techniques, the viewport may represent a portion of an image from a multi-directional video to be displayed at the future time, and tile(s) of the image may be identified in which the viewport is estimated to be located. In these techniques, the image data of tile(s) in which the viewport is estimated to be located may be requested at a first service tier, and the other tile in which the viewport is not estimated to be located may be requested at a second service tier, lower than the first service tier.
    Type: Application
    Filed: March 9, 2023
    Publication date: August 24, 2023
    Inventors: Xiaohua YANG, Alexandros TOURAPIS, Dazhong ZHANG, Hang YUAN, Hsi-Jung WU, Jae Hoon KIM, Jiefu ZHAI, Ming CHEN, Xiaosong ZHOU
  • Publication number: 20230262196
    Abstract: Some embodiments provide a method for initiating a video conference using a first mobile device. The method presents, during an audio call through a wireless communication network with a second device, a selectable user-interface (UI) item on the first mobile device for switching from the audio call to the video conference. The method receives a selection of the selectable UI item. The method initiates the video conference without terminating the audio call. The method terminates the audio call before allowing the first and second devices to present audio and video data exchanged through the video conference.
    Type: Application
    Filed: April 27, 2023
    Publication date: August 17, 2023
    Inventors: Elizabeth C. CRANFILL, Stephen O. LEMAY, Joe S. ABUAN, Hsi-Jung WU, Xiaosong ZHOU, Roberto GARCIA, JR.
  • Patent number: 11711527
    Abstract: A method of adaptive chroma downsampling is presented. The method comprises converting a source image to a converted image in an output color format, applying a plurality of downsample filters to the converted image and estimating a distortion for each filter chose the filter that produces the minimum distortion. The distortion estimation includes applying an upsample filter, and a pixel is output based on the chosen filter. Methods for closed loop conversions are also presented.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: July 25, 2023
    Assignee: APPLE INC.
    Inventors: Alexandros Tourapis, Yeping Su, David W. Singer, Hsi-Jung Wu
  • Patent number: 11683525
    Abstract: A system comprises an encoder configured to compress attribute and/or spatial information for a point cloud and/or a decoder configured to decompress compressed attribute and/or spatial information for the point cloud. To compress the attribute and/or spatial information, the encoder is configured to convert a point cloud into an image based representation. Also, the decoder is configured to generate a decompressed point cloud based on an image based representation of a point cloud. In some embodiments, an encoder performs downscaling of an image frame prior to video encoding and a decoder performs upscaling of an image frame subsequent to video decoding.
    Type: Grant
    Filed: November 10, 2021
    Date of Patent: June 20, 2023
    Assignee: Apple Inc.
    Inventors: Khaled Mammou, Yeping Su, Jungsun Kim, Valery G. Valentin, David W. Singer, Fabrice A. Robinet, Hsi-Jung Wu, Alexandros Tourapis
  • Publication number: 20230188738
    Abstract: In an example method, a decoder obtains a data stream representing video content. The video content is partitioned into one or more logical units, and each of the logical units is partitioned into one or more respective logical sub-units. The decoder determines that the data stream includes first data indicating that a first logical unit has been encoded according to a flexible skip coding scheme. In response, the decoder determines a first set of decoding parameters based on the first data, and decodes each of the logical sub-units of the first logical unit according to the first set of decoding parameters.
    Type: Application
    Filed: December 6, 2022
    Publication date: June 15, 2023
    Inventors: Alican Nalci, Alexandros Tourapis, Hilmi Enes Egilmez, Hsi-Jung Wu, Jun Xin, Yeqing Wu, Yixin Du, Yunfei Zheng
  • Patent number: 11677934
    Abstract: In an example method, a system receives a plurality of frames of a video, and generates a data structure representing the video and representing a plurality of temporal layers. Generating the data structure includes: (i) determining a plurality of quality levels for presenting the video, where each of the quality levels corresponds to a different respective sampling period for sampling the frames of the video, (ii) assigning, based on the sampling periods, each of the frames to a respective one of the temporal layers of the data structure, and (iii) indicating, in the data structure, one or more relationships between (a) at least one the frames assigned to at least one of the temporal layers of the data structure, and (b) at least another one of the frames assigned to at least another one of the temporal layers of the data structure. Further, the system outputs the data structure.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: June 13, 2023
    Assignee: Apple Inc.
    Inventors: Sudeng Hu, David L. Biderman, Christopher M. Garrido, Hsi-Jung Wu, Xiaosong Zhou, Dazhong Zhang, Jinbo Qiu, Karthick Santhanam, Hang Yuan, Joshua L. Hare, Luciano M. Verger, Kevin Arthur Robertson, Sasanka Vemuri