专利摘要:
the present invention relates to a method of decoding video data including receiving a first block of video data; receive a first syntax element indicating whether or not an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; explicitly decode the value of the first received syntax element; and applying the encoding mode to the first block of video data in accordance with a value of the first syntax element.
公开号:BR112019018464A2
申请号:R112019018464
申请日:2018-03-08
公开日:2020-04-14
发明作者:Karczewicz Marta;Seregin Vadim;Zhao Xin
申请人:Qualcomm Inc;
IPC主号:
专利说明:

INTRA FILTERING INDICATOR IN VIDEO CODING [0001] This Patent Application claims the benefit of US Provisional Patent Application r 62 / 470,099, filed on March 10, 2017, and US Provisional Patent Application r 62 / 475,739, filed on March 23, 2017, the content of both being incorporated by reference.
TECHNICAL FIELD [0002] This invention relates to video encoding and decoding.
FUNDAMENTALS [0003] Digital video functionalities can be incorporated into a wide variety of devices, including digital televisions, digital direct transmission systems, wireless transmission systems, personal digital assistants (PDAs), portable or desktop computers, tablets, readers of digital books, digital cameras, digital recording devices, digital media players, video game devices, video game consoles, radiotelephones via cell or satellite, so-called smartphones, video teleconferencing devices, video transmission devices via streaming and the like. Digital video devices implement video encoding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264 / MPEG-4, Part 10, Video encoding Advanced Video (AVC), the High Efficiency Video Encoding standard (HEVC or H.265) and the extensions to these standards. Video devices can transmit, receive, encode, decode and / or store information
Petition 870190087428, of 9/5/2019, p. 20/133
2/91 of digital video more efficiently by implementing these video compression techniques.
[0004] Video coding techniques include spatial prediction (intraimage) and / or temporal prediction (interimage) to reduce or remove the redundancy inherent in video sequences. For block-based video encoding, a video slice (for example, a video frame or part of a video frame) can be segmented into video blocks, which can also be referred to as treeblocks, encoding units (CUs) ) and / or coding nodes. Images can be referred to as frames, and reference images can be referred to as reference frames.
[0005] The spatial or temporal prediction results in a predictive block for a block to be coded. Residual data represents the pixel differences between the original block to be encoded and the predictive block. For greater compression, residual data can be transformed from the pixel domain to a transform domain, resulting in residual transform coefficients, which can then be quantized. Entropy coding can be applied to achieve even more compaction.
SUMMARY [0006] This invention relates to intraprediction, determination of prediction guidelines, determination of prediction modes, determination of encoding modes, determinations regarding the use of intra filtering in video encoding (for example, video encoding and / or video encoding), and explicitly syntax and signaling syntax elements.
Petition 870190087428, of 9/5/2019, p. 21/133
3/91 [0007] In one or more of the examples discussed below, a video encoder and a video decoder can be configured to determine the explicit decoding of syntax elements indicating the use of an encoding mode based on a comparison of a number of non-zero transform coefficients associated with a block, compared to a limit. If the number of non-zero transform coefficients in the block is greater than or equal to a limit, the video encoder and the video decoder explicitly encode the syntax element for the encoding mode. If the number of non-zero transform coefficients in the block is less than the limit, the video encoder and video decoder do not explicitly encode the syntax element indicating the encoding mode. The techniques of this invention can be used with any coding modes, including intra reference sample smoothing filters, and the position dependent prediction (PDPC) combination modes.
[0008] In an example of the invention, a method of decoding video data comprises receiving a first block of video data; receive a first syntax element indicating whether an encoding mode will be used for the first video data block in the event that the first video data block is associated with a number of non-zero transform coefficients greater than or equal to a limit; explicitly decode a value from the first received syntax element; and apply the encoding mode to the first block of video data in accordance with the value of the first
Petition 870190087428, of 9/5/2019, p. 22/133
4/91 element of syntax.
[0009] In another example of the invention, a method of encoding video data comprises determining an encoding method for encoding a first block of video data; explicitly encode a first syntax element indicating whether the encoding mode will be used for the first video data block in the event that the first video data block is associated with a number of non-zero transform coefficients greater than or equal to a limit; and signaling the first syntax element in an encoded video bit stream.
[0010] In another example of the invention, an apparatus configured to decode video comprises a memory configured to store the video data, and one or more processors in communication with the memory, the one or more processors configured to receive a first block of data of video; receive a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; explicitly decode a value from the first received syntax element; and applying the encoding mode to the first block of the video data in accordance with a value of the first syntax element.
[0011] In another example of the invention, an apparatus configured to encode video data comprises a memory configured to store the video data, and one or more processors in communication with the memory, the one or more processors configured to determine a mode
Petition 870190087428, of 9/5/2019, p. 23/133
5/91 encoding to encode a first block of the video data; explicitly encode a first syntax element indicating whether the encoding mode will be used for the first block of the video data in the event that the first block of the video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; and signaling the first syntax element in an encoded video bit stream.
[0012] In another example of the invention, an apparatus configured to decode video data comprises means for receiving a first block of video data; means for receiving a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; means for explicitly decoding a value of the first received syntax element; and means for applying the encoding mode to the first block of video data in accordance with the value of the first syntax element.
[0013] In another example of the invention, an apparatus configured to encode video data comprises means for determining an encoding mode for encoding a first block of video data; means to explicitly encode a first element of syntax indicating whether the encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of coefficients of
Petition 870190087428, of 9/5/2019, p. 24/133
6/91 nonzero transform greater than or equal to a limit; and means for signaling the first syntax element in an encoded video bit stream.
[0014] In another example, this invention describes a computer-readable storage medium storing instructions that, when executed, take one or more processors from a device configured to decode video data to receive a first block of video data; receive a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; explicitly decode a value from the first received syntax element; and applying the encoding mode to the first block of the video data in accordance with a value of the first syntax element.
[0015] In another example, this invention describes a computer-readable storage medium storing instructions that, when executed, take one or more processors from a device configured to encode video data to determine an encoding mode for encoding a first block video data; explicitly encode a first syntax element indicating whether the encoding mode will be used for the first block of the video data in the event that the first block of the video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; and signal the first
Petition 870190087428, of 9/5/2019, p. 25/133
7/91 syntax element in an encoded video bit stream.
[0016] The exemplary techniques described below to determine the explicit encoding of syntax elements for the encoding modes can be used in conjunction with one or more other techniques described in this invention in any combination. For example, the techniques of this invention for determining the explicit encoding of syntax elements for the encoding modes can be used in conjunction with the techniques for encoding syntax elements for transform indices; techniques for determining the explicit coding of syntax elements for luminance (luma) and chrominance (chroma) blocks; techniques for determining the explicit coding of syntax elements for blocks without transform advance; techniques for determining the explicit coding of syntax elements for blocks with particular intraprediction modes; techniques for determining the explicit coding of syntax elements based on block size and techniques for coding syntax elements by context.
[0017] Details of one or more aspects of the invention are set out in the attached drawings and in the description below. Other resources, objects and advantages of the techniques described in this invention will be apparent from the description and drawings, and from the claims.
BRIEF DESCRIPTION OF THE DRAWINGS [0018] Figure 1 is a block diagram that illustrates an exemplary video encoding and decoding system configured to implement the
Petition 870190087428, of 9/5/2019, p. 26/133
8/91 techniques described in this invention.
[0019] Figure 2 is a conceptual diagram illustrating an example of block segmentation using a quadtree plus binary tree (QTBT) structure.
[0020] Figure 2B is a conceptual diagram illustrating an exemplary tree structure corresponding to the segmentation of blocks using the QTBT structure of figure 2A.
[0021] Figure 3 illustrates a prediction of a 4x4 block using an unfiltered reference according to the techniques of this invention.
[0022] Figure 3B illustrates a prediction of a 4x4 block using a filtered reference according to the techniques of this invention.
[0023] Figure 4 is a block diagram illustrating an example of a video encoder configured to implement the techniques of the invention.
[0024] Figure 5 is a block diagram illustrating an example of a video decoder configured to implement the techniques of the invention.
[0025] Figure 6 is a flow chart illustrating an exemplary coding method of the invention.
[0026] Figure 7 is a flow chart illustrating an exemplary decoding method of the invention.
DETAILED DESCRIPTION [0027] This invention relates to intraprediction, determination of prediction guidelines, determination of prediction modes, determination of encoding modes, determinations regarding the use of intra filtering in video encoding (for example, video encoding
Petition 870190087428, of 9/5/2019, p. 27/133
9/91 and / or video encoding), and the explicit encoding and signaling of syntax elements.
[0028] Video encoding standards include ITU-T H.261, ISO / IEC MPEG-1 Visual, ITU-T H.262 or ISO / IEC MPEG-2 Visual, ITU-T H.263, ISO / IEC Visual MPEG-4, ITU-T H.264 (also known as ISO / IEC MPEG-4 AVC), ITU-T H.265 (also known as High Efficiency Video Encoding (HEVC)), including extensions such as Video Encoding Scalable Video (SVC), MultiVista Video Encoding (MVC) and Screen Content Encoding (SCC). Other video encoding standards include future video encoding standards, such as the Joint Video Exploration Team (JVET) test model, which is the development activity beyond HEVC. Video encoding standards also include proprietary video codecs, such as Google VP8, VP9, VP10 and video codecs developed by other organizations, for example, the Alliance for Open Media.
[0029] In HEVC and the Joint Exploratory Model (JEM), which is the test software under study by JVET, an intra reference can be smoothed, for example, a filter can be applied. In HEVC, intra mode dependent smoothing (MDIS) is used so that a filter is applied to an intra reference (neighboring samples relative to a currently encoded block) before generating the intra prediction from the intra reference. The modes, for which MDIS is enabled, are obtained based on the proximity of the current intra mode to a horizontal or vertical direction. The modes, for which MDIS is enabled, can be obtained based on the absolute difference of the mode index
Petition 870190087428, of 9/5/2019, p. 28/133
10/91 intra between the current mode and the index in horizontal and vertical mode. If the absolute difference exceeds a certain limit (for example, the limit may be dependent on the size of the block), MDIS filtering is not applied, otherwise it is applied. In other words, in the intra modes that are far from the horizontal or vertical directions, the intra reference filter is applied. MDIS does not apply to non-angular modes, such as DC or planar mode.
[0030] In JEM, MDIS has been replaced by a smoothing filter (adaptive reference sample filtering (RSAF) or adaptive reference sample smoothing (ARSS)), which, in some examples, can be applied for all modes intra, except a DC mode. A flag, which indicates where the filter is applied or not in the current block, is signaled at the decoder level. Signaling is done not as an explicit indicator, but rather hidden in the transform coefficients. That is, the value of the indicator that indicates whether the filter is applied to a block in progress can be determined by a video decoder based on certain values or characteristics of the transform coefficients. For example, if the transform coefficients meet a certain parity condition, the indicator is derived as 1, otherwise, the indicator is derived as 0.
[0031] Another tool used in JEM is the position dependent intraprediction combination (PDPC) mode. PDPC is a coding mode that weighs the intra reference and intra predictor samples, where weights can be derived based on the block size (including the
Petition 870190087428, of 9/5/2019, p. 29/133
11/91 width and height) and in intra mode.
[0032] Figure 1 is a block diagram illustrating an exemplary video encoding and decoding system 10 that can be configured to perform the techniques of this invention. As shown in figure 1, the system 10 includes the source device 12 which provides encoded video data to be further decoded by the target device 14. In particular, the source device 12 provides the video data to the target device 14 via of the computer reading medium 16. The source device 12 and the target device 14 can comprise any of a wide variety of devices, including desktop computers, notebook computers (ie laptops), tablets, set-top boxes, telephone sets such as so-called smart phones, tablets, televisions, cameras, display devices,
readers of digital media, consoles in video games, devices in video via streaming or similar. In many embodiments, the device in origin 12 it's the device in destiny 14 may to be equipped for Communication without thread. So the device of origin 12 it's the device in destiny 14 can be devices in Communication without thread. 0 device in origin 12 is one device in video encoding exemplary (or
(ie, a device for encoding video data). The target device 14 is an exemplary video decoding device (i.e., a device for decoding video data).
[0033] In the example in figure 1, the
Petition 870190087428, of 9/5/2019, p. 30/133
Source 12/91 includes the video source 18, the storage medium 20 configured to store video data, the video encoder 22 and the output interface 24. The destination device 14 includes the input interface 26, the medium storage device 28 configured to store encoded video data, the video decoder 30 and the display device 32. In other examples, the source device 12 and the target device 14 include other components or arrangements. For example, the source device 12 can receive video data from an external video source, such as an external camera. Likewise, the target device 14 can interact with an external display video, instead of including an integrated display device 32.
[0034] The illustrated system 10 of figure 1 is just an example. Techniques for processing and / or encoding (e.g., encoding and / or decoding) video data can be performed by any digital video encoding and decoding device. Although the techniques of this invention are generally performed by a video encoding device and / or video decoding device, the techniques can also be performed by a video encoder / decoder, commonly referred to as a CODEC. The source device 12 and the target device 14 are just examples of such encoding devices, where the source device 12 generates encoded video data for transmission to the target device 14. In some examples, the source device 12 and the target device 14 can function substantially
Petition 870190087428, of 9/5/2019, p. 31/133
13/91 symmetrical, such that each of the source device 12 and the target device 14 includes video encoding and decoding components. Thus, system 10 can support the transmission of unidirectional or bidirectional video between the source device 12 and the destination device 14, for example, for video via streaming, video playback, video via broadcast or video-telephony.
[0035] The video source 18 of the source device 12 may include a video capture device, such as a video camera, a video file containing previously captured video and / or a video input interface for receiving video data from a video content provider. As another alternative, video source 18 can generate computer-based data, such as the source video, or a combination of live video, archived video and computer generated video. The source device 12 may comprise one or more data storage media (for example, storage media 20) configured to store video data. The techniques described in this document may be applicable to video encoding in general, and can be applied to wireless and / or wired applications. In each case, the captured, pre-captured or computer generated video can be encoded by the video encoder 22. The output interface 24 can output the encoded video information (for example, an encoded video data bit stream) to the computer reading medium 16.
[0036] The target device 14 can receive the encoded video data to be decoded via the
Petition 870190087428, of 9/5/2019, p. 32/133
14/91 computer reading medium 16. The computer reading medium 16 can comprise any type of support or device capable of moving the encoded video data from the source device 12 to the destination device 14. In one example, the computer reading means 16 may comprise a communication means for enabling the source device 12 to transmit encoded video data directly to the destination device 14 in real time. The encoded video data can be modulated according to a communication standard, such as a wireless communication protocol, and transmitted to the destination device 14. The communication medium can comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may be part of a packet-based network, such as a local area network, a wide area network, or a global network such as the Internet. The communication medium may include routers, switches, base stations or any other equipment that may be useful to facilitate communication from the source device 12 to the destination device 14. The destination device 14 may comprise one or more data storage means configured to store encoded video data and decoded video data.
[0037] In some examples, encrypted data can be output from the output interface 24 to a storage device. Likewise, encrypted data can be accessed from the storage device via the input interface. The storage device can include any one of a variety
Petition 870190087428, of 9/5/2019, p. 33/133
15/91 of distributed or locally accessed data storage media, such as a hard disk drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other digital storage media to store encoded video data. In another example, the storage device may correspond to a file server or other intermediate storage device that can hold the encoded video generated by the source device 12. The destination device can access the stored device's video data via streaming or download. The file server can be any type of server capable of storing encoded video data and transmitting that encoded video data to the target device 14. Exemplary file servers include a network server (for example, for a website), a server FTP, network-attached storage devices (NAS) or a local disk drive. The target device 14 can access the encoded video data via any standard data connection, including an Internet connection. This can include a wireless channel (for example, a Wi-Fi connection), a wired connection (for example, DSL, cable modem, etc.), or a combination of both that is suitable for accessing stored encoded video data on a file server. The transmission of encoded video data from the storage device can be a transmission via streaming, a transmission via download or a combination of both.
[0038] The techniques described in this document
Petition 870190087428, of 9/5/2019, p. 34/133
16/91 can be applied for video encoding in support of any of a variety of multimedia applications, such as over-the-air television broadcasts, cable television broadcasts, satellite television broadcasts, video streams via Internet streaming, such as dynamic adaptive streaming over HTTP (DASH), digital video that is encoded to a data storage medium, decoding digital video stored on a data storage medium or other applications. In some instances, system 10 can be configured to support unidirectional or bidirectional video transmission to support applications such as video streaming, video playback, video transmission and / or video telephony.
[0039] Computer-readable medium 16 may include transient medium, such as transmission via wireless broadcast or wired network, or storage media (i.e., non-transitory storage media), such as a hard disk, flash drive, compact disc, digital video disc, Blu-ray disc or other means of reading by computer. In some examples, a network server (not shown) can receive encoded video data from the source device 12 and provide the encoded video data to the destination device 14, for example, via transmission over the network. Likewise, a computing device of a media production unit, such as a disk stamping unit, can receive encoded video data from the source device 12 and produce a disc containing the encoded video data. Therefore, the computer reading medium 16
Petition 870190087428, of 9/5/2019, p. 35/133
17/91 can be understood to include one or more means of reading by computer in various ways, in various examples.
[0040] The input interface 26 of the target device 14 receives information from the computer reading medium 16. The information from the computer reading medium 16 can include syntax information defined by the video encoder 22 of the video encoder 22, which they are also used by the video decoder, which includes elements of syntax that describe the characteristics and / or processing of blocks and other encoded units, for example, image group (GOPs). The storage means 28 can store encoded video data received by the input interface 26. The display device 32 displays the decoded video data for a user, and can comprise any of a variety of display devices, such as a tube. cathode rays (CRT), a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display or other type of display device.
[0041] Video encoder 22 and video decoder 30 can be implemented as any of a variety of video encoder and / or video decoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field programmable port arrangements (FPGAs), discrete logic, software, hardware, firmware or their combinations. When techniques are partially
Petition 870190087428, of 9/5/2019, p. 36/133
18/91 implemented in software, a device can store instructions for the software in a suitable non-transitory computer reading medium, and execute instructions in hardware using one or more processors to perform the techniques of this invention. Each of video encoder 22 and video decoder 30 can be included in one or more encoders or decoders, each of which can be integrated as part of a combined CODEC in a respective device.
[0042] In some examples, video encoder 22 and video decoder 30 may operate according to a video encoding standard. Exemplary video encoding standards include, but are not limited to, ITU-T H.261, ISO / IEC MPEG-1 Visual, ITU-T H.262 or ISO / IEC MPEG-2 Visual, ITU-T H. 263, ISO / IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO / IEC MPEG-4 AVO), including its Scalable Video Encoding (SVC) and MultiVista Video Encoding (MVC) extensions. In addition, a new standard for video encoding, namely High Efficiency Video Encoding (HEVC) or ITU-T H.265, including its scope and encoding extensions for screen content, 3D video encoding ( 3D-HEVC) and multiview extensions (MV-HEVC) and scalable extension (SHVC), was developed by the Joint Video Coding Collaboration Team (JCT-VC) of the ITU-T Group of Video Coding Specialists (VCEG) and Group of Experts in Moving Images (MPEG) ISO / IEC.
[0043] In other examples, the video encoder and video decoder 30 can be configured to work according to other techniques and / or standards
Petition 870190087428, of 9/5/2019, p. 37/133
19/91 video encoding, including new video encoding techniques in operation by the
Joint Video Exploration (JVET).
[0044] In HEVC and other video encoding specifications, a video sequence usually includes a series of images. Images can also be called frames. An image can include three sample arrangements, indicated SL, Scb and Scr. SL is a two-dimensional array (ie, a block) of luminance samples. Scb is a two-dimensional array of Cb chrominance samples. Scr is a two-dimensional matrix of Cr chrominance samples. Chrominance samples can also be referred to here as chroma samples. In other cases, an image may be monochromatic and may include only an array of luma samples.
[0045] To generate an encoded representation of an image (for example, an encoded video bit stream), video encoder 22 can generate a set of tree encoding units (CTUs). Each of the CTUs can comprise a luma sample tree coding block, two corresponding chroma sample CTBs, and syntax structures used to encode the samples from the tree coding blocks. In monochrome images or images having three separate color planes, a CTU can comprise a single tree coding block and syntax structures used to encode the tree coding block samples. A tree coding block can be an NxN sample block. A CTU can also be referred to as a treeblock or a larger coding unit (LCU). At
Petition 870190087428, of 9/5/2019, p. 38/133
20/91
HEVC CTUs can be, in general terms, analogous to macroblocks of other standards, such as H.264 / AVC. However, a CTU is not necessarily limited to a specific size and can include one or more encoding units (CUs). A slice can include an integer
consecutively ordered CTUs in an order in raster scan. [0046] To generate an CTU coded, O video encoder 22 can, as feature, perform The
quadtree segmentation in the tree coding blocks of a CTU to divide the tree coding blocks into coding blocks, hence the name tree coding units. A coding block can be an NxN block of samples. A CU can comprise a luma sample coding block and two corresponding chroma sample coding blocks of an image that has a luma sample matrix, a Cb sample matrix and a Cr sample matrix, and syntax structures used to code samples of the coding blocks. In monochrome images or images with three separate color planes, a CU can comprise a single coding block and syntax structures used to encode the samples in the coding block.
[0047] The video encoder 22 can segment an encoding block of a CU into one or more prediction blocks. A prediction block is a rectangular block (that is, square or non-square) of samples to which the same prediction applies. A CU prediction unit (PU) may comprise a luma sample prediction block, two corresponding sample prediction blocks
Petition 870190087428, of 9/5/2019, p. 39/133
21/91 chroma and syntax structures used to predict the prediction blocks. In monochrome images or images with three separate color planes, a PU can comprise a single prediction block and syntax structures used to predict the prediction block. The video encoder 22 can generate predictive blocks (e.g., predictive blocks luma, Cb and Cr) for the prediction blocks (e.g., prediction blocks luma, Cb and Cr) of each CU of the CU.
[0048] Video encoder 22 can use intraprediction or interpredition to generate the predictive blocks for a PU. If the video encoder 22 uses intraprediction to generate the predictive blocks of a PU, the video encoder 22 can generate the predictive blocks of the PU based on decoded samples of the image that includes the PU.
[0049] After video encoder 22 generates predictive blocks (for example, predictive blocks luma, Cb and Cr) for one or more PUs of a CU, video encoder 22 can generate one or more residual blocks for the CU. As an example, video encoder 22 can generate a luma residual block for the CU. Each sample in the CU luma residual block indicates a difference between a luma sample in one of CU's predictive luma blocks and a corresponding sample in the original CU luma coding block. In addition, video encoder 22 can generate a residual block Cb for the CU. In an example of chroma prediction, each sample in the residual block Cb of a CU can indicate a difference between a sample Cb in one of the predictive Cb blocks of CU and a corresponding sample in the block of CU
Petition 870190087428, of 9/5/2019, p. 40/133
22/91 CU original Cb coding. The video encoder 22 can also generate a residual block Cr for the CU. Each sample in the residual CU block of the CU can indicate a difference between a Cr sample in one of the predictive Cr blocks of the CU and a corresponding sample in the original CU coding block. However, it should be understood that other techniques for chroma prediction can be used.
[0050] In addition, video encoder 22 can use quad-tree segmentation to decompose the residual blocks (for example, the residual blocks luma, Cb and Cr) of a CU into one or more transform blocks (for example , the luma transform blocks, Cb and Cr). A transform block is a rectangular block (that is, square or non-square) of samples to which the same transform is applied. A transform unit (TU) of a CU can comprise a luma sample transform block, two corresponding chroma sample transform blocks, and syntax structures used to transform the transform block samples. Thus, each TU of a CU can have a luma transform block, a Cb transform block and a Cr transform block. The TU luma transform block can be a sub-block of the CU luma residual block. The transform block Cb can be a sub-block of the residual block Cb of CU. The transform block Cr can be a sub-block of the residual block Cr of CU. In monochrome images or images with three separate color planes, a TU can comprise a single transform block and syntax structures used to transform the samples in the transform block.
Petition 870190087428, of 9/5/2019, p. 41/133
23/91 [0051] Video encoder 22 can apply one or more transforms to a transform block of a TU to generate a coefficient block for the TU. For example, video encoder 22 can apply one or more transforms to a luma transform block of a TU to generate a lum coefficient block for the TU. A block of coefficients can be a two-dimensional matrix of sample coefficients. A transform coefficient can be a scalar quantity. The video encoder 22 can apply one or more transforms to a transform block Cb of a TU to generate a block of coefficient Cb for the TU. The video encoder 22 can apply one or more transforms to a TU Cr transform block to generate a Cr coefficient block for the TU.
[0052] After generating a block of coefficients (for example, a block of coefficients luma, a block of coefficients Cb or a block of coefficients Cr), the video encoder 22 can quantize the block of coefficients. Quantization, in general, refers to a process in which transform coefficients are quantized to possibly reduce the amount of data used to represent the transform coefficients, providing greater compression. After the video encoder 22 quantizes a block of coefficients, the video encoder 22 can entropy encode the syntax elements indicating the quantized transform coefficients. For example, video encoder 22 can perform Context Adaptive Binary Arithmetic Coding (CABAC) for
Petition 870190087428, of 9/5/2019, p. 42/133
24/91 syntax that indicate the quantized transform coefficients.
[0053] Video encoder 22 can output a bit stream that includes a bit stream that forms a representation of encoded images and associated data. Thus, the bit stream comprises an encoded representation of video data. The bit stream may comprise a sequence of units of the Network Abstraction Layer (NAL). An NAL unit is a syntax structure containing an indication of the type of data in the NAL unit and bytes containing that data in the form of a raw byte sequence payload (RBSP) interspersed, as necessary, with emulation prevention bits. Each NAL unit can include an NAL unit header and encapsulates an RBSP. The NAL unit header can include a syntax element indicating an NAL unit type code. The NAL unit type code specified by the NAL unit header of an NAL unit indicates the type of the NAL unit. An RBSP can be a syntax structure containing an integer number of bytes that is encapsulated within an NAL unit. In some cases, an RBSP includes zero bit.
[0054] Video decoder 30 can receive an encoded video bit stream generated by video encoder 22. Furthermore, video decoder 30 can analyze the bit stream to obtain bit stream syntax elements. The video decoder 30 can reconstruct the images of the video data based, at least in part, on the syntax elements obtained from the bit stream. The process to reconstruct the
Petition 870190087428, of 9/5/2019, p. 43/133
25/91 video data can, in general, be reciprocal to the process performed by video encoder 22. For example, the video decoder 30 can use the motion vectors of PUs to determine the predictive blocks for the PUs of a CU in course. In addition, the video decoder 30 can reverse quantize the CU coefficient blocks of CUs in progress. The video decoder 30 can perform reverse transforms in the coefficient blocks to reconstruct the transform blocks of the CU's TUs in progress. The video decoder 30 can reconstruct the CU encoding blocks in progress by adding the samples of the predictive blocks for CU's in progress to corresponding samples from the transform blocks of the CU's TUs in progress. With the reconstruction of the encoding blocks for each CU of an image, the video decoder 30 can reconstruct the image.
[0055] In some exemplary video codec frames, such as the HEVC quadtree segmentation frame, the segmentation of video data in blocks for the color components (for example, luma blocks and chroma blocks) is performed together. That is, in some examples, the luma blocks and chroma blocks are segmented in the same way, so that no more than one luma block corresponds to a chroma block at a specific position within an image.
[0056] A quadtree segmentation structure plus binary tree (QTBT) is being studied by the Joint Video Exploration Team (JVET). In J. An et al. , Block partitioning structure for next generation video coding, International Telecommunication Union, COM16Petition 870190087428, 05/09/2019, pg. 44/133
26/91
C966, September 2015 (hereinafter proposed VCEG COM16C966), QTBT segmentation techniques have been described for the future standard of video coding in addition to HEVC. Simulations showed that the proposed QTBT structure can be more efficient than the quadtree structure used in HEVC.
[0057] In the QTBT structure described in the VCEG COM16-C966 proposal, a CTB is first segmented using quadtree segmentation techniques, where the quadtree division of a node can be iterated until the node reaches the minimum allowed node size - quadtree sheet. The minimum allowed size of the leaf node in the quadtree can be indicated for the video decoder 30 by the value of the MinQTSize syntax element. If the size of the quadtree leaf node is not greater than the maximum allowed size of the binary tree's root node (for example, as indicated by a MaxBTSize syntax element), the quadtree leaf node can still be segmented using the segmentation of the binary tree. The segmentation of a node's binary tree can be iterated until the node reaches the minimum allowed leaf node size of the binary tree (for example, as indicated by a MinBTSize syntax element) or the maximum allowed depth of the binary tree (for example, example, as indicated by a MaxBTDepth syntax element). The VCEG COM16-C966 proposal uses the term CU to refer to leaf nodes of the binary tree. In the VCEG COM16-C966 proposal, CUs are used for prediction (for example, intraprediction, interpredition, etc.) and transformed without any further segmentation. In general, according to QTBT techniques, there are two types of segmentation for segmentation of the binary tree: horizontal symmetric segmentation and vertical symmetric segmentation. In
Petition 870190087428, of 9/5/2019, p. 45/133
27/91 each case, a block is segmented by dividing the block below the middle, horizontally or vertically. This differs from quadtree segmentation, which divides a block into four blocks.
[0058] In an example of the QTBT segmentation structure, the CTU size is defined as 128x128 (for example, a 128x128 luma block and two corresponding 64x64 chroma blocks), MinQTSize is defined as 16x16, MaxBTSize is defined as 64x64, MinBTSize (for both width and height) is set to 4, and MaxBTDepth is set to 4. Quadtree segmentation is applied to the CTU first to generate the quadtree leaf nodes. Quadtree leaf nodes can have a size of 16x16 (that is, the MinQTSize is 16x16) to 128x128 (that is, the CTU size). According to an example of QTBT segmentation, if the quadtree leaf node is 128x128, the quadtree leaf node cannot be further divided by the binary tree, since the quadtree leaf node size exceeds MaxBTSize (ie , 64x64). Otherwise, the leaf node of the quadtree is still segmented by the binary tree. Therefore, the leaf node of the quadtree is also the root node for the binary tree and has the depth of the binary tree as 0. The depth of the binary tree that reaches MaxBTDepth (for example, 4) implies that there is no further segmentation. The binary tree node with a width equal to MinBTSize (for example, 4) implies that there is no longer any horizontal division. Likewise, the binary tree node with a height equal to MinBTSize implies no vertical segmentation. The binary tree leaf nodes (CUs) are further processed (for example, performing a prediction process and a process
Petition 870190087428, of 9/5/2019, p. 46/133
28/91 transformed) without any other segmentation.
[0059] Figure 2A illustrates an example of a block (for example, a CTB) segmented using QTBT segmentation techniques. As shown in figure 2A, using QTBT segmentation techniques, each of the resulting blocks is segmented symmetrically through the center of each block. Figure 2B illustrates the tree structure corresponding to the block segmentation of figure 2A. The solid lines in figure 2B indicate segmentation in square and the dotted lines indicate segmentation in binary tree. In an example, at each segmentation node (ie, not leaf) of the binary tree, a syntax element (for example, an indicator) is flagged to indicate the type of segmentation performed (for example, horizontal or vertical), where 0 indicates horizontal segmentation and 1 indicates vertical segmentation. For quadtree segmentation, it is not necessary to indicate the type of
targeting, since targeting in quadtree ever divide one block horizontally and vertically in 4 sub- blocks in equal size. [0060] As shown in the figure 2B, no at the 70, the block 50 is divided into four blocks 51 , 52, 53 and 54,
shown in figure 2A, using quadtree segmentation. Block 54 is not even more divided and is therefore a nofolha. At node 72, block 51 is further divided into two blocks using binary tree segmentation. As shown in figure 2B, node 72 is marked with a 1, indicating vertical segmentation. Thus, segmentation at node 72 results in block 57 and the block including both blocks 55 and 56. Blocks 55 and 56 are created by a new segmentation
Petition 870190087428, of 9/5/2019, p. 47/133
29/91 vertical at node 74. At node 76, block 52 is further divided into two blocks 58 and 59 using binary tree segmentation.
As shown in figure 2B, node 76 is marked with a 1, indicating horizontal segmentation.
[0061] At node 78, block 53 is divided into 4 blocks of equal size using quadtree segmentation. Blocks 63 and 66 are created from this quadtree segmentation and are no longer divided. At node 80, the upper left block is first divided using vertical segmentation in a binary tree, resulting in block 60 and a vertical right block. The vertical right block is then divided using horizontal segmentation in a binary tree, dividing into blocks 61 and 62. The lower right block created from quadtree segmentation at node 78 is to divide at node 84 using horizontal segmentation in binary tree , dividing into blocks 64 and 65.
[0062] In an example of QTBT segmentation, chroma and luma segmentation can be performed independently for slices I, unlike, for example, for HEVC, where quadtree segmentation is performed together for chroma and luma blocks. That is, in some examples under study, the luma blocks and the chroma blocks can be segmented separately, so that the luma blocks and the chroma blocks do not overlap directly. Thus, in some examples of QTBT segmentation, chroma blocks can be segmented so that at least one segmented chroma block is not spatially aligned to a single segmented luma block. That is, the luma samples that are juxtaposed with a given chroma block can be within two or more different luma segments.
Petition 870190087428, of 9/5/2019, p. 48/133
30/91 [0063] The following sections describe techniques for determining parameters for a position-dependent intraprediction combination (PDPC) encoding mode for video data blocks. When encoding video data using the PDPC encoding mode, video encoder 22 and / or video decoder 30 can use one or more parameterized equations that define how to combine predictions based on filtered and unfiltered reference values, and with based on the predicted pixel position (or the color component value of a pixel). The present invention describes several sets of parameters, such that the video encoder 22 can be configured to test the parameter sets (for example, using the rate distortion analysis) and signal to the video decoder 30 the ideal parameters (for example, the parameters resulting from the best performance of the rate distortion between the parameters that are tested). In other examples, the video decoder 30 can be configured to determine the parameters of the PDPC from the characteristics of the video data (for example, the block size, the block height, the block width, etc.).
[0064] Figure 3A illustrates a prediction of a 4x4 block (p) using an unfiltered reference (r) according to the techniques of this invention. Figure 3B illustrates a prediction of a 4x4 block (q) using a filtered reference (s) according to the techniques of this invention. Although both figures 3A and 3b illustrate a block of pixels 4x4 and 17 respective reference values (4x4 + 1), the techniques of the present invention can be applied to
Petition 870190087428, of 9/5/2019, p. 49/133
31/91 any block size and number of reference values.
[0065] The video encoder 22 and / or the video decoder 30, when performing the PDPC encoding mode, can use a combination of filtered (q) and unfiltered (p) predictions, so that a predicted block for an ongoing block to be coded can be calculated using the pixel values of both filtered (s) and unfiltered (r) reference arrangements.
[0066] In an example of the PDPC techniques, given any two sets of pixel predictions p r [x, y] eq s [x, y], calculated using only the unfiltered and filtered references res, respectively, the predicted value combined pixel, represented by v [x, y], is defined by v fc, y] = c fc y] pr fc yl + (i - c fc y]) < $ fc y J (i) where c [x, y] is the set of parameters for the combination. The weight value c [x, y] can be a value between 0 and 1. The sum of the weights c [x, y] and (lc [x, y]) can be equal to one.
[0067] In some examples, it may not be practical to have a set of parameters as large as the number of pixels in the block. In these examples, c [x, y] can be defined by a much smaller set of parameters, plus an equation to calculate all the values of the combination of these parameters. In this example, the following formula can be used:
Petition 870190087428, of 9/5/2019, p. 50/133
32/91
II y ι I j: x ι ι 1
L 2K1 L wd j (»*) g p ^ {x , y} + 6 [xy] ^^ [xy] (2) where Ci v , c2 v , Ci h , c2 h , ged v , d h and {1, 2} are prediction parameters, N is the block size, p r [x, y] and q s [x, y] are prediction values calculated using the HEVC standard, for the specific mode, using the unfiltered and filtered references, respectively, and
(v) (v) H u 2 2it / d vJ, 2k <íh |
A-min (x, y) N (3) is a normalization factor (that is, to make the total weights assigned to r (HEVC) [x, y] and qs ( HEvc) [X / ry] add 1), defined by the prediction parameters.
[0068] Formula 2 can be generalized to any video encoding standard in Formula 2A:
v [x, y] C1 w r [x, -lb4 y) r [-l, -l] | | c ^ [~ l, y] - C ( ft) r [-l, ~ í] jy I + I 'I
2K1 JL 2 '^ W / V-min (x, y} (STD) r i, 1 r ί (STD) r ί (-) g Pr>, y] + b [x, y] q ± Ky] (2A) where c 2 v , c2 v , Ci h , c2 h , g and dv, dh and {1,2}, are prediction parameters, N is the block size, pr (STD) [x, y] and qs (STD) [x, y] are prediction values calculated using a video encoding standard (or video encoding scheme or algorithm), for the specific mode,
Petition 870190087428, of 9/5/2019, p. 51/133
33/91 using unfiltered and filtered references, respectively, and b [x, y] = 1 -
ly / dvl
2 ^ / ¾]
N-min (x, y) (3A) is a normalizing factor (that is, to make the total weights assigned to r (STD) [x, y] eq s (STD) [x,
y] add 1), defined by the prediction parameters.
[0069] These prediction parameters can include weights to provide an ideal linear combination of the predicted terms according to the type of intraprediction mode used (for example, HEVC, DC, planar and 33 directional modes). For example, HEVC contains 35 modes of intraprediction. A lookup table can be constructed with values for each of the prediction parameters ci v , C2 V , ci h , C2 h , g, d v and dh for each of the intraprediction modes (ie 35 values of Ci v , c2 v , Ci h , c2 h , g, d v and d h for each intraprediction mode). These values can be encoded in a bit stream with the video or they can be constant values known by the encoder and decoder ahead of time and do not need to be transmitted in a file or bit stream. The values for Ci v , c2 v , Ci h , c2 h , g, d v and h can be determined by an optimization training algorithm, finding the values for the prediction parameters that give the best compaction for a set of training videos.
[0070] In another example, there are a plurality of predefined prediction parameter sets for each intraprediction mode (for example, in a
Petition 870190087428, of 9/5/2019, p. 52/133
34/91 query) and the set of prediction parameters selected (but not the parameters themselves) is transmitted to a decoder in an encoded file or bit stream. In another example, the values for Ci v , C2, Ci h , C2 h , g, d v and dh can be generated dynamically by a video encoder and transmitted to a decoder in an encoded file or bit stream.
[0071] In another example, instead of using HEVC prediction, a video encoding device that performs these techniques can use a modified version of HEVC, such as the one that uses 65 directional predictions instead of 33 directional predictions. In fact, any type of intraframe prediction can be used.
[0072] In another example, the formula can be chosen to facilitate calculations. For example, you can use the following type of predictor
where (v) _ (v) C Íy / dvl
í. x / ^ hl (5)
(6) [0073] This approach can explore linearity
Petition 870190087428, of 9/5/2019, p. 53/133
35/91 of the HEVC (or other) prediction. Defining h as the impulse response of a filter k from a predefined set, if s = ar + (1 - a) (h * r) in which it represents the convolution, then (HEVC) r -ί (HEVC) r η p a , r lS IX y] = Ps IX y] (Ό (8) that is, the prediction combined with linearity can be calculated from the reference combined with linearity.
[0074] Formulas 4, 6 and 8 can be generalized to any video encoding standard in formulas 4A, 6A and 8A:
v [x, y] =
2Íy / dvJ 2 [x / d h j + b [x, y] P ^ s D) [x, y] (4A) where b [x, y] = 1 - (v) (v)
2iy / d vi (5A)
Petition 870190087428, of 9/5/2019, p. 54/133
36/91 ρ ^ γ, Γ Ι χ > y] = a Pr (STD) [*, y] + (ι - α) q s (std) Ιλ y] (6 A)
This approach can explore the linearity of the prediction of the coding pattern. Defining h as the impulse response of a filter k from a predefined set, if s = air + (1 - a) (h * r) (7A) in which it represents the convolution, then pa S r T s D) [ ^ y] = ps (STD) [^ y] (8A) that is, the prediction combined with linearity can be calculated from the reference combined with linearity.
[0075] In one example, the prediction functions can use the reference vector (for example, res) only as input. In this example, the behavior of the reference vector does not change if the reference has been filtered or unfiltered. If res are equal (for example, an unfiltered reference r becomes the same as another filtered reference s), then the predictive functions, for example, p r [x <y] (also written as p (x, y, r )) is equal to s [x, y] (also written as p (x, y, s))), applied to filtered and unfiltered references are the same. In addition, the pixel predictions p and q can be equivalent (for example, producing the same output given the same input).
Petition 870190087428, of 9/5/2019, p. 55/133
37/91
In one example, formulas (1) - (8) can be rewritten with the pixel prediction p [x, y] replacing the pixel prediction q [x, y].
[0076] In another example, the prediction (for example, the sets of functions) may change, depending on the information that a reference has been filtered. In this example, different sets of functions can be indicated (for example, p r [x, y] and q s [x, y]. In this case, even if res are equal, p r [x, y] θ q s [x , y] may not be the same. In other words, the same input may create a different output, depending on whether the input has been filtered or not. In this example, p [x, y] may not be able to be replaced by q [x , y].
[0077] An advantage of the prediction equations shown is that, with the parameterized formulation, sets of ideal parameters can be determined (that is, those that optimize the accuracy of the prediction), for different types of video textures, using techniques such as training . This approach, in turn, can be extended in some examples by calculating several sets of predictive parameters, for some common types of textures, and with a compression scheme where the encoder tests the predictors of each set, and encodes as secondary information that provide the best compression.
[0078] In some examples of the techniques described above, when the PDPC encoding mode is activated, the PDPC parameters used to weight the intraprediction and to control the use of unfiltered or filtered samples in the PDPC mode are pre-calculated and stored in a table
Petition 870190087428, of 9/5/2019, p. 56/133
38/91 consultation (LUT). In one example, the video decoder 30 determines the parameters of the PDPC according to the block size and the direction of the intraprediction. Previous techniques for the PDPC encoding mode assumed that the intrapredict blocks are always square in size.
[0079] In HEVC and JEM examples, an intra reference can be smoothed. For example, a filter can be applied to an intra reference. In HEVC, intra mode dependent smoothing (MDIS) is used so that a filter is applied to an intra reference (neighboring samples relative to a currently encoded block) before generating the intraprediction from the intra reference. The video encoder 22 and the video decoder 30 can derive certain intraprediction modes for which MDIS is activated based on the proximity of the current intraprediction mode to a horizontal or vertical direction. The modes, for which MDIS is enabled, can be obtained based on the absolute difference of the intra mode index between the current mode and the horizontal and vertical mode index. If the absolute difference exceeds a certain limit (for example, the limit may be dependent on the size of the block), the MDIS filter is not applied, otherwise it is applied. In other words, in intra modes that are distant from horizontal or vertical directions (for example, compared to a limit), the intra reference filter is applied. In some instances, MDIS does not apply to non-angular modes, such as DC or planar.
[0080] In JEM, MDIS has been replaced by a coding mode for the smoothing filter (for example, an adaptive reference sample filtering (RSAF) or
Petition 870190087428, of 9/5/2019, p. 57/133
39/91 adaptive reference sample smoothing (ARSS)), which, in some examples, can be applied to all intraprediction modes except a DC mode. In general, these techniques can be referred to as intra reference sample smoothing filters. The video encoder 22 can be configured to generate and signal a syntax element (e.g., an indicator), which indicates whether the intra reference sample smoothing filter is applied to the current block. In some instances, video encoder 22 may not be configured to explicitly encode the syntax element that indicates whether the filter is applied to the current block. In the context of the present invention, the explicit encoding of a syntax element refers to the ongoing encoding or decoding of a syntax element value in an encoded video bit stream. That is, the explicit encoding can refer to the video encoder 22 generating a value for a syntax element and explicitly encoding the value in an encoded video bit stream. Likewise, the explicit encoding can refer to the video decoder 30 receiving a value from a syntax element in an encoded bit stream and explicitly decoding the value of the syntax element.
[0081] In some examples, video encoder 22 is not configured to flag and explicitly encode a syntax element (for example, an indicator), which indicates whether an intra reference sample smoothing filter is applied to the current block video data. Instead, video encoder 22 is configured to hide the indicator value in
Petition 870190087428, of 9/5/2019, p. 58/133
40/91 transform coefficients. That is, the value of the indicator that indicates whether the smoothing filter of the intra reference sample applies to a current block is not explicitly encoded, but instead it can be determined by the video decoder 30 (for example, decoded) based on certain values or characteristics of the transform coefficients associated with the current block. For example, if the transform coefficients meet a certain parity condition (for example, with a positive or negative value), the video decoder 30 derives the indicator as having a value of 1, otherwise, the video encoder 30 derives the indicator value as 0, or vice versa.
[0082] In the context of the invention, the term decoding can generally encompass both explicit and implicit decoding of a syntax element value. In explicit decoding, an encoded syntax element is present in the encoded video bit stream. The video decoder 30 explicitly decodes the decoded syntax element to determine the value of the syntax element. In implicit decoding, the syntax element is not sent in the encoded video bit stream. Instead, the video decoder 30 derives a value from the syntax element from video encoding statistics (e.g., transform coefficient parity) based on some predetermined criteria.
[0083] Another tool used in JEM is the mode
PDPC. As described above, the PDPC is a coding mode that weighs the intra and predictor reference samples
Petition 870190087428, of 9/5/2019, p. 59/133
41/91 intra, where weights can be derived based on block size (including width and height) and intraprediction mode.
[0084] The following content describes exemplary techniques of this invention for determining prediction guidelines, determining prediction modes, determining encoding modes, determinations regarding the use of intrafiltration in video encoding (for example, video encoding and / or video encoding), and the explicit encoding and signaling of syntax elements. The techniques described here can be used in any combination and in any conjunction with other techniques. In some examples, the encoding techniques of this invention can be performed using elements of syntax (for example, indicators), which can be explicitly encoded and flagged, hidden in information from the transform coefficient or other location, derived both in the video decoder 22 video decoder 30 without signaling, and the like.
[0085] The techniques of this invention are described with reference to smoothing filters of the intra reference sample and the PDPC mode (coding modes in general). The reference sample smoothing below and the PDPC mode are used for illustration and description purposes. The techniques of this invention are not limited to these examples, and the techniques described can be applied to other modes of video encoding, techniques and tools.
[0086] Initially, the techniques related to a syntax element of the filter are discussed.
Petition 870190087428, of 9/5/2019, p. 60/133
42/91 smoothing the reference sample (for example, an indicator). This invention proposes that the video encoder 22 generate and / or signal an indicator of the smoothing filter of the intra reference sample explicitly. That is, video encoder 22 can be configured to explicitly encode a syntax element that indicates whether a particular encoding mode (for example, an intra reference sample smoothing filter) will be used to encode a video data block. . For example, video encoder 22 can generate and signal a smoothing filter indicator from the intra reference sample in an encoded video bit stream. In this way, video encoder 22 can avoid any need to modify the transform coefficients to make sure that the parity condition is valid (for example, the parity condition of the transformation coefficients correctly indicates the value of the indicator), as can be done when the intra smoothing indicator is not explicitly coded. This technique can save the perceived complexity in the video encoder 22. The video decoder 30 can be configured to receive the explicitly encoded syntax element (for example, the smoothing filter indicator of the intra reference sample) in the data bit stream. encoded video, for example, instead of deriving the indicator value from the parity of the transform coefficients. The video decoder 30 can then explicitly decode the smoothing filter indicator value of the intra reference sample.
[0087] However, in some examples, the
Petition 870190087428, of 9/5/2019, p. 61/133
43/91 encoding the syntax element of the smoothing filter of the intra reference sample can be a burden for some blocks (that is, it can unbearably increase the number of bits used to encode the bit). For example, when residual information related to the block is small, and few bits are used to encode the block, the bit used to signal the syntax element (for example, the smoothing filter indicator of the intra reference sample) may result at a higher than desired bit rate ratio. To solve this potential problem, video encoder 22 can be configured to explicitly encode and flag the smoothing filter indicator of the intra reference sample if a video data block has a certain number of non-zero transform coefficients, or the number of non-zero transform coefficients exceeds a certain limit. For example, the limit can be 3, which means that if a video data block has 3 or more non-zero transform coefficients, video encoder 22 signals (for example, explicitly encodes) the smoothing filter of the intra reference sample. Otherwise, video encoder 22 does not explicitly encode the smoothing filter indicator of the intra reference sample. Other examples of limits include 0, 1, 2 or any number of non-zero transform coefficients.
[0088] Thus, according to an example of the invention, the video encoder 22 can be configured to determine an encoding mode (for example, the use
Petition 870190087428, of 9/5/2019, p. 62/133
44/91 of an intra reference sample smoothing filter) to encode a first block of video data. Based on whether or not the intra reference sample smoothing filter is used for the first block of video data, video encoder 22 can be configured to explicitly encode a first syntax element (for example, a filter indicator smoothing of the intra reference sample) indicating whether the encoding mode (for example, a smoothing filter of the intra reference sample) will be used for the first block of video data in the case where the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit. That is, if the first video data block is associated with a number of non-zero transform coefficients greater than a threshold, the video encoder 22 explicitly encodes the first syntax element. The video encoder 22 can signal the first syntax element in an encoded video bit stream.
[0089] For a second block of video data, video encoder 22 can be configured not to encode a value of the syntax element (for example, a smoothing filter indicator of the intra reference sample) indicating whether the encoding will be used for the second video data block in the event that the second video data block is associated with a number of non-zero transform coefficients below the limit. That is, the second block of video data is associated with a number of transform coefficients
Petition 870190087428, of 9/5/2019, p. 63/133
45/91 other than zero below the limit.
[0090] Reciprocally, the video decoder 30 can be configured to receive the first block of video data, and receive a first element of syntax (for example, a smoothing filter indicator of the intra reference sample) indicating whether the encoding mode (for example, the use of an intra reference sample smoothing filter) will be used for the first block of video data in the event that the first block of video data is associated with a number of coefficients of nonzero transform greater than or equal to a limit. The video decoder 30 can also be configured to explicitly decode the value of the first received syntax element, and apply the encoding mode (for example, the use of an intra reference sample smoothing filter) to the first data block of video in accordance with a value of the first syntax element.
[0091] In the case where the video encoder 22 does not explicitly encode the syntax element (for example, for the second video data block discussed above), the video decoder 30 can be configured to receive the second data block video, infer a value of a second syntax element indicating whether the coding mode (for example, the smoothing filter of the intra reference sample) will be used for the second block of video data in the event that the second block of video video data is associated with a number of transform coefficients other than zero below the limit, and apply the encoding mode (for example, the use of a
Petition 870190087428, of 9/5/2019, p. 64/133
46/91 smoothing filter of the reference sample intra) in accordance with the inferred value of the second syntax element. As will be discussed in more detail below, the video decoder 30 can be configured to use one or more techniques to infer a syntax element value, including inferring the syntax element value from characteristics of the transform coefficients associated with the block. video data, and / or infer the value of the syntax element based on some predefined standard value (for example, always apply the intra reference sample smoothing filter, never apply the intra reference sample smoothing filter, apply standard filter, etc.).
[0092] In the examples above, the encoding mode is the use of an intra reference sample smoothing filter. In other examples discussed below, the encoding mode indicated by the explicitly encoded syntax element can be PDC mode. However, the techniques of this invention can be used with other modes of encoding.
[0093] In some examples, video encoder 22 can be configured to compare the number of non-zero transform coefficients associated with a video data block to the limit together for both luma and chroma components of the data data block. video when determining whether or not to explicitly encode a syntax element for an encoding mode. That is, video encoder 22 can consider the number of non-zero coefficients for the luma blocks and the chroma blocks together. The video decoder 30 can
Petition 870190087428, of 9/5/2019, p. 65/133
47/91 be configured to perform the same comparison as video encoder 22 when determining whether a syntax element for an encoding mode has been explicitly encoded or not and whether it will be received.
[0094] In other examples, video encoder 22 can be configured to compare only non-zero transform coefficients for a luma block when determining whether or not to explicitly encode a syntax element for an encoding mode. In this example, video encoder 22 can be configured to generate syntax elements for encoding modes separately for luma blocks and chroma blocks. Thus, even in this example, video encoder 22 can only consider non-zero transform coefficients for chroma blocks when determining whether or not to explicitly encode a syntax element for an encoding mode for a chroma block. Again, the video decoder 30 can be configured to perform the same comparison as the video encoder 22 when determining whether a syntax element for an encoding mode has been explicitly encoded and whether it will be received for the luma encoding blocks. and / or chroma.
[0095] In another example, the way in which video encoder 22 and video decoder 30 are configured to count non-zero transform coefficients to determine the explicit encoding of a syntax element may be dependent on the type of slice . For example, video encoder 22 and video decoder 30 can be configured to use a transform coefficient counting technique
Petition 870190087428, of 9/5/2019, p. 66/133
48/91
many different in zero for slices I and use another technique different in count coefficients of transformed many different in zero for slices what not if already am I (for example, slices P or slices B) .
[0096] In another example, video encoder 22 and video decoder 30 can be configured to count non-zero transform coefficients using a technique that depends on whether the luma and chroma components are encoded together or separately. For example, in some segmentation structures, the luma and chroma components have the same segmentation structure. In other segmentation structures (for example, examples of QTBT segmentation), the luma and chroma components can be segmented independently, such that their respective segmentation structures differ from one another. In this example, separate coding can mean that the luma and chroma blocks can have different representations of segmentation or tree structures. In this example, when separate and / or independent luma / chroma coding is enabled for slices I, video encoder 22 and video decoder 30 can be configured to count nonzero transform coefficients for the luma components. For slices other than I, when separate encoding is not enabled, video encoder 22 and video decoder 30 can be configured to count non-zero transform coefficients together for both chroma and luma transform coefficients, or only for the luma transform coefficients.
Petition 870190087428, of 9/5/2019, p. 67/133
49/91 [0097] In another example, when video encoder 22 and video decoder 30 are configured to count nonzero coefficients for both chroma and luma components, nonzero coefficient counting is performed per component . For example, video encoder 22 and video decoder 30 can include three non-zero coefficient counters; a counter for each color component (for example, Y, Cb and Cr). In another example, video encoder 22 and video decoder 30 can include two counters; a counter for a luma component and a counter for both chroma components. In this example, the limit can be defined by component, and the limit value can be different for different color components.
[0098] In one example, the limit used to explicitly code and / or signal the smoothing filter indicator of the intra reference sample is the same limit used to explicitly code and signal the primary and / or secondary transform indicators or indices. In this example, there is a unification between different video encoding techniques (for example, between the signaling of the transform and the signaling of the smoothing filter indicator of the intra reference sample), and a non-zero coefficient count and limit can be used, which can simplify implementation.
[0099] In another example, video encoder 22 and / or video decoder 30 may determine to explicitly encode the smoothing filter indicator of the intra reference sample based on a
Petition 870190087428, of 9/5/2019, p. 68/133
50/91 limit of non-zero transform coefficients only for blocks without transform advance. That is, for the transform advance blocks, the video encoder 22 and the video decoder 30 may not explicitly encode an indicator of the smoothing filter of the intra reference sample. For blocks without transform advance (i.e., blocks to which a transform is applied), video encoder 22 and video decoder 30 can explicitly encode the smoothing filter indicator of the intra reference sample. Transform advance is a method in which horizontal or vertical transforms, or both transforms, are not applied to the residue of a block, that is, they are ignored. The transform can be any transform: primary or secondary, or both.
[0100] In another example, video encoder 22 and / or video decoder 30 may determine to explicitly encode the intra reference sample smoothing filter indicator based on a limit of non-zero transform coefficients only for blocks encoded with a particular intraprediction mode. For example, video encoder 22 and / or video decoder 30 may determine to explicitly encode the intra reference sample smoothing filter indicator based on a limit of nonzero transform coefficients for blocks encoded with intrapredicting modes. different from a planar mode, a linear model prediction mode (LM) or a DC mode. For example, if the block of a component involved (for example, luma or chroma component) is
Petition 870190087428, of 9/5/2019, p. 69/133
51/91 encoded using planar mode, video encoder 22 and / or video decoder 30 would not consider the number of nonzero transform coefficients for this component involved when determining to explicitly encode the reference sample smoothing filter indicator intra. In this way, video encoder 22 is configured to explicitly encode the intra reference sample smoothing filter indicator based on an intraprediction mode used to encode the video data block. Likewise, video decoder 30 is configured to receive the intra reference sample smoothing filter indicator based on an intraprediction mode used to encode the video data block.
[0101] In another example, in addition to comparing the number of non-zero transform coefficients to a limit, video encoder 22 and video decoder 30 can apply a block size limit to determine whether or not to explicitly encode of an intra reference sample smoothing filter indicator. For example, video encoder 22 can be configured to explicitly encode and flag an intra reference sample smoothing filter indicator for blocks with a size greater than or equal to a predetermined minimum size and less than a predetermined maximum block size, where the minimum and maximum block sizes can be configurable or fixed for both video encoder 22 and video decoder 30. Likewise, video decoder 30 can be configured to receive and decode explicitly
Petition 870190087428, of 9/5/2019, p. 70/133
52/91 an intra reference sample smoothing filter indicator for blocks with a size greater than or equal to a predetermined minimum size and less than a predetermined maximum block size.
[0102] Thus, in this example, video encoder 22 can be configured to explicitly encode the intra reference sample smoothing filter indicator in the case where the first block of video data is greater than or equal to a predetermined size. Likewise, the video decoder 30 can be configured to explicitly receive and decode the intra reference sample smoothing filter indicator in the case where the first block of video data is greater than or equal to a predetermined size.
[0103] The minimum block size limit can be set to be greater than or equal to 8x8, meaning that all blocks smaller than 8x8 (for example, 4x4, 4x8, 8x4 and the like) are restricted and a smoothing filter indicator reference sample is not flagged for these blocks. Likewise, the maximum block limit can be, for example, set to 32x32. In another example, the limit can be expressed in width * height. That is, 8x8 is converted to 64 and 32x32 is converted to 1024. To check if the current block is restricted to explicitly encode the intra reference sample smoothing filter indicator, video encoder 22 and video decoder 30 you can check the width * height of the block versus the limit.
[0104] In any of the above examples where the reference sample smoothing filter indicator
Petition 870190087428, of 9/5/2019, p. 71/133
53/91 intra is not explicitly encoded and / or signaled, the video decoder 30 can be configured to apply some standard smoothing filters to the video data block. For example, the video decoder 30 can apply an MDIS filter (which is dependent on the mode), the video decoder 30 can apply any other filter, or the video decoder 30 can apply no filtering.
[0105] In other examples of the invention, video encoder 22 may be configured to explicitly encode and signal an indicator (e.g., an intra reference sample smoothing filter indicator) only for certain intraprediction modes. For example, video encoder 22 can be configured to explicitly encode and flag an intra reference sample smoothing filter indicator for intraprediction modes where MIDS can be enabled (for example, IDC modes), for MDIS modes and a planar mode, or for any other subset of intraprediction modes of available intraprediction modes.
[0106] In another example, video encoder 22 and video decoder 33 are configured to apply an intra reference sample smoothing filter to blocks of video data that are encoded with distant intraprediction modes (for example , compared to a boundary) of the horizontal or vertical directions. In addition, or optionally, video encoder 22 and video decoder 33 are configured to apply an intra reference sample smoothing filter to encoded video data blocks
Petition 870190087428, of 9/5/2019, p. 72/133
54/91 using a planar intraprediction mode or other angular intraprediction modes. The video encoder 22 and the video decoder 30 can be configured to obtain a subset of intraprediction modes used to determine the application of an intra reference sample smoothing filter. The video encoder 22 and the video decoder 33 can be configured to obtain the subset based on intrapredictive mode guidelines. In one example, video encoder 22 and video decoder 33 can be configured to obtain the subset of intraprediction modes based on how far or close (for example, based on a threshold) the index for the modes is. intrapredition in relation to the indices for horizontal, vertical and / or diagonal intraprediction modes. Another separate subset of the intraprediction modes can be attributed to non-angular directions, such as planar and / or intra and similar DC modes.
[0107] In another example, video encoder 22 and video decoder 30 are configured to explicitly encode and signal an intra reference sample smoothing filter indicator for different color components of a video data block. For example, video encoder 22 and video decoder 33 are configured to explicitly encode and signal an indicator for the luma components. In addition, video encoder 22 and video decoder 33 are configured to explicitly encode and signal an indicator for the chroma Cb (e.g. Croma_Cb) and Cr chroma (e.g. Croma_Cr) components. The signaling of a component's indicator may depend on the
Petition 870190087428, of 9/5/2019, p. 73/133
55/91 value of the indicator already signaled to another component. For example, video encoder 22 can be configured to explicitly encode and signal an intra reference sample smoothing filter indicator for chroma and luma components. When signaling the indicator for chroma, the entropy / analysis encoding of that indicator by video encoder 22 and video decoder 30, respectively, may depend on the value of the indicator signaled for luma. Dependency can be reflected by, but not limited to, the context value.
[0108] In another example, an indicator of the intra reference sample smoothing filter may not be flagged, but instead be obtained by the video decoder 30 according to the intrapredict mode index for the data block of video being decoded. For example, blocks of video data encoded using an intraprediction mode with the even-mode index use an intra-reference sample smoothing filter (the indicator is enabled), and blocks of video data encoded using a intraprediction with an odd mode index does not have an intra reference sample smoothing filter applied (the indicator is disabled), or vice versa.
[0109] In some examples, video encoder 22 and video decoder 30 may apply intra smoothing to a first block with a particular intra mode, and not apply intra smoothing to a neighboring block with an intra mode that is similar the intra mode for the first block can provide better variety for intraprediction. This is because the guidelines of the neighboring intraprediction mode (for
Petition 870190087428, of 9/5/2019, p. 74/133
56/91 example, intrapredictive mode guidelines that are close to each other or close to each other in relation to a limit) may offer similar intrapredictors (since the direction is similar), but the smoothing indicator may still differentiate the predictor. In one example, the video encoder 22 and the video decoder 30 can be configured to perform intra smoothing for an intrapredictive mode yes, another no. For example, intra smoothing can be performed for intrapredicting modes with an even index and intra smoothing may not be performed for intrapredicting modes.
with an index odd, or vice versa. In others examples, The intraprediction can to be performed for each 3 modes in intraprediction, all the other modes of intraprediction or
any subset of intraprediction modes.
[0110] Furthermore, there is no need to have the intra reference sample smoothing filter indicator explicitly flagged, and the bits can be saved. Non-angular intraprediction modes can be associated with a separate rule. For example, video encoder 22 and video decoder 30 can be configured to always apply intra reference sample smoothing to encoded video data blocks using a planar intraprediction mode. In another example, video encoder 22 and video decoder 30 can be configured to not apply intra reference sample smoothing to encoded video data blocks using a planar intraprediction mode. In yet other examples, video encoder 22 and video decoder 30 can be configured to
Petition 870190087428, of 9/5/2019, p. 75/133
57/91 explicitly encode an intra reference sample smoothing filter indicator to indicate whether intra reference sample smoothing will be applied to the encoded video data blocks using a planar intraprediction mode.
[0111] In some examples, context modeling (that is, the contexts used for entropy coding, such as CAB AC) for the entropy coding of the intra reference sample smoothing filter indicator may be dependent on the intraprediction mode . For example, video encoder 22 and video decoder 30 can be configured to use a context to entropy encode the intra reference sample smoothing filter indicator for some intraprediction modes, and video encoder 22 and the video decoder 30 can be configured to use other context (s) to entropy encode the intra reference sample smoothing filter indicator for other intraprediction modes. Context assignment can be based on a subset of intraprediction modes from the available intraprediction modes. That is, video encoder 22 and video decoder 30 can be configured to assign contexts used to encode the intra reference sample smoothing filter indicator based on the subset of intraprediction modes to which the intraprediction mode belongs to. video data block in progress. For example, the subset of intraprediction modes can be non-angular modes, angular modes, modes to which MDIS is applied and / or planar mode. The video encoder 22
Petition 870190087428, of 9/5/2019, p. 76/133
58/91 and video decoder 30 can be configured to obtain the subset to which the current video data block belongs based on the proximity of the current intraprediction mode to the specific modes (for example, based on a limit) . For example, video encoder 22 and video decoder 30 can be configured to determine the index proximity for an intraprediction mode in progress with the index for a horizontal intraprediction mode, a vertical intraprediction mode, a diagonal intraprediction mode or another mode of intraprediction. Another separate subset can be assigned for non-angular directions, such as planar and / or intra and similar DC modes.
[0112] Techniques for signaling the PDPC mode will now be discussed. In an example of the invention, the use of the PDPC mode can be restricted, and the video encoder 22 is configured to not signal a PDPC indicator for the restricted cases. The PDPC indicator, or more generally the PDPC syntax element, indicates whether the PDPC mode is used for a given block of video data. The restriction can be imposed in a similar way to the techniques for the intra reference sample smoothing filter indicator discussed above.
[0113] In one example, video encoder 22 can be configured to explicitly encode and flag the intra reference sample smoothing filter indicator if a video data block has a certain number of non-zero transform coefficients, or the number of non-zero transform coefficients exceeds a certain
Petition 870190087428, of 9/5/2019, p. 77/133
59/91 limit. For example, the limit can be 3, which means that if a video data block has 3 or more non-zero transform coefficients, video encoder 22 signals (for example, explicitly encodes) the PDPC mode. Otherwise, video encoder 22 does not explicitly encode the PDPC mode indicator. In some examples, the limit can be the same as used for signaling the transform indices. Other examples of limits include 0, 1, 2 or any number of non-zero transform coefficients. In one example, the limit is equal to 2, which means that video encoder 22 signals the PDPC mode indicator if the video data block has more than 1 non-zero transform coefficient.
[0114] Thus, according to an example of the invention, video encoder 22 can be configured to determine an encoding mode (for example, the use of PDPC mode) to encode a first block of video data. Based on whether or not to use PDPC mode for the first block of video data, video encoder 22 can be configured to explicitly encode a first syntax element (for example, a PDPC mode indicator) indicating whether the encoding mode (for example, a PDPC mode) will be used for the first video data block in the event that the first video data block is associated with a number of non-zero transform coefficients greater than or equal to a threshold. That is, if the first block of video data is associated with a number of non-zero transform coefficients greater than a limit, the video encoder 22
Petition 870190087428, of 9/5/2019, p. 78/133
60/91 explicitly encodes the first syntax element. Video encoder 22 can signal the first syntax element in an encoded video bit stream.
[0115] For a second block of video data, video encoder 22 can be configured to not encode a syntax element value (for example, a PDPC mode indicator) indicating whether the encoding mode will be used for the second video data block in the event that the second video data block is associated with a number of non-zero transform coefficients below the limit. That is, the second block of video data is associated with a number of transform coefficients other than zero below the limit.
[0116] Conversely, the video decoder 30 can be configured to receive the first block of video data, and receive a first element of syntax (for example, a PDPC mode indicator) indicating whether the encoding mode (for example, example, the use of PDPC mode) will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a threshold. The video decoder 30 can further be configured to explicitly decode the value of the first received syntax element, and apply the encoding mode (for example, the PDPC mode) to the first video data block in accordance with a value of the first element of syntax.
[0117] In the case where video encoder 22 does not explicitly encode the syntax element (for example, for the second block of video data discussed
Petition 870190087428, of 9/5/2019, p. 79/133
61/91 above), the video decoder 30 can be configured to receive the second block of video data, infer a value of a second syntax element indicating whether the encoding mode (for example, the PDPC mode) will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below the limit, and apply the encoding mode (for example, the PDPC mode) accordingly with the inferred value of the second syntax element.
[0118] In some examples, video encoder 22 can be configured to compare the number of non-zero transform coefficients associated with a video data block to the limit together for both luma and chroma components of the video data block when determining whether to explicitly encode a syntax element for an encoding mode (for example, PDPC mode). That is, video encoder 22 can consider the number of non-zero coefficients for the luma blocks and the chroma blocks together. The video decoder 30 can be configured to perform the same comparison as the video encoder 22 in determining whether a syntax element for an encoding mode has been explicitly encoded and whether it will be received.
[0119] In other examples, video encoder 22 can be configured to compare only non-zero transform coefficients for a luma block when determining whether to explicitly encode a syntax element for an encoding mode (for example,
Petition 870190087428, of 9/5/2019, p. 80/133
62/91 (example, PDPC mode). In this example, video encoder 22 can be configured to generate the syntax elements for encoding modes separately for luma blocks and chroma blocks. Thus, even in this example, video encoder 22 can only consider non-zero transform coefficients for chroma blocks when determining whether to explicitly encode a syntax element for an encoding mode (for example, PDPC mode) for a chroma block. Again, video decoder 30 can be configured to perform the same comparison as video encoder 22 in determining whether a syntax element for an encoding mode has been explicitly encoded and whether it will be received for the luma and / or chroma.
[0120] In another example, the way in which video encoder 22 and video decoder 30 are configured to count non-zero transform coefficients to determine the explicit encoding of a syntax element may be dependent on the type of slice . For example, video encoder 22 and video decoder 30 can be configured to use a non-zero transform coefficient counting technique for slices I and using a different non-zero transform coefficient counting technique for slices that other than I (for example, slices P or slices B).
[0121] In another example, the video encoder 22 and the video decoder 30 can be configured to count non-zero transform coefficients using a technique that depends on whether the luma components
Petition 870190087428, of 9/5/2019, p. 81/133
63/91 and chroma are coded together or separately. For example, in some segmentation structures, the luma and chroma components have the same segmentation structure. In other segmentation structures (for example, examples of QTBT segmentation), the luma and chroma components can be segmented independently, such that their respective segmentation structures differ from one another. In this example, separate coding can mean that the luma and chroma blocks can have different representations of segmentation or tree structures. In this example, when separate and / or independent luma / chroma coding is enabled for slices I, video encoder 22 and video decoder 30 can be configured to count nonzero transform coefficients for the luma components. For slices other than I, when separate encoding is not enabled, video encoder 22 and video decoder 30 can be configured to count non-zero transform coefficients together for both chroma and luma transform coefficients, or only for the luma transform coefficients.
[0122] In another example, when video encoder 22 and video decoder 30 are configured to count non-zero coefficients for both chroma and luma components, non-zero coefficients are counted per component. For example, video encoder 22 and video decoder 30 can include three non-zero coefficient counters; one counter for each color component (for example,
Petition 870190087428, of 9/5/2019, p. 82/133
64/91
Y, Cb and Cr). In another example, video encoder 22 and video decoder 30 can include two counters; a counter for one luma component and a counter for both chroma components. In this example, the limit can be defined by component, and the limit value can be different for different color components.
[0123] In another example, the video encoder 22 and / or the video decoder 30 may determine to explicitly encode the PDPC mode indicator based on a limit of non-zero transform coefficients only for blocks without transform advance. That is, for the transform advance blocks, the video encoder 22 and the video decoder 30 may not explicitly encode a PDPC mode indicator. For blocks without transform advance (i.e., blocks to which a transform is applied), the video encoder 22 and the video decoder 30 can explicitly encode the PDPC mode indicator. Transform advance is a method in which horizontal or vertical transforms, or both transforms, are not applied to the residue of a block, that is, they are ignored. The transform can be any transform: primary or secondary, or both.
[0124] In another example, video encoder 22 and / or video decoder 30 may determine to explicitly encode the PDPC mode indicator based on a limit of non-zero transform coefficients only for blocks encoded with an intrapredictive mode particular. For example, video encoder 22 and / or video decoder 30 can determine
Petition 870190087428, of 9/5/2019, p. 83/133
65/91 explicitly encode the PDPC mode indicator based on a limit of non-zero transform coefficients for blocks encoded with different intraprediction modes in a planar mode, a linear model prediction mode (LM) or a DC mode. For example, if the block of a component involved (for example, luma or chroma component) is encoded using planar mode, video encoder 22 and / or video decoder 30 would not consider the number of non-zero transform coefficients of this component involved when determining to explicitly code the PDPC mode indicator. In this way, video encoder 22 is configured to explicitly encode the PDPC mode indicator based on an intraprediction mode used to encode the video data block. Likewise, video decoder 30 is configured to receive the PDPC mode indicator based on an intraprediction mode used to encode the video data block.
[0125] In another example, in addition to comparing the number of non-zero transform coefficients to a limit, the video encoder 22 and the video decoder 30 can apply a block size limit to determine the explicit encoding of a PDPC mode indicator. For example, video encoder 22 can be configured to explicitly encode and flag a PDPC mode indicator for blocks greater than or equal to a predetermined minimum size and less than a predetermined maximum block size, where the minimum and maximum sizes are of the block can be configurable or fixed for both the video encoder 22 and the
Petition 870190087428, of 9/5/2019, p. 84/133
66/91 video decoder 30. Likewise, the video decoder 30 can be configured to receive and decode a PDPC mode indicator for blocks larger than or equal to a predetermined minimum size and less than a predetermined maximum size of the block.
[0126] Thus, in this example, video encoder 22 can be configured to explicitly encode the PDPC mode indicator in the case where the first block of video data is greater than or equal to a predetermined size. Likewise, the video decoder 30 can be configured to explicitly receive and decode the PDPC mode indicator in the event that the first block of video data is greater than or equal to a predetermined size.
[0127] The minimum block size limit can be set to be greater than or equal to 8x8, meaning that all blocks smaller than 8x8 (for example, 4x4, 4x8, 8x4 and the like) are restricted and a PDPC mode indicator is not is flagged for these blocks. Likewise, the maximum block limit can be, for example, set to 32x32. In another example, the limit can be expressed in width * height. That is, 8x8 is converted to 64 and 32x32 is converted to 1024. To check if the current block is restricted to explicitly encode the PDPC mode indicator, video encoder 22 and video decoder 30 can check the width * height of the block versus the limit.
[0128] In any of the above examples where the intra reference sample smoothing filter indicator is not explicitly coded and / or flagged, the
Petition 870190087428, of 9/5/2019, p. 85/133
67/91 video decoder 30 can be configured to obtain a default value for the PDPC mode indicator for some intraprediction mode (s). For example, for some smoothed intraprediction modes, for example, planar mode, PDPC mode is always applied.
[0129] In another example, video encoder 22 and video decoder 30 are configured to explicitly encode and signal a PDPC mode indicator for different color components of a video data block. For example, video encoder 22 and video decoder 33 are configured to explicitly encode and signal an indicator for the luma components. In addition, video encoder 22 and video decoder 33 are configured to explicitly encode and signal an indicator for the chroma Cb (e.g., Chroma Cb) and chroma Cr (e.g., Chroma Cr) components. The signaling of the indicator for one component may depend on the value of the indicator already signaled for another component. For example, video encoder 22 can be configured to explicitly encode and signal a PDPC mode indicator for chroma and luma components. When signaling the indicator for chroma, the entropy / analysis coding of that indicator by the video encoder 22 and the video decoder 30, respectively, may depend on the value of the indicator signaled for luma. Dependency can be reflected by, but not limited to, the context value.
[0130] In addition, or alternatively, the restriction of the PDPC mode can be performed based on the intraprediction mode. For example, video encoder 22 and
Petition 870190087428, of 9/5/2019, p. 86/133
68/91 video decoder 30 can be configured not to apply the PDPC mode, and video encoder 22 is configured not to explicitly encode a PDCPC mode indicator, for some intraprediction modes or for a subset of the subset of intraprediction modes . The video encoder 22 and video decoder 30 can be configured to obtain the subset to which the current video data block belongs based on the proximity of the current intraprediction mode to the specific modes (for example, based on a limit) . For example, video encoder 22 and video decoder 30 can be configured to determine the proximity of the index for an intraprediction mode in progress to the index for a horizontal intraprediction mode, a vertical intraprediction mode, a diagonal intraprediction mode or another mode of intraprediction. Another separate subset can be attributed to non-angular directions, such as planar and / or intra and similar DC modes. In a specific example, PDPC mode is not applied to planar mode.
[0131] In another example, the PDPC mode can be combined with other video encoding tools or techniques, such as the secondary transform and / or intra reference sample smoothing filters described above. This combination can be allowed for certain intra modes, and the PDPC indicator mode is signaled for cases where PDPC mode is allowed. The selection of the intra mode can be one of the examples described above.
[0132] In some examples, context modeling (ie contexts used for entropy coding, such as CABAC) for the entropy coding of
Petition 870190087428, of 9/5/2019, p. 87/133
69/91 PDPC mode indicator may be dependent on the intraprediction mode and / or the block size. For example, video encoder 22 and video decoder 30 can be configured to use a context to entropy encode the PDPC mode indicator for some intraprediction modes, and video encoder 22 and video decoder 30 can be configured to use other context (s) to entropy the PDPC mode indicator for other intraprediction modes. Context assignment can be based on a subset of intraprediction modes from the available intraprediction modes. That is, video encoder 22 and video decoder 30 can be configured to assign contexts used to encode the PDPC mode indicator based on the subset of intraprediction modes to which the intraprediction mode of the video data block belongs. course. For example, the subset of intraprediction modes can be non-angular modes, angular modes, modes to which MDIS is applied and / or planar mode. The video encoder 22 and video decoder 30 can be configured to obtain the subset to which the current video data block belongs based on the proximity of the current intraprediction mode to the modes
specific (for example, based on in a limit). Per example, the encoder video 22 it's the decoder in video 30 can be configured to determine The proximity of the index for a mode in intraprediction in
course to the index for a horizontal intraprediction mode, a vertical intraprediction mode, a diagonal intraprediction mode or another intraprediction mode. Another subset
Petition 870190087428, of 9/5/2019, p. 88/133
70/91 separate can be attributed to non-angular directions, such as planar and / or intra and similar DC modes.
[0133] Figure 4 is a block diagram illustrating an exemplary video encoder 22 that can implement the techniques of this invention. Figure 4 is provided for the purpose of explanation and should not be considered as a limitation of the techniques widely exemplified and described in this publication. However, the techniques of this invention can be applicable to various standards or coding methods.
[0134] In the example of figure 4, video encoder 22 includes a prediction processing unit 100, video data memory 101, a residual generation unit 102, a transform processing unit 104, a quantization unit 106 , a reverse quantization unit 108, a reverse transform processing unit 110, a reconstruction unit 112, a filter unit 114, a decoded image buffer 116 and an entropy coding unit 118. The prediction processing unit 100 includes an interpreting processing unit 120 and an intraprediction processing unit 126. Interpretation processing unit 120 may include a motion estimation unit and a motion compensation unit (not shown).
[0135] Video data memory 101 can be configured to store video data to be encoded by components of video encoder 22. Video data stored in video data memory 101 can be obtained, for example, from the source video 18. The
Petition 870190087428, of 9/5/2019, p. 89/133
71/91 decoded image buffer 116 may be a reference image memory that stores reference video data for use in encoding video data by data encoder 22, for example, in intra- or interpreting modes. The video data memory 101 and the decoded image buffer 116 can be formed by any of a variety of memory devices, such as dynamic random access memory (DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM) ), Resistive RAM (RRAM) or other types of memory devices. The video data memory 101 and the decoded image buffer 116 can be provided by the same memory device or separate memory devices. In several examples, the video data memory 101 can be on-chip with other components of the video encoder 22, or off-chip with respect to those components. The video data memory 101 can be the same or part of the storage medium 20 of Figure 1.
[0136] Video encoder 22 receives video data. The video encoder 22 can encode each CTU into a slice of an image of the video data. Each of the CTUs can be associated with luma coding blocks (CTBs) of the same size and corresponding CTBs of the image. As part of the coding of a CTU, the prediction processing unit 100 can perform segmentation to divide the CTBs of the CTU into progressively smaller blocks. In some examples, video encoder 22 can segment blocks using a QTBT structure. The smallest blocks can be CU coding blocks. For example, the prediction processing unit 100 can segment a CTB
Petition 870190087428, of 9/5/2019, p. 90/133
72/91 associated with a CTU according to a tree structure. According to one or more techniques of the invention, for each respective non-leaf node of the tree structure at each depth level of the tree structure, there are a plurality of segmentation patterns allowed for the respective non-leaf node and the corresponding video block the respective leaf node is not segmented into video blocks corresponding to the secondary nodes of the respective non-leaf node according to one of the plurality of permissible segmentation patterns.
[0137] Video encoder 22 can encode the CUs of a CTU to generate encoded representations of the CUs (that is, encoded CUs). As part of the encoding of a CU, the prediction processing unit 100 can segment the encoding blocks associated with the CU between one or more PUs of the CU. Thus, each PU can be associated with a luma prediction block and corresponding chroma prediction blocks. Video encoder 22 and video decoder 30 can support PUs of various sizes. As indicated above, the size of a CU can refer to the size of the CU luma coding block and the size of a PU can refer to the size of a PU luma prediction block. Assuming that the size of a particular CU is 2NX2N, video encoder 22 and video decoder 30 can support 2NX2N or NxN PU formats for intraprediction, and symmetric PU sizes 2NX2N, 2NxN, Nx2N, NxN, or similar for interpretation. Video encoder 22 and video decoder 30 can also support asymmetric segmentation for PU sizes of 2NxnU, 2NxnD, nLx2N and nRx2N for interpretation.
Petition 870190087428, of 9/5/2019, p. 91/133
73/91 [0138] Interpretation processing unit 120 can generate predictive data for a PU, performing interpretation on each PU of a CU. Predictive data for PU can include predictive blocks for PU and motion information for PU. Interpretation processing unit 120 can perform different operations for a PU of a CU, depending on whether the PU is in a slice I, a slice P or a slice B. In a slice I, all PUs are intrapredicted. Thus, if the PU is in a slice I, the interpreting processing unit 120 does not perform interpreting on the PU. Thus, for blocks encoded in mode I, the predicted block is formed using spatial prediction from neighboring blocks previously encoded within the same frame. If a PU is in a P slice, the Interpretation Processing Unit 120 can use unidirectional Interpretation to generate a PU predictive block. If a PU is on a B slice, Interpretation Processing Unit 120 can use unidirectional or bidirectional interpreting to generate a PU predictive block.
[0139] Interpretation processing unit 126 can generate predictive data for a PU, performing interpretation on the PU. Predictive data for the PU can include predictive blocks of the PU and various syntax elements. Intraprediction processing unit 126 can perform Pus intraprediction on slices I, slices P and slices B. Intraprediction processing unit 126 can be configured to determine one or more encoding modes to apply when predicting a video data block using intraprediction, including applying a
Petition 870190087428, of 9/5/2019, p. 92/133
74/91 intra reference sample smoothing filter and / or a PDPC mode. The intrapredictive processing unit 126 and / or other component of the video encoder 22 can be configured to perform the explicit encoding techniques described above for the intra reference sample smoothing filter and PDPC mode syntax encoding.
[0140] To perform intraprediction on a PU, the intraprediction processing unit 126 can use several intraprediction modes to generate multiple predictive data sets for the PU. The intraprediction processing unit 126 can use sample blocks of neighboring PU samples to generate a predictive block for a PU. The neighboring PUs can be above, above and to the right, above and to the left, or to the left of the PU, assuming a coding order from left to right, from top to bottom for the PUs, CUs and CTUs. The intraprediction processing unit 126 can use various numbers of intraprediction modes, for example, 33 directional intraprediction modes. In some instances, the number of intrapredictive modes may depend on the size of the PU-associated region.
[0141] The prediction processing unit 100 can select the predictive data for PUs from a CU from the predictive data generated by the interpretation processing unit 120 for the PUs or the predictive data generated by the intraprediction processing unit 126 for the PUs . In some instances, the prediction processing unit 100 selects predictive data for CU PUs based on
Petition 870190087428, of 9/5/2019, p. 93/133
75/91 rate / distortion of predictive data sets. The predictive blocks of the selected predictive data can be referred to here as the selected predictive blocks.
[0142] Residual generation unit 102 can generate, based on the coding blocks (for example, the luma, Cb and Cr coding blocks) for a CU and the selected predictive blocks (for example, the luma, Cb predictive blocks) and Cr) for CU's PUs, residual blocks (for example, luma, Cb and Cr residual blocks) for CU. For example, the residual generation unit 102 can generate the residual CU blocks, such that each sample in the residual blocks has a value equal to a difference between a sample in a CU coding block and a corresponding sample in a block corresponding selected predictive value of a CU PU.
[0143] Transform processing unit 104 can perform quadtree segmentation to segment residual blocks associated with a CU into transform blocks associated with CU's TUs. Thus, a PU can be associated with a luma transformed block and two chroma transformed blocks. The sizes and positions of the luma and chroma transform blocks of the CU's TUs may or may not be based on the sizes and positions of the CU's PU prediction blocks. A quadtree structure known as a residual quadtree (RQT) can include nodes associated with each of the regions. The CU's TUs can correspond to RQT leaf nodes.
[0144] The transform processing unit
104 can generate blocks of transform coefficients for each CU of a CU, by applying one or more
Petition 870190087428, of 9/5/2019, p. 94/133
76/91 transformed to TU transform blocks. The transform processing unit 104 can apply several transforms to a transform block associated with a TU. For example, transform processing unit 104 can apply a discrete cosine transformation (DCT), a directional transform or a transform conceptually similar to a transform block. In some examples, transform processing unit 104 does not apply transforms to a transform block. In these examples, the transform block can be treated as a transform coefficient block.
[0145] The quantization unit 106 can quantize the transform coefficients in a block of coefficients. The quantization process can reduce the bit depth associated with some or all of the transform coefficients. For example, a n-bit transform coefficient can be rounded to a w-bit transform coefficient, where n is greater than m. The quantization unit 106 can quantize a block of coefficients associated with a CU's TU based on a quantization parameter (QP) value associated with the CU. The video encoder 22 can adjust the degree of quantization applied to the coefficient blocks associated with a CU, by adjusting the value of the QP associated with the CU. Quantization can introduce loss of information. Thus, the quantized transform coefficients may be less accurate than the original ones.
[0146] The inverse quantization unit 108 and the inverse transform processing unit 110 can apply reverse quantization and inverse transforms to a
Petition 870190087428, of 9/5/2019, p. 95/133
77/91 coefficient block, respectively, to reconstruct a residual block of the coefficient block. The reconstruction unit 112 can add the reconstructed residual block to corresponding samples of one or more predictive blocks generated by the prediction processing unit 100 to produce a reconstructed transform block associated with a TU. With the reconstruction of the transform blocks for each CU of a CU in this way, the video encoder 22 can reconstruct the CU coding blocks.
[0147] Filter unit 114 can perform one or more unlocking operations to reduce blocking artifacts in the coding blocks associated with a CU. The decoded image buffer 116 can store the reconstructed coding blocks after the filter unit 114 performs one or more unlock operations on the reconstructed coding blocks. Interpretation processing unit 120 can use a reference image containing the reconstructed coding blocks to perform PU interpretation of other images. In addition, the intraprediction processing unit 126 can use reconstructed encoding blocks in the decoded image buffer 116 to perform intraprediction on other PUs in the same CU image.
[0148] The unity coding entropy 118 can receive data from other functional components of encoder in video 22. For example, unity in coding in entropy 118 can receive blocks in coefficients gives unity in quantization 106 and can receive
Petition 870190087428, of 9/5/2019, p. 96/133
78/91 elements of prediction processing unit 100. The entropy coding unit 118 can perform one or more entropy coding operations on the data to generate entropy encoded data. For example, entropy coding unit 118 can perform a CABAC operation, a context-adaptive variable-length coding operation (CAVLC), a variable-length variable-coding operation (V2V), an adaptive binary arithmetic coding operation to the context and based on syntax (SBAC), a probability interval segmentation entropy (PIPE) encoding operation, a Golomb Exponential encoding operation or other type of entropy encoding operation in the data. Video encoder 22 can output a bit stream that includes entropy-encoded data generated by entropy encoding unit 118. For example, the bit stream can include data that represents an RQT for a CU.
[0149] Figure 5 is a block diagram illustrating an exemplary video encoder 30 that is configured to implement the techniques of this invention. Figure 5 is provided for the purpose of explanation and should not be considered as a limitation of the techniques widely exemplified and described in this publication. For purposes of explanation, this invention describes the video encoder 30 in the context of HEVC encoding. However, the techniques of this invention may be applicable to other standards or coding methods, including techniques that allow for non-square segmentation and / or luma and chroma independent segmentation.
Petition 870190087428, of 9/5/2019, p. 97/133
79/91 [0150] In the example of figure 5, the video decoder 30 includes an entropy decoding unit 150, a video data memory 151, a prediction processing unit 152, an inverse quantization unit 154, a reverse transform processing unit 156, reconstruction unit 158, filter unit 160 and decoded image buffer 162. Prediction processing unit 152 includes a motion compensation unit 164 and an intraprediction processing unit 166 In other examples, the video encoder 30 may include more, less or different functional components.
[0151] Video data memory 151 can store encoded video data, such as an encoded video bit stream, to be decoded by video decoder components 30. Video data stored in video data memory 151 can be obtained, for example, by means of computer reading 16, for example, from a local video source, such as a camera, via wired or wireless network communication of video data, or by accessing physical storage media of data. The video data memory 151 can form an encoded image buffer (CPB) that stores encoded video data from an encoded video bit stream. The decoded image buffer 162 may be a reference image memory that stores reference video data for use in decoding video data by the video decoder 30, for example, in intraor prediction modes, or for transmission. The video data memory 151 and the decoded image buffer 162 can be
Petition 870190087428, of 9/5/2019, p. 98/133
80/91 formed by any of a variety of memory devices, such as DRAM, including SDRAM, MRAM, RRAM or other types of memory devices. The video data memory 151 and the decoded image buffer 162 can be provided by the same memory device or separate memory devices. In several examples, the video data memory 151 can be on-chip with other components of the 30 video decoder, or off-chip with respect to those components. The video data memory 151 can be the same or part of the storage medium 28 of Figure 1.
[0152] Video data memory 151 receives and stores encoded video data (for example, NAL units) of a bit stream. The entropy decoding unit 150 can receive encoded video data (e.g., NAL units) from the video data memory 151 and can analyze the NAL units for syntax elements. The entropy decoding unit 150 can entropy the entropy-encoded syntax elements in the NAL units with entropy. The prediction processing unit 152, the inverse quantization unit 154, the inverse transform processing unit 156, the reconstruction unit 158 and the filter unit 160 can generate decoded video data based on the syntax elements extracted from the stream of bits. The entropy decoding unit 150 can perform a process generally reciprocal to that of the entropy coding unit 118.
[0153] In accordance with some examples of this invention, the entropy decoding unit 150 can
Petition 870190087428, of 9/5/2019, p. 99/133
81/91 determine a tree structure as part of obtaining the bit stream syntax elements. The tree structure can specify how an initial video block, such as a CTB, is segmented into smaller video blocks, such as encoding units. According to one or more techniques of this invention, for each respective non-leaf node of the tree structure at each depth level of the tree structure, there are a plurality of segmentation patterns allowed for the respective non-leaf node and the video block corresponding to the respective non-leaf node is not segmented into video blocks corresponding to the secondary nodes of the respective non-leaf node according to one of the plurality of permissible segmentation patterns.
[0154] In addition to obtaining the bit stream syntax elements, the video decoder 30 can perform a reconstruction operation in a non-segmented CU. For
perform the operation in reconstruction in a CU, O decoder of video 30 can perform an operation in reconstruction in each TU gives ASS. With the realization of the operation reconstruction for each CU TU, the decoder in
video 30 can rebuild CU residual blocks.
[0155] As part of performing a reconstruction operation on a CU's TU, the inverse quantization unit 154 can quantize in reverse, that is, decantify, blocks of coefficients associated with the TU. After the inverse quantization unit 154 decantifies a block of coefficients, the inverse transform processing unit 156 can apply one or more inverse transforms to the coefficients in order to generate a residual block associated with the TU. For example, the
Petition 870190087428, of 9/5/2019, p. 100/133
82/91 inverse transform processing 156 can apply a
Inverse DCT, an inverse integer transform, an inverse Karhunen-Loeve transform (KLT), an inverse rotational transform, an inverse directional transform or another inverse transform to the coefficient block.
[0156] If a PU is encoded using intraprediction, the intraprediction processing unit 166 can perform intraprediction to generate PU predictive blocks. The intraprediction processing unit 166 can use an intraprediction mode to generate the PU predictive blocks based on spatially neighboring blocks of the samples. The intraprediction processing unit 166 can determine the intraprediction mode for the PU based on one or more elements of syntax obtained from the bit stream. The intraprediction processing unit 166 can be configured to determine one or more encoding modes to be applied when predicting a block of video data using intraprediction, including the application of an intra reference sample smoothing filter and / or a mode PDPC. The intraprediction processing unit 166 and / or other component of the video decoder 30 can be configured to perform the explicit coding techniques described above for the intra reference sample smoothing filter and PDPC mode syntax encoding.
[0157] If a PU is encoded using interpretation, the entropy decoding unit 150 can determine the motion information for the PU. The motion compensation unit 164 can determine,
Petition 870190087428, of 9/5/2019, p. 101/133
83/91 based on PU movement information, one or more reference blocks. The motion compensation unit 1647 can generate, based on one or more reference blocks, predictive blocks (for example, predictive blocks luma, Cb and Cr) for the PU.
[0158] The reconstruction unit 158 can use the transform blocks (for example, luma transform blocks, Cb and Cr) for CU's TUs and the predictive blocks (for example, luma, Cb and Cr blocks) of the PUs of the CU CU, that is, intraprediction data or interpreting data, as applicable, to reconstruct the coding blocks (for example, the luma, Cb and Cr coding blocks) for CU. For example, reconstruction unit 158 can add samples from the transform blocks (for example, the luma, Cb and Cr transform blocks) to corresponding samples from the predictive blocks (for example, the luma, Cb and Cr predictive blocks) to reconstruct the coding blocks (for example, the coding blocks luma, Cb and Cr) of CU.
[0159] The filter unit 160 can perform an unlocking operation to reduce the blocking artifacts associated with the CU coding blocks. The video decoder 30 can store the CU encoding blocks in the decoded image buffer 162. The decoded image buffer 162 can provide reference images for later motion compensation, intraprediction and presentation on a display device, such as the display device. display 32 of figure 1. For example, video decoder 30 can perform, based on the blocks in the decoded image buffer 162,
Petition 870190087428, of 9/5/2019, p. 102/133
84/91 intraprediction or interpretation for PUs from other CUs.
[0160] Figure 6 is a flow chart illustrating an exemplary coding method of the invention. The techniques of figure 6 can be performed by one or more structural components of the video encoder 22.
[0161] In an example of the invention, video encoder 22 can be configured to determine an encoding mode for encoding a first block of video data (600). In an example of the invention, the coding mode is at least one of an intra reference sample smoothing mode or a PDPC mode. Video encoder 22 can also be configured to explicitly encode a first element of syntax, indicating whether the encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit (602). In an example of the invention, the limit is one of 1, 2 or 3 non-zero coefficients. Video encoder 22 can also signal the first syntax element in an encoded video bit stream (604).
[0162] In another example of the invention, to explicitly encode the first syntax element, video encoder 22 can further be configured to explicitly encode the first syntax element based on an intraprediction mode used to encode the first data block of video.
[0163] In another example of the invention, to explicitly encode the first syntax element, the
Petition 870190087428, of 9/5/2019, p. 103/133
Video encoder 22 may further be configured to explicitly encode the first syntax element in the case where the first block of video data is greater than or equal to a predetermined size.
[0164] In another example of the invention, the number of non-zero transform coefficients includes the number of non-zero transform coefficients for both the luma and chroma components of the first video data block. In another example of the invention, the first video data block includes a single video data block, and the number of non-zero transform coefficients includes the number of non-zero transform coefficients for the video data light block. . In another example of the invention, the first video data block is not a transform advance block.
[0165] In another example of the invention, video encoder 22 is further configured to determine a context for encoding the first syntax element based on an intraprediction mode used to encode the first block of video data, and encode the first syntax element using the given context.
[0166] In another example of the invention, video encoder 22 is further configured to determine an encoding mode for encoding a second block of video data; and do not encode a value of a second syntax element indicating whether the encoding mode will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below the limit.
Petition 870190087428, of 9/5/2019, p. 104/133
86/91 [0167] Figure 7 is a flow chart illustrating an exemplary decoding method of the invention. The techniques of figure 7 can be performed by one or more structural components of the video decoder 30.
[0168] In an example of the invention, the video decoder 30 can be configured to receive a first block of video data (700). The video decoder 30 can also be configured to receive a first element of syntax, indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit (702), and explicitly decode the value of the first received syntax element (704). In an example of the invention, the limit is one of 1, 2 or 3 non-zero coefficients. The video decoder 30 can apply the encoding mode to the first block of video data in accordance with a value of the first syntax element (706). In an example of the invention, the coding mode is at least one of an intra reference sample smoothing mode or a PDPC mode.
[0169] In another example of the invention, to receive the first element of syntax, the video decoder 30 can be further configured to receive the first element of syntax based on an intraprediction mode used to encode the first block of video data .
[0170] In another example of the invention, to receive the first element of syntax, the video decoder 30 can be further configured to receive the first element
Petition 870190087428, of 9/5/2019, p. 105/133
87/91 of syntax in the event that the first block of video data is greater than or equal to a predetermined size.
[0171] In another example of the invention, the number of non-zero transform coefficients includes the number of non-zero transform coefficients for both the luma and chroma components of the first video data block. In another example, the first block of video data includes a single block of video data, and the number of non-zero transform coefficients includes the number of non-zero transform coefficients for the single block of video data. In another example, the first block of the video data is not a transform advance block.
[0172] In another example of the invention, the video decoder 30 is further configured to determine a context for encoding the first syntax element based on an intraprediction mode used to encode the first block of video data, and encode the first syntax element using the given context.
[0173] In another example of the invention, the video decoder 30 can be configured to receive a second block of video data; infer a value of a second syntax element indicating whether the encoding mode will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below limit; and apply the encoding mode in accordance with the inferred value of the second syntax element.
Petition 870190087428, of 9/5/2019, p. 106/133
88/91
[0174] Some aspects of that invention were described in relation to extensions of standard HEVC and the model software JEM in study by JVET for purposes in illustration. At the However, the techniques described in this document can to be useful for others Law Suit in coding in video, including others Law Suit in coding in video standard or in property in development or even undeveloped.
[0175] A video encoder, as described in this document, can refer to a video encoder or a video decoder. Likewise, the video encoding unit can refer to a video encoder or a video decoder. Likewise, video encoding can refer to video encoding or video decoding, as applicable.
[0176] It must be recognized that, depending on the example, some acts or events of any of the techniques described here can be performed in a different sequence, can be added, merged or left out altogether (for example, not all acts or described events are necessary for the practice of the techniques). In addition, in some instances, acts or events can be performed simultaneously, for example, through multi-threaded processing, interrupt processing or multiple processors, rather than sequentially.
[0177] In one or more examples, the functions described can be implemented in hardware, software, firmware or any combination of these. If implemented in software, functions can be stored or transmitted
Petition 870190087428, of 9/5/2019, p. 107/133
89/91 as one or more instructions or code in a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium, such as data storage media, or means of communication, including any medium that facilitates the transfer of a computer program from a computer. place to another, for example, according to a communication protocol. In this way, computer-readable media can generally correspond to (1) tangible computer-readable storage media that are non-transitory or (2) a communication medium such as a signal or carrier wave. Data storage media can be any available media that can be accessed through one or more computers or one or more processors to obtain instructions, code and / or data structures for implementing the techniques described in this publication. A computer program product may include a computer reading medium.
[0178] By way of example, and not by way of limitation, these computer-readable storage media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory or any other means that can be used to transport or store desired program code in the form of instructions or data structures, and that can be accessed by a computer. In addition, any connection is properly called a means of
Petition 870190087428, of 9/5/2019, p. 108/133
90/91 computer reading. For example, if instructions are transmitted from a website, server or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL) or wireless technologies such as infrared, radio and micro- waves, then coaxial cable, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of medium. It should be understood, however, that computer read storage and data storage media do not include connections, carrier waves, signals or other transient means, but refer to tangible, non-transitory storage means. Disk and disk, as used here, include compact disk (CD), laser disk, optical disk, digital versatile disk (DVD), floppy disk and Blu-ray disk, in which disks generally reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above items should also be included in the scope of computer reading media.
[0179] Instructions can be executed by one or more processors, such as digital signal processors (DSPs), general purpose microprocessors, application-specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other logic circuits discrete or integrated equivalents. Therefore, the term processor, as used in this document, can refer to any structure previously mentioned or any other structure suitable for implementing the techniques described in this document.
Petition 870190087428, of 9/5/2019, p. 109/133
91/91 document. In addition, in some examples, the features described here can be provided in hardware and / or software modules configured for encoding and decoding, or incorporated into a combined codec. In addition, the techniques can be fully implemented in one or more circuits or logic elements.
[0180] The techniques of this invention can be implemented in a wide variety of devices or devices, including a wireless telephone device, an integrated circuit (IC) or a set of ICs (for example, a chip set). Various components, modules or units are described in this publication to emphasize functional aspects of the devices configured to perform the described techniques, but do not necessarily require realization by different hardware units. Instead, as described above, multiple units can be combined into one hardware codec unit or provided by a collection of interoperable hardware units, including one or more processors, as described above, in conjunction with the appropriate software and / or firmware .
[0181] Several examples have been described. These and other examples are within the scope of the following claims.
权利要求:
Claims (13)
[1]
1. Method of decoding video data, characterized by the fact that it comprises:
receiving a first block of video data;
receive a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit, where the coding mode is at least one of a reference intra-sample smoothing mode or a position dependent intraprediction combination (PDPC) mode, where a filter is applied to one or more reference samples intra in the intra reference sample smoothing mode, and in which the PDPC mode applies weights to one or more intra reference samples or intra predictor;
explicitly decode a value from the first received syntax element; and applying the encoding mode to the first block of video data in accordance with the value of the first syntax element.
[2]
2/13 greater than or equal to a predetermined size.
Method according to claim 1, characterized in that receiving the first element of syntax also comprises receiving the first element of syntax based on an intraprediction mode used to encode the first block of video data.
[3]
3/13 receiving a second block of video data;
infer a value of a second syntax element indicating whether the encoding mode will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below limit; and apply the encoding mode in accordance with the inferred value of the second syntax element.
3 non-zero coefficients.
Method according to claim 1, characterized in that receiving the first element of syntax also comprises receiving the first element of syntax in the case where the first block of video data is
Petition 870190087428, of 9/5/2019, p. 7/133
[4]
The first syntax element also comprises explicitly encoding the first syntax element based on an intraprediction mode used to encode the first block of video data.
4. Method according to claim 1 characterized by the fact that the limit is one of 1, 2 or
[5]
5/13
to determine one context for coding of first syntax element with base in a mode in intraprediction used for code the first block in data from video; and encode the first element of syntax using the context determined. 18. Method in a deal with the claim 10,
characterized by the fact that it also comprises:
determining an encoding mode for encoding a second block of video data; and do not encode a value of a second syntax element indicating whether the encoding mode will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below the limit.
19. Configured device for decode Dice in video, characterized by the fact what comprises:an memory configured for store the Dice in video; and one or more processors in Communication with the
memory, the one or more processors configured to: receive a first block of video data;
receive a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit, where the coding mode is at least one of a sample smoothing mode
Petition 870190087428, of 9/5/2019, p. 11/133
5.
Method according to claim 1, characterized in that the number of non-zero transform coefficients includes the number of non-zero transform coefficients for both the luma and chroma components of the first video data block.
[6]
6/13 intra reference or a position dependent intraprediction combining mode (PDPC), where a filter is applied to one or more intra reference samples in the intra reference sample smoothing mode, and where the PDPC mode applies weights to one or more reference samples intra or intra predictor;
explicitly decode a value from the first received syntax element; and applying the encoding mode to the first block of video data in accordance with the value of the first syntax element.
20. Apparatus according to claim 19, characterized in that to receive the first element of syntax, the one or more processors are further configured to receive the first element of syntax based on an intraprediction mode used to encode the first block of video data.
21. Apparatus according to claim 19, characterized by the fact that to receive the first element of syntax, the one or more processors are further configured to receive the first element of syntax in the event that the first block of video data is greater than or equal to a predetermined size.
22. Apparatus according to claim 19, characterized by the fact that the limit is one of 1, 2 or 3 non-zero coefficients.
23. Apparatus according to claim 19, characterized by the fact that the number of non-zero transform coefficients includes the number of non-zero transform coefficients for both
Petition 870190087428, of 9/5/2019, p. 12/133
Method according to claim 1, characterized in that the first block of video data includes a single block of video data, and the number of non-zero transform coefficients includes the number of non-zero transform coefficients for the luma block of video data.
[7]
7/13 the luma and chroma components of the first block of the video data.
24. Apparatus according to claim 19, characterized in that the first block of video data includes a single block of video data, and in which the number of transform coefficients other than zero includes the number of transform coefficients nonzero for the luma block of the video data.
25. Apparatus according to claim 19, characterized by the fact that the first block of the video data is not a transform advance block.
26. Device according to claim 19, characterized by the fact that the one or more processors are further configured to:
determining a context for decoding the first syntax element based on an intraprediction mode used to encode the first block of video data; and decode the first syntax element using the given context.
27. Device according to claim 19, characterized by the fact that the one or more processors are further configured to:
receiving a second block of video data;
infer a value of a second syntax element indicating whether the encoding mode will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below limit; and
Petition 870190087428, of 9/5/2019, p. 13/133
Method according to claim 1, characterized in that the first video data block is not a transform advance block.
[8]
8/13 apply the coding mode in accordance with the inferred value of the second syntax element.
28. Device configured to encode video data, characterized by the fact that it comprises:
a memory configured to store video data; and one or more processors in communication with the memory, the one or more processors configured to:
determining an encoding mode for encoding a first block of video data, wherein the encoding mode is at least one of an intra reference sample smoothing mode or a position dependent intraprediction combining mode (PDPC), in that a filter is applied to one or more intra reference samples in the intra reference sample smoothing mode, and that the PDPC mode applies weights to one or more intra reference samples and intra predictor;
explicitly encode a first syntax element indicating whether the encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; and signaling the first syntax element in an encoded video bit stream.
29. Apparatus according to claim 28, characterized in that to explicitly encode the first syntax element, the one or more processors are further configured to explicitly encode the first syntax element based on
Petition 870190087428, of 9/5/2019, p. 14/133
8. Method according to claim 1, characterized by the fact that it also comprises:
determining a context for decoding the first syntax element based on an intraprediction mode used to encode the first video data block; and decode the first syntax element using the given context.
[9]
9/13 is an intraprediction mode used to encode the first block of video data.
30. Apparatus according to claim 28, characterized in that to explicitly encode the first syntax element, the one or more processors are further configured to explicitly encode the first syntax element in the case in which the first block of data video is greater than or equal to a predetermined size.
31. Device in according to claim 28, featured by the fact in that the limit is one out of 1, 2 or 3 coefficients nonzero.32. Device in according to claim 28, featured by the fact in that the number coefficients in transformed many different zero includes the number in
non-zero transform coefficients for both the luma and chroma components of the first block of the video data.
33. Apparatus according to claim 28, characterized in that the first block of video data includes a single block of video data, and in which the number of transform coefficients other than zero includes the number of transform coefficients nonzero for the luma block of the video data.
34. Apparatus according to claim 28, characterized in that the first block of the video data is not a transform advance block.
35. Device according to claim
28, characterized by the fact that the one or more
Petition 870190087428, of 9/5/2019, p. 15/133
9. Method according to claim 1, characterized by the fact that it also comprises:
Petition 870190087428, of 9/5/2019, p. 8/133
[10]
10/13 processors are further configured to:
determining a context for encoding the first syntax element based on an intraprediction mode used to encode the first block of video data; and encode the first element of syntax using the given context.
36. Device according to claim 28, characterized by the fact that the one or more processors are further configured to:
determining an encoding mode for encoding a second block of video data; and do not encode a value of a second syntax element indicating whether the encoding mode will be used for the second block of video data in the event that the second block of video data is associated with a number of non-zero transform coefficients below the limit.
37. Device configured to decode video data, characterized by the fact that it comprises:
means for receiving a first block of video data;
means for receiving a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a threshold, where the coding mode is at least one of an intra reference sample smoothing mode or an intraprediction combination mode
Petition 870190087428, of 9/5/2019, p. 16/133
10. Video data encoding method, characterized by the fact that it comprises:
determine an encoding mode to encode a first block of video data, wherein the encoding mode is at least one of an intra reference sample smoothing mode or a position dependent intraprediction combining mode (PDPC), in that a filter is applied to one or more intra reference samples in the intra reference sample smoothing mode, and that the PDPC mode applies weights to one or more intra reference samples and intra predictor;
explicitly encode a first syntax element indicating whether the encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; and signaling the first syntax element in an encoded video bit stream.
[11]
11/13 position dependent (PDPC), where a filter is applied to one or more intra reference samples in the intra reference sample smoothing mode, and where the PDPC mode applies weights to one or more intra reference samples or intra predictor; and means for explicitly decoding a value of the first received syntax element; and means for applying the encoding mode to the first block of video data in accordance with the value of the first syntax element.
38. Device configured to encode video data, characterized by the fact that it comprises:
means for determining an encoding mode for encoding a first block of video data, wherein the encoding mode is at least one of an intra reference sample smoothing mode or a position dependent intraprediction combining mode (PDPC) , where a filter is applied to one or more intra reference samples in the intra reference sample smoothing mode, and where the PDPC mode applies weights to one or more intra reference samples and intra predictor;
means for explicitly encoding a first syntax element indicating whether the encoding mode will be used for the first video data block in the event that the first video data block is associated with a number of non-zero transform coefficients greater than or equal to a limit; and means for signaling the first syntax element in an encoded video bit stream.
39. Read storage medium by
Petition 870190087428, of 9/5/2019, p. 17/133
11. Method according to claim 10, characterized by the fact that it explicitly encodes the
Petition 870190087428, of 9/5/2019, p. 9/133
[12]
12/13 computer storing instructions that, when executed, take one or more processors from a device configured to decode video data to:
receiving a first block of video data;
receive a first element of syntax indicating whether an encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit, where the coding mode is at least one of an intra reference sample smoothing mode or a position dependent intraprediction combination (PDPC) mode, where a filter is applied to one or more reference samples intra in the intra reference sample smoothing mode, and where the PDPC mode applies weights to one or more intra reference samples or intra predictor;
explicitly decode a value from the first received syntax element; and applying the encoding mode to the first block of video data in accordance with the value of the first syntax element.
40. Computer readable storage medium storing instructions that, when executed, take one or more processors from a device configured to encode video data to:
determine an encoding mode for encoding a first block of video data, wherein the encoding mode is at least one of an intra reference sample smoothing mode or a position dependent intraprediction combining mode (PDPC), in that a
Petition 870190087428, of 9/5/2019, p. 18/133
Method according to claim 10, characterized in that explicitly encoding the first syntax element also comprises explicitly encoding the first syntax element in the event that the first video data block is greater than or equal to a predetermined size .
13. Method according to claim 10, characterized by the fact that the limit is one of 1, 2 or 3 coefficients other than zero.
Method according to claim 10, characterized in that the number of non-zero transform coefficients includes the number of non-zero transform coefficients for both the luma and chroma components of the first video data block.
Method according to claim 10, characterized in that the first block of video data includes a single block of video data, and the number of transform coefficients other than zero includes the number of transform coefficients different from zero for the luma block of video data.
16. Method according to claim 10, characterized in that the first video data block is not a transform advance block.
17. Method according to claim 10, characterized by the fact that it further comprises:
Petition 870190087428, of 9/5/2019, p. 10/133
[13]
The filter is applied to one or more intra reference samples in the intra reference sample smoothing mode, and in which the PDPC mode applies weights to one or more intra reference samples and intra predictor;
explicitly encode a first syntax element indicating whether the encoding mode will be used for the first block of video data in the event that the first block of video data is associated with a number of non-zero transform coefficients greater than or equal to a limit; and signal the first syntax element in an encoded video bit stream
类似技术:
公开号 | 公开日 | 专利标题
BR112019018464A2|2020-04-14|intra filtering indicator in video encoding
TWI693820B|2020-05-11|Determining prediction parameters for non-square blocks in video coding
US10805641B2|2020-10-13|Intra filtering applied together with transform processing in video coding
BR112019013645A2|2020-01-21|multi-type tree structure for video encoding
KR20200128138A|2020-11-11|Method and apparatus for video coding
BR112019013705A2|2020-04-28|temporal prediction of modified adaptive loop filter to support time scalability
BR112019014090A2|2020-02-04|intraprevision techniques for video encoding
BR112020019715A2|2021-02-09|combination of extended position-dependent intraprediction with angular modes
BR112021000352A2|2021-04-06|COMBINATION OF POSITION-DEPENDENT INTRAPREDITION WITH WIDE-ANGLE INTRAPREDITION
TW201830964A|2018-08-16|Deriving bilateral filter information based on a prediction mode in video coding
BR112019019423A2|2020-04-14|intrapredict mode propagation
BR112021002990A2|2021-05-11|unlocking filter for video encoding and processing
BR112019010547A2|2019-09-17|indication of using bilateral filter in video coding
KR20210049929A|2021-05-06|Video coding method and apparatus
KR20210068101A|2021-06-08|Single-level transform division and adaptive sub-block transform
TWI745594B|2021-11-11|Intra filtering applied together with transform processing in video coding
TWI755376B|2022-02-21|Geometric transforms for filters for video coding
BR112020016913A2|2020-12-15|TRANSFORMATION OF SPACE VARIATION DEPENDENT ON POSITION FOR VIDEO ENCODING
BR112021004124A2|2021-05-25|video decoding method and video decoder
同族专利:
公开号 | 公开日
US20200021818A1|2020-01-16|
WO2018165397A1|2018-09-13|
SG11201906877QA|2019-09-27|
US11146795B2|2021-10-12|
CN110393010A|2019-10-29|
US20180262763A1|2018-09-13|
EP3593531A1|2020-01-15|
TW201841503A|2018-11-16|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US6088802A|1997-06-04|2000-07-11|Spyrus, Inc.|Peripheral device with integrated security functionality|
US7783901B2|2001-12-05|2010-08-24|At&T Intellectual Property Ii, L.P.|Network security device and method|
JP4617644B2|2003-07-18|2011-01-26|ソニー株式会社|Encoding apparatus and method|
EP2039171B1|2006-07-07|2016-10-05|Telefonaktiebolaget LM Ericsson |Weighted prediction for video coding|
US8621601B2|2008-05-21|2013-12-31|Sandisk Technologies Inc.|Systems for authentication for access to software development kit for a peripheral device|
WO2011127964A2|2010-04-13|2011-10-20|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus for intra predicting a block, apparatus for reconstructing a block of a picture, apparatus for reconstructing a block of a picture by intra prediction|
CN102857752B|2011-07-01|2016-03-30|华为技术有限公司|A kind of pixel prediction method and apparatus|
US9288508B2|2011-11-08|2016-03-15|Qualcomm Incorporated|Context reduction for context adaptive binary arithmetic coding|
US10681036B2|2014-03-28|2020-06-09|Ncr Corporation|Composite security interconnect device and methods|
US10445710B2|2014-08-26|2019-10-15|Ncr Corporation|Security device key management|
WO2016154963A1|2015-04-01|2016-10-06|Mediatek Inc.|Methods for chroma coding in video codec|
US10425648B2|2015-09-29|2019-09-24|Qualcomm Incorporated|Video intra-prediction using position-dependent prediction combination for video coding|
CN109076243B|2016-05-04|2022-01-25|夏普株式会社|System and method for encoding transform data|
US20180199062A1|2017-01-11|2018-07-12|Qualcomm Incorporated|Intra prediction techniques for video coding|
US11146795B2|2017-03-10|2021-10-12|Qualcomm Incorporated|Intra filtering flag in video coding|
US10805641B2|2017-06-15|2020-10-13|Qualcomm Incorporated|Intra filtering applied together with transform processing in video coding|US11146795B2|2017-03-10|2021-10-12|Qualcomm Incorporated|Intra filtering flag in video coding|
WO2018216862A1|2017-05-24|2018-11-29|엘지전자 주식회사|Method and device for decoding image according to intra prediction in image coding system|
US10805641B2|2017-06-15|2020-10-13|Qualcomm Incorporated|Intra filtering applied together with transform processing in video coding|
GB2567249A|2017-10-09|2019-04-10|Canon Kk|New sample sets and new down-sampling schemes for linear component sample prediction|
CN110515761A|2018-05-22|2019-11-29|杭州海康威视数字技术股份有限公司|A kind of data capture method and device|
US10645396B2|2018-06-04|2020-05-05|Tencent America LLC|Method and apparatus for implicit transform splitting|
US10567752B2|2018-07-02|2020-02-18|Tencent America LLC|Method and apparatus for intra prediction for non-square blocks in video compression|
US11082692B2|2018-12-28|2021-08-03|Telefonaktiebolaget Lm Ericsson |Method and apparatus for selecting transform selection in an encoder and decoder|
US11240516B2|2019-03-20|2022-02-01|Tencent America LLC|Coding mode signaling for small blocks|
WO2021134759A1|2020-01-02|2021-07-08|Huawei Technologies Co., Ltd.|An encoder, a decoder and corresponding methods of symmetric mode dependent intra smoothing when wide angle intra prediction is activated|
GB2590729A|2020-01-03|2021-07-07|British Broadcasting Corp|Transform skip in video coding and decoding|
法律状态:
2021-10-19| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
US201762470099P| true| 2017-03-10|2017-03-10|
US201762475739P| true| 2017-03-23|2017-03-23|
US15/914,514|US11146795B2|2017-03-10|2018-03-07|Intra filtering flag in video coding|
PCT/US2018/021495|WO2018165397A1|2017-03-10|2018-03-08|Intra filtering flag in video coding|
[返回顶部]