Understanding DCT and Quantization in JPEG compression

JPEG is a widely used lossy compression standard method specifically designed for images. It was developed by the Joint Photographic Experts Group.

JPEG compression can be roughly divided into five steps: 1) Color space transformation, 2) Downsampling, 3) Discrete Cosine Transform (DCT), 4) Quantization, and 5) Entropy coding.

Here, I will mainly analyze the third step, “Discrete Cosine Transform,” and the fourth step, “Quantization,” and provide the corresponding Python code implementation.

Discrete Cosine Transform (DCT)

Discrete Cosine Transform (DCT) is commonly used in signal processing and image processing as a method to transform an image from the spatial domain to the frequency domain. The idea behind DCT is that any function can be represented by the summation of cosine functions. DCT is a reversible process and inherently lossless.

DCT can be accomplished using a DCT transformation matrix.

T = C F C T T=CFC^T T=CFCT

Among them,
F F F represents the original image,
T T T represents the image after DCT. Assuming the matrix size is
N × N N \times N N×N , the DCT matrix
C C C is as follows:

C ( u , v ) = { 1 N u = 0 , 0 ≤ v ≤ N − 1 2 N c o s ( 2 v + 1 ) u π 2 N 1 ≤ u ≤ N − 1 , 0 ≤ v ≤ N − 1 C(u,v)=\begin{cases} \sqrt{\frac{1}{N}} &u=0,0\le v \le N-1 \\ \sqrt{\frac{2}{N}}cos\frac{(2v+1)u\pi}{2N} & 1\le u \le N-1,0\le v \le N-1 \end{cases} C(u,v)=⎩ ⎨ ⎧N1 N2 cos2N(2v+1)uπu=0,0≤v≤N−11≤u≤N−1,0≤v≤N−1

If the horizontal axis is
u u u , the vertical axis is
v v v , the matrix representation of
C C C is:

C ( u , v ) = [ 1 N 1 N ⋯ 1 N 2 N c o s π 2 N 2 N c o s 3 π 2 N ⋯ 2 N c o s ( 2 N − 1 ) π 2 N ⋮ ⋮ ⋱ ⋮ 2 N c o s ( N − 1 ) π 2 N 2 N c o s 3 ( N − 1 ) π 2 N ⋯ 2 N c o s ( 2 N − 1 ) ( N − 1 ) π 2 N ] C(u,v)=\begin{bmatrix} \sqrt{\frac{1}{N}} & \sqrt{\frac{1}{N}} & \cdots & \sqrt{\frac{1}{N}} \\ \sqrt{\frac{2}{N}}cos\frac{\pi}{2N} & \sqrt{\frac{2}{N}}cos\frac{3\pi}{2N} & \cdots & \sqrt{\frac{2}{N}}cos\frac{(2N-1)\pi}{2N} \\ \vdots & \vdots & \ddots & \vdots \\ \sqrt{\frac{2}{N}}cos\frac{(N-1)\pi}{2N} & \sqrt{\frac{2}{N}}cos\frac{3(N-1)\pi}{2N} & \cdots & \sqrt{\frac{2}{N}}cos\frac{(2N-1)(N-1)\pi}{2N} \\ \end{bmatrix} C(u,v)=⎣ ⎡N1 N2 cos2Nπ⋮N2 cos2N(N−1)πN1 N2 cos2N3π⋮N2 cos2N3(N−1)π⋯⋯⋱⋯N1 N2 cos2N(2N−1)π⋮N2 cos2N(2N−1)(N−1)π⎦ ⎤

Transpose matrix of
C C C is:

C T ( u , v ) = [ 1 N 2 N c o s π 2 N ⋯ 2 N c o s ( N − 1 ) π 2 N 1 N 2 N c o s 3 π 2 N ⋯ 2 N c o s 3 ( N − 1 ) π 2 N ⋮ ⋮ ⋱ ⋮ 1 N 2 N c o s ( 2 N − 1 ) π 2 N ⋯ 2 N c o s ( 2 N − 1 ) ( N − 1 ) π 2 N ] C^T(u,v)=\begin{bmatrix} \sqrt{\frac{1}{N}} & \sqrt{\frac{2}{N}}cos\frac{\pi}{2N} & \cdots & \sqrt{\frac{2}{N}}cos\frac{(N-1)\pi}{2N} \\ \sqrt{\frac{1}{N}} & \sqrt{\frac{2}{N}}cos\frac{3\pi}{2N} & \cdots & \sqrt{\frac{2}{N}}cos\frac{3(N-1)\pi}{2N} \\ \vdots & \vdots & \ddots & \vdots \\ \sqrt{\frac{1}{N}} & \sqrt{\frac{2}{N}}cos\frac{(2N-1)\pi}{2N} & \cdots & \sqrt{\frac{2}{N}}cos\frac{(2N-1)(N-1)\pi}{2N}\\ \end{bmatrix} CT(u,v)=⎣ ⎡N1 N1 ⋮N1 N2 cos2NπN2 cos2N3π⋮N2 cos2N(2N−1)π⋯⋯⋱⋯N2 cos2N(N−1)πN2 cos2N3(N−1)π⋮N2 cos2N(2N−1)(N−1)π⎦ ⎤

Basis functions above can be illustrated as images, known as the basis image. The specific method is as follows: For an NxN DCT matrix, by multiplying the nth column vector of
C C C with the mth row vector of
C T C^T CT , a total of
N 2 N^2 N2 matrices can be obtained. These matrices are then concatenated according to their positions (m,n), resulting in a basis image of size
N 2 × N 2 N^2 \times N^2 N2×N2 . Any image can be composed using these cosine transform combinations. In fact, DCT is to calculate the contribution of each cosine waves and get what we call “coefficient”.


import numpy as np
import matplotlib.pyplot as plt
import math
import cv2
import skimage.io as io
# DCT matrix
N=8 # matrix shape: 8*8
DCT=np.zeros((N,N))
for m in range(N):
    for n in range(N):
        if m==0:
            DCT[m][n]=math.sqrt(1/N)
        else:
            DCT[m][n]=math.sqrt(2/N)*math.cos((2*n+1)*math.pi*m/(2*N))
# DCT basis image
basis=np.zeros((N*N,N*N))
for m in range(N):
    for n in range(N):
        pos_m=m*N
        pos_n=n*N
        DCT_v=DCT[m,:].reshape(-1,1)
        DCT_T_h=DCT.T[:,n].reshape(-1,N)
       basis[pos_m:pos_m+N,pos_n:pos_n+N]=np.matmul(DCT_v,DCT_T_h)
# Center values
basis+=np.absolute(np.amin(basis))
scale=np.around(1/np.amax(basis),decimals=3)
for m in range(basis.shape[0]):
    for n in range(basis.shape[1]):
        basis[m][n]=np.around(basis[m][n]*scale,decimals=3)
# Show basis image
plt.figure(figsize=(4,4))
plt.gray()
plt.axis('off')
plt.title('DCT Basis Image')
plt.imshow(basis,vmin=0)
import numpy as np
import matplotlib.pyplot as plt
import math
import cv2
import skimage.io as io

# DCT matrix
N=8 # matrix shape: 8*8
DCT=np.zeros((N,N))
for m in range(N):
    for n in range(N):
        if m==0:
            DCT[m][n]=math.sqrt(1/N)
        else:
            DCT[m][n]=math.sqrt(2/N)*math.cos((2*n+1)*math.pi*m/(2*N))

# DCT basis image
basis=np.zeros((N*N,N*N))
for m in range(N):
    for n in range(N):
        pos_m=m*N
        pos_n=n*N
        DCT_v=DCT[m,:].reshape(-1,1)
        DCT_T_h=DCT.T[:,n].reshape(-1,N)
       basis[pos_m:pos_m+N,pos_n:pos_n+N]=np.matmul(DCT_v,DCT_T_h)

# Center values
basis+=np.absolute(np.amin(basis))
scale=np.around(1/np.amax(basis),decimals=3)
for m in range(basis.shape[0]):
    for n in range(basis.shape[1]):
        basis[m][n]=np.around(basis[m][n]*scale,decimals=3)

# Show basis image
plt.figure(figsize=(4,4))
plt.gray()
plt.axis('off')
plt.title('DCT Basis Image')
plt.imshow(basis,vmin=0)


import numpy as np
import matplotlib.pyplot as plt
import math
import cv2
import skimage.io as io

# DCT matrix
N=8 # matrix shape: 8*8
DCT=np.zeros((N,N))
for m in range(N):
    for n in range(N):
        if m==0:
            DCT[m][n]=math.sqrt(1/N)
        else:
            DCT[m][n]=math.sqrt(2/N)*math.cos((2*n+1)*math.pi*m/(2*N))

# DCT basis image
basis=np.zeros((N*N,N*N))
for m in range(N):
    for n in range(N):
        pos_m=m*N
        pos_n=n*N
        DCT_v=DCT[m,:].reshape(-1,1)
        DCT_T_h=DCT.T[:,n].reshape(-1,N)
       basis[pos_m:pos_m+N,pos_n:pos_n+N]=np.matmul(DCT_v,DCT_T_h)

# Center values
basis+=np.absolute(np.amin(basis))
scale=np.around(1/np.amax(basis),decimals=3)
for m in range(basis.shape[0]):
    for n in range(basis.shape[1]):
        basis[m][n]=np.around(basis[m][n]*scale,decimals=3)

# Show basis image
plt.figure(figsize=(4,4))
plt.gray()
plt.axis('off')
plt.title('DCT Basis Image')
plt.imshow(basis,vmin=0)

Enter fullscreen mode Exit fullscreen mode

This is the basis image of an 8×8 DCT matrix, with a size of 64×64. By observing the image, we can notice that the horizontal frequencies increase from left to right, while the vertical frequencies increase from top to bottom. The constant basis function in the top left corner is commonly referred to as the DC (Direct Current) basis function, and the corresponding DCT coefficient is known as the DC coefficient.

Any image can be composed by overlaying 64 cosine transformations from this reference image. Since we have set the size of the DCT matrix to be 8×8, we need to divide the image into several small 8×8 squares, referred to as blocks. The image is processed in units of blocks. Below, we will demonstrate 8 sets of DCT matrices along with their corresponding blocks.


# DCT matrix
blocks=np.zeros((8*8,8))
for i in range(8):
    blocks[i*8][i]=1
# IDCT -> Original images
blocks_idct=np.zeros((8*8,8))
for i in range(8):
    block=blocks[i*8:i*8+8][:]
    data=cv2.idct(block)
    blocks_idct[i*8:i*8+8][:]=data
# Show DCT matrix
plt.figure(figsize=(16,3))
for i in range(8):
    pos='18'+str(i+1)
    pos=int(pos)
    plt.subplot(pos)
    block=blocks[i*8:i*8+8][:]
    plt.gray()
    plt.axis('off')
    plt.imshow(block,vmin=0)
# DCT matrix
blocks=np.zeros((8*8,8))
for i in range(8):
    blocks[i*8][i]=1

# IDCT -> Original images
blocks_idct=np.zeros((8*8,8))
for i in range(8):
    block=blocks[i*8:i*8+8][:]
    data=cv2.idct(block)
    blocks_idct[i*8:i*8+8][:]=data

# Show DCT matrix
plt.figure(figsize=(16,3))
for i in range(8):
    pos='18'+str(i+1)
    pos=int(pos)
    plt.subplot(pos)
    block=blocks[i*8:i*8+8][:]
    plt.gray()
    plt.axis('off')
    plt.imshow(block,vmin=0)


# DCT matrix
blocks=np.zeros((8*8,8))
for i in range(8):
    blocks[i*8][i]=1

# IDCT -> Original images
blocks_idct=np.zeros((8*8,8))
for i in range(8):
    block=blocks[i*8:i*8+8][:]
    data=cv2.idct(block)
    blocks_idct[i*8:i*8+8][:]=data

# Show DCT matrix
plt.figure(figsize=(16,3))
for i in range(8):
    pos='18'+str(i+1)
    pos=int(pos)
    plt.subplot(pos)
    block=blocks[i*8:i*8+8][:]
    plt.gray()
    plt.axis('off')
    plt.imshow(block,vmin=0)

Enter fullscreen mode Exit fullscreen mode


# Show original images
plt.figure(figsize=(16,3))
for i in range(8):
    pos='18'+str(i+1)
    pos=int(pos)
    plt.subplot(pos)
    block_idct=blocks_idct[i*8:i*8+8][:]
    plt.gray()
    plt.xticks([])
    plt.yticks([])
    plt.imshow(block_idct,vmin=0)
# Show original images
plt.figure(figsize=(16,3))
for i in range(8):
    pos='18'+str(i+1)
    pos=int(pos)
    plt.subplot(pos)
    block_idct=blocks_idct[i*8:i*8+8][:]
    plt.gray()
    plt.xticks([])
    plt.yticks([])
    plt.imshow(block_idct,vmin=0)


# Show original images
plt.figure(figsize=(16,3))
for i in range(8):
    pos='18'+str(i+1)
    pos=int(pos)
    plt.subplot(pos)
    block_idct=blocks_idct[i*8:i*8+8][:]
    plt.gray()
    plt.xticks([])
    plt.yticks([])
    plt.imshow(block_idct,vmin=0)

Enter fullscreen mode Exit fullscreen mode

Let’s assume that the original images above are numbered from left to right as 1-8. We can see that Image 1 is exactly the same as the top-left small square in the DCT basis image, which means its DCT matrix only has weights in the top-left corner, and the weights are 0 everywhere else. Image 2 is identical to the small square in the first row and second column of the DCT basis image, so its DCT matrix shows weights only at (0,1), and the weights are 0 elsewhere. This pattern continues for the other images. This is the essence of the DCT.

The basic idea of JPEG compression

DCT itself is lossless, and its purpose is to transform the image from the spatial domain to the frequency domain. To achieve compression, we need to introduce a step between DCT and IDCT. The human eye is less sensitive to high-frequency information, so the compression method involves discarding high-frequency information from the image. In the DCT matrix, the information in the top-left corner is high-frequency, while the information in the bottom-right corner is low-frequency. Below, I will demonstrate the basic idea of JPEG compression by selectively retaining 1, 3, and 10 low-frequency components.


# Read image
img=io.imread("img3.jpg",as_gray=True)
# Obtaining a mask through zigzag scanning
def z_scan_mask(C,N):
    mask=np.zeros((N,N))
    start=0
    mask_m=start
    mask_n=start
    for i in range(C):
        if i==0:
            mask[mask_m,mask_n]=1
        else:
            # If even, move upward to the right
            if (mask_m+mask_n)%2==0:
                mask_m-=1
                mask_n+=1
                # If it exceeds the upper boundary, move downward
                if mask_m<0:
                    mask_m+=1
                # If it exceeds the right boundary, move left
                if mask_n>=N:
                    mask_n-=1
            # If odd, move downward to the left
            else:
                mask_m+=1
                mask_n-=1
                # If it exceeds the lower boundary, move upward
                if mask_m>=N:
                    mask_m-=1
                # If it exceeds the left boundary, move right
                if mask_n<0:
                    mask_n+=1
            mask[mask_m,mask_n]=1
    return mask
# overlaying the mask, discarding the high-frequency components
def Compress(img,mask,N):
    img_dct=np.zeros((img.shape[0]//N*N,img.shape[1]//N*N))
    for m in range(0,img_dct.shape[0],N):
        for n in range(0,img_dct.shape[1],N):
            block=img[m:m+N,n:n+N]
            # DCT
            coeff=cv2.dct(block)
            # IDCT, but only the parts of the image where the mask has a value of 1 are retained
            iblock=cv2.idct(coeff*mask)
            img_dct[m:m+N,n:n+N]=iblock
    return img_dct
# Images keeping only 1, 3, and 10 low-frequency coefficients
plt.figure(figsize=(16,4))
plt.gray()
plt.subplot(141)
plt.title('Original image')
plt.imshow(img)
plt.axis('off')
plt.subplot(142)
plt.title('Keep 1 coefficient')
plt.imshow(Compress(img,z_scan_mask(1,8),8))
plt.axis('off')
plt.subplot(143)
plt.title('Keep 3 coefficients')
plt.imshow(Compress(img,z_scan_mask(3,8),8))
plt.axis('off')
plt.subplot(144)
plt.title('Keep 10 coefficients')
plt.imshow(Compress(img,z_scan_mask(10,8),8))
plt.axis('off')
plt.show()
# Read image
img=io.imread("img3.jpg",as_gray=True)

# Obtaining a mask through zigzag scanning
def z_scan_mask(C,N):
    mask=np.zeros((N,N))
    start=0
    mask_m=start
    mask_n=start
    for i in range(C):
        if i==0:
            mask[mask_m,mask_n]=1
        else:
            # If even, move upward to the right
            if (mask_m+mask_n)%2==0:
                mask_m-=1
                mask_n+=1
                # If it exceeds the upper boundary, move downward
                if mask_m<0:
                    mask_m+=1
                # If it exceeds the right boundary, move left
                if mask_n>=N:
                    mask_n-=1
            # If odd, move downward to the left
            else:
                mask_m+=1
                mask_n-=1
                # If it exceeds the lower boundary, move upward
                if mask_m>=N:
                    mask_m-=1
                # If it exceeds the left boundary, move right
                if mask_n<0:
                    mask_n+=1
            mask[mask_m,mask_n]=1
    return mask

# overlaying the mask, discarding the high-frequency components
def Compress(img,mask,N):
    img_dct=np.zeros((img.shape[0]//N*N,img.shape[1]//N*N))
    for m in range(0,img_dct.shape[0],N):
        for n in range(0,img_dct.shape[1],N):
            block=img[m:m+N,n:n+N]
            # DCT
            coeff=cv2.dct(block)
            # IDCT, but only the parts of the image where the mask has a value of 1 are retained
            iblock=cv2.idct(coeff*mask)
            img_dct[m:m+N,n:n+N]=iblock
    return img_dct

# Images keeping only 1, 3, and 10 low-frequency coefficients
plt.figure(figsize=(16,4))
plt.gray()
plt.subplot(141)
plt.title('Original image')
plt.imshow(img)
plt.axis('off')

plt.subplot(142)
plt.title('Keep 1 coefficient')
plt.imshow(Compress(img,z_scan_mask(1,8),8))
plt.axis('off')

plt.subplot(143)
plt.title('Keep 3 coefficients')
plt.imshow(Compress(img,z_scan_mask(3,8),8))
plt.axis('off')

plt.subplot(144)
plt.title('Keep 10 coefficients')
plt.imshow(Compress(img,z_scan_mask(10,8),8))
plt.axis('off')
plt.show()


# Read image
img=io.imread("img3.jpg",as_gray=True)

# Obtaining a mask through zigzag scanning
def z_scan_mask(C,N):
    mask=np.zeros((N,N))
    start=0
    mask_m=start
    mask_n=start
    for i in range(C):
        if i==0:
            mask[mask_m,mask_n]=1
        else:
            # If even, move upward to the right
            if (mask_m+mask_n)%2==0:
                mask_m-=1
                mask_n+=1
                # If it exceeds the upper boundary, move downward
                if mask_m<0:
                    mask_m+=1
                # If it exceeds the right boundary, move left
                if mask_n>=N:
                    mask_n-=1
            # If odd, move downward to the left
            else:
                mask_m+=1
                mask_n-=1
                # If it exceeds the lower boundary, move upward
                if mask_m>=N:
                    mask_m-=1
                # If it exceeds the left boundary, move right
                if mask_n<0:
                    mask_n+=1
            mask[mask_m,mask_n]=1
    return mask

# overlaying the mask, discarding the high-frequency components
def Compress(img,mask,N):
    img_dct=np.zeros((img.shape[0]//N*N,img.shape[1]//N*N))
    for m in range(0,img_dct.shape[0],N):
        for n in range(0,img_dct.shape[1],N):
            block=img[m:m+N,n:n+N]
            # DCT
            coeff=cv2.dct(block)
            # IDCT, but only the parts of the image where the mask has a value of 1 are retained
            iblock=cv2.idct(coeff*mask)
            img_dct[m:m+N,n:n+N]=iblock
    return img_dct

# Images keeping only 1, 3, and 10 low-frequency coefficients
plt.figure(figsize=(16,4))
plt.gray()
plt.subplot(141)
plt.title('Original image')
plt.imshow(img)
plt.axis('off')

plt.subplot(142)
plt.title('Keep 1 coefficient')
plt.imshow(Compress(img,z_scan_mask(1,8),8))
plt.axis('off')

plt.subplot(143)
plt.title('Keep 3 coefficients')
plt.imshow(Compress(img,z_scan_mask(3,8),8))
plt.axis('off')

plt.subplot(144)
plt.title('Keep 10 coefficients')
plt.imshow(Compress(img,z_scan_mask(10,8),8))
plt.axis('off')
plt.show()

Enter fullscreen mode Exit fullscreen mode

For the images above, the more coefficients you retain, the better the image quality. In the DCT matrix, the top-left corner’s DC component contains most of the information. If you only keep the DC component from the top-left corner, the entire block will be represented by the DC component alone, resulting in the entire block having a single value. By retaining more coefficient information, more cosine components are overlaid, resulting in more details in the image.

Quantization

The concept of quantization is similar to the second step. The quantization matrix is user-defined, where smaller quantization coefficients retain more information from the image, while larger quantization coefficients retain less information. For example, consider a 2×2 DCT matrix:

C = [ 211 22 13 5 ] C=\begin{bmatrix} 211 & 22 \\ 13 & 5 \end{bmatrix} C=[21113225]

If the quantization matrix is:

Q = [ 6 6 6 6 ] Q=\begin{bmatrix} 6 & 6 \\ 6 & 6 \end{bmatrix} Q=[6666]

If we divide C C C by Q Q Q , round the resulting values, and then multiply them back by Q Q Q , we can obtain the quantized matrix C q C_q Cq .

C q = [ 210 24 12 0 ] C_q=\begin{bmatrix} 210 & 24 \\ 12 & 0 \end{bmatrix} Cq=[21012240]

Indeed, we can observe that the value 5 in
C C C becomes 0 after quantization. If we reduce
Q Q Q , for example, to 5, the value 5 will not be discarded. Conversely, if we increase
Q Q Q , it is possible that higher values, such as 13, may also be discarded. In software applications like Photoshop and other image processing tools, adjusting the “quality” setting corresponds to modifying the quantization coefficients. The quantization matrix can have different values at each position.


# Define quantization matrix
q_mat=np.array([[16,11,10,16,24,40,51,61],
                [12,12,14,19,26,58,60,55],
                [14,13,16,24,40,57,69,56],
                [14,17,22,29,51,87,80,62],
                [18,22,37,56,68,109,103,77],
                [24,35,55,64,81,104,113,92],
                [49,64,78,87,103,121,120,101],
                [72,92,95,98,112,100,103,99]])
# Quantization
def Compress_2(img,N):
    img_dct=np.zeros((img.shape[0]//N*N,img.shape[1]//N*N))
    for m in range(0,img_dct.shape[0],N):
        for n in range(0,img_dct.shape[1],N):
            block=img[m:m+N,n:n+N]
            # DCT
            coeff=cv2.dct(block)
            # Quantization
            q_coeff=np.around(coeff*255/q_mat)
            # Inverse quantization
            iq_coeff=q_coeff*q_mat
            # IDCT
            iblock=cv2.idct(iq_coeff)
            img_dct[m:m+N,n:n+N]=iblock
    return img_dct
# Show Images
plt.figure(figsize=(8,4))
plt.gray()
plt.subplot(121)
plt.title('Original')
plt.imshow(img)
plt.axis('off')
plt.subplot(122)
plt.title('Quantified')
plt.imshow(Compress_2(img,8))
plt.axis('off')
plt.show
# Define quantization matrix
q_mat=np.array([[16,11,10,16,24,40,51,61],
                [12,12,14,19,26,58,60,55],
                [14,13,16,24,40,57,69,56],
                [14,17,22,29,51,87,80,62],
                [18,22,37,56,68,109,103,77],
                [24,35,55,64,81,104,113,92],
                [49,64,78,87,103,121,120,101],
                [72,92,95,98,112,100,103,99]])

# Quantization
def Compress_2(img,N):
    img_dct=np.zeros((img.shape[0]//N*N,img.shape[1]//N*N))
    for m in range(0,img_dct.shape[0],N):
        for n in range(0,img_dct.shape[1],N):
            block=img[m:m+N,n:n+N]
            # DCT
            coeff=cv2.dct(block)
            # Quantization
            q_coeff=np.around(coeff*255/q_mat)
            # Inverse quantization
            iq_coeff=q_coeff*q_mat
            # IDCT
            iblock=cv2.idct(iq_coeff)
            img_dct[m:m+N,n:n+N]=iblock
    return img_dct

# Show Images
plt.figure(figsize=(8,4))
plt.gray()
plt.subplot(121)
plt.title('Original')
plt.imshow(img)
plt.axis('off')

plt.subplot(122)
plt.title('Quantified')
plt.imshow(Compress_2(img,8))
plt.axis('off')
plt.show


# Define quantization matrix
q_mat=np.array([[16,11,10,16,24,40,51,61],
                [12,12,14,19,26,58,60,55],
                [14,13,16,24,40,57,69,56],
                [14,17,22,29,51,87,80,62],
                [18,22,37,56,68,109,103,77],
                [24,35,55,64,81,104,113,92],
                [49,64,78,87,103,121,120,101],
                [72,92,95,98,112,100,103,99]])

# Quantization
def Compress_2(img,N):
    img_dct=np.zeros((img.shape[0]//N*N,img.shape[1]//N*N))
    for m in range(0,img_dct.shape[0],N):
        for n in range(0,img_dct.shape[1],N):
            block=img[m:m+N,n:n+N]
            # DCT
            coeff=cv2.dct(block)
            # Quantization
            q_coeff=np.around(coeff*255/q_mat)
            # Inverse quantization
            iq_coeff=q_coeff*q_mat
            # IDCT
            iblock=cv2.idct(iq_coeff)
            img_dct[m:m+N,n:n+N]=iblock
    return img_dct

# Show Images
plt.figure(figsize=(8,4))
plt.gray()
plt.subplot(121)
plt.title('Original')
plt.imshow(img)
plt.axis('off')

plt.subplot(122)
plt.title('Quantified')
plt.imshow(Compress_2(img,8))
plt.axis('off')
plt.show

Enter fullscreen mode Exit fullscreen mode

原文链接：Understanding DCT and Quantization in JPEG compression

展开阅读全文

文章版权声明 1、本网站名称：拾光赋
2、本站永久网址：https://www.blogs.ink
3、本网站的文章部分内容可能来源于网络，仅供大家学习与参考，如有侵权，请联系站长QQ：805375623进行删除处理。
4、本站一切资源不代表本站立场，并不代表本站赞同其观点和对其真实性负责。
5、本站一律禁止以任何方式发布或转载任何违法的相关信息，访客发现请向站长举报
6、本站资源大多存储在云盘，如发现链接失效，请联系我们我们会第一时间更新。

THE END