TensorFlow实现卷积自编码器对图像进行去噪
自编码器是一种执行数据压缩的网络架构,其中的压缩和解压缩功能是从数据本身学习得到的,而非人为手工设计的。自编码器的两个核心部分是编码器和解码器,它将输入数据压缩到一个潜在表示空间里面,然后再根据这个表示空间将数据进行重构得到最后的输出数据。编码器和解码器都是用神经网络构建的,整个网络的构建方式和普通的神经网络类似,通过最小化输入和输出之间的差异来得到最好的网络。
1. 图像去噪;
2. 数据压缩降维。
但是它的图像压缩性能不如JPEG、MP3等传统压缩方法,并且自编码器泛化到其他数据集方面有困难。
我们的数据基于MNIST数据集,首先需要下载数据并且放在MNIST_data目录下,可以从文章后面提供的链接下载,也可以自行找网上的资源进行下载。目录结构:
,
MNIST数据集:
加载数据集:
1%matplotlib inline
2import numpy as np
3import tensorflow as tf
4import matplotlib.pyplot as plt
5from tensorflow.examples.tutorials.mnist import input_data
6mnist = input_data.read_data_sets('MNIST_data/', validation_size=0)
查看一张图片:
1img = mnist.train.images[2]
2plt.imshow(img.reshape((28, 28)), cmap='Greys_r')
输出:
网络的编码器部分将是一个典型的卷积金字塔。每一个卷积层后面都有一个最大池化层来减少维度。解码器需要从一个窄的表示转换成一个宽的重构图像。例如,表示可以是4x4x8 的最大池化层。这是编码器的输出,也是译码器的输入。我们想要从解码器中得到一个28x28x1图像,所以我们需要从狭窄的解码器输入层返回。这是网络的示意图:
这里我们最后的编码器层有大小4x4x8=128。原始图像的大小为28x28x1=784,因此编码的矢量大约是原始图像大小的16%。这些只是每个层的建议大小。网络的深度和大小都可以更改,但请记住,我们的目标是找到输入数据的一个小表示。
在编码阶段,我们使用卷积层和最大池化层来不断减小输入的维度,在解码器阶段,需要使用反卷积将4x4x8的图片还原到原来的28x28x1。我们使用的这种反卷积方法叫做去卷积,关于反卷积的知识,可以查看这篇文章。在Tensorflow中,很容易使用tf.image.resize_images
或者 tf.image.resize_nearest_neighbor实现。代码如下:
1inputs_ = tf.placeholder(tf.float32, (None, 28, 28, 1), name='inputs')
2targets_ = tf.placeholder(tf.float32, (None, 28, 28, 1), name='targets')
3
4
5### 编码器--压缩
6conv1 = tf.layers.conv2d(inputs_, 16, (3,3), padding='same', activation=tf.nn.relu)
7# 当前shape: 28x28x16
8maxpool1 = tf.layers.max_pooling2d(conv1, (2,2), (2,2), padding='same')
9# 当前shape: 14x14x16
10conv2 = tf.layers.conv2d(maxpool1, 8, (3,3), padding='same', activation=tf.nn.relu)
11# 当前shape: 14x14x8
12maxpool2 = tf.layers.max_pooling2d(conv2, (2,2), (2,2), padding='same')
13# 当前shape: 7x7x8
14conv3 = tf.layers.conv2d(maxpool2, 8, (3,3), padding='same', activation=tf.nn.relu)
15# 当前shape: 7x7x8
16encoded = tf.layers.max_pooling2d(conv3, (2,2), (2,2), padding='same')
17# 当前shape: 4x4x8
18
19
20### 解码器--还原
21upsample1 = tf.image.resize_nearest_neighbor(encoded, (7,7))
22# 当前shape: 7x7x8
23conv4 = tf.layers.conv2d(upsample1, 8, (3,3), padding='same', activation=tf.nn.relu)
24# 当前shape: 7x7x8
25upsample2 = tf.image.resize_nearest_neighbor(conv4, (14,14))
26# 当前shape: 14x14x8
27conv5 = tf.layers.conv2d(upsample2, 8, (3,3), padding='same', activation=tf.nn.relu)
28# 当前shape: 14x14x8
29upsample3 = tf.image.resize_nearest_neighbor(conv5, (28,28))
30# 当前shape: 28x28x8
31conv6 = tf.layers.conv2d(upsample3, 16, (3,3), padding='same', activation=tf.nn.relu)
32# 当前shape: 28x28x16
33
34
35logits = tf.layers.conv2d(conv6, 1, (3,3), padding='same', activation=None)
36#当前shape: 28x28x1
37
38
39decoded = tf.nn.sigmoid(logits, name='decoded')
40
41
42#计算损失函数
43loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=targets_, logits=logits)
44cost = tf.reduce_mean(loss)
45#使用adam优化器优化损失函数
46opt = tf.train.AdamOptimizer(0.001).minimize(cost)
1sess = tf.Session()
2
3epochs = 20
4batch_size = 200
5sess.run(tf.global_variables_initializer())
6for e in range(epochs):
7 for ii in range(mnist.train.num_examples//batch_size):
8 batch = mnist.train.next_batch(batch_size)
9 imgs = batch[0].reshape((-1, 28, 28, 1))
10 batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: imgs,
11 targets_: imgs})
12 print("Epoch: {}/{}...".format(e+1, epochs),
13 "Training loss: {:.4f}".format(batch_cost))
1fig, axes = plt.subplots(nrows=2, ncols=10, sharex=True, sharey=True, figsize=(20,4))
2in_imgs = mnist.test.images[:10]
3reconstructed = sess.run(decoded, feed_dict={inputs_: in_imgs.reshape((10, 28, 28, 1))})
4
5
6for images, row in zip([in_imgs, reconstructed], axes):
7 for img, ax in zip(images, row):
8 ax.imshow(img.reshape((28, 28)), cmap='Greys_r')
9 ax.get_xaxis().set_visible(False)
10 ax.get_yaxis().set_visible(False)
11fig.tight_layout(pad=0.1)
12sess.close()
1inputs_ = tf.placeholder(tf.float32, (None, 28, 28, 1), name='inputs')
2targets_ = tf.placeholder(tf.float32, (None, 28, 28, 1), name='targets')
3
4
5### 编码器
6conv1 = tf.layers.conv2d(inputs_, 32, (3,3), padding='same', activation=tf.nn.relu)
7# 当前shape: 28x28x32
8maxpool1 = tf.layers.max_pooling2d(conv1, (2,2), (2,2), padding='same')
9# 当前shape: 14x14x32
10conv2 = tf.layers.conv2d(maxpool1, 32, (3,3), padding='same', activation=tf.nn.relu)
11# 当前shape: 14x14x32
12maxpool2 = tf.layers.max_pooling2d(conv2, (2,2), (2,2), padding='same')
13# 当前shape: 7x7x32
14conv3 = tf.layers.conv2d(maxpool2, 16, (3,3), padding='same', activation=tf.nn.relu)
15# 当前shape: 7x7x16
16encoded = tf.layers.max_pooling2d(conv3, (2,2), (2,2), padding='same')
17# 当前shape: 4x4x16
18
19
20### 解码器
21upsample1 = tf.image.resize_nearest_neighbor(encoded, (7,7))
22# 当前shape: 7x7x16
23conv4 = tf.layers.conv2d(upsample1, 16, (3,3), padding='same', activation=tf.nn.relu)
24# 当前shape: 7x7x16
25upsample2 = tf.image.resize_nearest_neighbor(conv4, (14,14))
26# 当前shape: 14x14x16
27conv5 = tf.layers.conv2d(upsample2, 32, (3,3), padding='same', activation=tf.nn.relu)
28# 当前shape: 14x14x32
29upsample3 = tf.image.resize_nearest_neighbor(conv5, (28,28))
30# 当前shape: 28x28x32
31conv6 = tf.layers.conv2d(upsample3, 32, (3,3), padding='same', activation=tf.nn.relu)
32# 当前shape: 28x28x32
33
34
35logits = tf.layers.conv2d(conv6, 1, (3,3), padding='same', activation=None)
36#当前shape: 28x28x1
37
38
39decoded = tf.nn.sigmoid(logits, name='decoded')
40
41
42loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=targets_, logits=logits)
43cost = tf.reduce_mean(loss)
44opt = tf.train.AdamOptimizer(0.001).minimize(cost)
1sess = tf.Session()
2epochs = 100
3batch_size = 200
4# Set's how much noise we're adding to the MNIST images
5noise_factor = 0.5
6sess.run(tf.global_variables_initializer())
7for e in range(epochs):
8 for ii in range(mnist.train.num_examples//batch_size):
9 batch = mnist.train.next_batch(batch_size)
10 # Get images from the batch
11 imgs = batch[0].reshape((-1, 28, 28, 1))
12
13 # Add random noise to the input images
14 noisy_imgs = imgs + noise_factor * np.random.randn(*imgs.shape)
15 # Clip the images to be between 0 and 1
16 noisy_imgs = np.clip(noisy_imgs, 0., 1.)
17
18 # Noisy images as inputs, original images as targets
19 batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: noisy_imgs,
20 targets_: imgs})
21
22
23 print("Epoch: {}/{}...".format(e+1, epochs),
24 "Training loss: {:.4f}".format(batch_cost))
1fig, axes = plt.subplots(nrows=2, ncols=10, sharex=True, sharey=True, figsize=(20,4))
2in_imgs = mnist.test.images[:10]
3noisy_imgs = in_imgs + noise_factor * np.random.randn(*in_imgs.shape)
4noisy_imgs = np.clip(noisy_imgs, 0., 1.)
5
6reconstructed = sess.run(decoded, feed_dict={inputs_: noisy_imgs.reshape((10, 28, 28, 1))})
7
8for images, row in zip([noisy_imgs, reconstructed], axes):
9 for img, ax in zip(images, row):
10 ax.imshow(img.reshape((28, 28)), cmap='Greys_r')
11 ax.get_xaxis().set_visible(False)
12 ax.get_yaxis().set_visible(False)
13
14
15fig.tight_layout(pad=0.1)
源代码以及数据地址:参见这里
原文链接:https://blog.csdn.net/qq_34464926/article/details/80936150
没有更多数据了
还没有内容