[Deep Learning] Final Summary

11 minute read

week 9

Clustering

Scaling required!!

Agglomerative Clustering

linkage: dataframe, metric, method
dendrogram: link_dist, labels

KMeans

KMeans: n_clusters
- labels_, cluster_centers_

DBSCAN

core point, border point, noise point
DBSCAN: eps, min_samples, metric
- 0~n: class samples
- -1: noise points

Dimension Reduction

SelectPercentile: score_func, percentile
PCA: n_components
TSNE: n_components, perplexity(number of nearest neighbors that is used in other manifold learning algorithms)

week 10

Tensorflow 1.0

# version 1
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# placeholder()
X = tf.placeholder("float") # X를 담을 공간
Y = tf.placeholder("float") # Y를 담을 공간
W = tf.Variable(np.random.randn(), name = "W") 
b = tf.Variable(np.random.randn(), name = "b") 

# session
x1 = tf.constant([1,2,3,4])
x2 = tf.constant([5,6,7,8])
result = tf.multiply(x1, x2)

with tf.Session() as sess:
  output = sess.run(result)
  print(output)
    
# tf.global_variables_initializer
sess = tf.Session()
sess.run(tf.global_variables_initializer())
lossHistory = []

# feed_dict
for i in range(300): 
    sess.run([W_update, b_update], feed_dict={X: x, Y: y})
    cost_val, W_val, b_val = sess.run([cost, W, b], feed_dict={X: x, Y: y})
    
    lossHistory.append(cost_val)

sess.close()

Tensorflow 2.0

# tf.GradientTape() (tape.gradient())
for i in range(300):
    with tf.GradientTape() as tape:
        y_pred = W * x + b
        cost = tf.reduce_mean(tf.square(y_pred - y))
    
    W_grad, b_grad = tape.gradient(cost, [W,b])  # dCost/dw, dCost/db
    
    W.assign_sub(learning_rate * W_grad)
    b.assign_sub(learning_rate * b_grad)
    # optimizer = tf.optimizers.Adam( learning_rate )
    # optimizer.apply_gradients(zip(grads, [W,b]))
    lossHistory.append(cost)
    if i % 10 == 0:
        print("{:5}|{:10.4f}|{:10.4}|{:10.6f}".format(i, W.numpy(), b.numpy(), cost))
       

Keras

model = Sequential()

model.add(Flatten(input_shape=(1,)))
model.add(Dense(2, activation='sigmoid'))
# or
# model.add(Dense(2, activation='sigmoid'), input_shape=(1,)) # more common
model.summary()
'''
Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 1)                 0         
_________________________________________________________________
dense_4 (Dense)              (None, 2)                 4         
=================================================================
Total params: 4
Trainable params: 4
Non-trainable params: 0
_________________________________________________________________
'''
# regression
model.compile(optimizer=SGD(learning_rate=0.1), 
              loss='mse',
              metrics=['accuracy'])    
# classification
model.compile(optimizer=Adam(learning_rate=1e-3), 
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10, 
          batxh_size=100,
          verbose=0,
          validation_split=0.2)

model.evaluate(x_test, y_test, epochs=10, batch_size=10)
model.predict(x_input_data, batch_size=100)

model.save("model_name.h5")
# and later
model = load_model("model_name.h5")

week 11

Keras Introduction

# Functional API
inputs = Input(shape=(784,))
x = Dense(64, activation="relu")(inputs)
x = Dense(64, activation="relu")(x)
outputs = Dense(10)(x)
# create a Model by specifying its inputs and outputs in the graph of layers
model = Model(inputs=inputs, outputs=outputs, name="mnist_model")
model.summary()
'''
Model: "mnist_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         [(None, 784)]             0         
_________________________________________________________________
dense_48 (Dense)             (None, 64)                50240     
_________________________________________________________________
dense_49 (Dense)             (None, 64)                4160      
_________________________________________________________________
dense_50 (Dense)             (None, 10)                650       
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
- 784 * 64 + 64 = 50240
- 64 * 64 + 64 = 4160
- 64 * 10 + 10 = 650
'''

# Sequential API
model = Sequential()
model.add(Dense(64, input_shape=(784,), activation='relu')) # 첫번째 계츧에서 input_shape 지정
model.add(Dense(64, activation='relu'))
model.add(Dense(10))
model.summary()
'''
Model: "sequential_17"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_54 (Dense)             (None, 64)                50240     
_________________________________________________________________
dense_55 (Dense)             (None, 64)                4160      
_________________________________________________________________
dense_56 (Dense)             (None, 10)                650       
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
'''

Optimizers

Gradient Descent

Stochastic gradient descent

Mini-batch gradient descent

Momentum

NAG(Nesterov accelerated gradient)

Adagrad

가중치 별로 다른 갱신
과거에 많이 변경되지 않은 매개변수에 더 큰 learning rate 적용

RMSProp

Adagrad 알고리즘은 너무 급격히 감소하여 global optimum에 도달하지 못 하는 경우 발생
처음부터 모든 gradient Gt를 합산하는 대신 지수 평균을 사용하여 최근 것 사용

Adam

RMSProp + Momentum

Summary

week 12

MLP

parameters = (input_shape) * (# of neurons) + (# of neurons)

# Transfer learning
conv_base = VGG16(weights = 'imagenet',   # loading 할 weights
                 include_top=False,
                 input_shape=(150, 150, 3))
conv_base.trainable = False 

model = models.Sequential()
model.add(conv_base) # 특징 추출기
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.summary()

'''
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Functional)           (None, 4, 4, 512)         14714688  
_________________________________________________________________
flatten (Flatten)            (None, 8192)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 256)               2097408   
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 257       
=================================================================
Total params: 16,812,353
Trainable params: 16,812,353
Non-trainable params: 0
_________________________________________________________________
'''

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-5),
              metrics=['acc'])


history = model.fit_generator(
    generator=train_generator, 
    steps_per_epoch=100,
    epochs=30,
    validation_data=validation_generator,
    validation_steps=50)

 

CNN

parameters = (kernel_size) * (input_depth) * (# of neurons) + (# of neurons)

model = models.Sequential()

model.add(layers.Conv2D(32,(3,3), activation = 'relu', 
                        input_shape=(img_width, img_height, 3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64,(3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128,(3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128,(3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))

model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten (Flatten)            (None, 6272)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               3211776   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
'''

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])
# Use ImageDataGenerator
datagen = ImageDataGenerator(rescale = 1./255)
train_generator = datagen.flow_from_directory(directory=train_dir,
											   target_size=(img_width,img_height),
											   classes=['dogs','cats'],
											   class_mode='binary',
											   batch_size=20)

validation_generator = datagen.flow_from_directory(directory=validation_dir,
											   target_size=(img_width,img_height),
											   classes=['dogs','cats'],
											   class_mode='binary',
											   batch_size=20)

history = model.fit_generator(
    generator=train_generator, 
    steps_per_epoch=100,
    epochs=10,
    validation_data=validation_generator,
    validation_steps=50)

week 13

Embedding/RNN

Embedding
- parameters = (input_shape(단어 개수)) * (output_shape(단어 벡터 차원))
- output = (None, None, output_shape)
- input_length 지정 시
  - output = (None, input_length, output_shape)
SimpleRNN:
- parameters = (Embedding output_shape * 2) * (# of neurons) + (# of neurons)
- output = (None, # of neurons)
- return_sequences = True 지정 시
  - output = (None, input_length, # of neurons)
- SimpleRNN이 한 가지 다른 점은 넘파이 예제처럼 하나의 시퀀스가 아니라 다른 케라스 층과 마찬가지로 시퀀스 배치를 처리한다는 것입니다. 즉, (timesteps, input_features) 크기가 아니라 (batch_size, timesteps, input_features) 크기의 입력을 받습니다.
- 케라스에 있는 모든 순환 층과 동일하게 SimpleRNN은 두 가지 모드로 실행할 수 있습니다. 각 타임스텝의 출력을 모은 전체 시퀀스를 반환하거나(크기가 (batch_size, timesteps, output_features)인 3D 텐서), 입력 시퀀스에 대한 마지막 출력만 반환할 수 있습니다(크기가 (batch_size, output_features)인 2D 텐서). 이 모드는 객체를 생성할 때 return_sequences 매개변수로 선택할 수 있습니다.

model = Sequential()
model.add(Embedding(10000, 32)) # 문장 길이(단어 개수) 10000, 단어벡터 차원 32
model.add(SimpleRNN(32))
model.summary()
model.input_shape, model.output_shape

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, None, 32)          320000    
_________________________________________________________________
simple_rnn (SimpleRNN)       (None, 32)                2080      
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________

((None, None), (None, 32))

- embedding(in_dim, out_dim): 10000 * 32 = 320000
- simpleRNN: (32 + 32)*32 + 32 = 2080
'''

# 뒤에 flatten 이나 Dense layer 에 연결하기위해서는 length 가 고정되어야 함
model = Sequential()
model.add(Embedding(10000, 32, input_length=20)) # 인풋 20개
model.add(SimpleRNN(32))
model.summary()
model.input_shape, model.output_shape

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 20, 32)            320000    
_________________________________________________________________
simple_rnn_1 (SimpleRNN)     (None, 32)                2080      
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________

((None, 20), (None, 32))
'''

# RNN layer의 중간 결과들을 저장
model = Sequential()
model.add(Embedding(10000, 32, input_length=20))
model.add(SimpleRNN(32, return_sequences=True)) # 중간 결과 반환
model.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_4 (Embedding)      (None, 20, 32)            320000    
_________________________________________________________________
simple_rnn_4 (SimpleRNN)     (None, 20, 32)            2080      
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________
'''

네트워크의 표현력을 증가시키기 위해 여러 개의 순환 층을 차례대로 쌓는 것이 유용할 때가 있다. 이런 설정에서는 중간 층들이 전체 출력 시퀀스를 반환하도록 설정해야 한다:

model = Sequential()
model.add(Embedding(10000, 32, input_length=20))
model.add(SimpleRNN(32, return_sequences=True))
model.add(SimpleRNN(32, return_sequences=True))
model.add(SimpleRNN(32, return_sequences=True))
model.add(SimpleRNN(32))  # 맨 위 층만 마지막 출력을 반환합니다.
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_12 (Embedding)     (None, 20, 32)            320000    
_________________________________________________________________
simple_rnn_12 (SimpleRNN)    (None, 20, 32)            2080      
_________________________________________________________________
simple_rnn_13 (SimpleRNN)    (None, 20, 32)            2080      
_________________________________________________________________
simple_rnn_14 (SimpleRNN)    (None, 20, 32)            2080      
_________________________________________________________________
simple_rnn_15 (SimpleRNN)    (None, 32)                2080      
=================================================================
Total params: 328,320
Trainable params: 328,320
Non-trainable params: 0
_________________________________________________________________

Vectorizing

CountVectorizer
TfidfVectorizer: tokenizer, max_features

![TF-IDF를 활용한 클래스 유사도 분석과 추천 서버 구축 1편

클래스101 기술 블로그](https://class101.dev/images/thumbnails/tf-idf.png)

cv = TfidfVectorizer(tokenizer=twitter_tokenizer, max_features=3000)
X_train = cv.fit_transform(text_train)
X_test = cv.transform(text_test) # cv.fit_transform(text_test) (X)

X_train.shape, y_train.shape, X_test.shape, y_test.shape

# ((2000, 3000), (2000,), (1000, 3000), (1000,))

# CountVectorizer
from sklearn.feature_extraction.text import CountVectorizer
corpus = [
        'This is the first document.',
        'This document is the second document.',
        'And this is the third one.',
        'Is this the first document?',
]
vectorizer1 = CountVectorizer()
X = vectorizer1.fit_transform(corpus)
print(vectorizer1.get_feature_names())
print(X.toarray())

'''
['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this']
[[0 1 1 1 0 0 1 0 1]
 [0 2 0 1 0 1 1 0 1]
 [1 0 0 1 1 0 1 1 1]
 [0 1 1 1 0 0 1 0 1]]
'''

# TfidfVectorier
vectorizer2 = TfidfVectorizer()
X = vectorizer2.fit_transform(corpus)
print(vectorizer2.get_feature_names())
print(X.toarray().round(2))

'''
['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this']
[[0.   0.47 0.58 0.38 0.   0.   0.38 0.   0.38]
 [0.   0.69 0.   0.28 0.   0.54 0.28 0.   0.28]
 [0.51 0.   0.   0.27 0.51 0.   0.27 0.51 0.27]
 [0.   0.47 0.58 0.38 0.   0.   0.38 0.   0.38]]
'''

Word2Vec: sentence_list, sg(skip-gram), size, window, min_count

model = Word2Vec(data,         # 리스트 형태의 데이터
                 sg=1,         # 0: CBOW, 1: Skip-gram
                 size=100,     # 벡터 크기
                 window=3,     # 고려할 앞뒤 폭(앞뒤 3단어)
                 min_count=3,  # 사용할 단어의 최소 빈도(3회 이하 단어 무시)
                 workers=4)    # 동시에 처리할 작업 수(코어 수와 비슷하게 설정)

model.wv['대한민국']
'''
array([-2.4107585e-02, -7.4946046e-02,  1.5689157e-03,  1.7300507e-02,
        7.7659652e-02, -4.3071166e-02,  8.3631985e-02,  1.6745523e-01,
       -8.2903586e-02, -1.7553378e-02,  3.9016213e-02, -1.0054115e-01,
        4.1688729e-02,  1.7242630e-01, -1.8903978e-02,  1.2952442e-01,
        4.8356697e-02,  4.0910381e-01, -7.0913650e-02, -5.0823655e-02,
        1.4685905e-01, -1.2997684e-01,  2.2543812e-02, -3.7712879e-02,
        9.6920088e-02,  1.3099691e-01, -1.3746825e-01, -1.0660959e-01,
        1.1127534e-01,  1.2975276e-01, -2.8525587e-02, -1.2853998e-01,
       -8.3741836e-02, -9.9310517e-02, -2.4495709e-01, -4.1113162e-01,
        1.0418992e-02,  7.9034410e-02,  1.3711397e-01, -5.1028132e-02,
       -1.4102933e-01, -4.6473064e-02, -7.5484976e-02, -6.2391542e-02,
       -4.0519308e-02, -1.5226401e-01, -1.3334070e-01, -1.7248647e-01,
       -9.5049895e-02,  9.9440172e-02, -2.9708706e-02,  8.7483376e-02,
        8.1404611e-02,  1.3708833e-01, -1.1457676e-01, -9.5910830e-03,
       -6.4596653e-02, -2.4731688e-01,  3.0563422e-02,  1.2345860e-01,
       -3.4807574e-02,  1.6530770e-01,  1.2371200e-01, -1.2324062e-02,
        1.4210464e-01, -1.4213949e-01,  1.7249145e-01, -7.8410409e-02,
       -6.2629886e-02, -9.0875283e-02,  2.9489502e-02,  2.1956262e-01,
        3.4037119e-01,  1.0848373e-01,  3.6547065e-02, -1.5146755e-01,
        5.6681294e-02,  6.6085658e-03,  1.9274153e-02,  1.9991216e-01,
       -1.5090431e-01,  9.0067700e-02,  5.1970325e-02,  2.0268182e-01,
        4.6885550e-02, -5.2929554e-02,  6.6083498e-02, -5.8406308e-02,
       -1.1952946e-01,  5.5076398e-02,  1.2351151e-04, -3.8982730e-02,
       -1.3962780e-01,  1.2789361e-01, -1.5078008e-01, -1.4386822e-01,
       -1.3026667e-01, -1.1459819e-01, -7.1221814e-02,  1.1928054e-01],
      dtype=float32)
'''

print(model.wv.most_similar("대한민국"))
'''
[('대한', 0.9968054294586182), ('민국', 0.9958725571632385), ('터닝포인트', 0.9953158497810364), ('근', 0.9948737621307373), ('터닝', 0.994050920009613), ('마감', 0.993889570236206), ('국내증시', 0.9935024976730347), ('정치인', 0.992567777633667), ('글로벌', 0.9920015335083008), ('외국인', 0.9918369650840759)]
'''

# a:b = c: ? 
model.wv.most_similar(positive=['한국', '미국'], negative=['서울'])
'''
[('핵', 0.6568202376365662),
 ('미', 0.6307210922241211),
 ('북핵', 0.6297447681427002),
 ('북', 0.6209843754768372),
 ('북ㆍ미', 0.6095261573791504),
 ('ㆍ', 0.6072773337364197),
 ('성명', 0.601407527923584),
 ('정상회담', 0.6000897884368896),
 ('변', 0.5984941720962524),
 ('월말', 0.5965142250061035)]
'''

print(model.wv.similarity("한국","미국"))
print(model.wv.similarity("한국","일본"))
print(model.wv.similarity("미국","일본"))
'''
0.19900209
0.45370853
0.7131777
'''

week 14

AutoEncoder

# MLP
input_img = keras.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
encoded = layers.Dense(32, activation='relu')(encoded)

decoded = layers.Dense(64, activation='relu')(encoded)
decoded = layers.Dense(128, activation='relu')(decoded)
decoded = layers.Dense(784, activation='sigmoid')(decoded)

encoder = keras.Model(input_img, encoded)
autoencoder = keras.Model(input_img, decoded)

autoencoder.compile(optimizer='adam', loss='mse')
# input = output = X
autoencoder.fit(x_train, x_train, epochs=10, batch_size=128, shuffle=True,
                validation_data=(x_test, x_test))

# CNN
# Encoder
input_img = keras.Input(shape=(28, 28, 1))

x = layers.Conv2D(16, (3, 3), padding='same', activation='relu')(input_img)
x = layers.MaxPooling2D(2, 2)(x)
x = layers.Conv2D(32, (3, 3), padding='same', activation='relu')(x)
x = layers.MaxPooling2D(2, 2)(x)
x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D(2, 2)(x)
encoder = keras.Model(input_img, encoded)
# Decoder
# at this point the representation of 'encoded' is (3, 3, 64) 
# Decconvolution
x = layers.Conv2DTranspose(32, kernel_size=3, strides=2, activation='relu', 
                           padding='valid')(encoded)
x = layers.Conv2DTranspose(16, kernel_size=3, strides=2, activation='relu', 
                           padding='same')(x)
x = layers.Conv2DTranspose(1, kernel_size=3, strides=2, activation='sigmoid', 
                           padding='same')(x)
decoded = layers.Reshape([28,28])(x)

autoencoder = keras.Model(input_img, decoded)
autoencoder.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 28, 28, 16)        160       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 16)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 32)        4640      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 32)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 7, 7, 64)          18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 3, 3, 64)          0         
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 7, 7, 32)          18464     
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 14, 14, 16)        4624      
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 28, 28, 1)         145       
_________________________________________________________________
reshape (Reshape)            (None, 28, 28)            0         
=================================================================
Total params: 46,529
Trainable params: 46,529
Non-trainable params: 0
_________________________________________________________________
'''

VAE

Encoder

# original_dim = 28 * 28
# intermediate_dim = 64
# latent_dim = 2

inputs = keras.Input(shape=(28*28,))
h = layers.Dense(64, activation='relu')(inputs)
z_mean = layers.Dense(2)(h)
z_log_sigma = layers.Dense(2)(h)

z_mean.shape, z_log_sigma.shape
# (TensorShape([None, 2]), TensorShape([None, 2]))

from tensorflow.keras import backend as K

def sampling(args):
    z_mean, z_log_sigma = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], 2),
                              mean=0., stddev=0.1)
    return z_mean + K.exp(z_log_sigma) * epsilon # latent space

z = layers.Lambda(sampling)([z_mean, z_log_sigma]) # make sampling layer (Lambda)

# Create encoder
encoder = keras.Model(inputs, [z_mean, z_log_sigma, z], name='encoder')
encoder.summary()
'''
Model: "encoder"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_5 (InputLayer)            [(None, 784)]        0                                            
__________________________________________________________________________________________________
dense_20 (Dense)                (None, 64)           50240       input_5[0][0]                    
__________________________________________________________________________________________________
dense_21 (Dense)                (None, 2)            130         dense_20[0][0]                   
__________________________________________________________________________________________________
dense_22 (Dense)                (None, 2)            130         dense_20[0][0]                   
__________________________________________________________________________________________________
lambda_3 (Lambda)               (None, 2)            0           dense_21[0][0]                   
                                                                 dense_22[0][0]                   
==================================================================================================
Total params: 50,500
Trainable params: 50,500
Non-trainable params: 0
__________________________________________________________________________________________________
'''

output_14_0

Decoder

# Create decoder
latent_inputs = keras.Input(shape=(2,), name='z_sampling')
x = layers.Dense(64, activation='relu')(latent_inputs)
outputs = layers.Dense(28*28, activation='sigmoid')(x)

decoder = keras.Model(latent_inputs, outputs, name='decoder')

decoder.summary()
keras.utils.plot_model(decoder, "decoder_info.png", show_shapes=True)

'''
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
z_sampling (InputLayer)      [(None, 2)]               0         
_________________________________________________________________
dense_23 (Dense)             (None, 64)                192       
_________________________________________________________________
dense_24 (Dense)             (None, 784)               50960     
=================================================================
Total params: 51,152
Trainable params: 51,152
Non-trainable params: 0
_________________________________________________________________
'''

output_16_1

VAE (Encoder+Decoder)

outputs = decoder(encoder(inputs)[2])    # take only z-value
vae = keras.Model(inputs, outputs, name='vae_mlp')
vae.summary()
keras.utils.plot_model(vae, "vae_info.png", show_shapes=True)

'''
Model: "vae_mlp"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_5 (InputLayer)         [(None, 784)]             0         
_________________________________________________________________
encoder (Functional)         [(None, 2), (None, 2), (N 50500     
_________________________________________________________________
decoder (Functional)         (None, 784)               51152     
=================================================================
Total params: 101,652
Trainable params: 101,652
Non-trainable params: 0
_________________________________________________________________
'''

output_19_1

Lab3

Caption testing on pre-trained model

모델 다운로드
라이브러리 임포트
학습된 모델, 이미지, 토크나이저 로드

Training using pre-trained model and smaller dataset

Extract features on photo images.
- load model to extract features on out images
Prepare text data (photograph - description)
- load the file containing all of the descriptions
- map the descriptions to corresponding photo
- clean the text of descriptions
- summarize the size of the vocabulary. (faster training)
- save the dictionary of image identifiers - descriptions

Retrain model using the prepared data

Load the training data
Load the prepared descriptions
Load the prepared photos
Load the tokenizer
Transform data to input-output pairs for training the model.
Define the model
Fit the model

Evaluate the trained model with test dataset

Load the dataset, its features and descriptions, and the tokenizer
Define a function that can generate a description for a photo using the trained model
Evaluate a trained model against a given test dataset (BLEU score)

Share on

Twitter Facebook LinkedIn

wowo0709