[Deep Learning] Final Summary
week 9
Clustering
Scaling required!!
Agglomerative Clustering
linkage
: dataframe, metric, methoddendrogram
: link_dist, labels
KMeans
KMeans
: n_clusters- labels_, cluster_centers_
DBSCAN
-
core point, border point, noise point
-
DBSCAN
: eps, min_samples, metric- 0~n: class samples
- -1: noise points
Dimension Reduction
SelectPercentile
: score_func, percentilePCA
: n_componentsTSNE
: n_components, perplexity(number of nearest neighbors that is used in other manifold learning algorithms)
week 10
Tensorflow 1.0
# version 1
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
# placeholder()
X = tf.placeholder("float") # X를 담을 공간
Y = tf.placeholder("float") # Y를 담을 공간
W = tf.Variable(np.random.randn(), name = "W")
b = tf.Variable(np.random.randn(), name = "b")
# session
x1 = tf.constant([1,2,3,4])
x2 = tf.constant([5,6,7,8])
result = tf.multiply(x1, x2)
with tf.Session() as sess:
output = sess.run(result)
print(output)
# tf.global_variables_initializer
sess = tf.Session()
sess.run(tf.global_variables_initializer())
lossHistory = []
# feed_dict
for i in range(300):
sess.run([W_update, b_update], feed_dict={X: x, Y: y})
cost_val, W_val, b_val = sess.run([cost, W, b], feed_dict={X: x, Y: y})
lossHistory.append(cost_val)
sess.close()
Tensorflow 2.0
# tf.GradientTape() (tape.gradient())
for i in range(300):
with tf.GradientTape() as tape:
y_pred = W * x + b
cost = tf.reduce_mean(tf.square(y_pred - y))
W_grad, b_grad = tape.gradient(cost, [W,b]) # dCost/dw, dCost/db
W.assign_sub(learning_rate * W_grad)
b.assign_sub(learning_rate * b_grad)
# optimizer = tf.optimizers.Adam( learning_rate )
# optimizer.apply_gradients(zip(grads, [W,b]))
lossHistory.append(cost)
if i % 10 == 0:
print("{:5}|{:10.4f}|{:10.4}|{:10.6f}".format(i, W.numpy(), b.numpy(), cost))
Keras
model = Sequential()
model.add(Flatten(input_shape=(1,)))
model.add(Dense(2, activation='sigmoid'))
# or
# model.add(Dense(2, activation='sigmoid'), input_shape=(1,)) # more common
model.summary()
'''
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 1) 0
_________________________________________________________________
dense_4 (Dense) (None, 2) 4
=================================================================
Total params: 4
Trainable params: 4
Non-trainable params: 0
_________________________________________________________________
'''
# regression
model.compile(optimizer=SGD(learning_rate=0.1),
loss='mse',
metrics=['accuracy'])
# classification
model.compile(optimizer=Adam(learning_rate=1e-3),
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10,
batxh_size=100,
verbose=0,
validation_split=0.2)
model.evaluate(x_test, y_test, epochs=10, batch_size=10)
model.predict(x_input_data, batch_size=100)
model.save("model_name.h5")
# and later
model = load_model("model_name.h5")
week 11
Keras Introduction
# Functional API
inputs = Input(shape=(784,))
x = Dense(64, activation="relu")(inputs)
x = Dense(64, activation="relu")(x)
outputs = Dense(10)(x)
# create a Model by specifying its inputs and outputs in the graph of layers
model = Model(inputs=inputs, outputs=outputs, name="mnist_model")
model.summary()
'''
Model: "mnist_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense_48 (Dense) (None, 64) 50240
_________________________________________________________________
dense_49 (Dense) (None, 64) 4160
_________________________________________________________________
dense_50 (Dense) (None, 10) 650
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
- 784 * 64 + 64 = 50240
- 64 * 64 + 64 = 4160
- 64 * 10 + 10 = 650
'''
# Sequential API
model = Sequential()
model.add(Dense(64, input_shape=(784,), activation='relu')) # 첫번째 계츧에서 input_shape 지정
model.add(Dense(64, activation='relu'))
model.add(Dense(10))
model.summary()
'''
Model: "sequential_17"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_54 (Dense) (None, 64) 50240
_________________________________________________________________
dense_55 (Dense) (None, 64) 4160
_________________________________________________________________
dense_56 (Dense) (None, 10) 650
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
'''
Optimizers
Gradient Descent
Stochastic gradient descent
Mini-batch gradient descent
Momentum
NAG(Nesterov accelerated gradient)
Adagrad
- 가중치 별로 다른 갱신
- 과거에 많이 변경되지 않은 매개변수에 더 큰 learning rate 적용
RMSProp
- Adagrad 알고리즘은 너무 급격히 감소하여 global optimum에 도달하지 못 하는 경우 발생
- 처음부터 모든 gradient Gt를 합산하는 대신 지수 평균을 사용하여 최근 것 사용
Adam
- RMSProp + Momentum
Summary
week 12
MLP
parameters = (input_shape) * (# of neurons) + (# of neurons)
# Transfer learning
conv_base = VGG16(weights = 'imagenet', # loading 할 weights
include_top=False,
input_shape=(150, 150, 3))
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base) # 특징 추출기
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
'''
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 4, 4, 512) 14714688
_________________________________________________________________
flatten (Flatten) (None, 8192) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 2097408
_________________________________________________________________
dense_3 (Dense) (None, 1) 257
=================================================================
Total params: 16,812,353
Trainable params: 16,812,353
Non-trainable params: 0
_________________________________________________________________
'''
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-5),
metrics=['acc'])
history = model.fit_generator(
generator=train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50)
CNN
parameters = (kernel_size) * (input_depth) * (# of neurons) + (# of neurons)
model = models.Sequential()
model.add(layers.Conv2D(32,(3,3), activation = 'relu',
input_shape=(img_width, img_height, 3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64,(3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128,(3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128,(3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
'''
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 148, 148, 32) 896
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 72, 72, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 34, 34, 128) 73856
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 15, 15, 128) 147584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 6272) 0
_________________________________________________________________
dense (Dense) (None, 512) 3211776
_________________________________________________________________
dense_1 (Dense) (None, 1) 513
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
'''
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
# Use ImageDataGenerator
datagen = ImageDataGenerator(rescale = 1./255)
train_generator = datagen.flow_from_directory(directory=train_dir,
target_size=(img_width,img_height),
classes=['dogs','cats'],
class_mode='binary',
batch_size=20)
validation_generator = datagen.flow_from_directory(directory=validation_dir,
target_size=(img_width,img_height),
classes=['dogs','cats'],
class_mode='binary',
batch_size=20)
history = model.fit_generator(
generator=train_generator,
steps_per_epoch=100,
epochs=10,
validation_data=validation_generator,
validation_steps=50)
week 13
Embedding/RNN
- Embedding
parameters = (input_shape(단어 개수)) * (output_shape(단어 벡터 차원))
output = (None, None, output_shape)
- input_length 지정 시
output = (None, input_length, output_shape)
- SimpleRNN:
parameters = (Embedding output_shape * 2) * (# of neurons) + (# of neurons)
output = (None, # of neurons)
- return_sequences = True 지정 시
output = (None, input_length, # of neurons)
- SimpleRNN이 한 가지 다른 점은 넘파이 예제처럼 하나의 시퀀스가 아니라 다른 케라스 층과 마찬가지로 시퀀스 배치를 처리한다는 것입니다. 즉, (timesteps, input_features) 크기가 아니라 (batch_size, timesteps, input_features) 크기의 입력을 받습니다.
- 케라스에 있는 모든 순환 층과 동일하게 SimpleRNN은 두 가지 모드로 실행할 수 있습니다. 각 타임스텝의 출력을 모은 전체 시퀀스를 반환하거나(크기가 (batch_size, timesteps, output_features)인 3D 텐서), 입력 시퀀스에 대한 마지막 출력만 반환할 수 있습니다(크기가 (batch_size, output_features)인 2D 텐서). 이 모드는 객체를 생성할 때 return_sequences 매개변수로 선택할 수 있습니다.
model = Sequential()
model.add(Embedding(10000, 32)) # 문장 길이(단어 개수) 10000, 단어벡터 차원 32
model.add(SimpleRNN(32))
model.summary()
model.input_shape, model.output_shape
'''
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 32) 320000
_________________________________________________________________
simple_rnn (SimpleRNN) (None, 32) 2080
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________
((None, None), (None, 32))
- embedding(in_dim, out_dim): 10000 * 32 = 320000
- simpleRNN: (32 + 32)*32 + 32 = 2080
'''
# 뒤에 flatten 이나 Dense layer 에 연결하기위해서는 length 가 고정되어야 함
model = Sequential()
model.add(Embedding(10000, 32, input_length=20)) # 인풋 20개
model.add(SimpleRNN(32))
model.summary()
model.input_shape, model.output_shape
'''
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 20, 32) 320000
_________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 32) 2080
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________
((None, 20), (None, 32))
'''
# RNN layer의 중간 결과들을 저장
model = Sequential()
model.add(Embedding(10000, 32, input_length=20))
model.add(SimpleRNN(32, return_sequences=True)) # 중간 결과 반환
model.summary()
'''
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_4 (Embedding) (None, 20, 32) 320000
_________________________________________________________________
simple_rnn_4 (SimpleRNN) (None, 20, 32) 2080
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________
'''
- 네트워크의 표현력을 증가시키기 위해 여러 개의 순환 층을 차례대로 쌓는 것이 유용할 때가 있다. 이런 설정에서는 중간 층들이 전체 출력 시퀀스를 반환하도록 설정해야 한다:
model = Sequential()
model.add(Embedding(10000, 32, input_length=20))
model.add(SimpleRNN(32, return_sequences=True))
model.add(SimpleRNN(32, return_sequences=True))
model.add(SimpleRNN(32, return_sequences=True))
model.add(SimpleRNN(32)) # 맨 위 층만 마지막 출력을 반환합니다.
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_12 (Embedding) (None, 20, 32) 320000
_________________________________________________________________
simple_rnn_12 (SimpleRNN) (None, 20, 32) 2080
_________________________________________________________________
simple_rnn_13 (SimpleRNN) (None, 20, 32) 2080
_________________________________________________________________
simple_rnn_14 (SimpleRNN) (None, 20, 32) 2080
_________________________________________________________________
simple_rnn_15 (SimpleRNN) (None, 32) 2080
=================================================================
Total params: 328,320
Trainable params: 328,320
Non-trainable params: 0
_________________________________________________________________
Vectorizing
CountVectorizer
TfidfVectorizer
: tokenizer, max_features
![TF-IDF를 활용한 클래스 유사도 분석과 추천 서버 구축 1편 | 클래스101 기술 블로그](https://class101.dev/images/thumbnails/tf-idf.png) |
cv = TfidfVectorizer(tokenizer=twitter_tokenizer, max_features=3000)
X_train = cv.fit_transform(text_train)
X_test = cv.transform(text_test) # cv.fit_transform(text_test) (X)
X_train.shape, y_train.shape, X_test.shape, y_test.shape
# ((2000, 3000), (2000,), (1000, 3000), (1000,))
# CountVectorizer
from sklearn.feature_extraction.text import CountVectorizer
corpus = [
'This is the first document.',
'This document is the second document.',
'And this is the third one.',
'Is this the first document?',
]
vectorizer1 = CountVectorizer()
X = vectorizer1.fit_transform(corpus)
print(vectorizer1.get_feature_names())
print(X.toarray())
'''
['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this']
[[0 1 1 1 0 0 1 0 1]
[0 2 0 1 0 1 1 0 1]
[1 0 0 1 1 0 1 1 1]
[0 1 1 1 0 0 1 0 1]]
'''
# TfidfVectorier
vectorizer2 = TfidfVectorizer()
X = vectorizer2.fit_transform(corpus)
print(vectorizer2.get_feature_names())
print(X.toarray().round(2))
'''
['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this']
[[0. 0.47 0.58 0.38 0. 0. 0.38 0. 0.38]
[0. 0.69 0. 0.28 0. 0.54 0.28 0. 0.28]
[0.51 0. 0. 0.27 0.51 0. 0.27 0.51 0.27]
[0. 0.47 0.58 0.38 0. 0. 0.38 0. 0.38]]
'''
Word2Vec
: sentence_list, sg(skip-gram), size, window, min_count
model = Word2Vec(data, # 리스트 형태의 데이터
sg=1, # 0: CBOW, 1: Skip-gram
size=100, # 벡터 크기
window=3, # 고려할 앞뒤 폭(앞뒤 3단어)
min_count=3, # 사용할 단어의 최소 빈도(3회 이하 단어 무시)
workers=4) # 동시에 처리할 작업 수(코어 수와 비슷하게 설정)
model.wv['대한민국']
'''
array([-2.4107585e-02, -7.4946046e-02, 1.5689157e-03, 1.7300507e-02,
7.7659652e-02, -4.3071166e-02, 8.3631985e-02, 1.6745523e-01,
-8.2903586e-02, -1.7553378e-02, 3.9016213e-02, -1.0054115e-01,
4.1688729e-02, 1.7242630e-01, -1.8903978e-02, 1.2952442e-01,
4.8356697e-02, 4.0910381e-01, -7.0913650e-02, -5.0823655e-02,
1.4685905e-01, -1.2997684e-01, 2.2543812e-02, -3.7712879e-02,
9.6920088e-02, 1.3099691e-01, -1.3746825e-01, -1.0660959e-01,
1.1127534e-01, 1.2975276e-01, -2.8525587e-02, -1.2853998e-01,
-8.3741836e-02, -9.9310517e-02, -2.4495709e-01, -4.1113162e-01,
1.0418992e-02, 7.9034410e-02, 1.3711397e-01, -5.1028132e-02,
-1.4102933e-01, -4.6473064e-02, -7.5484976e-02, -6.2391542e-02,
-4.0519308e-02, -1.5226401e-01, -1.3334070e-01, -1.7248647e-01,
-9.5049895e-02, 9.9440172e-02, -2.9708706e-02, 8.7483376e-02,
8.1404611e-02, 1.3708833e-01, -1.1457676e-01, -9.5910830e-03,
-6.4596653e-02, -2.4731688e-01, 3.0563422e-02, 1.2345860e-01,
-3.4807574e-02, 1.6530770e-01, 1.2371200e-01, -1.2324062e-02,
1.4210464e-01, -1.4213949e-01, 1.7249145e-01, -7.8410409e-02,
-6.2629886e-02, -9.0875283e-02, 2.9489502e-02, 2.1956262e-01,
3.4037119e-01, 1.0848373e-01, 3.6547065e-02, -1.5146755e-01,
5.6681294e-02, 6.6085658e-03, 1.9274153e-02, 1.9991216e-01,
-1.5090431e-01, 9.0067700e-02, 5.1970325e-02, 2.0268182e-01,
4.6885550e-02, -5.2929554e-02, 6.6083498e-02, -5.8406308e-02,
-1.1952946e-01, 5.5076398e-02, 1.2351151e-04, -3.8982730e-02,
-1.3962780e-01, 1.2789361e-01, -1.5078008e-01, -1.4386822e-01,
-1.3026667e-01, -1.1459819e-01, -7.1221814e-02, 1.1928054e-01],
dtype=float32)
'''
print(model.wv.most_similar("대한민국"))
'''
[('대한', 0.9968054294586182), ('민국', 0.9958725571632385), ('터닝포인트', 0.9953158497810364), ('근', 0.9948737621307373), ('터닝', 0.994050920009613), ('마감', 0.993889570236206), ('국내증시', 0.9935024976730347), ('정치인', 0.992567777633667), ('글로벌', 0.9920015335083008), ('외국인', 0.9918369650840759)]
'''
# a:b = c: ?
model.wv.most_similar(positive=['한국', '미국'], negative=['서울'])
'''
[('핵', 0.6568202376365662),
('미', 0.6307210922241211),
('북핵', 0.6297447681427002),
('북', 0.6209843754768372),
('북ㆍ미', 0.6095261573791504),
('ㆍ', 0.6072773337364197),
('성명', 0.601407527923584),
('정상회담', 0.6000897884368896),
('변', 0.5984941720962524),
('월말', 0.5965142250061035)]
'''
print(model.wv.similarity("한국","미국"))
print(model.wv.similarity("한국","일본"))
print(model.wv.similarity("미국","일본"))
'''
0.19900209
0.45370853
0.7131777
'''
week 14
AutoEncoder
# MLP
input_img = keras.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
encoded = layers.Dense(32, activation='relu')(encoded)
decoded = layers.Dense(64, activation='relu')(encoded)
decoded = layers.Dense(128, activation='relu')(decoded)
decoded = layers.Dense(784, activation='sigmoid')(decoded)
encoder = keras.Model(input_img, encoded)
autoencoder = keras.Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='mse')
# input = output = X
autoencoder.fit(x_train, x_train, epochs=10, batch_size=128, shuffle=True,
validation_data=(x_test, x_test))
# CNN
# Encoder
input_img = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(16, (3, 3), padding='same', activation='relu')(input_img)
x = layers.MaxPooling2D(2, 2)(x)
x = layers.Conv2D(32, (3, 3), padding='same', activation='relu')(x)
x = layers.MaxPooling2D(2, 2)(x)
x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D(2, 2)(x)
encoder = keras.Model(input_img, encoded)
# Decoder
# at this point the representation of 'encoded' is (3, 3, 64)
# Decconvolution
x = layers.Conv2DTranspose(32, kernel_size=3, strides=2, activation='relu',
padding='valid')(encoded)
x = layers.Conv2DTranspose(16, kernel_size=3, strides=2, activation='relu',
padding='same')(x)
x = layers.Conv2DTranspose(1, kernel_size=3, strides=2, activation='sigmoid',
padding='same')(x)
decoded = layers.Reshape([28,28])(x)
autoencoder = keras.Model(input_img, decoded)
autoencoder.summary()
'''
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 28, 28, 16) 160
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 16) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 14, 14, 32) 4640
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 7, 7, 64) 18496
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 3, 3, 64) 0
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 7, 7, 32) 18464
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 14, 14, 16) 4624
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 28, 28, 1) 145
_________________________________________________________________
reshape (Reshape) (None, 28, 28) 0
=================================================================
Total params: 46,529
Trainable params: 46,529
Non-trainable params: 0
_________________________________________________________________
'''
VAE
Encoder
# original_dim = 28 * 28
# intermediate_dim = 64
# latent_dim = 2
inputs = keras.Input(shape=(28*28,))
h = layers.Dense(64, activation='relu')(inputs)
z_mean = layers.Dense(2)(h)
z_log_sigma = layers.Dense(2)(h)
z_mean.shape, z_log_sigma.shape
# (TensorShape([None, 2]), TensorShape([None, 2]))
from tensorflow.keras import backend as K
def sampling(args):
z_mean, z_log_sigma = args
epsilon = K.random_normal(shape=(K.shape(z_mean)[0], 2),
mean=0., stddev=0.1)
return z_mean + K.exp(z_log_sigma) * epsilon # latent space
z = layers.Lambda(sampling)([z_mean, z_log_sigma]) # make sampling layer (Lambda)
# Create encoder
encoder = keras.Model(inputs, [z_mean, z_log_sigma, z], name='encoder')
encoder.summary()
'''
Model: "encoder"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
dense_20 (Dense) (None, 64) 50240 input_5[0][0]
__________________________________________________________________________________________________
dense_21 (Dense) (None, 2) 130 dense_20[0][0]
__________________________________________________________________________________________________
dense_22 (Dense) (None, 2) 130 dense_20[0][0]
__________________________________________________________________________________________________
lambda_3 (Lambda) (None, 2) 0 dense_21[0][0]
dense_22[0][0]
==================================================================================================
Total params: 50,500
Trainable params: 50,500
Non-trainable params: 0
__________________________________________________________________________________________________
'''
Decoder
# Create decoder
latent_inputs = keras.Input(shape=(2,), name='z_sampling')
x = layers.Dense(64, activation='relu')(latent_inputs)
outputs = layers.Dense(28*28, activation='sigmoid')(x)
decoder = keras.Model(latent_inputs, outputs, name='decoder')
decoder.summary()
keras.utils.plot_model(decoder, "decoder_info.png", show_shapes=True)
'''
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
z_sampling (InputLayer) [(None, 2)] 0
_________________________________________________________________
dense_23 (Dense) (None, 64) 192
_________________________________________________________________
dense_24 (Dense) (None, 784) 50960
=================================================================
Total params: 51,152
Trainable params: 51,152
Non-trainable params: 0
_________________________________________________________________
'''
VAE (Encoder+Decoder)
outputs = decoder(encoder(inputs)[2]) # take only z-value
vae = keras.Model(inputs, outputs, name='vae_mlp')
vae.summary()
keras.utils.plot_model(vae, "vae_info.png", show_shapes=True)
'''
Model: "vae_mlp"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, 784)] 0
_________________________________________________________________
encoder (Functional) [(None, 2), (None, 2), (N 50500
_________________________________________________________________
decoder (Functional) (None, 784) 51152
=================================================================
Total params: 101,652
Trainable params: 101,652
Non-trainable params: 0
_________________________________________________________________
'''
Lab3
Caption testing on pre-trained model
- 모델 다운로드
- 라이브러리 임포트
- 학습된 모델, 이미지, 토크나이저 로드
Training using pre-trained model and smaller dataset
- Extract features on photo images.
- load model to extract features on out images
- Prepare text data (photograph - description)
- load the file containing all of the descriptions
- map the descriptions to corresponding photo
- clean the text of descriptions
- summarize the size of the vocabulary. (faster training)
- save the dictionary of image identifiers - descriptions
Retrain model using the prepared data
- Load the training data
- Load the prepared descriptions
- Load the prepared photos
- Load the tokenizer
- Transform data to input-output pairs for training the model.
- Define the model
- Fit the model
Evaluate the trained model with test dataset
- Load the dataset, its features and descriptions, and the tokenizer
- Define a function that can generate a description for a photo using the trained model
- Evaluate a trained model against a given test dataset (BLEU score)
Leave a comment