5. Introduction to modules, layers, and models

2022. 3. 12. 13:54Tool/TensorFlow

To do machine learning in TensorFlow, you are likely to need to define, save, and restore a model.

A model is, abstractly:

  • A function that computes something on tensors (a forward pass)
  • Some variables that can be updated in response to training
import tensorflow as tf
from datetime import datetime

%load_ext tensorboard

Defining models and layers in TensorFlow

In TensorFlow, most high-level implementations of layers and models are built on the same foundational class: tf.Module.
Modules and Layers have internal state, and methods that use that state.

class SimpleDense(tf.Module):
    def __init__(self, in_f, out_f, name=None):
        super().__init__()

        self.w = tf.Variable(tf.random.normal([in_f, out_f]), name='w')
        self.b = tf.Variable(tf.zeros([out_f]), trainable=False, name='b')

    def __call__(self, x):
        y = tf.matmul(x, self.w) + self.b
        return tf.nn.relu(y)

simple_dense = SimpleDense(2, 2)
print('Model result: ', simple_dense(tf.constant([[3.0, 2.0]])))
print('\ntrainable variables: ')
for var in simple_dense.trainable_variables:
    print(var, '\n')
print('\nVariables: ')
for var in simple_dense.variables:
    print(var, '\n')
Model result:  tf.Tensor([[2.545136  7.0265546]], shape=(1, 2), dtype=float32)

trainable variables: 
<tf.Variable 'w:0' shape=(2, 2) dtype=float32, numpy=
array([[0.43900734, 1.0092689 ],
       [0.61405694, 1.9993739 ]], dtype=float32)> 


Variables: 
<tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(2, 2) dtype=float32, numpy=
array([[0.43900734, 1.0092689 ],
       [0.61405694, 1.9993739 ]], dtype=float32)> 

By subclassing tf.Module, any tf.Variable or tf.Module instances assigned to this object's properties are automatically collected.
This allows you to save and load variables, and also create collections of tf.Modules.
tf.Module instances will automatically collect, recursively, any tf.Variable or tf.Module instances assigned to it.
This allows you to manage collections of tf.Modules with a single model instance, and save and load whole models.

class SequentialDense(tf.Module):
    def __init__(self, name=None):

        self.dense_1 = SimpleDense(in_f=3, out_f=3)
        self.dense_2 = SimpleDense(in_f=3, out_f=2)

    def __call__(self, x):
        x = self.dense_1(x)
        return self.dense_2(x)

seq_dense = SequentialDense()

print('Model result: ', seq_dense(tf.constant([[3.0, 2.0, 1.0]])))
print('\nSubmodules: ', seq_dense.submodules)
print('\nVariables: ')
for var in seq_dense.variables:
    print(var, '\n')
Model result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

Submodules:  (<__main__.SimpleDense object at 0x000001C18D683E08>, <__main__.SimpleDense object at 0x000001C18D685348>)

Variables: 
<tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(3, 3) dtype=float32, numpy=
array([[-1.114024  , -0.80356526,  0.3159744 ],
       [ 1.478429  ,  0.7592096 ,  0.5036542 ],
       [-0.44235912, -0.14899522, -0.806919  ]], dtype=float32)> 

<tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[-0.5780724 ,  0.39984643],
       [ 0.8962458 ,  0.27314118],
       [-0.07528466, -0.07791077]], dtype=float32)> 

Waiting to create variables

By deferring variable creation to the first time the module is called with a specific input shape, you do not need specify the input size up front.

class FlexibleDense(tf.Module):
    def __init__(self, out_f, name=None):
        super().__init__()

        self.is_built = False
        self.out_f = out_f

    def __call__(self, x):

        if not self.is_built:
            self.w = tf.Variable(tf.random.normal([x.shape[-1], self.out_f]), name='w')
            self.b = tf.Variable(tf.zeros([self.out_f]), name='b')
            self.is_built = True

        y = tf.matmul(x, self.w) + self.b
        return tf.nn.relu(y)

class FlexSeqDense(tf.Module):
    def __init__(self, name=None):
        super().__init__()

        self.dense_1 = FlexibleDense(out_f=3)
        self.dense_2 = FlexibleDense(out_f=2)

    def __call__(self, x):
        x = self.dense_1(x)
        return self.dense_2(x)

flex_seq_dense = FlexSeqDense()

print('Model result: ', flex_seq_dense(tf.constant([[3.0, 2.0, 1.0]])))
Model result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

Saving weights

Checkpoints are just the weights (that is, the values of the set of variables inside the module and its submodules).
Checkpoints consist of two kinds of files: the data itself and an index file for metadata.

  • Index file keeps track of what is actually saved and the numbering of checkponts
  • Checkpoint data contains the variable values and their attribute lookup paths.
# save weight
ckpt_path = 'my_ckpt'
ckpt = tf.train.Checkpoint(model=flex_seq_dense)
ckpt.write(ckpt_path)

# look collection of variables sorted by the Python object
for var in tf.train.list_variables(ckpt_path):
    print(var)

new_model = FlexSeqDense()

# load weight
new_ckpt = tf.train.Checkpoint(model=new_model)
new_ckpt.restore('my_ckpt')

print('New model result: ', new_model(tf.constant([[30.0, 20.0, 10.0]])))
('_CHECKPOINTABLE_OBJECT_GRAPH', [])
('model/dense_1/b/.ATTRIBUTES/VARIABLE_VALUE', [3])
('model/dense_1/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 3])
('model/dense_2/b/.ATTRIBUTES/VARIABLE_VALUE', [2])
('model/dense_2/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 2])
New model result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

Saving Graphs

TensorFlow needs to know how to do the computations described in Python, but without the original code. To do this, you should make a graph.

class FlexSeqDense(tf.Module):
    def __init__(self, name=None):
        super().__init__()

        self.dense_1 = SimpleDense(in_f=3, out_f=3)
        self.dense_2 = SimpleDense(in_f=3, out_f=2)

    @tf.function
    def __call__(self, x):
        x = self.dense_1(x)
        return self.dense_2(x)

tf_dense = FlexSeqDense()
print('graph result: ', tf_dense(tf.constant([[1.0, 2.0, 3.0]])))
print('graph result: ', tf_dense(tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])))
graph result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)
graph result:  tf.Tensor(
[[0. 0.]
 [0. 0.]], shape=(2, 2), dtype=float32)

visualize the graph by tracing it within a TensorBoard summary.

# set up logging
stamp = datetime.now().strftime('%Y%m%d-%H%M%S')
logdir = f"logs/func/{stamp}"
writer = tf.summary.create_file_writer(logdir)

# create new model to get a fresh trace
new_model = FlexSeqDense()

tf.summary.trace_on(graph=True)
#tf.profiler.experimental.start(logdir)
new_model(tf.constant([[2.0, 2.0, 2.0]]))
with writer.as_default():
    tf.summary.trace_export(
        name='flexible_dense',
        step=0,
        profiler_outdir=logdir
    )
%tensorboard --logdir logs/func
Reusing TensorBoard on port 6006 (pid 20208), started 0:04:19 ago. (Use '!kill 20208' to kill it.)

 

 

Creating a SavedModel

SavedModel contains both a collection of functions and a collection of weights.
Models and layers can be loaded from this representation without actually making an instance of the class that created it.

# save model
tf.saved_model.save(tf_dense, 'the_saved_model')
new_model = tf.saved_model.load('the_saved_model')
print('Is new_model instance of FlexSeqDense: ', isinstance(new_model, FlexSeqDense))
print('SavedModel result: ', new_model(tf.constant([[2.0, 2.0, 2.0]])))
INFO:tensorflow:Assets written to: the_saved_model\assets
Is new_model instance of FlexSeqDense:  False
SavedModel result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

Keras models and layers

In this section, you will examine how Keras uses tf.Module.

Keras layers

tf.keras.layers.Layer is the base class of all Keras layers, and it inherits from tf.Module .
You can convert a module into a Keras layer just by swapping out the parent and then changing __call__ to call:

class KerasDense(tf.keras.layers.Layer):
    def __init__(self, in_f, out_f, **kwargs):
        super().__init__(**kwargs)

        self.w = tf.Variable(tf.random.normal([in_f, out_f]), name='w')
        self.b = tf.Variable(tf.zeros([out_f]), name='b')

    def call(self, x):
        y = tf.matmul(x, self.w) + self.b
        return tf.nn.relu(y)

keras_dense = KerasDense(in_f=3, out_f=3)
print('keras dense result: ', keras_dense([[2.0, 2.0, 2.0]]))
keras dense result:  tf.Tensor([[0.        0.        1.1862065]], shape=(1, 3), dtype=float32)

The build step

Keras layers come with an extra lifecycle step that allows you more flexibility in how you define your layers. This is defined in the build function.
Since build is only called once, inputs will be rejected if the input shape is not compatible with the layer's variables.

class FlexibleDense(tf.keras.layers.Layer):
    def __init__(self, out_f, **kwargs):
        super().__init__(**kwargs)

        self.out_f = out_f

    def build(self, input_shape):
        self.w = tf.Variable(tf.random.normal([input_shape[-1], self.out_f]), name='w')
        self.b = tf.Variable(tf.zeros([self.out_f]), name='b')

    def call(self, x):
        y = tf.matmul(x, self.w) + self.b
        return tf.nn.relu(y)

flexible_dense = FlexibleDense(out_f=3)
print('flexible dense result: ', flexible_dense(tf.constant([[2.0, 2.0, 2.0]])))
flexible dense result:  tf.Tensor([[0.        2.3967142 1.7811047]], shape=(1, 3), dtype=float32)

Keras models

You can define your model as nested Keras layers.
Keras also provides a full-featured model class called tf.keras.Model which inherits from tf.keras.layers.Layer.

class KerasModel(tf.keras.Model):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

        self.dense_1 = FlexibleDense(out_f=3)
        self.dense_2 = FlexibleDense(out_f=2)

    def call(self, x):
        x = self.dense_1(x)
        return self.dense_2(x)

keras_model = KerasModel()
print('keras model result: ', keras_model(tf.constant([[1., 2., 3.]])))
keras model result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

Saving Keras Models

Keras model use tf.keras.Model.save to save and use tf.keras.models.load_model to load.
Keras SavedModel also save metric, loss, and optimizer states.

# save model
keras_model.save('keras_model')

#load model
new_model = tf.keras.models.load_model("keras_model")
print('new model result: ', new_model(tf.constant([[1., 2., 3.]])))
INFO:tensorflow:Assets written to: keras_model\assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
new model result:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

'Tool > TensorFlow' 카테고리의 다른 글

6. Basic training loops  (0) 2022.03.12
4. Introduction to graphs and tf.function  (0) 2022.03.12
3. Introduction to gradients and automatic differentiation  (0) 2022.03.12
2. Introduction to Variables  (0) 2022.03.12
1. Introduction to Tensors  (0) 2022.03.12