3. Making new Layers and Models via subclassing

2022. 3. 12. 14:01Tool/Keras

import tensorflow as tf
import numpy as np
from tensorflow import keras

The Layer class: the combination of state (weights) and some computation

A layer encapsulates both a state (the layer's "weights") and a transformation from inputs to outputs (a "call", the layer's forward pass).

class CustomLinear1(keras.layers.Layer):
    def __init__(self, d_in, d_out):
        super().__init__()

        w_init = tf.random_normal_initializer()
        self.w = tf.Variable(
            initial_value=w_init(shape=(d_in, d_out), dtype='float32'),
            trainable=True,
        )
        b_init = tf.zeros_initializer()
        self.b = tf.Variable(
            initial_value=b_init(shape=(d_out,), dtype='float32'),
            trainable=True,
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

x = tf.ones((2, 2))
linear_layer = CustomLinear1(2, 4)
y = linear_layer(x)
print('result: ', y)
result:  tf.Tensor(
[[0.03218924 0.00798053 0.04283624 0.02556972]
 [0.03218924 0.00798053 0.04283624 0.02556972]], shape=(2, 4), dtype=float32)

Note you also have access to a quicker shortcut for adding weight to a layer: the add_weight() method:

class CustomLinear2(keras.layers.Layer):
    def __init__(self, d_in, d_out):
        super().__init__()

        self.w = self.add_weight(
            shape=(d_in, d_out), initializer='random_normal', trainable=True
        )
        self.b = self.add_weight(
            shape=(d_out,), initializer='zeros', trainable=False
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

x = tf.ones((2, 2))
linear_layer = CustomLinear2(2, 4)
y = linear_layer(x)
print('result: ', y)
print('\nweights: ', linear_layer.weights)
print('\nnon-trainable weights: ', linear_layer.non_trainable_weights)
print('\ntrainable weights: ', linear_layer.trainable_weights)
result:  tf.Tensor(
[[ 0.19004892  0.07329603 -0.06572916 -0.14658041]
 [ 0.19004892  0.07329603 -0.06572916 -0.14658041]], shape=(2, 4), dtype=float32)

weights:  [<tf.Variable 'Variable:0' shape=(2, 4) dtype=float32, numpy=
array([[ 0.03415267,  0.04970514, -0.04784032, -0.15157548],
       [ 0.15589625,  0.02359089, -0.01788885,  0.00499506]],
      dtype=float32)>, <tf.Variable 'Variable:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>]

non-trainable weights:  [<tf.Variable 'Variable:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>]

trainable weights:  [<tf.Variable 'Variable:0' shape=(2, 4) dtype=float32, numpy=
array([[ 0.03415267,  0.04970514, -0.04784032, -0.15157548],
       [ 0.15589625,  0.02359089, -0.01788885,  0.00499506]],
      dtype=float32)>]

Deferring weight creating until the shape of the inputs is known

In many cases, you may not know in advance the size of your inputs, and you would like to lazily create weights when that value becomes known, some after instanitating the layer.
In the Keras API, we recommanded creating layer weights in the build(self, inputs_size) method of your layer. The __call__() method of your layer will automatically run build the first time it is called.

class CustomLinear3(keras.layers.Layer):
    def __init__(self, d_out):
        super().__init__()
        self.d_out = d_out

    def build(self, shape):
        self.w = self.add_weight(
            shape=(shape[-1], self.d_out),
            initializer='random_normal',
            trainable=True
        )
        self.b = self.add_weight(
            shape=(self.d_out,),
            initializer='random_normal',
            trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

linear_layer = CustomLinear3(32)
y = linear_layer(x)
print('result: ', y)
result:  tf.Tensor(
[[-0.0533622  -0.05335495 -0.00513582 -0.01051114 -0.0926936   0.1039307
  -0.14676946  0.03698617 -0.11198049 -0.12905014 -0.02261585 -0.03237192
   0.06992531  0.06582534  0.0221484   0.00312662  0.0376885   0.04436354
  -0.01268116 -0.08066177  0.02846039 -0.1235311   0.07653584  0.0125522
   0.033754   -0.05295909 -0.04116435 -0.0397236  -0.04736534 -0.13913187
   0.00132135  0.18220991]
 [-0.0533622  -0.05335495 -0.00513582 -0.01051114 -0.0926936   0.1039307
  -0.14676946  0.03698617 -0.11198049 -0.12905014 -0.02261585 -0.03237192
   0.06992531  0.06582534  0.0221484   0.00312662  0.0376885   0.04436354
  -0.01268116 -0.08066177  0.02846039 -0.1235311   0.07653584  0.0125522
   0.033754   -0.05295909 -0.04116435 -0.0397236  -0.04736534 -0.13913187
   0.00132135  0.18220991]], shape=(2, 32), dtype=float32)

Layers are recursively composable

If you assign a Layer instance as an attribute of another Layer, the outer layer will start tracking the weights created by the inner layer.
We recommand creating such sublayers in the __init__() method and leave it to the first __call__() to trigger building their weights.

class Linears(keras.layers.Layer):
    def __init__(self):
        super().__init__()
        self.linear1 = CustomLinear3(32)
        self.linear2 = CustomLinear3(32)
        self.linear3 = CustomLinear3(1)

    def call(self, inputs):
        x = self.linear1(inputs)
        x = tf.nn.relu(x)
        x = self.linear2(x)
        x = tf.nn.relu(x)
        return self.linear3(x)

linears = Linears()
y = linears(tf.ones(shape=(3, 64)))
print('weights: ', len(linears.weights))
print('trainable weights: ', len(linears.trainable_weights))
weights:  6
trainable weights:  6

The add_loss() and add_metric() method

When writing the call() method of a layer, you can create loss and metric tensors that you will want to use later, when writing your training loop. This is doable by calling self.add_loss(loss_value) and self.add_metric(metric_value).
These values can be retrieved via layer.losses and layer.metrics. This property is reset at the start of every __call__() to the top-level layer. fit() sum and add the layer losses(metrics) with main loss(metric).

class LogisticEndpoint(keras.layers.Layer):
    def __init__(self):
        super().__init__()

        self.loss_fn = keras.losses.BinaryCrossentropy(from_logits=True)
        self.accuracy_fn = keras.metrics.BinaryCrossentropy()

    def call(self, targets, logits):
        loss = self.loss_fn(targets, logits)
        self.add_loss(loss)

        acc = self.accuracy_fn(targets, logits)
        self.add_metric(acc)

        return tf.nn.softmax(logits)

layer = LogisticEndpoint()

targets = tf.ones((2, 2))
logits = tf.ones((2, 2))
y = layer(targets, logits)

print('layer.metrics: ', layer.metrics)
print('layer.losses: ', layer.losses)

inputs = keras.Input(shape=(3,), name='inputs')
targets = keras.Input(shape=(10,), name='targets')
logits = keras.layers.Dense(10)(inputs)
predictions = LogisticEndpoint()(logits, targets)

model = keras.Model(inputs=[inputs, targets], outputs=predictions)
model.compile(optimizer='adam')

data = {
    'inputs': np.random.random((3, 3)),
    'targets': np.random.random((3, 10))
}
layer.metrics:  [<keras.metrics.BinaryCrossentropy object at 0x7f39090a4990>]
layer.losses:  [<tf.Tensor: shape=(), dtype=float32, numpy=0.3132617>]

You can optionally enable serializaion on your layers

If you need your custom layers to be serializable as part of a Functional model, you can optionally implement get_config() method.
Note that the __init__() method of the base Layer class takes some keyword arguments, in particular a name and a dtype. It's good practice to pass these arguments to the parent class in __init__() and to include them in the layer config.

class CustomLinear4(keras.layers.Layer):
    def __init__(self, d_out=32, **kwargs):
        super(CustomLinear4, self).__init__(**kwargs)
        self.d_out = d_out

    def build(self, shape):
        self.w = self.add_weight(
            shape=(shape[-1], self.d_out),
            initializer='random_normal',
            trainable=True
        )
        self.b = self.add_weWight(
            shape=(self.d_out,),
            initializer='random_normal',
            trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        config = super(CustomLinear4, self).get_config()
        config.update({'d_out': self.d_out})
        return config

custom_layer = CustomLinear4(64)
config = custom_layer.get_config()
print('config: ', config)
new_layer = CustomLinear4.from_config(config)
config:  {'name': 'custom_linear4_3', 'trainable': True, 'dtype': 'float32', 'd_out': 64}

Privileged training argument in the call() method

Some layers have different behaviors during training and inference. For such layers, it is standard practice to expose a training (boolean) argument in the call() method.
By exposing this argument in call(), you enable the built-in training and evaluation loops to correctly use the layer in training and inference.

class CustomDropout(keras.layers.Layer):
    def __init__(self, rate, **kwargs):
        super(CustomDropout, self).__init__(**kwargs)
        self.rate = rate

    def call(self, inputs, training=None):
        if training:
            return tf.nn.dropout(inputs, rate=self.rate)
        return inputs

Privileged mask argument in the call() method

The other privileged argument supported by call() is the mask argument.
Keras will automatically pass the correct mask argument to __call__() for layers that support it, when a mask is generated by a prior layer. Mask-generating layers are the Embedding layer configured with mask_zero=True, and the Masking layer.

The Model class

The Model class has the same API as Layer, with the following differences:

  • It exposes built-in training, evaluation, and prediction loops (model.fit(), model.evaluate(), model.predict())
  • It exposes the list of its inner layers, via the model.layers property
  • It exposes saving and serialization APIs (save(), save_weights())

 

Ask yourself:

  • Will I need to call fit() on it?
  • Will I need to call save() on it?

If so, use Model. If not use Layer.

class CustomModel(tf.keras.Model):
    def __init__(self, d_out, **kwargs):
        super(CustomModel, self).__init__(**kwargs)
        self.layer1 = CustomLinear4(d_out)
        self.layer2 = CustomLinear4(d_out)

    def call(self, inputs):
        x = self.layer1(inputs)
        return self.layer2(x)

custom_model = CustomModel(5)

'Tool > Keras' 카테고리의 다른 글

2. The Functional API  (0) 2022.03.12