전체 글(108)
-
[PyTorch] Writing Distributed Applications with Pytorch
Introduction The distributed package included in PyTorch(i.e., torch.distributed) enables researchers and practitioners to easily parallelize their computations across processes and clusters of machines. To do so, it leverages message passing semantics allowing each process to communicate data to any of the other processes. Setup In order to get started we need the ability to run multiple proces..
2022.06.10 -
[Paper Review] Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Abstract In this work, we present our techniques for training very large transformer models and implement a simple, efficient intra-layer model parallel approach that enables training transformer models with billions of parameters. We sustain 15.1 PetaFLOPs across the entire application with 76% scaling efficiency when compared to a strong single GPU baseline that sustains 39 TeraFLOPs, wh..
2022.06.07 -
[Paper Review] LaMDA: Language Models for Dialog Applications
Abstract We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of 1). safety and 2). factual grounding. We also explore the use of LaMDA in 3). the domains of education and content recommendations to investigate its potential and shortcomings. 1. Introduction According t..
2022.05.24 -
3. Making new Layers and Models via subclassing
import tensorflow as tf import numpy as np from tensorflow import keras The Layer class: the combination of state (weights) and some computation A layer encapsulates both a state (the layer's "weights") and a transformation from inputs to outputs (a "call", the layer's forward pass). class CustomLinear1(keras.layers.Layer): def __init__(self, d_in, d_out): super().__init__() w_init = tf.random_n..
2022.03.12 -
2. The Functional API
The main idea is that a deep learning model is usually a directed acyclic graph(DAG) of layers. So the functional API is a way to build graphs of layers. import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers Create model by funtional API To build this model using the functional API: Creating an input node(keras.input) which return information about the shape an..
2022.03.12 -
6. Basic training loops
You have learned about tensors, variables, gradient tape, and modules. In this guide, you will fit these all together to train models. import tensorflow as tf import matplotlib.pyplot as plt Solving machine learning problems Solving a machine learning problem usually consists of the following steps: Obtain training data Define the model Define a loss function Run through the training data, calcu..
2022.03.12