강정노트

[PyTorch] Writing Distributed Applications with Pytorch

Introduction The distributed package included in PyTorch(i.e., torch.distributed) enables researchers and practitioners to easily parallelize their computations across processes and clusters of machines. To do so, it leverages message passing semantics allowing each process to communicate data to any of the other processes. Setup In order to get started we need the ability to run multiple proces..

2022.06.10

[Paper Review] Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

Abstract In this work, we present our techniques for training very large transformer models and implement a simple, efficient intra-layer model parallel approach that enables training transformer models with billions of parameters. We sustain $15.1$ PetaFLOPs across the entire application with $76$ % scaling efficiency when compared to a strong single GPU baseline that sustains $39$ TeraFLOPs, wh..

2022.06.07

[Paper Review] LaMDA: Language Models for Dialog Applications

Abstract We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of 1). safety and 2). factual grounding. We also explore the use of LaMDA in 3). the domains of education and content recommendations to investigate its potential and shortcomings. 1. Introduction According t..

2022.05.24

3. Making new Layers and Models via subclassing

import tensorflow as tf import numpy as np from tensorflow import keras The Layer class: the combination of state (weights) and some computation A layer encapsulates both a state (the layer's "weights") and a transformation from inputs to outputs (a "call", the layer's forward pass). class CustomLinear1(keras.layers.Layer): def __init__(self, d_in, d_out): super().__init__() w_init = tf.random_n..

2022.03.12

2. The Functional API

The main idea is that a deep learning model is usually a directed acyclic graph(DAG) of layers. So the functional API is a way to build graphs of layers. import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers Create model by funtional API To build this model using the functional API: Creating an input node(keras.input) which return information about the shape an..

2022.03.12

6. Basic training loops

You have learned about tensors, variables, gradient tape, and modules. In this guide, you will fit these all together to train models. import tensorflow as tf import matplotlib.pyplot as plt Solving machine learning problems Solving a machine learning problem usually consists of the following steps: Obtain training data Define the model Define a loss function Run through the training data, calcu..

2022.03.12

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

강정노트

강정노트

최근글

전체 글(108)

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역