Normal view MARC view ISBD view

Transformers for machine learning : (Record no. 5955)

MARC details
000 -LEADER
fixed length control field	10112cam a2200577 i 4500
001 - CONTROL NUMBER
control field	9781003170082
003 - CONTROL NUMBER IDENTIFIER
control field	FlBoTFG
005 - DATE AND TIME OF LATEST TRANSACTION
control field	20240213122832.0
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
fixed length control field	m o d
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field	cr cnu\|\|\|unuuu
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	220412s2022 xx eo 000 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency	OCoLC-P
Language of cataloging	eng
Description conventions	rda
--	pn
Transcribing agency	OCoLC-P
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	9781003170082
Qualifying information	(electronic bk.)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	1003170080
Qualifying information	(electronic bk.)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	9781000587074
Qualifying information	(electronic bk. ;
--	PDF)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	100058707X
Qualifying information	(electronic bk. ;
--	PDF)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	9781000587098
Qualifying information	(electronic bk. ;
--	EPUB)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	1000587096
Qualifying information	(electronic bk. ;
--	EPUB)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN	9780367771652
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN	9780367767341
024 7# - OTHER STANDARD IDENTIFIER
Standard number or code	10.1201/9781003170082
Source of number or code	doi
035 ## - SYSTEM CONTROL NUMBER
System control number	(OCoLC)1310470794
035 ## - SYSTEM CONTROL NUMBER
System control number	(OCoLC-P)1310470794
050 #4 - LIBRARY OF CONGRESS CALL NUMBER
Classification number	QA76.87
072 #7 - SUBJECT CATEGORY CODE
Subject category code	COM
Subject category code subdivision	044000
Source	bisacsh
072 #7 - SUBJECT CATEGORY CODE
Subject category code	COM
Subject category code subdivision	042000
Source	bisacsh
072 #7 - SUBJECT CATEGORY CODE
Subject category code	COM
Subject category code subdivision	016000
Source	bisacsh
072 #7 - SUBJECT CATEGORY CODE
Subject category code	UYQ
Source	bicssc
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	006.3/2
Edition number	23/eng/20220218
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name	Kamath, Uday,
Relator term	author.
245 10 - TITLE STATEMENT
Title	Transformers for machine learning :
Remainder of title	a deep dive /
Statement of responsibility, etc.	Uday Kamath, Kenneth Graham, Wael Emara.
250 ## - EDITION STATEMENT
Edition statement	First edition.
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture	[Place of publication not identified] :
Name of producer, publisher, distributor, manufacturer	Chapman and Hall/CRC,
Date of production, publication, distribution, manufacture, or copyright notice	2022.
300 ## - PHYSICAL DESCRIPTION
Extent	1 online resource (xxvi, 257 pages)
336 ## - CONTENT TYPE
Content type term	text
Content type code	txt
Source	rdacontent
337 ## - MEDIA TYPE
Media type term	computer
Media type code	c
Source	rdamedia
338 ## - CARRIER TYPE
Carrier type term	online resource
Carrier type code	cr
Source	rdacarrier
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note	List of Figures List of Tables Author Bios Foreword Preface Contributors Deep Learning and Transformers: An Introduction 1.1 DEEP LEARNING: A HISTORIC PERSPECTIVE1.2 TRANSFORMERS AND TAXONOMY 1.2.1 Modified Transformer Architecture 1.2.1.1 Transformer block changes 1.2.1.2 Transformer sublayer changes 1.2.2 Pretraining Methods and Applications 1.3 RESOURCES 1.3.1 Libraries and Implementations 1.3.2 Books 1.3.3 Courses, Tutorials, and Lectures 1.3.4 Case Studies and Details Transformers: Basics and Introduction 2.1 ENCODER-DECODER ARCHITECTURE 2.2 SEQUENCE TO SEQUENCE 2.2.1 Encoder 2.2.2 Decoder 2.2.3 Training 2.2.4 Issues with RNN-based Encoder Decoder 2.3 ATTENTION MECHANISM 2.3.1 Background 2.3.2 Types of Score-Based Attention 2.3.2.1 Dot Product (multiplicative) 2.3.2.2 Scaled Dot Product or multiplicative 2.3.2.3 Linear, MLP, or additive 2.3.3 Attention-based Sequence to Sequence 2.4 TRANSFORMER 2.4.1 Source and Target Representation 2.4.1.1 Word Embedding 2.4.1.2 Positional Encoding 2.4.2 Attention Layers 2.4.2.1 Self-Attention 2.4.2.2 Multi-Head Attention 2.4.2.3 Masked Multi-Head Attention 2.4.2.4 Encoder-Decoder Multi-Head Attention 2.4.3 Residuals and Layer Normalization 2.4.4 Position-wise Feed-Forward Networks 2.4.5 Encoder 2.4.6 Decoder 2.5 CASE STUDY: MACHINE TRANSLATION 2.5.1 Goal 2.5.2 Data, Tools and Libraries 2.5.3 Experiments, Results and Analysis 2.5.3.1 Exploratory Data Analysis 2.5.3.2 Attention 2.5.3.3 Transformer2.5.3.4 Results and Analysis 2.5.3.5 Explainability Bidirectional Encoder Representations from Transformers (BERT) 3.1 BERT 3.1.1 Architecture 3.1.2 Pre-training 3.1.3 Fine-tuning 3.2 BERT VARIANTS 3.2.1 RoBERTa 3.3 APPLICATIONS 3.3.1 TaBERT 3.3.2 BERTopic 3.4 BERT INSIGHTS 3.4.1 BERT Sentence Representation 3.4.2 BERTology 3.5 CASE STUDY: TOPIC MODELING WITH TRANSFORMERS 3.5.1 Goal 3.5.2 Data, Tools, and Libraries 3.5.2.1 Data 3.5.2.2 Compute embeddings 3.5.3 Experiments, Results, and Analysis 3.5.3.1 Building Topics 3.5.3.2 Topic size distribution 3.5.3.3 Visualization of topics 3.5.3.4 Content of topics 3.6 CASE STUDY: FINE-TUNING BERT 3.6.1 Goal 3.6.2 Data, Tools and Libraries 3.6.3 Experiments, Results and Analysis Multilingual Transformer Architectures 4.1 MULTILINGUAL TRANSFORMER ARCHITECTURES 4.1.1 Basic Multilingual Transformer 4.1.2 Single-Encoder Multilingual NLU 4.1.2.1 mBERT 4.1.2.2 XLM 4.1.2.3 XLM-RoBERTa 4.1.2.4 ALM 4.1.2.5 Unicoder 4.1.2.6 INFOXL4.1.2.7 AMBER 4.1.2.8 ERNIE-M 4.1.2.9 HITCL 4.1.3 Dual-Encoder Multilingual NLU 4.1.3.1 LaBSE 4.1.3.2 mUSE 4.1.4 Multilingual NLG 4.2 MULTILINGUAL DATA 4.2.1 Pre-training Data 4.2.2 Multilingual Benchmarks 4.2.2.1 Classification 4.2.2.2 Structure Prediction 4.2.2.3 Question Answering 4.2.2.4 Semantic Retrieval 4.3 MULTILINGUAL TRANSFER LEARNING INSIGHTS 4.3.1 Zero-shot Cross-lingual Learning 4.3.1.1 Data Factors 4.3.1.2 Model Architecture Factors 4.3.1.3 Model Tasks Factors 4.3.2 Language-agnostic Cross-lingual Representations4.4 CASE STUDY 4.4.1 Goal 4.4.2 Data, Tools, and Libraries 4.4.3 Experiments, Results, and Analysis 4.4.3.1 Data Preprocessing 4.4.3.2 Experiments Transformer Modifications5.1 TRANSFORMER BLOCK MODIFICATIONS 5.1.1 Lightweight Transformers 5.1.1.1 Funnel-Transformer 5.1.1.2 DeLighT 5.1.2 Connections between Transformer Blocks 5.1.2.1 RealFormer 5.1.3 Adaptive Computation Time 5.1.3.1 Universal Transformers (UT) 5.1.4 Recurrence Relations between Transformer Blocks 5.1.4.1 Transformer-XL 5.1.5 Hierarchical Transformers 5.2 TRANSFORMERS WITH MODIFIED MULTI-HEAD SELF-ATTENTION5.2.1 Structure of Multi-head Self-Attention 5.2.1.1 Multi-head self-attention 5.2.1.2 Space and time complexity 5.2.2 Reducing Complexity of Self-attention 5.2.2.1 Longformer 5.2.2.2 Reformer 5.2.2.3 Performer 5.2.2.4 Big Bird 5.2.3 Improving Multi-head-attention 5.2.3.1 Talking-Heads Attention 5.2.4 Biasing Attention with Priors 5.2.5 Prototype Queries5.2.5.1 Clustered Attention 5.2.6 Compressed Key-Value Memory 5.2.6.1 Luna: Linear Unified Nested Attention 5.2.7 Low-rank Approximations5.2.7.1 Linformer 5.3 MODIFICATIONS FOR TRAINING TASK EFFICIENCY 5.3.1 ELECTRA5.3.1.1 Replaced token detection 5.3.2 T5 5.4 TRANSFORMER SUBMODULE CHANGES 5.4.1 Switch Transformer 5.5 CASE STUDY: SENTIMENT ANALYSIS5.5.1 Goal 5.5.2 Data, Tools, and Libraries 5.5.3 Experiments, Results, and Analysis 5.5.3.1 Visualizing attention head weights 5.5.3.2 Analysis Pretrained and Application-Specific Transformers 6.1 TEXT PROCESSING 6.1.1 Domain-Specific Transformers 6.1.1.1 BioBERT 6.1.1.2 SciBERT 6.1.1.3 FinBERT 6.1.2 Text-to-text Transformers 6.1.2.1 ByT5 6.1.3 Text generation 6.1.3.1 GPT: Generative Pre-training 6.1.3.2 GPT-2 6.1.3.3 GPT-3 6.2 COMPUTER VISION 6.2.1 Vision Transformer 6.3 AUTOMATIC SPEECH RECOGNITION 6.3.1 Wav2vec 2.0 6.3.2 Speech2Text2 6.3.3 HuBERT: Hidden Units BERT 6.4 MULTIMODAL AND MULTITASKING TRANSFORMER 6.4.1 Vision-and-Language BERT (VilBERT) 6.4.2 Unified Transformer (UniT) 6.5 VIDEO PROCESSING WITH TIMESFORMER 6.5.1 Patch embeddings 6.5.2 Self-attention 6.5.2.1 Spatiotemporal self-attention 6.5.2.2 Spatiotemporal attention blocks 6.6 GRAPH TRANSFORMERS 6.6.1 Positional encodings in a graph 6.6.1.1 Laplacian positional encodings 6.6.2 Graph transformer input 6.6.2.1 Graphs without edge attributes 6.6.2.2 Graphs with edge attributes 6.7 REINFORCEMENT LEARNING 6.7.1 Decision Transformer 6.8 CASE STUDY: AUTOMATIC SPEECH RECOGNITION 6.8.1 Goal 6.8.2 Data, Tools, and Libraries 6.8.3 Experiments, Results, and Analysis 6.8.3.1 Preprocessing speech data 6.8.3.2 Evaluation Interpretability and Explainability Techniques for Transformers7.1 TRAITS OF EXPLAINABLE SYSTEMS 7.2 RELATED AREAS THAT IMPACT EXPLAINABILITY 7.3 EXPLAINABLE METHODS TAXONOMY 7.3.1 Visualization Methods 7.3.1.1 Backpropagation-based 7.3.1.2 Perturbation-based 7.3.2 Model Distillation 7.3.2.1 Local Approximation 7.3.2.2 Model Translation 7.3.3 Intrinsic Methods 7.3.3.1 Probing Mechanism 7.3.3.2 Joint Training 7.4 ATTENTION AND EXPLANATION 7.4.1 Attention is not Explanation 7.4.1.1 Attention Weights and Feature Importance 7.4.1.2 Counterfactual Experiments 7.4.2 Attention is not not Explanation 7.4.2.1 Is attention necessary for all tasks? 7.4.2.2 Searching for Adversarial Models 7.4.2.3 Attention Probing 7.5 QUANTIFYING ATTENTION FLOW 7.5.1 Information flow as DAG 7.5.2 Attention Rollout 7.5.3 Attention Flow 7.6 CASE STUDY: TEXT CLASSIFICATION WITH EXPLAINABILITY 7.6.1 Goal 7.6.2 Data, Tools, and Libraries 7.6.3 Experiments, Results and Analysis 7.6.3.1 Exploratory Data Analysis 7.6.3.2 Experiments 7.6.3.3 Error Analysis and Explainability Bibliography Alphabetical Index
520 ## - SUMMARY, ETC.
Summary, etc.	Transformers are becoming a core part of many neural network architectures, employed in a wide range of applications such as NLP, Speech Recognition, Time Series, and Computer Vision. Transformers have gone through many adaptations and alterations, resulting in newer techniques and methods. Transformers for Machine Learning: A Deep Dive is the first comprehensive book on transformers. Key Features: A comprehensive reference book for detailed explanations for every algorithm and techniques related to the transformers. 60+ transformer architectures covered in a comprehensive manner. A book for understanding how to apply the transformer techniques in speech, text, time series, and computer vision. Practical tips and tricks for each architecture and how to use it in the real world. Hands-on case studies and code snippets for theory and practical real-world analysis using the tools and libraries, all ready to run in Google Colab. The theoretical explanations of the state-of-the-art transformer architectures will appeal to postgraduate students and researchers (academic and industry) as it will provide a single entry point with deep discussions of a quickly moving field. The practical hands-on case studies and code will appeal to undergraduate students, practitioners, and professionals as it allows for quick experimentation and lowers the barrier to entry into the field.
588 ## - SOURCE OF DESCRIPTION NOTE
Source of description note	OCLC-licensed vendor bibliographic record.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Neural networks (Computer science)
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Computational intelligence.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Machine learning.
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	COMPUTERS
General subdivision	Neural Networks.
Source of heading or term	bisacsh
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	COMPUTERS
General subdivision	Natural Language Processing.
Source of heading or term	bisacsh
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	COMPUTERS
General subdivision	Computer Vision & Pattern Recognition.
Source of heading or term	bisacsh
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name	Graham, Kenneth L.
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name	Emara, Wael
856 40 - ELECTRONIC LOCATION AND ACCESS
Materials specified	Taylor & Francis
Uniform Resource Identifier	<a href="https://www.taylorfrancis.com/books/e/9781003170082">https://www.taylorfrancis.com/books/e/9781003170082</a>
856 42 - ELECTRONIC LOCATION AND ACCESS
Materials specified	OCLC metadata license agreement
Uniform Resource Identifier	<a href="http://www.oclc.org/content/dam/oclc/forms/terms/vbrl-201703.pdf">http://www.oclc.org/content/dam/oclc/forms/terms/vbrl-201703.pdf</a>

No items available.

NLU Meghalaya Library

Online Public Access Catalogue (OPAC)

Transformers for machine learning : (Record no. 5955)