- Annotated Transformer - An in-depth guide on the Transformer model, explaining the architecture and its components.
- The Unreasonable Effectiveness of Recurrent Neural Networks - Andrej Karpathy's post on the power and applications of RNNs.
- Understanding LSTMs - A detailed explanation of Long Short-Term Memory networks and their mechanisms.
- Seq2Seq Learning with Neural Networks - Paper by Ilya Sutskever et al. on sequence-to-sequence learning.
- Distributed Representations - Geoffrey Hinton's paper on distributed representations of concepts in neural networks.
- Distilling the Knowledge in a Neural Network - Paper on model compression techniques.
- ImageNet Classification with Deep Convolutional Neural Networks.
- Batch Normalization - Paper introducing the technique to accelerate deep network training by reducing internal covariate shift.
- BERT: Pre-training of Deep Bidirectional Transformers - Paper on BERT, a method for pre-training language representations.
- ResNet: Deep Residual Learning - Paper on Residual Networks, which allow training very deep networks.
- Adam: A Method for Stochastic Optimization - Paper on the Adam optimization algorithm, widely used in training neural networks.
- Attention Is All You Need - The influential paper introducing the Transformer model.
- Transformers for Image Recognition at Scale - Paper on the application of Transformer models to image recognition.
- Generative Adversarial Nets - The original paper on GANs by Goodfellow et al.
- Neural Machine Translation by Jointly Learning to Align and Translate - Paper on neural machine translation models that align and translate simultaneously.
- One Model to Learn Them All - Paper on a single model that learns tasks across multiple domains.
- Wavenet: A Generative Model for Raw Audio - Paper introducing WaveNet, a deep generative model for producing audio.
- Attention Is All You Need for Speech Recognition - Paper exploring the application of Transformer models to speech recognition.
- Auto-Encoding Variational Bayes - Paper on variational autoencoders (VAEs), a type of generative model.
- Deep Convolutional Generative Adversarial Networks - Paper on DCGANs, combining CNNs with GANs for generative tasks.
- Memory Networks - Paper on neural networks with a memory component for reasoning tasks.
- Graph Neural Networks - A comprehensive survey on graph neural networks (GNNs).
- Category Theory for Computing Science - Paper discussing the application of category theory in computer science.
- Machine Super Intelligence
- Kolmogorov Complexity and Algorithmic Randomness - A comprehensive book on Kolmogorov complexity and its applications.
- CS231n: Convolutional Neural Networks for Visual Recognition - Stanford's course materials on CNNs for visual recognition tasks.