This is a living document and will continue to be updated. Last update 25-October-2022
Free Online Classes & Learning Resources
One of the great things about the fields of data science and machine learning is the number of free courses from industry leaders.
fastai
Practical Deep Learning for Coders - A beginner friendly video lesson series from fastai using Python, Pytorch. Jeremy Howard & Rachel Thomas. The only prerequisite is that you know how to code (a year of experience is enough), preferably in Python, and that you have at least followed a high school math course. No special hardware needed as lessons can be run in Jupyter Notebooks on Google Colab. The accompanying book: Deep Learning for Coders with Fastai and Pytorch: AI Applications Without a PhD (Howard, J. and Gugger, S.)
Fastbook Reading Group Indepth tuturial video lessons to go along with the FastAI book. Highly recommended to augment your learning as your follow the fastai course & book (above).
A Walk With Fastai Zac Mueller walks through Deep Learning examples & tips using the fastai library and Pytorch. Lessons include Vision, Tabular & Audio.
Awesome fastai - A curated list of awesome projects and resources related fastai. Thanks to Tanishq Abraham
Fast AI Recommended Learning Resources for Python - A curated list of Python learning resources from Beginner to Advanced. Something for everyone. if you aren’t already a member of the fastai forums, now would be a great time to sign up, it is a welcoming, positive and helpful ML community.
NYU Deep Learning SP21 - Video Course with accompanying website with lectures from Neural Net pioneer Yann LeCun and tutorials by Alfredo Canziani.
Computational Thinking MIT Spring 2021 - A computational thinking course in Julia language. Julia is the up and coming language for DS and ML that may well displace Python & R in the future. Also see Introduction to Computational Thinking
Programming of Simulation, Analysis, and Learning Systems - UQ - A practical way to learn Julia through simulation and analysis. From bubble sort to Monte Carlo.
Khan Academy Learn and refresh skills across Math, Statistics and more.
3Blue1Brown Math concepts explained in video with beautiful animations and visualisations. You can spend many enjoyable hours here even if you don’t think you are a ‘Math person’.
StatQuest to quote host Josh Starmer, “StatQuest breaks down complicated Statistics and Machine Learning methods into small, bite-sized pieces that are easy to understand. StatQuest doesn’t dumb down the material, instead, it builds you up so that you are smarter and have a better understanding of Statistics and Machine Learning.” triple bam!
The Missing Semster of your CS Education - Fill in your missing skills needed for modern Data Science and Machine Learning, such as commandline usage, version control (Git)
Calm code - Short and simple video lessons from starting scratch on using many open source tools, mainly in Python.
Kaggle - Kaggle is a great place to find datasets to practice on, even if you don’t want to enter comptetitions. You get free processing time in their (Jupyter-like) kernals, and you can learn by viewing solutions from other people. There are also numerous short courses to explore Data Visualisation, Feature Engineering, Geospatial analysis, Pandas, SQL, Time Series and many more.
Data School - Many short How-to videos (most under 5 minutes) covering Data Science & ML tools like SciKit-Learn, Pandas etc.
R Shiny Tutorial - Turn your R into interactive analysis apps.
🤗 Hugging Face - This course will teach you about natural language processing (NLP) using libraries from the Hugging Face ecosystem. Covers 🤗 Transformers, 🤗 Datasets, 🤗 Tokenizer, and 🤗 Accelerate — as well as the Hugging Face Hub. It’s completely free and without ads.
Getting Started with Object Detection using IceVision - IceVision is a Framework for object detection and deep learning that makes it easier to prepare data, train an object detection model, and use that model for inference. The IceVision Framework provides a layer across multiple deep learning engines, libraries, models, and data sets.
28 Jupyter Tips & Tricks - These tips will increase your productivity in Jupyter, from key shortcuts to must-know plugins.
ML News & Interviews
ML News. Regular news style updates. A simple way to keep up with current happenings in ML.
Machine Learning Street Talk. In depth topic interviews.
Lex Fridman Interviews - Lex interviews the great minds of machine learning and artificial intelligence on his video podcast.
Two Minute Papers - Catch the latest papers in video, summmarised in two minutes.
Microsoft Research Podcast - Interviews with researchers in Microsoft in diverse research areas. What is their current focus area and the problems they are trying to solve?
Tools & Libraries
The choices have varied over the years, but currently falling into two camps. Tensorflow and Pytorch. In the end you will probably learn both, but currently more new papers are in Pytorch than Tensorflow. It is not too hard to move from one to the other.
TensorFlow - Much improved ease of use since V2 and using Keras as the primary API. If you want to run ML in the browser via javascript. Tensorflow is the better option over Pytorch with tensorflow.js. Keras API can be used with multiple backends, not just Tensorflow, so there may be future systems that will use it.
Pytorch - Whilst Google was still inflicting dreadful programming interfaces with Tensorflow V1, much of the research world moved to Pytorch. It is flexible and expandible and well supported with blogs and books.
FastAI - Fastai library extends Pytorch with a layed API, providing SOTA defaults and best practice training loop and increased productivity.
ML.Net - ML.Net provides a AutoML tools and ML pipeline for use on dotnet platforms. It has promise, but documentation & examples are limited and often broken.
FLAX Flax is a high-performance neural network library and ecosystem for JAX that is designed for flexibility. The new cool kid on the block.
Pytorch Image Models (timm) -
timm
is a deep-learning library created by Ross Wightman and is a collection of SOTA computer vision models, layers, utilities, optimizers, schedulers, data-loaders, augmentations and also training/validating scripts with ability to reproduce ImageNet training results.Hugging Face Spaces - Share your (Python) ML model applications in a few minutes. Spaces are a simple way to host ML demo apps directly on your profile or your organization’s profile. This allows you to create your ML portfolio, showcase your projects at conferences or to stakeholders, and work collaboratively with other people in the ML ecosystem.
Shiny - Shiny is an R package that makes it easy to build interactive web apps straight from R. You can host standalone apps on a webpage or embed them in R Markdown documents or build dashboards. You can also extend your Shiny apps with CSS themes, htmlwidgets, and JavaScript actions.
TorchSharp - TorchSharp is a wrapper .NET library that provides access to the library that powers PyTorch. Allowing C# and F# to more easily use Pytorch from those languages on Windows & Linux. It is part of the .NET Foundation. Examples repo.
Papers
Finding and reading papers is a key skill to develop particularly in Deep Learning, where you may have to implement the technique yourself because it may not make it into your favourite tools for a while.
Papers With code - I’m a firm believer that results in papers should be replicable. Papers with code is an excellent resource for machine learning papers that have provided code and also shows the SOTA (state of the art) results for papers and related benchmark datasets.
Semantic Scholar - An excellent free AI-powered research tool for scientific literature. Results includes citations (and the egotistical H-Index)
W&B Paper Reading Group - Aman Arora hosts this Deep Learning Paper reading group for beginners. Step by step Video walk-throughs of key papers covering ML architectures such as Resnets, DETR, Squeeze & Excitation Nets etc. A great place to begin learning through reading papers.
Two Minute Papers Catch the latest papers in video, summmarised in two minutes.
NLP Progress Tracks the state of the art (SOTA) for numerous Natural Language Processing tasks.
Trending Papers Displays trending ML papers, recent, weekly or monthly feed-view. Also has search.
Annotated Papers A collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations, and the website renders these as side-by-side formatted notes. We believe these would help you understand these algorithms better.
Blogging Tools
- Jekyll Cheatsheet - Options for use with Jekyll static sites. This cheat sheet serves as a quick reference of everything Jekyll can do.
- FastPages - Deprecated. Use Quarto instead. FastPages to Quarto Migration Guide
FastPages: Turn your Jupyter Notebooks, Word and Markdown documents into Blog posts. fastpages automates the process of creating blog posts via GitHub Actions, so you don’t have to fuss with conversion scripts. - Quarto - Quarto® is an open-source scientific and technical publishing system built on Pandoc
- Create dynamic content with Python, R, Julia, and Observable.
- Author documents as plain text markdown or Jupyter notebooks.
- Publish high-quality articles, reports, presentations, websites, blogs, and books in HTML, PDF, MS Word, ePub, and more.
- Author with scientific markdown, including equations, citations, crossrefs, figure panels, callouts, advanced layout, and more.
- Markdown Guide for Jupyter - A guide to the Markdown format, so you can get the most from your Jupyter Notebook writing.