PyData Seattle 2023

Introduction to Ray for distributed and machine learning applications in Python
04-26, 09:00–10:30 (America/Los_Angeles), Hood

This is an introductory and hands-on guided tutorial of Ray Core that covers an introductory and hands-on coding tour through the core features of Ray 2.0, which provides powerful yet easy-to-use design patterns for scaling compute and implementing distributed systems in Python. This tutorial includes a brief talk to give an overview of concepts, why and what Ray is, and how you write distributed Python applications and scale machine learning workloads.

Setup instructions for your laptop
To avoid wasting time doing it in class, please set up your laptops before coming to class.
If you want to follow along and have hands-on experience, please follow instructions on
how to set up your laptop with Ray.
https://github.com/dmatrix/ray-core-tutorial#-setup-instructions-for-local-laptop-


An introduction to Ray (https://www.ray.io/), the system for scaling your Python and machine learning applications from a laptop to a cluster. We'll start with a hands-on exploration of the core Ray API for distributed workloads, covering basic distributed Ray Core API patterns for scaling ML workloads:

  • Remote Python functions as tasks
  • Remote objects as futures
  • Remote Python classes as stateful actors
  • Multi-model training with Ray Core APIs patterns

If you are a data scientist, ML engineer, or a Python developer the key takeaways:

  • Understand what Ray 2.0 is and why to use it
  • Learn about Ray Core Python APIs and convert Python functions and classes into distributed stateless and stateful tasks
  • Use Ray Dashboard for inspection and observation metrics
  • Learn how use Ray core for multi-model training

Setup instructions for your laptop
To avoid wasting time doing it in class, please set up your laptops before coming to class.
If you want to follow along and have hands-on experience, please follow instructions on
how to set up your laptop with Ray.
https://github.com/dmatrix/ray-core-tutorial#-setup-instructions-for-local-laptop-


Prior Knowledge Expected

Previous knowledge expected

Jules S. Damji is a lead developer advocate at Anyscale Inc, an MLflow contributor, and co-author of Learning Spark, 2nd Edition. He is a hands-on developer with over 25 years of experience and has worked at leading companies, such as Sun Microsystems, Netscape, @Home, Opsware/LoudCloud, VeriSign, ProQuest, Hortonworks, and Databricks, building large-scale distributed systems. He holds a B.Sc and M.Sc in computer science (from Oregon State University and Cal State, Chico respectively), and an MA in political advocacy and communication (from Johns Hopkins University).