Anthony Holten
Anthony Holten is a Senior Software Engineer at Interos, Inc. building supply chain software that calculates and tracks risk profiles for hundreds of millions of companies worldwide. Previously, as a Data Engineer at Deloitte, Anthony empowered government clients’ internal policy analysis through natural language processing. He is a published photographer whose formal education is in International Relations by way of Washington, DC and Beijing, China.
Sessions
When Pandas starts to become a bottleneck for data workloads, data practitioners seek out distributed computing frameworks such as Spark, Dask, and Ray. The problem is porting over existing code would take a lot of rewrites. Though drop-in replacements exist where you can just change the import statement, the resulting code is still attached to the Pandas interface, which is not a good grammar for a lot of distributed computing problems. In this tutorial, we will go over some scenarios where the Pandas interface can't scale, and we'll show how to port the existing code to distributed backend with minimal rewrites.