PyData Seattle 2023

Fabiana Clemente

Fabiana Clemente is the co-founder and CDO of YData, combining Data Understanding, Causality, and Privacy as her main fields of work and research, with the mission to make data actionable for organizations.

Passionate for data, Fabiana has vast experience leading data science teams in startups and multinational companies.

Host of “When Machine Learning meets privacy” podcast and a guest speaker at Datacast and Privacy Please, the previous WebSummit speaker, was recently awarded “Founder of the Year” by the South Europe Startup Awards.

The speaker's profile picture

Sessions

04-27
11:00
45min
The Importance of Synthetic Data in Data-Centric AI
Fabiana Clemente

This talk covers the importance of synthetic data for the adoption and development of Data-Centric AI approaches. We’ll cover how generative models can be used to mimic real-world domains through the generation of synthetic data and demonstrate their application using the open-source python package, ydata-synthetic. For this talk, we’ll focus on tabular data and discuss the impact of synthetic data on different industries such as healthcare and finance. Finally, we’ll explain how to validate the quality of the synthetic data generated, depending on the downstream application - privacy-preserving and ML as an ML performance booster.

Hood
04-28
15:00
90min
Panel: The living nature of data: exploring the Lifecycle and Management of Data at Scale
Alan Descoins, Fabiana Clemente, Yucheng Low, David Aronchick

As we continue to witness the exponential growth of data generation, especially with the proliferation of IoT devices, widespread deploy of LLMs, and synthetic data, it is essential to understand the dynamic nature of data and its lifecycle. This panel will delve into the living nature of data, exploring its various stages, from creation to effective processing, augmentation, and beyond: we will discuss tools, experiences and trends to look out for in 2023.

Baker