11-03, 10:15–10:55 (America/New_York), Radio City (Room 6604)
Building ML systems in business settings isn't just about algorithms; it's also about navigating unpredictable data, fostering efficient collaboration, and promoting constant experimentation. This talk delves into Metaflow's solutions to these challenges, spotlighting its event-driven patterns that streamline model training while leveraging infrastructure shared by multiple data-scientists/engineers. Through a hands-on demonstration, we'll showcase how Metaflow facilitates asynchronous training upon new data arrivals for finetuning a state-of-the-art LLM like LLaMAv2 (although the framework is general and widely applicable). Additionally, we'll highlight how multiple developers can seamlessly experiment with various models as new data becomes available.
This presentation unveils patterns essential for crafting ML systems. The patterns we shall present will illuminate strategies to foster seamless collaboration and experimentation across multiple developers, especially when working with ever-changing data over shared compute resources. We will demonstrate how Metaflow's event-driven mechanism can be used for various use cases to trigger asynchronous execution of ML pipelines.
To ensure practical relatability, we'll draw upon our experiences from a project where we trained LLM models like llamav2. We will showcase some eventing patterns by providing an example of initiating asynchronous LLM training when fresh data becomes available.
Why is it compelling?
Transitioning an ML model from its research genesis to a regularly updated production entity is not just a challenging research endeavor—it's a sophisticated systems problem. Especially in scenarios where several developers concurrently experiment with diverse models. The ML field, still in its relative infancy, lacks standardized, robust patterns for systems that need to respond to fresh data while accommodating multiple developers.
Who is this for?
ML practitioners and data scientists applying ML techniques in commercial environments stand to gain immensely from this discussion. Attendees should possess a foundational grasp of machine learning.
Our discourse will be rooted in engineering patterns, prioritizing the architectural aspects of ML systems over the nuances of individual model creation. To resonate with the pragmatic needs of our audience, a demonstrative approach will be adopted.
Key Takeaways
- Discern patterns for building event-driven ML training systems.
- Grasp methodologies for triggering asynchronous ML training upon the receipt of new data.
- Understand collaborative techniques that allow multiple developers/data scientists to work together without stepping over each other’s work.
- Insights into leveraging Metaflow's capabilities for the development of robust ML systems.
No previous knowledge expected