Conversation with Brian Kent
First of all, I would like to know how you developed your passion for Data. Did you learn this was your field while studying Economics? Or did you get a crunch later on?
No, I think it was the other way around; as an undergrad I was more interested in political science but I got frustrated by the anecdotal nature of international relations research. I switched late to economics because I found quantitative empirical research more convincing and satisfying.
I probably should have anticipated my choice earlier - I spent many, many hours as a kid sorting and counting my baseball cards…and I didn’t even really like baseball.
In fact, it was after your PHD Statistics when you started working as a Data Scientist at Turi. Tell me a little bit about the type of projects you were involved in, the models you were working with and the tools you were using
I wore two hats at Turi. I was primarily a software engineer, implementing ML algorithms in C++. I focused primarily on unsupervised methods, like clustering, nearest neighbors search, anomaly detection, and data matching.
My other hat was as a data scientist, helping our customers to use Turi’s toolkits effectively to solve business problems. One project area that stood out was data matching, i.e. entity resolution, record linkage, master data management. It’s not an area that gets a lot of attention, but it can be super valuable. For these customer collaborations, I used whichever model best fit the task, with Turi’s Python interface to the underlying C++ engine.
What was the main challenge you faced during those days?
For me personally, the initial challenge was to learn how to be a professional software engineer, because that’s not something my statistics PhD emphasized. Turi was a fantastic opportunity for me, in that regard, and really set me on course for my career.
From the perspective of building an ML platform, it was an exciting but challenging time because both modeling and MLOps were changing quickly and it was hard to know where to focus our time. In the era before Tensorflow and PyTorch, for example, was it worth developing a deep learning toolkit of our own, even though most customers were better off with gradient boosted trees?
From a business perspective, a big challenge for our “pure ML” play was estimating how much value our product brought to the customer and how to set prices.
And then, Turi was acquired by Apple and you started a role as Machine Learning Engineer. I have a few questions about this transition: the first one, how come you moved from a Data Scientist role into a Machine Learning Engineer.
I don’t worry too much about these labels, to be honest, at least as applied to people. I’m much more interested in understanding what work needs to be done and how to match that work to people with the right skill sets.
I’ve been lucky to be able to more or less choose my own label. At Turi, I was primarily a software engineer but I wanted people to know that my “home base” was statistical modeling so I went with “data scientist”. At Apple, people already assumed I was a data scientist just by my association with Turi, and I chose “ML Engineer” to emphasize that my goal was to implement ML systems in production, not just deliver “insights”.
And regarding your role as ML Engineer at Apple, I think you were developing a new framework for prototyping and testing reinforcement learning agents. Could you give some more details about your work there?
There was a lot of excitement about RL at the time, with new landmark papers coming out every week. But I felt there were two problems. First, it was hard to reproduce the results in the papers, which made it hard to know if the agent logic was implemented correctly. So my first goal was just to implement the research ideas in a way that my users could trust, a lot like HuggingFace has done recently with deep learning papers.
The second was that it was hard to convert business problems into RL environments. OpenAI had created the Gym environment suite, but it was mostly games and simple control problems. I wanted to make it easier for domain experts to turn their business tasks into environments an RL agent could tackle.
In fact, reinforcement learning is becoming a quite relevant topic in the world of Data. What do you think are the main success cases in this arena you have seen in the last years?
I’m an eternal RL optimist because I tend to frame problems through a sequential decision making lens. That said, I’m afraid I don’t see RL as being relevant to the vast majority of data science applications.
I think of RL as a candidate for contextual policy learning when three things are true. First, the reward function is super clear. Second, the goal is to achieve performance better than human experts. Not just humans, but experts. And third, the team has to have a lot of resources, in terms of skill, money, time, and compute.
This doesn’t leave too many candidates for RL applications. Historically, it’s been limited to top AI labs “solving” games. Control of complex systems is a tantalizing use case; DeepMind achieved state of the art results in nuclear fusion with an RL approach and I think the same logic could be applied to things like bioreactors and distributed energy management as well.
Overall, how does Apple manage data initiatives? Is there a Data R&D team? Or Data initiatives are always driven by business needs? A bit of both?
I think it’s probably a bit of both in every company, including Apple. It’s good to have centralized data science experts, including data engineering, modelers, MLOps engineers, and R&D, but business units will always feel pressure to build their own dedicated data science capacity. I’ve seen this even at the smaller and mid-size companies I’ve worked for and collaborated with.
And after your Apple days, you also had roles as Lead & Director of Data Scientist and Machine Learning teams. What do you think is the key to successfully managing a Data Team?
There are so many keys it’s hard to know where to begin. Where I start as a manager is to accept that leading a data team is a hard job. Things are never going to be perfect and the key is to keep learning and improving, individually and as a team. Humility goes a long way.
Doing work that matters is really important for a data team, whether it’s ML models in production or analytics dashboards that drive executive decisions. It sounds obvious, but it’s surprisingly hard to do, especially with new systems. When a team’s work isn’t having impact, morale drops very quickly.
The flip side is also important: balancing the fun, exciting, résumé-building work with the more tedious work needed to make data systems work in production. Modelers in particular tend to want to play with the latest hot model architectures and tools, but that’s usually not what’s most pressing for a company.
Lastly, I’ve learned the importance of understanding the goals and concerns of other stakeholders, in order to build empathy and reduce resentment. For example, if an analytics team feels overwhelmed by a surge of ad hoc requests from the marketing team, it’s important to understand where the marketing team is coming from and find a solution that works for everybody. Or, if an ML team finds discrepancies between the production and analytics databases, it doesn’t mean the data engineering team is bad at their job, it means the two systems operate under different requirements and constraints.
What I find interesting about your career path is you player different roles in the Data Value Chain, so I am going to ask you for your own definition of different roles and what you consider the OKR of these roles
Data Scientist: I see “data science” as an umbrella term that comprises analytics, predictive modeling, and causal modeling and experimentation. Fundamentally, the goal is to extract meaning from data. That means understanding a company’s business objectives, it means understanding statistics and models, and it means being able to find the match between business goal and data method.
Data Engineer: Data engineers design and run the data pipes. Simple to describe, hard to do well. One challenge is that data engineers tend to serve multiple stakeholders, some of whom—like product engineering—have higher priority than data science.
ML Engineer: To me, there is a lot of overlap in the Venn diagram between Data Scientist and ML Engineer. ML Engineers build predictive models, specifically (vs. causal models or experiments or analytics) and they tend to think more about MLOps, i.e. the ML system as a whole, in production.
Lately you have been focusing on Machine Learning Ops. I would also love to get your own definition of “ML-Ops” and why do you think ML Ops is important for a company?
In plain language, it’s making machine learning models work in the real world, not just the classroom or Kaggle sandbox. It’s important because companies operate in the real-world and getting models to work reliably is the only way to capture the return on an ML investment.
So what does MLOps entail? Shreya Shankar, et al just posted a paper with a nice summary that I’ll borrow. MLOps comprises data collection, model training experiments, deployment and evaluation, and monitoring and response. A minimum viable ML product needs all four of these components to work.
And what type of profiles are needed to successfully run ML-Ops in a company?
It’s tempting to jump right into the required technical skills, but I think the foundation of MLOps is a mindset that the goal of ML is to bring measurable value to an organization. Often, data scientists come in with a bag of models and look around for data they can play with. A more mature mindset is to start with business objectives and look for places where the ML approach could add value.
The second ingredient is an ability to work cross-functionally and a sense of ownership over the ML system. Good MLOps requires commitment from data engineering, analytics, ML, back-end engineering, and the business stakeholders, and it takes a lot of work to get all those teams aligned and orchestrated.
I think the key technical skill for MLOps is an ability to learn broadly but shallowly. Spanning data, modeling, deployment, and monitoring is a lot of ground to cover. It’s important to be able to get things working in each area, but also not to get bogged down trying for perfection in any one area. In larger organizations, individual teams can own and perfect the work, but a good ML engineer still should be able to understand technically what each team is doing, in order to orchestrate the overall effort.
Sometimes it seems to be a friction between Data Scientist (“we have developed a great model but ML-OPs team is unable to deploy our fantastic work”) and ML-Ops teams (“oh, here we go again these DS bringing models that cannot be deployed and get outdated very easily”). Since you have been on both sides, what is your recommendation to address this potential friction?
I have definitely seen this friction; it can be a real problem. I’m going to take a bit of a strong stand here – I think it’s on the Data Scientists to adapt to the constraints of the business and engineering teams they work with. I say this as somebody who loves ML modeling, but in this era where open source, state-of-the-art, pretrained models can be downloaded and used for free, super fancy ML models just aren’t that special. A simple model that works reliably and enables fast iterations is far more valuable.
A couple caveats. First, I don’t mean to suggest that data science is dying, quite the opposite. Data scientists should be the leaders when it comes to understanding business goals and creatively applying ML models to meet those objectives. Data scientists are the ones who should do the essential but difficult work of quantifying business metrics and choosing corresponding ML metrics. And data scientists are probably also in the best position to “own” the MLOps pipeline, because we’re used to working cross-functionally and have at least some experience with each of the components.
In fact, one specific recommendation here would be to data scientists–if you haven’t implemented a model in production, try it. Even if it just runs in shadow, it’s a great learning exercise. For Pythonistas not sure where to start, check out FastAPI.
The second caveat is that data scientists should also push for long run improvements to MLOps. For example, maybe today the production and analytics databases have too many discrepancies to train a certain model. Work with the data engineering team to find a long run solution that enables the better model.
And which are the 3 DO’s and the 3 DONT’s when talking about ML-OPs strategy?
DO’s
Do emphasize the mindset shift: the goal is to bring meaningful, measurable value to the company and to mitigate risks.
Do emphasize cross-functional collaboration and buy-in. ML systems are fragile and die easily if everybody is on a different page.
Do start small and build incrementally. It’s never going to be perfect, even for companies with huge resources. Just keep making progress.
DONT’s
Don’t present MLOps as the latest hot thing in AI, even if it is. Everybody’s tired of AI hype (including me), and it’s easy to dismiss the latest hot thing as empty buzz.
Don’t assume external vendors will solve MLOps for you. Lots of companies are moving into this space, but I’m skeptical because MLOps touches so many teams and requires so much custom logic.
Don’t expect everything to work seamlessly right away. There will be some rough edges that you just have to live with until the high priority tasks are done.
Last question, in 3 years from now, which main changes do you foresee in terms of ML-Ops?
Hmm, this is a tough question!
Despite what I said previously about not assuming vendors will solve MLOps, I do think there will be an explosion of vendors and tools, then a bit of a coalescing opinion about the “modern MLOps stack”, just like we’ve seen with the “modern data stack”.
On the technical side, I see a growing awareness of the importance of rule-based guardrails. The Shankar, et al. paper mentions a chatbot example, where the ML model’s responses are overruled when it can’t possibly know the answer to a question. Everybody does this for models in production, but the guardrails are rarely treated as first-class citizens. I think that will change.
I think the trend will continue of data scientists driving MLOps. In particular, I think (and hope) there will be fewer situations where data scientists run model training experiments in a vacuum then throw their models to an entirely different team for productionization and deployment.