Conversation with Ariadna Font (Part 1)
Let 's start this conversation traveling back in time. In the early 90’s you decided to study Translation and Interpreting in Barcelona. I am curious to know the reasons behind it
I have always loved languages and getting to know people from different cultures, so one of the main reasons I picked translation and interpreting over a scientific career was that the program required me to study abroad for part of the year. This was the deciding factor for me. I ended up spending three semesters abroad (St Andrews, Bochum, and London).
My entire family are scientists with PhDs, and my aunt is a medical doctor…so I was the black sheep back then.
But somehow you get back to science and you wrote your thesis: "A bottom-up chart parser for PATR unification grammars in prolog". I guess that explains a lot of what happened in your professional life years later.
Yes! That was my first serious attempt at being a computational linguist. During my undergraduate studies I met Dr Toni Badia, who introduced Computational Linguistics and showed us how to write translation rules so that computers could do the translation instead of us. I was in! As it turns out, I did not actually enjoy translating, I found it to be a solitary activity. So I was very happy to teach computers to do that. I fell in love with Computational Linguistics, which is a key part of Artificial Intelligence (AI). All my electives were about Lexical-Functional Grammar (LFG) and other early days Natural Language Processing (NLP) theories and techniques. It was super fun.
Let’s now jump to 2007, when you started working as Senior Computational Linguist at Vivisimo. What was the scope of your work there?
When I joined Vivisimo -after finishing my PhD in NLP at Carnegie Mellon- it was still a small startup, and I was the one and only Computational Linguist there, which was not even such a common role, especially for such a small company. But because we were building enterprise search solutions (it was early days when all the companies wanted to have an internal version of Google search), it actually made a lot of sense to have somebody with an NLP background. My first project was to implement a clustering algorithm that we used to enrich user keyword searches, essentially, semantic search.
After a couple of NLP-specific projects, I wore many hats, from software engineer/programmer to designer and user experience researcher. I loved that fluidity, and I ended up learning a lot about lean and agile software development, design thinking, user-centric design, and even being a team lead.
Since then, the field of computational language has evolved significantly. From your perspective, which have been the major breakthroughs in the last years?
Indeed! When I was finishing my PhD, machine learning (ML) was just one graduate course that PhD and Masters students could take. Now, ML at CMUis a Department with both a Master’s and PhD programs in ML/AI, and there is even an ML minor for undergrads.
There have been several breakthroughs since I first started in this field!
At CMU, I was working on statistical NLP, which involved using large corpora and probabilistic models to understand and generate language. Back then we were building small, purpose-built language models to solve very specific tasks. My Master’s thesis leveraged ML to identify the language of origin of proper names and see how that affected the pronunciation of that name via a speech synthesis system.
At that time, we were using Hidden Markov Models (HMMs), Support Vector Machines (SVMs), and Conditional Random Fields (CRFs), and one of the largest corpora was the Penn Treebank (1993), a bit over 1 million words. In 2006, Google’s Statistical Machine Translation (SMT) started showing really promising results. It set a new standard for machine translation systems and demonstrated the power of large-scale data-driven approaches.
And it wasn’t until about 10 years ago, that Word embeddings were introduced as dense vector representations for words, capturing semantic similarities and relationships (Word2Vec). It was also around that time that Recurrent Neural Networks (RNNs) and their variant Long Short-Term Memory (LSTM) became popular for processing sequential data, making them well-suited for tasks such as language modeling and machine translation.
A key breakthrough that has propelled the field forward was the introduction of the transformer model in the paper "Attention is All You Need" in 2017: it revolutionized NLP by enabling efficient parallel processing and capturing long-range dependencies more effectively. Transformers became the foundation for state-of-the-art models in NLP, such as BERT (2018), GPT (2018-present), and T5 (2019). BERT set new state-of-the-art performance on a wide range of NLP tasks and highlighted the importance of pre-training on large text corpora followed by fine-tuning, whereas GPT models showcased the potential of scaling up language models and achieving human-like text generation capabilities.
In the last 4 years, technical advances have not only not stopped, but in fact, have accelerated. There has been what many call a true democratization of AI. These Large Language Models (LLMs), such as GPT-4 and Llama-3, are all-mighty models that can now tackle a variety of tasks ranging from translation, summarization, and writing a poem to writing a report and an analysis or technical review. This would have required multiple models in the past, one for each task.
The biggest revolution of this new wave of AI is not just the technology but the user experience and alignment process that these models were subjected to. This is the era of Responsible AI (RAI).
What has made these models ready for production and usable via downstream applications, such as chatbots and co-pilots, is that they have undergone a lengthy and thorough Alignment process. During this process, they have been steered to prefer answers aligned with human values and preferences. These efforts aim to mitigate harmful biases and ensure responsible AI deployment.
Another key part of RAI is sustainability. These models consume a lot of energy and water during the training process, which has triggered lots of research and development around making models more efficient and smaller. Models like DistilBERT, TinyBERT, and MobileBERT focus on reducing the computational resources required while maintaining performance.
And after almost 5 years at Vivisimo, you moved to IBM, where you were involved in several AI challenges, including Watson Discovery or Watson Knowledge. I would love to hear from you tha main projects you were working on
After Watson won Jeopardy in 2011, there was this idea that AI was ready for prime time and could be infused into many business applications to improve efficiency and even accuracy. This was the premise of IBM Watson Group, created in 2014. I was one of the teams that got “internally acquired” into Watson (after having been acquired by IBM’s Big Data Business Unit two years prior).
When I first joined the Watson Group, I was leading teams of Designers as Program Director and Design Principal. I was designing the early AI business systems, including Watson Discovery (which was actually based on Vivisimo’s technology) and Watson Knowledge Studio. These first systems were using several different NLP techniques, but did not yet have the power to generalize the way LLMs do today.
I then spent about a year as the Chief of Staff (aka Executive Assistant or EA) of Watson’s Group SVP, Mike Rhodin, learning about how decisions at the highest level of IBM are made. During those months, I spent a lot more time interacting with customers and other leaders, than I spent with engineering and design teams, but it was an incredible learning experience.
You also catalyzed one of the IBM Quantum projects. Tell me more about it
Yes! My first role after the Watson SVP chief of staff role was Director of Emerging Technologies at IBM Research. And let me tell you, IBM Research is a very special place, full of incredible scientists.
The first project I worked on was assembling a team of researchers, engineers and designers to put our Quantum computer on the cloud, which had been completely inaccessible to anyone who wasn’t an IBM Quantum physicist. This was an incredible milestone. Nobody had done this before, in fact, most believed it was not possible. The technology was considered to be too unstable and finicky. It took a lot of work and deep collaboration, but we did it; we were the first to put a real quantum computer on the cloud and give everyone worldwide access to it.
In fact, it was such a big success that IBM decided to create a new Business Unit for it: IBM Quantum. So I have personally witnessed and been a part of the creation of two BU during my 7 years at IBM. It was a huge team accomplishment and I have fond memories of the teams I worked with and my time at IBM Research.
In 2019, you started to work at Twitter. As Director of Engineering. Let’s first talk about your main responsibilities as ML Platform lead.
When I joined Twitter, I took on a dual role: Director of Engineering of Twitter’s ML PLatform and NY site lead for Engineering, Product, Research, and Design (EPDR).
As ML Platform Engineering Director, I led multiple teams, mainly in the US, responsible for building the ML tooling and infrastructure to help Twitter's ML engineers be more efficient and scale operations.
You were also responsible for META (ML Ethics Transparency and Accountability) and Responsible ML. Which were the main challenges and accomplishments.
Yes. In fact, this is a responsibility I picked up in 2020 when the manager of the then-small META team left Twitter. While my manager was thinking of dismantling the team, I instead advocated for doubling down and investing much more in Responsible ML/AI as a company, starting by increasing the budget and scope of the META team, but also turning RAI into a company-wide objective. I not only really believed RAI was critical for Twitter, but I also had a very clear vision about how to do it (from product instead of policy), and thus I offered to lead the charge. After a few months of pitching the vision, mission and roadmap to Jack Dorsey (founder and then CEO) and his leadership team, it got elevated to a top-level objective for the company and I managed to get headcount to hire a more senior leader for the team as well as to grow the team.
One of the main challenges we faced as a company came shortly after I had gotten RAI approved and funded. It’s what I refer to as the image cropping crisis. At the end of September 2020, a user noticed that Zoom erased the face of one of his colleagues because the algorithm did not recognize his face because he was black. He uploaded the images to Twitter to share his outrage and found that in the preview that Twitter automatically generated, it also showed only his image, leaving his black colleague out. He started to share his surprise and outrage on Twitter, and many joined in evaluating Twitter's image-cropping algorithm in real time and started experimenting with other images. When uploading images with the faces of two people, one black and one white, with a lot of intermediate space, Twitter's preview overwhelmingly showed only the white person's face, even when changing the order and position of the images.
To give you some context, this was happening during the Black Lives Matter movement in the United States, with several cases of innocent black people being killed by the police. As a result, it spread like wildfire and went viral.
Twitter introduced this saliency algorithm back in 2018 to frame images so that all photos would have the same size and to allow users to see more tweets without having to scroll. A saliency algorithm estimates the section of the image users want to see first to automatically determine how to crop the image and display it without requiring any manual work from the user. The saliency models are trained on how the human eye looks at a photo to prioritize what most people think is most important. The algorithm, trained with human eye-tracking data, predicts a saliency score for all regions of the image and chooses the point with the highest score as the center of the frame.
From a product and user experience perspective, automatic image cropping has the advantage that it doesn't require users to edit the image before tweeting, which was reflected in our engagement metrics. In this case, it is also important to emphasize that before putting the algorithm into production, we conducted a series of tests to try to detect gender and racial biases with a public dataset. Although we had not trained the algorithm with race or gender features, and our initial analysis did not indicate bias, we immediately realized that any automatic cropping algorithm risks causing representational harm, as it limits users' ability to express themselves as they wish. And what is worse, it can reinforce social stereotypes, in this case, the subordination of certain groups in a way that affects their identity.
Instead of insisting on technicalities of the algorithm's accuracy and that there was a 50% chance of choosing a white or black person, I advocated for us to focus on admitting the representational harm the algorithm had caused in a blog post, where we also committed to finding a better solution that gives users the option to choose how their images are displayed in tweets.
Five months later, in March 2021, we began testing a new way of visualizing photos—without using any saliency algorithm. The goal was to give users more control over how their images appear on Twitter, while also improving how others see the images. After receiving positive feedback on this experience, in May we rolled out this change worldwide.
One of the main conclusions was that not everything on the Twitter platform is a good candidate for algorithmic decision-making, and in this case, how to frame an image is a decision that we want to leave to the author of the Tweet, not an algorithm,
As part of our commitment to transparency, we also published a detailed analysis (which can be found on ArXiv) and opened our source code so that everyone could reproduce and improve our analysis.
Our response to this crisis became a roadmap for the various aspects necessary to carry out responsible ML: accountability, responding to user concerns, and considering the impact our systems have on society.
Transparency, by sharing our audit and analysis
Collaboration, by opening our code
Product change, when necessary
In addition to managing crises such as this one, we also conducted detailed studies and analyses to detect the existence of potential harm that could be caused by the algorithms used by Twitter, including an analysis of content recommendations from various political ideologies in seven countries and an analysis of algorithmic fairness to see if the recommendations shown on Twitter's main page vary by racial subgroups. On the engineering side, the META team developed tools to enable an analysis of multiple aspects, including a way to assess bias.
May I ask why you left Twitter? Was it because of the whole Elon Musk mess?
Yes, that was it. Elon Musk came in, and less than a week later more than 50% of the company was laid off. My org was impacted in a major way. Needless to say, he did not care much for Responsible AI, or doing the right thing for Twitter users. He had other interests in mind.
For me, this was a blessing in disguise. This allowed me to take a step back around the same time that chatGPT came out, and I immediately knew we were in for a ride! It was the most powerful combination of NLP+UX+RAI to date, and I knew it was going to be a hit. I also knew it was a two-edged sword, and companies were going to need a lot of help navigating the space and taming this technology to benefit their business and their customers.
P.D: you can read second part of the conversation here