Conversation with Sofia Kosenko

Sofia Kosenso is a data enthusiast who has worked in IT for 6 years now. Started as Data Analyst and progressed to Data Governance Specialist. She is a strange animal that moved from Translator to IT and landed in Data Governance with no regrets. Also, she is a writer, Top Data Governance Voice on LinkedIn (just love writing), slow runner, animal lover and a person in constant search of happiness and balance. She is Ukrainian, an immigrant living in Spain.



I always start these conversations traveling back in time. So, let’s jump to 2014 when you finished your studies in Language Interpretation and Translation. At that point you started your professional career in a completely different field. Tell me more about it

Since I’m eight, I was sure I’ll be a translator from Spanish. It was my childhood dream and I was sure I’ll fulfill it. So I did, but when I started working as a Translator in my earlier student years I felt the need to express my own opinion instead of repeating (but in another language) someone else's ideas. Then, I went through different roles in Finance, Project Management and even Customer Service. I moved to Spain in 2014 and a few years after, I decided to take a risk and start an Executive MBA that opened for me the path to the data world. Since then, I fell in love with data, but this is our next question.

And when do you fall in love with Data? When did you realize you wanted to work in this field?

I never realized that I wanted to work in data till I started to work in the field. How did that happen? Before my graduation one Spanish analytical company called me because they had a project in Ukrainian (the context of the conflict since 2014 with Russia and the development of fake news, bots and other ways of controlling society through social media) and they needed a Data Analyst. I had no big idea about Big Data and social media analytics but I felt that I couldn’t refuse that offer. My intuition is well developed and I thought that I just can’t refuse that offer. There I learned a lot about Analytics, building of knowledge graphs, SQL, databases, Python, how crazy you can get when you need data and go to S3 or even how magnificent can be one report and how difficult is to understand it sometimes by someone who didn’t build it. I always worked with data but as a user. As someone who needed it to make decisions or just to end your working day and have some results, but when I started to work as Data Analyst I understood what is behind and how important data processes are. 

I guess one important moment is when you started to work at Collibra, one of the most important tools used for Data Governance. Could you share some details about your time there and the biggest challenges you find for customers to adopt Data Governance Tools?

Collibra was the one who approached me and offered a job opportunity. Before, I’ve heard a lot of them, but it was not my field (Data Governance) so, I didn’t even have conversations about the solution with my customers. At that moment, data science was sexier than Data Governance (and still is). I started as Business Development and I needed to learn more about the solution, how it works and why customers would need it instead of having an internal one or not having any. That was really challenging because I understood that I have a lot to learn. Data analysis is just one part that can be managed by data governance but there is a lot more. I started learning with Collibra webinars, internal courses (thank you, Presales Collibra guys), courses, YouTube, customers, writing posts, reading books and spending a lot of extra time learning about the topic. I fell in love even more. For me Data Governance is a perfect mix between Business and IT. I feel myself as a translator (my childhood dream came true in a perfect way that I wasn’t expecting to, 10 years ago) between technology and strategy, having my own voice and sharing with others my knowledge. 

By the way, could I get your own definition of Data Governance

Oh, I love this question. For me, Data Governance is a business process of collaboration with IT in order to manage your data efficiently on different levels. Data Governance has a change management function and it unifies different areas of data and business to give real meaning for users to understand the actions.

And why is it important to any organization?

When I was a Data Analyst, we created wonderful and beautiful reports with huge bulks of data. Those reports were used to make important decisions for business. The executives never really cared where that data came from, what was the quality of that data and what would happen next. Neither did we. Because one department receives a lot of data, then it does their best on analyzing it without additional context, after that data flows, and the cool report and insights are ready. No one is able to understand what each line means but the person who created it and their manager. In the meeting everything is explained and everyone makes their own conclusions and after, some important decisions are made.

Why am I telling you all this information and what it has to do with data governance? So, Data Governance helps you to give more meaning to every action taken over the data you work with in your organization allowing you to make the process transparent, easier, understandable for IT and business users and be sure on every level that the data you are using has good quality and trace. 

What's the most important part of Data Governance?

I would divide this into essential must haves like for example, Communication Plan (not just on paper, also in execution), Maturity Assessment, needs definition and Executives Support. These points are fundamentals for me. 

But on the other hand, we have more important parts of Data Governance that usually depend on things like:

  • Use case

  • Criticality of data

  • External requirements (laws, regulations, acts, etc.)

  • Even worldwide events that may affect your business

All these should be a part of synchronized actions between IT and business. When we just do IT related actions as Data Governance, it will fail.

3 Do’s and Don’ts when implementing Data Governance programs

3 Do’s:

  1. Understand well the needs of your organizations

  2. Build good use cases on data and area that are the most critic

  3. Create a communication plan because the lack of communication will ruin your even best intentions

3 Don’ts:

  1. Do not just focus yourself on data quality and lineage stopping there

  2. Don’t skip the Maturity Assessment

  3. Don’t see it just as an IT issue eliminating business from the sequence (business is a user and the one that gives you money)

Could you provide more details on Maturity Assessment? What is it about? 

Maturity Assessment will show you where you as a company have to start with. Why? Because it will show you how well your company is going. It’s not about numbers at the end of the fiscal year. It’s about a real situation where you understand the point of every business and technology area, users, providers, even customers, and how to perform in each step of execution of different data processes. 

I would say it’s not just about data processes. It’s also about your company and how different processes influence the data flow around the company. In this way you value on what stage your data governance is, where to start its implementation and where to go. A few months ago I made a post about Maturity Assessment and what happens usually with this step - and it reflects a lot the reality of so many companies. That’s why a lot of them are having issues with Data Governance and at some point start hating it. They don’t know where to start and why they do it. Maturity Assessment can take weeks or months, but it’s a wonderful investment for your future actions. 

As discussed before, you worked for Collibra, but I would like to know which are the main tools in the market and pros and cons of each one 

Collibra is definitely a wonderful tool, but I’m agnostic about it. There is no perfect tool, there is the most suitable tool for you taking into consideration your company’s needs, ecosystem, architecture, and budget.

There is no point buying a Ferrari if you are going to travel from work to home and back 30 kms/day and 90 kms/hour. 

It’s not about brands and how great it is to have one or another tool, it is about what use you will give to it and how it will help your company to get the maximum benefit for the price you’re paying.

Hence, can you start a Data Governance program without a tool?

Indeed, Data Governance is not just a tool. I would say more, don’t start from the tool, if we are talking about real starts. What do I mean? Once you do your Maturity Assessment and you realize what processes you already have implemented, you have roles defined, some possible use cases, identified sources (about sources is another interesting topic to discuss) and a general idea of why you need it, so, yes. Tool would be a next logical step. But a tool by itself can be a detractor if the previous work is not done yet. 

I would love companies to remember - TOOL IS A HELPER, not a magical solution that will solve all your issues. The formula is - people, processes and tools.

I have met several colleagues who mention Data Governance is not working in their organizations because they do not get the right empowerment or cannot scale. What is your recommendation to address this topic? Who should be leading Data Governance in an organization? 

I worked with different companies that addressed to us that issue and the usual problem was the lack of people and processes in the strategy planning. How does it happen? Company wants to do data governance and buys a tool. They start injecting data, classifying it, even implementing quality processes, lineage and assigning responsibilities. Meanwhile the money flows, people are spending hours and hours working on it. Data team is building timelines and action plans about how to proceed next but finally they understand that instead of improving processes they are stuck. Sometimes they even see the chaos they generated. After they can buy a new tool trying to solve all the confusion but the issue remains. 

They never understood the real need of the company to do such change, they never put in sequence people and processes. C-suite put all their faith just in a tool. Even if the tool is the best one on the market.

And which profiles should a company dedicate for Data Governance? Which skills should they have?

The information about profiles we can find in every web page, so it’s not something I want to discuss a lot. But there is a common issue related to the data governance profiles that I saw in almost every company I worked with. Data is still a 100% IT issue, so the profiles are selected without taking into consideration business users. 

Companies do Data Governance as part of IT initiative not taking into consideration business needs and business people as part of it. When we do Data Governance, business people become your internal customers and before starting anything that can impact their day to day life, you need to understand why are you doing that, how it will affect them, why they will switch from their well known processes to something new. And something that I might add from my experience and point of view is - please, understand how business guys see data. 

Once you have all this together, work together. Because Data Governance is a team play.

And how should a small / medium size business start its journey on Data Governance?

I will repeat myself over and over again, the beginning of Data Governance implementation is almost the same for every company - understand at what point you are doing the maturity assessment or analysis of your initial situation. Define 2-3 use cases. Identify the sources. Trace the process and your data lifecycle and after build a plan and sell it to your executives.

Once you have fundamentals in place, you can have a tool. You don’t need the best one or the expensive one, for a low maturity level of data management and data governance ecosystem you can find one basic solution that will allow you to progress on your strategy, create a glossary, a data catalog, have basic data quality scores and lineage. Trust me, for the first steps it is more than enough. It takes time to gain maturity and become more experienced in Data Governance. You are not competing against anyone else.

How do you measure progress and impact of Data Governance initiatives? 

This question is not easy to answer because it’s too complex. Data Governance is not returning to you results just from day one. It’s a long term process that requires implication from all employees. To do so, you need to involve them from the beginning. Once the plan is built, you divide it into priorities and tasks and define some quick wins. In parallel develop KPIs that will measure them and explain if you are in a good way to your objectives or not. 

They are not static and may change once you reach your goals or your needs are changing. Those KPIs and quick wins should be tied to business strategy and business actions. Also, that quick wins will help you to justify your activity to your Management and see the progress.

Although Data Quality and Data Governance come together, could you also elaborate on the differences between them? 

Data Quality is a stand alone action required to monitor the state of quality of your data. According to DMBok Data Quality is the planning, implementation, and control of activities that apply quality management techniques to data, in order to assure it is fit for consumption and meets the needs of data consumers. For me, Data Quality is like measurement or metrics of human health but for data. There are not perfect measurements, it depends on your needs, environment, other key indicators like height, weight, neighborhood, quality of food and water or other conditioners. We are talking about Data Quality here and the first thing to understand is that there is no such measure as perfect data. Data Quality should be defined according to your needs and use cases. 

So, to answer the question. Data Quality can come together with Data Governance, but it’s not mandatory. Can Data Quality be part of Data Governance? Yes, and the essential one. Doing Data Quality you are doing Data Governance? You CAN’T substitute Data Governance strategy by doing just Data Quality. And unfortunately, it’s one of the most common mistakes when companies see Data Governance as just Data Quality, Lineage or Architecture instead of seeing it as a more complex business and tech process where all those areas are united under the same umbrella. 

How do you recommend embedding Data Governance when a DataWarehouse needs to be built? And when the DataWarehouse is already up and running?

You can have a Data Warehouse and not have a Data Governance in place. But if you have Data Governance, it will help you to better understand your data, classify and set policies for its efficient and better use. Inside of the Data Warehouse the data should be governed and should serve the business to generate money. Data Governance will maintain the health of your DW.

Data Contracts and Data Lineage are also relevant in any Data Governance initiative. Any thoughts on this?

The same answer as for the Data Quality question. 

Lineage is very important. A lot of stakeholders don’t even know all their sources, how data appears and how it flows through the systems. Companies should have their business and technical lineage for different objectives and scopes. It will help companies to see how the data flows, what sources they have and if those sources are secure. And Data Governance will unify them to work better and in a more efficient way.

Maybe it is just a “semantics” issue but some people claim we should move from Data Governance to Data management. What is your perspective?

Sure, it just means that people don’t understand well the difference between Data Governance and Data Management. Data Governance is a framework of Data Management that helps you to set policies, processes and metrics that help to manage your data in a more distributed and efficient way. So, these two processes should be managed in parallel. Data Management stands for you have to manage your data on an operational level just focusing on data processes. If not, what does that mean to you? Data Governance stands for political and governmental view of DM all over the company unifying IT and business, process and strategy, people and technology. 

Those two processes should go together, as we say in Spanish “sí o sí” (stands for “no matter what”).

How is the arrival of LLM and Generative AI affecting the Data Governance Programs? Any recommendations on how to cope with it?

At this moment, LLM and GenAI are just hype and hot topics. Not because they are not required. So, most companies don’t understand what it's about. The same as happened about seven to five years ago with Data Governance. 

A lot of times, talking to someone from the Exec business team about AI and GenAI results in a need for just a Python code that will automate the dataset search instead of the vlookup function in Excel. Not everyone really understands what’s about. Unfortunately, the same still happens with DG. 

So, Data Governance and LLM or GenAI are not rivals. They work together. I would say more, Data Governance can co-exist without LLM and GenAI. But GenAI and LLM can be supported by Data Governance. For me. It's like asking AI to bring you results without good Data Quality and Data Governance. Will it work? Maybe… Are you sure about the results? Ohhh, let’s see. Might seem like a joke, but it’s not. 

Looking ahead. What are the biggest changes we will see in the Data Governance field in the next 5 years? 

Ohhh, it’s not easy to answer because we don’t know. Three years ago AI and GenAI were not the hot topics and today they are. Ten years ago only a few companies were thinking about DG, and today it’s a must have for some industries due to governmental requirements and standards. So, I think that the main goal we need to work on now is evangelization about why we need it. Why data is so important for every employee (and not only) and why we need to govern it. 

Our job now is to make sure that IT and business work together. Once we get it, Data Governance will stop being so confusing and difficult.

To finish the conversation, I would like to know your perspective about Gender gap in STEM. According to recent statistics, it remains significant with women making up only 28% of the STEM workforce. Why do you think this is happening?

Fourteen years ago I had to choose between Engineering and a Translator's career in two different Universities. Don’t misunderstand me, I love languages and I’m good at them, but I was thinking about an Aeronautical Engineer career as the University scholarship was offered to me for that faculty. Then, I was influenced by my family and close circle that this career is not for girls and I should be a translator (more girlish career) as I always wanted. I do not regret my choices but I still can see the pressure women have in the IT industry, working on data. 

Every day we are more and more in different wrongly considered “male” industries, raising our voices and making changes showing that we are capable of everything, but the process is too slow yet. 

And what can we do about it? 

We live in the era of data as data citizens and women are part of it. We need more education and cheering for all girls in schools and women who want to change their profession into STEM. It’s not easy, sometimes it is hard as fuck and there is a huge lack of support. More associations, include women into more projects, don’t interrupt us when speaking, listen to our opinion, give us the same opportunities, wages and promotions. We definitely have a lot to say in those fields where we are! 

Anterior
Anterior

Conversation with Marc Planagumà

Siguiente
Siguiente

Conversation with Ariadna Font (Part 2)