Data Scientist is a new profession, and as such, there isn?t a clearly defined career path made for it. People end up in data science from different backgrounds such as computer sciences, mathematics, economics? Some of which had prior experiences in related professions.
We can look at the career path of a Data Scientist along four main axes, a data axis, an engineering, a business, and a product axis. The role of Data Scientist is multidisciplinary, and we can see the career path within each axis as being a continuation skewed towards some of these disciplines.
Within the data axis, we have two typical career progressions, one as a data leader/manager, one as a senior individual contributor (Senior IC).
There is no common understanding of what makes a data leader. For some, a data leader is merely a people manager. For others, it needs to be able to act as a mentor to more junior data specialists. Others need him/her to be able to instill a data-driven culture within an organization or lead complex data projects. There are vast differences in requirements for data leaders, depending on the nature of the organization and its? maturity stage leveraging data.
During the ?pitch? phase of the data maturity in a company, the data leader needs to set up the foundation for a data-driven culture, convincing stakeholders to adopt data-driven processes, do proofs of concept as well as manage vendors to set up these foundations.
In the ?startup? phase, the focus is on recruiting and internalizing key resources, setup the data project foundations such as a data lake, the initial master data management strategy, the reporting infrastructure etc. The focus for the data leader is very much on recruitment and managing these different projects.
?Scaling up? the focus for data leaders need to be in growing the team, mentoring and up-skilling existing members and pushing to leverage more advanced analytics and defer decisions to data and algorithms, integrating predictions into production systems.
While in the ?Run? phase, having already a pretty mature team, there needs to be a different type of focus. The focus can be more axed towards people management, handling data as a strategic asset, towards data management and governance, gathering organization focus towards improving certain aspects of data quality, or focusing on longer-term products and projects leveraging data.
There are quite different needs for a data leader dependent on the stage of maturity an organization is in, but needs also differ depending on the specific data focus areas.
To be able to step up and move towards a data leader role, the skills must match the needs of the organization for a data leader at that particular point in time.
Senior Data IC:
Like for data leaders, there is not a commonly accepted measure of what makes an individual contributor ?senior.? There is a lot of variance by companies/departments, and there might not even be clear expectations as to what a senior data IC looks like.
There is quite a wide range of factors from which to judge IC. Being classified as a Senior IC means that you have made it past the threshold in these areas.
Where to draw the line between a regular data IC and a senior one, is more a factor of the particular organization employing them.
Data Science & Data Knowledge: There is a wide range of knowledge that an individual contributor can get in both data science techniques, and with regards to the underlying datasets, s/he is using. It is often quite difficult to compare individuals with specialized knowledge vs. broad knowledge.
- Techniques: There are numerous statistical, machine learning and deep learning and operation research techniques that can be applied in data science. Having a wide arsenal of techniques helps solve different classes of problems. The depth of knowledge in this area also helps distinguish a Senior IC. Having a deep knowledge of the models, their pitfall, knowing what types of transformations work best with them, allows us to truly leverage the potential of the data with the given model.
- Datasets: If we take the example of an e-commerce website, some data specialists might have profound knowledge on web-browsing clickstream data, but never touched on the logistic datasets. Knowing a wide range of datasets allows us to be able to connect different dots and work on problems holistically as well as end to end.
Engineering craft: Engineering skills are an important part of data science, especially when the focus is on data or ML engineer. There are multiple areas where engineering craft can be useful, from being able to put models in production, to create reusable data structures, to be able to navigate someone else?s codebase, and make changes there .
There are a few areas which can help distinguish Senior data IC:
- Coding knowledge: Familiarity navigating the codebase and the libraries being used, breaking code up into libraries and reusable components, thinking about exceptions and test cases, producing clear, legible code, insightful code reviews ?.
- Architecture: Being able to provide architecture insights on both a system and data perspective, knowing the different trade-offs between the available solutions, ?
- Productionalizing: Being able to automate the process and make the code production ready, helping setting up the deployment of the code, and looking at an end to end picture.
Product & Business impact: There are many ways for data-scientists to have product or business impact, from convincing stakeholders to look and base their decisions on specific metrics, to leveraging part of the organization to focus on data-driven initiatives.
- Strategy & Roadmap: One of the ways data scientists can have product or business impact is by being able to push items to the roadmap, through analysis or data projects as well as helping prioritize the different components of the roadmap, through a quantification of their business impact and potential.
- Lead large project: Another way to generate large product or business impact for data scientists is to lead large projects, particularly XFN (cross-functional) projects, which data scientists with their multidisciplinary competence are a good match for.
Being a data scientist requires to leverage quite a few product skills. It, therefore, comes as no surprise that Product Management is a common career path for many data scientists.
There are many aspects to being a product manager, from quantitative knowledge, technical ability, design and product sense, and communication. Product Managers typically lie within very different points within that spectrum. Data scientists typically fare pretty well on the quantitative and technical part of the spectrum, and if they are able to fulfill the minimum required on the other aspects make good candidates for a Product Management role.
Quantitative: Part of the role of product manager, requires to be able to provide estimates for business cases, planning on how to achieve the target and to be numerically literate to drive the product to success.
Technical: having a technical background is a requirement in many organizations for product management roles. PMs often need to be able to get an understanding of the complexity and time estimate of the different approaches. They can also benefit from being able to follow the overall development by following commits as well understand the need for refactoring code and being able to properly handle the trade-off between delivery and handling technical debt.
Product Sense: Defining what product sense is can be quite difficult, but it fits around having a focus on both problem and solution, being able to define and express the requirements for the product, define the right metrics for evaluation, and being open to user feedback. This is an area, data scientists that are part of a product can already partially contribute to, prior to switching to a product manager role, allowing them to demonstrate some of these skills.
Design Sense: This is an area data scientists are not particularly well suited for. The requirement for a product manager working on a front-end application is to have at least some design basics, but for backend oriented products this requirement can be completely overlooked.
Communication: Being a product manager requires to interface with the different stakeholders to gather requirements and communicate the product direction as well as with the engineering team to communicate the product direction and requirements.
Data Scientists are well equipped to embark on product management roles, provided they have adequate product sense and communication skills. If they work on user-facing products they should also make sure that they understand the basic design principles.
Data Scientists need to tackle a fair amount of technical work, some even come from computer science and software engineering background. There are a few progression towards engineering that are fairly natural such as machine learning engineer or data engineer, but given the evolution of data engineering towards software engineering, most backend software engineering roles could be within reach for the more technical data scientists.
Data Scientists can leverage their knowledge of big data and distributed systems, coding ETL, or ML pipelines, ? But there are a few attention points and skills that data scientists need to tackle when moving towards a more engineering-oriented role. Notably on code quality, data architecture, and system design.
- Code quality: Typically, data scientists tend to be more focused on coding fast rather than coding with the same level of conscientiousness than most software engineers. Data scientists wishing to move towards engineering, need to start writing cleaner code and break their code into more reusable components, write unit tests?
- Data Architecture: One aspect that they should also work on is the architecture for the pipelines they build, setup common and reusable data structures, or properly handle data warehouse topics such as changing dimensions, database normalization.
- System design: Data Scientists need to get a better grasp of the different components in their data landscape, their advantages and drawbacks, from the different data stores, to the processing layer, to message brokers, caching layer, etc. And get a sense of how to design systems leveraging them.
Data scientists intending to move towards this area should take on more data engineering and software engineering tasks related to data, such as building API to serve models? predictions in production. Work on real-time data ingestion, help set up some of the data infrastructure, CI/CD pipelines?
Another path for data scientists is to move more towards the business they have been advising and help to operate. The existing position in which data scientists end up is very dependent on the actual nature of the business they worked with.
Data Scientists having worked in CRM or Digital analytics, can end up in Marketing Manager positions, while those having worked in Supply chain, might end up in a Supply Chain program manager position for instance.
To establish a smooth transition to this type of role, the data scientist needs to have a good knowledge of the business, usually good project management skills.
- Business Knowledge: Data scientists should be armed to the teeth with business knowledge from doing analyses, deep-diving into the key metrics, getting exposed to the business problems to formulate questions?
- Project management: Most business roles require some level of project management, formulating a plan, coordinating between multiple parties, managing dependencies, informing stakeholders, etc.
There are quite varied career path options for data-scientists. Data Science being a multidisciplinary profession, it gives exposure and exit options in different areas. The focus of the data scientist, towards one particular area of engineering, product, or business, and making sure that the required knowledge and skills have been acquired opens the door to moving towards these careers. For those that want rather to keep this multidisciplinary aspect, there are still options for a data scientist to grow within their track as data leaders or senior IC.
More from me on Hacking Analytics:
- One the evolution of Data Engineering
- What Should be the Analytics Organization Structure?
- The death of data-scientists
- New roles of Analytics ? The Data Product Owner & Analytics Translator
- E-commerce Analysis: Data-Structures and Applications