Structuring a data science team for success is a highly personal matter. Huge differences exist in the type of business, level of analytical maturity, size, existing structures etc. Despite this, there are a number of good articles which try to provide generalizable advise in structuring a data science team for success.
During some desk research on this topic, I have synthesized the below articles into an amalgam of ideas:
Integrated - Data Scientists are embedded within product teams and have primary relationships with product managers. They report to a data science manager for admin purposes. This achieves strong buy-in and empowerment with business teams, but creates a lack of cohesion between DSs. It also fosters knowledge silos in the business.
Centralized - Data Scientists are in a tight-knit team managed by a technical DS manager. This creates much better cross-skilling and cohesion but results in a lack of domain context when solving problems due to the detachment of the analyst and the product teams. This can result in DS teams being ‘order takers’ or a pure support function.
Hybrid / Center of Excellence (COE) - This approach has a COE to create standards, training and collaboration of data scientists, however data scientists are strongly embedded within product teams. The best case is having clusters of DSs who can serve multiple teams with regular rotations to enable upskilling. Here the Hybrid team is not a support function, but a genuine data product team who need to deliver data products.
While this is quite context driven, the current trend is for data scientists to have full end-to-end control of the analysis from ETL to deployment into production. This has several benefits:
- Reduced tension between engineering and analysts
- Upskilling of analysts
- Engineering can focus on generalizable, horizontally scalable solutions focused on autonomy for users
Data scientists are categorized as:
- Type A (Analyst) for pure, statistical, consulting type of data scientists with ‘human targets’ i.e. business people.
- Type B (Builders) are machine targeted DSs who build algorithms for production deployment directly
These are traits of high performing teams
- Discoverability - Publish results and analysis internally and openly
- Empowerment - Right tools for the right job
- Collaboration - Shared understanding and standards
- Automation - Reliably provided and automatically refreshed
- Deployment - Laser like focus on production deployment
- Excellence - Deliberately deciding to be awesome at data analytics
Where to put a data science team is not that clear cut, it really depends on a few factors
- Where can they influence the org the best?
- Which exec. is the biggest advocate?
- Where can they get ready engineering support?
- Where does tech / BI currently sit?
Would love to get some feedback on the above methods and war stories on what has worked and what hasn’t.