Data Scientist, Systems Intelligence
Amazon.com, Inc.
 Shoreline, WA

The mission of the Systems Intelligence team is to provide situational awareness to Amazon's leaders enabling them to make decisions that will directly impact the culture of software development. A Data Scientist is critical to the success of this program to analyze the data we collect and provide statistical correlations that will drive future business decisions. Our team must provide visibility into the metadata collected from Amazon's internal systems, software development tools, developer output metrics, and internal surveys to highlight differences across organizations. These variances will help drive discussions and investigations into areas that present potential bottlenecks or risks to software delivery. Based on the learnings from these investigations, we will identify best practices that can be shared with all levels of the organization to drive continuous improvement. A Data Scientist will complete the feedback cycle for our team, synthesizing the data we collect, providing correlation analysis, and guiding future investigations.

As a Data Scientist, your role will be to leverage the past data to make future predictions, thereby helping us mitigate the uncertainty of the future by making predictions of future performance. While business intelligence tends to be structured, data science leans more toward the unstructured. You must deal with incomplete, messy, unorganized data, not immediately usable without some degree of cleaning and prepping. You will generate predictive insights and new product innovations by applying advanced analytical tools and algorithms utilizing advanced statistical packages, SQL, Hadoop, and open source tools like Python and Perl.

In 2019, our team will deliver a dashboard for software development managers that provides in depth insights and business metrics about their teams, providing historical trending analysis along with comparisons against organizational averages, guiding managers toward improvement opportunities in development agility. By gathering datasets such as deployments, code submissions, code reviews, and team hierarchy, the dashboard will also provide a view for technical leaders to drive crosscutting initiatives such as SDE Ratios, remote code contributions, and migration to native AWS. Success will be measured by providing a dashboard built on Systems Intelligence (our internal data lake) that improves development agility. Examples include quantifying the efficiency gained by migrating to optimized platforms (pre-compute queries and deployment automation) along with identifying teams that could benefit from leveraging these services. Other examples include enabling teams to track increases in deployment velocity, increase in code coverage, and/or reduction in technical debt. With the cost of engineering resources constantly on the rise, leaders must seek opportunities to increase the efficiency of software development. Attempting to quantify software agility and baseline the maturity of software development teams has been a long-standing challenge because of the complexity in the development process and various forms of output. Providing visibility into the outcome of software development enables teams to identify maturity opportunities within their own processes and better understand the impact of changes.