Data Engineer

Amazon.com Services, Inc.
 Seattle, WA

Desciption

Compensation Science is building economic models and algorithms from the ground up to design and scale pay for hundreds of thousands of Amazon employees worldwide. This fast-growing, interdisciplinary team is working at the intersection of research, economics, machine learning, and product development. Our mission is to use science to assist and measurably improve every pay decision made at Amazon. The ideal candidate will be passionate about working with big data sets and have the expertise to utilize these datasets to answer business questions and drive for best in class analytics, dashboard design and new innovative approaches

The ideal candidate will be entrepreneurial and innovative, thriving on solving challenging, ambiguous problems.

Key Responsibilities:

· Develop the end-to-end automation of data pipelines, making datasets readily-consumable by machine learning platforms;

· Work with applied scientists to source data for machine learning algorithms;

· Participate in the design, development and evaluation of highly innovative, scalable models and algorithms;

· Manage AWS resources including EC2, Redshift, S3, etc. Explore and learn the latest AWS technologies to provide new capabilities and increase efficiency;

· Work with product and software engineering teams to manage the integration of successful models and algorithms in complex, real-time production systems at very large scale;

· Foster cross-team collaboration with other science, product, UX, and tech partners;

· Analyze and convey impact and results to senior management and stakeholders

A day in the life

About the hiring group

Job responsibilities

Basic Qualifications

· 3+ years of experience as a Data Engineer or in a similar role

· Experience with data modeling, data warehousing, and building ETL pipelines

· Experience in SQL

· Bachelor's degree in Computer Science, Engineering, Statistics, Mathematics, Finance or related field.

· Hands on experience with Python/Scala, writing complex data transformations in spark and automating pipelines.

· Comfortable using AWS services such as EMR(Hive/Spark), Redshift, S3, Glue, Cloudformation.

· Experience with data storage/compression on Hadoop file systems S3 (EMRFS)/HDFS

· Exceptional technical writing and communication skills for non-technical audience understanding of research results

· Highly adaptable, scrappy, creative, and thrives in a fast-paced and agile work environment

· Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL and AWS big data technologies

Preffered Qualifications

· Master’s degree in related field (Mathematics, Statistics, Computer Science, Finance, Economics or similar quantitative field)

· Industry experience as a Data Engineer, Business Intelligence Engineer, Data Scientist, or related field with a track record of manipulating, processing, and extracting value from large datasets.

· Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations. Hands on experience with implementing scalable Data Lakes with real-time, near real-time, and batch processing use cases.

· Excellent knowledge of Advanced SQL working with large data sets.

· Experience building data products incrementally and integrating and managing datasets from multiple sources.

· Experience with AWS Technologies including (S3, Redshift, Tableau Server Deployment & Maintenance , Quicksight, etc.).

· Familiarity with AWS Data Pipelines/Lambda

· Comfortable with Linux environments.

Support