Amazon’s Onsite, Offsite, and Perception Measurement (O2PM) team (a part of Customer Behavior Analytics) is hiring a talented, self-directed Data Engineer to support the rapid growth of our Marketing Measurement solutions. You will design, develop, implement, test, document, and operate large-scale, high-volume, high-performance data structures and pieplines for our internal customers. Implement data structures using best practices in data modeling and ETL/ELT processes. Gather business and functional requirements and translate these requirements into robust, scalable, operable solutions that work well within the overall data architecture. Analyze source data systems and drive best practices in source teams. Participate in the full development life cycle, end-to-end, from design, implementation and testing, to documentation, delivery, support, and maintenance. Produce comprehensive, usable dataset documentation and metadata. Set up and maintain a compliant system of credential authentication. Evaluate and make decisions around dataset implementations and new or existing software products and tools. Educate and mentor scientists in best data querying practices to improve efficiencies and accelerate.
The ideal candidate relishes working with a science team to develop scalable products that ingest large volumes of data to understand customer preferences, enjoys working independently and the challenge of highly complex technical contexts, and, above all else, is passionate about data and analytics. They are expert with data modeling, ETL design and business intelligence tools and passionately partners with the business to identify strategic opportunities where improvements in data infrastructure creates out-sized business impact. They are a self-starters, comfortable with ambiguity, able to think big (while paying careful attention to detail) and enjoy working in a fast-paced team. The ideal candidate needs to possess exceptional technical expertise in large scale data warehouse, lakes and BI systems with hands-on knowledge on SQL, Distributed/MPP data storage, and AWS services (S3, Redshift, EMR, RDS).
Key job responsibilities
- Design, implement, and support a platform providing ad hoc access to large datasets
- Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL
- Implement data structures using best practices in data modeling, ETL/ELT processes, and SQL, and Redshift
- Build robust and scalable data integration (ETL) pipelines using SQL, Python, Spark and Scala
- Build and deliver high quality datasets to support business analysis and customer reporting needs
- Interface with internal customers and scientists, gathering requirements and delivering complete data structures
About the team
The Customer Behavior Analytics (CBA) organization owns Amazon’s insights pipeline, from data collection to deep analytics. We aspire to be the place where Amazon teams come for answers, a trusted source for data and insights that empower our systems and business leaders to make better decisions. Our outputs shape Amazon product and marketing teams’ decisions and thus how Amazon customers see, use, and value their experience.
- Degree in Computer Science, Engineering, Mathematics, or a related field or 4+ years industry experience
- 3+ years of experience as a Data Engineer
- Experience with data modeling, data warehousing, and building ETL pipelines
- Experience writing complex, highly-optimized SQL queries across large data sets
- Experience with AWS technologies such as Redshift, EMR, and S3
- Coding proficiency in at least one modern programming language (e.g. Python, Spark, Scala etc.)
- Experience building/operating large scale pipelines using distributed systems for data extraction, ingestion, and processing of large data sets
- Experience building data products incrementally and integrating and managing datasets from multiple sources
- Experience working with a science team and/or familiarity with Machine Learning
- Familiarity with Spark and/or Scala
- Experience leading large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EMR, Glue etc.
- Experience providing technical leadership and mentoring scientists and other engineers for the best practices on the data engineering space
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.