Pocket empowers people to discover, organize, consume, and share content that matters to them. Our apps and platform are essential ways that tens of millions of people discover and consume content on the web. Pocket is the Web, curated: for you and by you.The opportunity
For content recommendations, everything starts with data. Pocket's Data Products team builds systems that combine machine learning with editorial expertise to surface high-quality content from across the internet. Ensuring data privacy when collecting, distributing, validating, and securing data at scale is no small task and every engineer on our team plays a vital role in shaping each user's experience.
We are looking for a Lead Data Pipeline Engineer to own the design and development of data pipeline applications for complex, extensible, and highly scalable cloud-based data platforms. Are you passionate about building intuitive data models? Do you excel at taking vague requirements and crystallizing them into scalable data solutions? We invite you to apply!
People who excel on our team thrive in a small, dynamic environments. We cover many areas including machine learning, product engineering, machine learning operations, and data modeling, among others.Who You Are
What You'll Do
- Enjoy working on small, dynamic teams.
- Understand Data Lifecycle and concepts such as lineage, governance, privacy, retention, anonymity, etc.
- Conceptually familiar with AWS cloud resources (S3, EC2, RDS etc).
- A trusted authority in distributed data processing patterns.
- Highly proficient in at least one of Java, Python or Scala.
- Comfortable with complex SQL
- Experience designing, building, and maintaining data lakes.
- Build and maintain data pipeline applications
- Design, create and maintain the data platform data model at the conceptual, logical, and physical levels.
- Establish data security, quality, load, transport and performance models.
- Research, design, document and modify data pipeline software specifications throughout the production life cycle.
- Develop and maintain stakeholder documentation and operations procedures, programs, security, etc. and assist in eliminating
- redundancy and automating manual processes.
- Assist in developing standards and criteria for the successful implementation of new systems.
- Perform code reviews and mentor other engineers.
Commitment to diversity, equity, inclusion, and belonging
- Cloud warehouses: Snowflake, BigQuery, Redshift
- Feature stores: Sagemaker, Databricks, Vertex
- Orchestrators: Airflow, Prefect
- Compute frameworks: AWS Glue, Spark, Hadoop, Athena
- Streaming data: Kinesis, Kafka
- Data modeling: DBT
Mozilla understands that valuing diverse creative practices and forms of knowledge are crucial to and enrich the company's core mission. We encourage applications from everyone, including members of all equity-seeking communities, such as (but certainly not limited to) women, racialized and Indigenous persons, persons with disabilities, persons of all sexual orientations, gender identities and expressions.
We will ensure that qualified individuals with disabilities are provided reasonable accommodations to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment, as appropriate. Please contact us at hiringaccommodation@ to request accommodation.
We are an equal opportunity employer. We do not discriminate on the basis of race (including hairstyle and texture), religion (including religious grooming and dress practices), gender, gender identity, gender expression, color, national origin, pregnancy, ancestry, domestic partner status, disability, sexual orientation, age, genetic predisposition, medical condition, marital status, citizenship status, military or veteran status, or any other basis covered by applicable laws. Mozilla will not tolerate discrimination or harassment based on any of these characteristics or any other unlawful behavior, conduct, or purpose.
Req ID: R1866