Senior Data Engineer
Predoc
Data Science
United States
Posted on Sep 3, 2025
Overview
We are looking for an onshore + remote Senior Data Engineer to join our team. You will be responsible for designing, developing, and maintaining data pipelines and structures that support our data science practice and customer-facing data products.
Responsibilities
- Design and Development: Develop, test, and maintain data pipelines using Python and orchestration tools such as Airflow or Kestra.
- Code Quality: Write clean, maintainable, and efficient code, following best practices for coding standards, security, testing, and deployment.
- Database Management: Design and optimize database tables, write efficient SQL queries, and manage database migration scripts.
- Documentation: Develop and maintain documentation around data sources and their composition requirements and resulting transformations from ingestion to final data structures. Also, ensure accurate tracking through the data lifecycle.
- Collaboration: Work with the product team, data scientists, and other stakeholders to define and implement data solutions using new and existing data sources and technologies.
- Continuous Improvement: Participate in code reviews, contribute to team learning, and stay updated with industry trends and technologies.
Technical Qualifications
- Python
- Demonstrable experience in data-focused libraries (Pandas, DuckDB, etc.)
- Experience working with DAGs or equivalent structures
- Experience in process automation in Python
- Experience Integrating with third-party APIs
- Proven Experience in a Data Engineering or Similar Role (ideally 5+ yrs)
- Understanding Data Lineage and Strategies
- ETL Pipeline Design/Development
- Data Modeling Experience
- Experience Building Scalable Data Lakes/Warehouses
- Experience analyzing and organizing large data sets
- Experience in Event-Based Data Processing
- Strong Data Documentation Experience
- Strong SQL (Postgres RDBMS) Experience
- Table design and optimization
- Advanced Query Building and Optimization
- Advanced Data Aggregation Strategies
- Experience with ETL/Workflow Automation & Tools (Kestra, Airflow, or similar)
- Git SCM (Gitlab)
- Experience in Regulated Industries (Healthcare, Banking, etc.)
Bonus:
- AWS (S3, Step Functions, Batch, Athena, Glue)
- Experience in Data Analysis
- Experience working with Data Science/ML teams
- Experience with Typescript