Data Engineer

Los Angeles, CA, United States, New York, NY, United States, Remote • $130k - $180k


Role Locations

  • Los Angeles, CA, United States
  • New York, NY, United States
  • Remote


$130k - $180k


101 - 250 people


5 Pennsylvania Plaza
New York City, NY, 10001, US

Tech Stack

  • JavaScript
  • React
  • Python
  • Java
  • Go
  • AWS

Role Description


Your primary responsibility will be developing transformation logic against disparate datasets in Aetion Evidence Platform. You will work closely with our Product and Science team in developing custom transformation logic for longitudinal data, which is in Java / Python / Scala and / or R, and executed over a Spark cluster. In addition, you will be integral in developing and enhancing our platform and its connections to Spark and a combination of big data infrastructure.

The following duties include, but are not limited to:

  • Develop transformation logic to convert disparate datasets into Aetion’s proprietary format.

  • Work with the Science team to develop transformations in Spark SQL and UDFs executed over a Spark cluster.

  • Assess, develop, troubleshoot and enhance our measure system, which utilizes a combination of Java, Scala, Python.

  • Work on a full-stack rapid-cycle analytic application.

  • Develop highly effective, performant, and scalable components capable of handling large amounts of data for over 100 million patients.

  • Work with the Science and Product teams to understand and assess client needs, and to ensure optimal system efficiency.

  • Take ownership from software development and prototyping through implementation

  • Build proprietary cloud-based big data analytics for healthcare and improve core back-end & cloud-based data services

Qualifications Required:

  • Bachelor’s degree or equivalent in Computer Science, Computer Engineering, Information Systems, or a related field.

  • 3 - 5 years of experience or equivalent in the position offered or related position, including 2 year of experience with: designing, developing, maintaining large-scale data ETL pipelines using Java/Scala in AWS, Hadoop, Spark, and DataBricks to manage Apache Spark infrastructure.

  • Experience working with programming languages like Java, Python, SQL, and SCALA.

  • Experience or knowledge of building and optimizing ETL pipelines

  • Experience building systems with large data sets

  • Experience or working knowledge with distributed systems

  • Experience translating requirements from product, DevOps teams to technology solutions using SDLC.

About Aetion

Aetion is a health care analytics SaaS platform that informs health care’s most critical decisions—what works best, for whom, and when.

Company Culture

  • Lead at all levels
  • Lifelong students
  • Eclectic collective
  • Facts not flash
  • Thoughtful
  • Own it
Interested in this role?
Skip straight to final-round interviews by applying through Triplebyte.