EARNEST ANALYTICS
Earnest Analytics is a VC-backed data innovation startup driven to change the way professionals understand consumer and business behavior. Working with world-class data partners, we transform raw data into a source for business and investment professionals to ask better questions so they can make better decisions. We believe, in the right hands, data has the power to change the way we work.
SENIOR DATA ENGINEER
Earnest Analytics is seeking a Data Engineer to join our Datasets Team. The Datasets Team is responsible for the ingestion, transformation, and productization of all our datasets. As a Data Engineer in our Datasets Team, you will be instrumental in creating the next generation of Earnest’s products and play a leading role in building our internal and client-facing data pipelines, infrastructure, and tooling. This is a chance to work on a cross-functional team with modern managed cloud services, functional programming, and lots of data. The work we do will be directly attributable to the next level of growth for Earnest.
Technologies
- We work in a DevOps environment where teams own their code and infrastructure in Production.
- Some of the technologies we use across Earnest include but are not limited to Google Cloud Platform, Scala, Python, BigQuery, Docker, Airflow, Kubernetes.
- We favor type safe and functional programming languages, we tend towards Scala for building distributed data pipelines.
- We also want to hear your ideas on the latest-and-greatest technologies we could use!
RESPONSIBILITIES
- Collaborate with product owners and data analysts in the development and delivery of new product features across a multitude of datasets
- Build and maintain integrated data pipelines, systems, and internal tooling in functional Scala, Python, and SQL to power the company’s products
- Define ETL/ELT logic for processing terabytes of raw data, including writing BigQuery SQL, Dataflow (Apache Beam) and orchestrating Airflow tasks.
- Ensure high data duality and pipeline stability by maintaining a high quality code base with high test coverage
- Work with the engineering organization to build Earnest’s data platform, in particular interfacing with our data science group
- Assist analysts with troubleshooting data issues and leverage technology to increase their productivity
QUALIFICATIONS
Required:
- Experience processing large amounts of structured and semi-structured data
- Programming experience in Scala/Java, Python, SQL
- 2+ years writing and maintaining ETL at a terabyte level scale
- 1+ years experience working with Hadoop applications (Spark/Scalding) or Dataflow (Apache Beam)
- Experience with version control systems (Git) and CI/CD practices
- Substantial SQL and data modeling experience, particularly focussed on efficient transformations
- Industrious and conscientious with the ability to work both independently and in a collaborative environment
- Effective interpersonal, written and verbal communication with engineers and non-engineers
Preferred:
- Knowledge of Google Cloud Platform (GCP), especially BigQuery, Dataflow and GKE
- Knowledge of Amazon Web Services (AWS), especially EMR, and columnar storage-style databases including Snowflake
- Code-based data transformation orchestration scheduling with Apache Airflow or similar
- Scala experience, either with microservices or distributed big data transformation tools like Spark/Scalding
- Experience with Docker containerization and Kubernetes
- Experience with cross-timezone code reviews and CI/CD toolchains
- Experience with infrastructure as code and tools like Terraform
- Knowledge of statistics and analytics
- Data warehouse modeling experience
- Experience with or willingness to learn functional programming paradigms
- Experience with unit testing, property checking, and type-driven development
- Experience automating data quality checks through Data Build Tool (DBT), Great Expectations or other company tools
Benefits & Perks
- A strong tech community with training and support to develop your skills
- Ability to make an immediate impact to our products
- Input into the architectural design and technologies used in our platform
- Distributed environment, flexible working arrangements, competitive salary, generous annual leave, Health Insurance for you and your family, Pension/401K plans with Employer Matching, Health & Fitness Reimbursement Program