About Us
XEqualTo Analytics is seeking a highly experienced Senior AWS Databricks Developer to design and manage robust data pipelines. This role requires a deep understanding of AWS services and the Databricks platform to build enterprise-grade data solutions.
About the Role
We’re seeking a seasoned AWS Databricks Engineer to join our Cloud Data Platform team. In this role, you will design, build and optimize large-scale data processing pipelines on Databricks, integrate them with AWS services, and collaborate with data scientists and analysts to deliver high-impact insights.
You’ll take ownership of the end-to-end lifecycle—from requirements gathering and data modeling through deployment, monitoring and performance tuning.
Responsibilities:
- Design, implement and maintain ETL/ELT pipelines in Databricks using Apache Spark (Scala, Python or SQL), ingesting data from diverse sources (S3, RDS, Kinesis, on-prem).
- Design, implement and maintain ETL/ELT pipelines in Databricks using Apache Spark (Scala, Python or SQL), ingesting data from diverse sources (S3, RDS, Kinesis, on-prem).
- Design, implement and maintain ETL/ELT pipelines in Databricks using Apache Spark (Scala, Python or SQL), ingesting data from diverse sources (S3, RDS, Kinesis, on-prem).
Deploy notebooks or jobs via CI/CD (Git, Azure DevOps, or AWS CodePipeline), manage Databricks job schedules, and configure robust monitoring/alerting (CloudWatch, Databricks metrics).
Partner with data science, BI and engineering teams to translate business requirements into technical solutions. Maintain clear STTM (Source-to-Target Mapping), runbooks and best-practice guidelines.
- Enforce data governance policies through Unity Catalog or Lakehouse permissions, implement encryption-at-rest/in-transit, and ensure compliance with organizational security standards.
Experience
- 6+ years of experience in Data Engineering or a closely related field
- 4+ years of hands-on expertise in SQL—including complex queries, stored procedures, and user-defined functions—ideally on AWS RDS/Redshift and Databricks SQL
- 3+ years of working experience with AWS Databricks (Delta Lake, Unity Catalog, MLflow, Delta Live Tables)
- 3+ years of broader AWS experience (S3, Glue, EMR, Redshift, Lambda, IAM)
- Strong understanding of data modeling principles and data-warehousing best practices (star/snowflake schemas, partitioning strategies) in a Lakehouse architecture
- Proven ability to design and optimize large-scale Apache Spark pipelines (PySpark/Scala) for performance, cost efficiency, and scalability on Databricks
- Experience with infrastructure-as-code (Terraform or CloudFormation) to provision Databricks workspaces and AWS resources
- Familiarity with orchestration and ETL tools (AWS Glue, Apache Airflow, Step Functions) integrated into Databricks workflows
- Hands-on skill in monitoring and alerting using AWS CloudWatch metrics and Databricks job metrics
- Excellent problem-solving and analytical skills, with the ability to troubleshoot Spark job failures and data quality issues
- Strong communication and collaboration capabilities, partnering effectively with data science, analytics, and DevOps teams
Bonus Points:
Experience with streaming data (Kafka, Kinesis, Spark Structured Streaming)
Exposure to machine-learning workflows in Databricks
Knowledge of Python libraries (Pandas, PySpark, scikit-learn)
Understanding of DevOps practices: CI/CD pipelines, GitOps.
What We are Offering:
- 100% Remote Work – Work from anywhere in the world!
- Career Growth – Be part of impactful data projects that shape business strategies.
- Competitive Compensation – Recognizing your expertise and dedication.
- Flexible Work Arrangements – Start as a contract with a strong possibility of full-time conversion.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.