Search

AWS Cloud and Spark Architect

Lovefreedom Solution
locationBloomfield, CT, USA
PublishedPublished: 6/14/2022
Technology
Full Time

Job Description

Job DescriptionPrimary Job Title: Lead AWS Cloud Apache Spark Architect

Industry Sector: Enterprise Cloud Data Engineering and Big Data Analytics. We design, deploy and operate high-scale AWS-native data platforms and analytics pipelines for enterprise customers—supporting batch and real-time ML/BI workloads across finance, healthcare, and adtech. This is an onsite U.S. role focused on architecting secure, cost-efficient Spark-based processing at scale.

Role Responsibilities

  • Architect and deliver AWS-native big data platforms and data lake solutions using S3, EMR, Glue, Redshift and EKS—designing for performance, scale and resiliency.
  • Lead migration efforts from on-prem Hadoop/Cloudera ecosystems to AWS (EMR/EKS/Glue), defining cutover strategies, data validation, and rollback plans.
  • Optimize Apache Spark (PySpark/Scala) jobs and clusters for throughput, latency and cost—tuning shuffle, partitioning, memory/executor settings and job scheduling.
  • Implement IaC and production-grade CI/CD for data pipelines using Terraform/CloudFormation and pipelines (Jenkins, GitLab CI), including automated testing and deployment safeguards.
  • Define and enforce security, governance and networking best practices (IAM, VPC design, encryption, data lineage, access controls) for enterprise workloads.
  • Mentor engineering teams, run architecture reviews, set operational runbooks, and drive capacity planning and observability standards.

Skills Qualifications

  • Must-Have: 7+ years hands-on AWS experience (EMR, S3, Glue, Redshift, EC2) and deep Apache Spark expertise (PySpark and/or Scala) including production performance tuning and debugging.
  • Must-Have: Proven track record migrating on-prem Hadoop or legacy ETL to AWS and operating Spark in EMR/EKS at enterprise scale.
  • Must-Have: Strong IaC CI/CD skills (Terraform/CloudFormation, Jenkins/GitLab/GitHub Actions), containerization (Docker) and Kubernetes/EKS experience.
  • Preferred: Experience with streaming (Kafka/Kinesis), Spark Structured Streaming, Delta Lake or Iceberg and event-driven architectures.
  • Preferred: Solid understanding of security compliance (IAM, encryption, SOC2/HIPAA awareness), VPC/networking and observability tooling (CloudWatch, Prometheus, Grafana).
  • Preferred: Bachelor’s/Master’s in CS or related field and prior leadership/architect role in enterprise data platform projects.

Benefits Culture Highlights

  • On-site U.S. role with ownership of high-impact modernization projects and visible cross-functional influence.
  • Engineering-first culture that values mentorship, technical excellence, and measurable business outcomes.
  • Learning development support—conferences, certifications, and hands-on opportunities to build large-scale, production data systems.

Location Work Type: United States — Onsite (candidate must be based in or willing to relocate to the U.S. and work from the office).

Keywords: AWS, Apache Spark, EMR, PySpark, Scala, Terraform, EKS, Kafka, Glue, Redshift, S3, data-lake, streaming, performance tuning, migration, IaC, CI/CD, security, observability.

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...