Job Description
Job Description
Position: Databricks Data Engineer
Location: Remote (U.S. based) — Preference for candidates in or willing to relocate to Washington, DC or Indianapolis, IN for periodic on-site support
Citizenship Requirement: U.S. Citizen
Role Summary:
Seeking a Databricks Data Engineer to develop and support data pipelines and analytics environments within an Azure cloud-based data lake. This role translates business requirements into scalable data engineering solutions and supports ongoing ETL operations with a focus on data quality and management.
Key Responsibilities:
- Design, build, and optimize scalable data solutions using Databricks and Medallion Architecture.
- Develop ingestion routines for multi-terabyte datasets across multiple projects and Databricks workspaces.
- Integrate structured and unstructured data sources to enable high-quality business insights.
- Apply data analysis techniques to extract insights from large datasets.
- Implement data management strategies to ensure data integrity, availability, and accessibility.
- Identify and execute cost optimization strategies in data storage, processing, and analytics.
- Monitor and respond to user requests, addressing performance issues, cluster stability, Spark optimization, and configuration management.
- Collaborate with cross-functional teams to support AI-driven analytics and data science workflows.
- Integrate with Azure services including:
- Azure Functions
- Storage Services
- Data Factory
- Log Analytics
- User Management
- Provision and manage infrastructure using Infrastructure-as-Code (IaC).
- Apply best practices for data security, governance, and compliance, supporting federal regulations and public trust standards.
- Work closely with technical and non-technical teams to gather requirements and translate business needs into data solutions.
Preferred Experience:
- Hands-on experience with the above Azure services.
- Strong foundation in advanced AI technologies.
- Experience with Databricks, Spark, and Python.
- Familiarity with .NET is a plus.
Job Requirements:
To be considered for this position, candidates must possess:
Education & Experience:
- Bachelor’s degree in Computer Science or related field with 3+ years of experience, or
- Master’s degree with 2+ years of experience.
Technical Expertise:
- 3+ years of experience designing and developing ingestion flows for structured, streaming, and unstructured data using cloud platform services, with a focus on data quality.
- Databricks Data Engineer certification and 2+ years of experience maintaining the Databricks platform and developing in Apache Spark.
- Proficiency in Python, Spark, and R; experience with .NET is a plus.
- Strong knowledge of data governance practices, including:
- Metadata management
- Enterprise data catalog
- Design standards
- Data quality governance
- Data security
Client Interaction & Communication:
- Ability to work directly with clients and provide front-line support.
- Skilled in documenting and presenting solutions using architecture and interface diagrams.
Development Methodologies:
- Experience with Agile methodologies.
- Familiarity with CI/CD automation and cloud-based development (Azure, AWS).
Preferred Qualifications (Not Required):
- Certifications in Azure Cloud.
- Knowledge of FinOps principles and cost management.
Eligibility:
- This role will require US Citizenship and the ability to obtain a public trust clearance. No C2C allowed.
