Databricks Data Engineer

Robert Half

Washington, DC, USA

Published: 6/14/2022

Technology

Full Time

Job Description

Position: Databricks Data Engineer

Location: Remote (U.S. based) — Preference for candidates in or willing to relocate to Washington, DC or Indianapolis, IN for periodic on-site support

Citizenship Requirement: U.S. Citizen

Role Summary:

Seeking a Databricks Data Engineer to develop and support data pipelines and analytics environments within an Azure cloud-based data lake. This role translates business requirements into scalable data engineering solutions and supports ongoing ETL operations with a focus on data quality and management.

Key Responsibilities:

Design, build, and optimize scalable data solutions using Databricks and Medallion Architecture.
Develop ingestion routines for multi-terabyte datasets across multiple projects and Databricks workspaces.
Integrate structured and unstructured data sources to enable high-quality business insights.
Apply data analysis techniques to extract insights from large datasets.
Implement data management strategies to ensure data integrity, availability, and accessibility.
Identify and execute cost optimization strategies in data storage, processing, and analytics.
Monitor and respond to user requests, addressing performance issues, cluster stability, Spark optimization, and configuration management.
Collaborate with cross-functional teams to support AI-driven analytics and data science workflows.
Integrate with Azure services including:
Azure Functions
Storage Services
Data Factory
Log Analytics
User Management
Provision and manage infrastructure using Infrastructure-as-Code (IaC).
Apply best practices for data security, governance, and compliance, supporting federal regulations and public trust standards.
Work closely with technical and non-technical teams to gather requirements and translate business needs into data solutions.

Preferred Experience:

Hands-on experience with the above Azure services.
Strong foundation in advanced AI technologies.
Experience with Databricks, Spark, and Python.
Familiarity with .NET is a plus.

Job Requirements:

To be considered for this position, candidates must possess:

Education & Experience:

Bachelor’s degree in Computer Science or related field with 3+ years of experience, or
Master’s degree with 2+ years of experience.

Technical Expertise:

3+ years of experience designing and developing ingestion flows for structured, streaming, and unstructured data using cloud platform services, with a focus on data quality.
Databricks Data Engineer certification and 2+ years of experience maintaining the Databricks platform and developing in Apache Spark.
Proficiency in Python, Spark, and R; experience with .NET is a plus.
Strong knowledge of data governance practices, including:
Metadata management
Enterprise data catalog
Design standards
Data quality governance
Data security

Client Interaction & Communication:

Ability to work directly with clients and provide front-line support.
Skilled in documenting and presenting solutions using architecture and interface diagrams.

Development Methodologies:

Experience with Agile methodologies.
Familiarity with CI/CD automation and cloud-based development (Azure, AWS).

Preferred Qualifications (Not Required):

Certifications in Azure Cloud.
Knowledge of FinOps principles and cost management.

Eligibility:

This role will require US Citizenship and the ability to obtain a public trust clearance. No C2C allowed.