Ventura Tec Insights

Senior Data Engineer

Written by Fernanda Buonfiglio | Nov 5, 2024 2:21:23 PM
 

Looking for a seasoned Tech Lead or Senior Data Engineer to optimize your data infrastructure?

This candidate brings extensive experience in distributed computing and data engineering, consistently delivering high-performance solutions for large-scale data processing. They excel in designing resilient data pipelines and optimizing performance, ensuring your data-driven operations run smoothly and cost-effectively.

Key Career Achievements:

  • Optimized Data Processing: Re-engineered a data processing application that handled over 7 million records in critical maintenance windows, reducing runtime from over 3 hours to just 30 minutes by modernizing outdated components and enhancing data handling efficiency.
  • Credit Score System Migration: Led a data migration project to transition credit score models from mainframes to a distributed computing platform, utilizing Apache Spark and AWS. This migration reduced update times from 24 hours to around 6 hours, resulting in significant cost savings and improved processing times.
  • Enhanced Data Traffic Management: Developed a feature to handle high-traffic events for a media platform, allowing for controlled redirection of data flow to alternate endpoints during peak times. This solution enabled continuous data collection and gradual processing, maintaining system stability during high-demand periods.

Professional Summary:

With expertise in distributed computing frameworks (Hadoop, Apache Spark), data pipeline tools (Kafka, Airflow), and a robust tech stack across AWS, GCP, and Azure, this engineer is well-equipped to lead data-driven projects in high-scale environments. They are proficient in CI/CD with GitLab, Jenkins, and GitHub Actions, and skilled in database development with PostgreSQL, Redshift, MongoDB, and more.

Tech Stack:

Distributed Computing & Data Pipelines: Hadoop, Apache Spark, Kafka, Airflow
Programming Languages: Scala, Python, Java
CI/CD Tools: GitLab, Jenkins, GitHub Actions, Bitbucket
Databases: PostgreSQL, Redshift, Clickhouse, Oracle, SQL Server, MongoDB, HBase
Cloud Platforms: AWS, GCP, Azure

Discover how this data engineering expert can elevate your data infrastructure.
Click here to talk with us.