The position will require you to source data from internal systems and third-party systems to be stored in the data warehouse, code reviews, SQL code optimization. Dev ops experience is highly beneficial. Additionally, you would be required to support the data science team, BI team and insights in troubleshooting issues.
- ● SQL code review and optimization of existing deployed ETL scripts on Redshift.
- ● Review python scripts deployed for ETL pipelines.
- ● Maintenance and management of Redshift cluster.
- ● Monitoring ETL jobs running on Airflow.
- ● Deploy automations and alerts for monitoring the infrastructure and for reporting Data Anomalies on business metrics.
- ● Build APIs for serving data products to customers
- ● Cost efficiency management for infrastructure.
- ● Bachelor’s in computer science, Software Engineering or any related fields.
- ● At least 4 years of experience as a Data Engineer.
- ● Cloud Certifications - Solution Architect (Nice to have).
- ● API development experience using Python.
- ● SQL & data modeling experience.
- ● Experience building data processing pipelines using Python for handling unstructured data
- ● Experience managing Apache Airflow setup
- ● Experience using AWS CloudWatch, Redshift, Athena, ECS & EC2
- ● Experience using docker for creating production optimized containers
- ● Experience with Terraform infrastructure as code is a plus
- ● Experience with Apache Kafka is a plus