Farzad Jamil
Programming Language & Technologies
- Python, Java, Spark
- UNIX/Linux, Shell Scripting
- Informatica PowerCenter
- SAP HANA, QuickBooks
- MySQL, NoSQL
- Glue, Lambda, Crawler
- Apache Airflow, DAG
- RESTful API, Kafka
- Hadoop, YARN, EMR
- Redshift, Teradata, RDB
- CloudWatch, Event Bridge
- S3, Google Cloud Storage
- AWS, GCP (Trained)
- Athena, Ad-Hoc Query
- ERWin, ER Studio, MDM
- Power BI, Quick Sight
Optimization
Database partitioning, indexing, compression, normalization, caching, parallel processing
Strategic Skills
Agile Project Management, Cost Optimization, Resource Utilization, Data Governance
Relevant Experience
Cognizant Technology Solutions | [AUG 2021 – JUN 2024]
AI & Data Engineer
Project: PepsiCo | Data Engineer
- Collaborated with stakeholders and cross-functional teams to conduct thorough assessments of measurable, specific, actionable, relevant, and time-bound key performance indicators facilitating the strategic planning and execution of database engineering requirements.
- Implemented Agile methodology and the STAR method for planning database design projects and reporting progress to stakeholders via Jira.
- Designed and implemented scalable data solutions using AWS services, integrating multiple heterogeneous data sources into a unified data warehouse.
- Built and deployed RESTful APIs to extract KPIs from middleware systems (Salesforce, SAP HANA, Oracle) and load them into an S3 data lake.
- Developed and optimized ETL pipelines using Informatica PowerCenter Designer, enhancing query performance by 40% and reducing Redshift costs by 10% through effective indexing.
- Engineered custom data solutions with AWS Athena, enabling ad-hoc analysis for data scientists and analysts.
- Monitored and optimized resource utilization and performance using CloudWatch, ensuring efficient ETL job scheduling and data processing in Informatica PowerCenter.
- Optimized ETL job schedules to ensure optimal resource utilization, effectively managed data processing units (DPUs) in AWS Glue to balance cost and performance, and adjusted Redshift cluster capacity for optimal query performance and storage efficiency.
- Planned and integrated Redshift Spectrum and S3 storage with auto-tiering strategies to categorize hot and cold data, optimizing efficient reporting for Power BI and Quick Sight.
Project: PepsiCo | Associate Data Engineer
- Contributed to the design and implementation of Terraform infrastructure on AWS, establishing secure data pipelines between AWS and GCP.
- Developed data warehouse and big data solutions incorporating automated MDM and data quality capabilities, utilizing industry-standard data modeling tools (ERWin, ER Studio).
- Developed and maintained ETL pipelines using Python and SQL to process over 3 TB of data daily, improving data accuracy and processing speed by 40%.
- Implemented data warehousing solutions using AWS services, enabling real-time data access for business intelligence teams.
- Created complex SQL queries and optimized existing scripts to improve data extraction and transformation processes, resulting in 50% performance improvement.
- Collaborated with cross-functional teams, including BI engineers, software developers, and product managers, to deliver data-driven solutions that supported business growth.
Education
Texas A&M University, College Station, TX
B.Sc. in Computer Engineering, minor in Computer Science [Graduation: 2023]