Linkedin

Data Engineering With GCP

  • 24/7 Support
  • 5 Months
  • 120 Sessions

Course Description

Dive into the realm of modern data engineering with a comprehensive course designed to equip professionals with the skills and knowledge to harness the full potential of Google Cloud Platform (GCP). This course is tailored for data engineers, software developers, and IT professionals who seek to architect, build, and optimize data pipelines for efficient data processing, storage, and analysis using GCP services.

One to One personalized training Schedule for Data Engineering With GCP

EkasCloud provides flexible training to all it's student. Here is our training schedule. Incase you find these timings difficult, please let us know. We will try to arrange appropriate timings based on your Convenience.

28-04-2024 Sunday (Monday - Friday) Weekdays Regular 08:00 AM (IST) (Class 1Hr - 1:30Hrs) / Per Session
30-04-2024 Tuesday (Monday - Friday) Weekdays Regular 08:00 AM (IST) (Class 1Hr - 1:30Hrs) / Per Session
02-05-2024 Thursday (Monday - Friday) Weekdays Regular 08:00 AM (IST) (Class 1Hr - 1:30Hrs) / Per Session
03-05-2024 Friday (Monday - Friday) Weekdays Regular 08:00 AM (IST) (Class 1Hr - 1:30Hrs) / Per Session

Course Detail

Dive into the realm of modern data engineering with a comprehensive course designed to equip professionals with the skills and knowledge to harness the full potential of Google Cloud Platform (GCP). This course is tailored for data engineers, software developers, and IT professionals who seek to architect, build, and optimize data pipelines for efficient data processing, storage, and analysis using GCP services.

  1. Foundations of Data Engineering on GCP:

    • Acquire a solid understanding of data engineering concepts, methodologies, and their applications.
    • Explore the core components of GCP, including BigQuery, Cloud Storage, and Cloud Dataprep, as the foundation for data engineering projects.
  2. Data Ingestion and Integration:

    • Learn best practices for data ingestion from various sources using GCP services such as Cloud Storage, Cloud Pub/Sub, and Dataflow.
    • Explore techniques for integrating and harmonizing diverse datasets to create unified and actionable insights.
  3. Building Scalable Data Pipelines:

    • Master the art of building scalable and resilient data pipelines with GCP's Cloud Composer and Cloud Dataflow.
    • Utilize Apache Beam for stream and batch processing, ensuring optimal performance and efficiency.
  4. Data Transformation and Cleaning:

    • Explore GCP tools like Cloud Dataprep and Dataflow for data transformation and cleaning.
    • Implement data wrangling techniques to ensure data quality and reliability in downstream processes.
  5. Data Storage and Management:

    • Understand the various storage options provided by GCP, including Bigtable, Cloud Storage, and Cloud SQL.
    • Explore best practices for data partitioning, indexing, and optimization for efficient data storage and retrieval.
  6. BigQuery for Data Warehousing:

    • Harness the power of BigQuery for building scalable and high-performance data warehouses.
    • Learn to optimize queries, partition data, and design schemas for effective data warehousing on GCP.
  7. Real-time Data Processing:

    • Explore real-time data processing using GCP's Pub/Sub and Dataflow for streaming analytics.
    • Implement real-time dashboards and monitoring solutions for actionable insights.
  8. Security and Compliance in Data Engineering:

    • Gain insights into securing data pipelines, managing access controls, and ensuring compliance with industry standards.
    • Understand encryption techniques and best practices for protecting sensitive data in transit and at rest.
  9. Monitoring and Optimization:

    • Learn how to monitor and optimize data pipelines for performance, cost efficiency, and reliability.
    • Implement logging, monitoring, and alerting solutions using GCP's Stackdriver for proactive system management.

Prerequisites:

Participants should have a basic understanding of data engineering concepts, programming (preferably Python or Java), and cloud computing fundamentals. Familiarity with GCP services is beneficial but not mandatory.

Who Should Enroll:

  • Data engineers and developers aiming to enhance their skills in data engineering using GCP.
  • IT professionals and software developers involved in building and managing data pipelines.
  • Anyone interested in leveraging GCP for scalable and efficient data processing and analysis.

Delivery Format:

This course is delivered through a blend of video lectures, hands-on labs, collaborative projects, and interactive discussions. Participants will have access to GCP environments for practical exercises, fostering a dynamic and engaging learning experience.

 We check your knowledge before we start the session.

 We build foundational topics first and core topics next.

 Theory classes with a real-time case study.

 Demo on every topic.

 You will learn how to design architecture diagrams for each service.

 Mock exam on every topic you understand.

 Exam Preparation

 Interview Preparation

Data Engineering With GCP Syllabus


4 Months Course 50% Theory 50% Lab Daily Homework Real time Projects

Topics Covered

Module 1: Introduction & Prerequisites

  • Course overview
  • Introduction to GCP
  • Docker and docker-compose
  • Running Postgres locally with Docker
  • Setting up infrastructure on GCP with Terraform
  • Preparing the environment for the course
  • Homework

Module 2: Workflow Orchestration

  • Data Lake
  • Workflow orchestration
  • Workflow orchestration with Mage
  • Homework

Module 3: Data Warehouse

  • Data Warehouse
  • BigQuery
  • Partitioning and clustering
  • BigQuery best practices
  • Internals of BigQuery
  • Integrating BigQuery with Airflow
  • BigQuery Machine Learning

Module 4: Analytics engineering

  • Basics of analytics engineering
  • dbt (data build tool)
  • BigQuery and dbt
  • Postgres and dbt
  • dbt models
  • Testing and documenting
  • Deployment to the cloud and locally
  • Visualizing the data with google data studio and metabase

Module 5: Batch processing

  • Batch processing
  • What is Spark
  • Spark Dataframes
  • Spark SQL
  • Internals: GroupBy and joins

Module 6: Streaming

  • Introduction to Kafka
  • Schemas (avro)
  • Kafka Streams
  • Kafka Connect and KSQL


Module 7: Project

  • Putting everything we learned to practice


 

Instructor

Huzefa Mohammed

Huzefa has a Bachelor’s degree in Engineering.He is based out of South India and has trained more than 9000 students in the last fifteen years on various technologies starting from Networking to Cloud and is deeply passionate about their success.

Huzefa is a AWS SA PRO /AWS Security / Azure Expert & CCSP certified

 

Frequently asked question

Q: What if I miss a class?
A: We will stop the course for you because it is one-to-one or one-to-two students only.

Q: What if I am not an Engineer/Programmer? Can I still do the Data Science course?
A: Data Science is not a separate domain but a tool/technology which can be used in any field. Our course is designed to address the needs of non-programmers and candidates who have no IT knowledge. Anyone who has an interest in Data Science can take up this course.

Q: Will I get placement assistance?
A: Yes. You will get it once you finish the course.

Q: What if I have queries after I complete this course?
A: You can check our blog or send your queries from social media like Facebook, Linkedin, Instagram, Twitter, and Youtube.

Q: How soon after signing up will I get access to the course?
A: Once you join the course, our Counselor Manager book a slot with our Trainer based on your and Trainer's available time.

Q: Is the course material accessible to the students even after the course training is over?
A: Yes, you can access our course material. Not only material, but you can also watch short and sweet videos from our Youtube Channel.

Q: What is the average salary of a Data Engineering professional?
A: $229,868


Data Engineering With GCP Fees