Learn the principles of effective data engineering. Build your skills in the high-demand field of data engineering and learn how you can deliver real business value by applying a core set of principles and strategies for developing data systems.
Introduction to DE
In the last decade , pretty much every industry has become digital. Digital communications are pervasive , and digital data is replacing pieces of paper as the primary mechanism for information is stored in healthcare, finance, manufacture , education, tech, and pretty much all industries.
But with this tidal wave of data comes risk and also new challenges for how to store and process all those data to create value. Whether your organization is looking to improve how it serves customers or outmaneuver the competition, having good data pipelines is increasingly critical, which is why there's also skyrocketing demand for data engineers.
Data-Centric AI:
The discipline of systematically engineering the data used to build an AI system.
Program Outline
Course 1: Introduction to Data Engineering
- identify key upstream and downstream collaborators and stakeholders for data engineers
- Articulate a mental framework for building data engineering solutions
- Identify some of the necessary considerations for requirements gathering at the start of a new project
- Describe the structure of the data engineering lifecycle and its undercurrents, and how to think about data engineering problems through this lens
- Identify some of the key technologies that can be employed in different stages of the data engineering lifecycle
- Evaluate technologies and tools against the context of requirements and good data architecture
- Design a data architecture on AWS based on stakeholder requirements
- Implement a batch and streaming pipeline on AWS to support a product recommendation system
Course 2: Source Systems, Data Ingestion , and Pipelines.
- Identify different data formats and determine appropriate source systems for generating each type of data
- Explain at a high level how data is generated, stored, and retrieved in various source systems, including relational databases, NoSQL databases, object storage, and streaming systems
- Explain the basics of cloud networking