Live Course Module: Secoda Course for Data Engineering
Total Duration: 32 Hours (4 Weeks)
Week 1: Foundations of Secoda & Metadata / Catalog Basics
Total Time: ~8 hours
-
Introduction & Context (1 hr)
-
What is Secoda? Concepts: data catalog, metadata, governance, observability Secoda+2Secoda+2
-
Why metadata and data catalogs matter in a data engineering stack
-
-
Secoda Architecture & Components (1.5 hrs)
-
Key modules: catalog, lineage, monitoring, governance, AI agents Secoda+3Secoda+3Secoda+3
-
Metadata ingestion, integrations, and how Secoda connects to data sources (databases, BI tools, ETL pipelines) Secoda+2Secoda+2
-
-
Setup & Initial Integration (2 hrs – Lab)
-
Provisioning a Secoda workspace / account
-
Connecting a sample data warehouse / data lake (e.g. PostgreSQL, Snowflake, BigQuery)
-
Ingesting metadata (schemas, tables, columns) and lineage
-
-
Exploring Catalog & Discovery (1.5 hrs)
-
Navigating the dashboard, search interface, glossary, metadata viewer
-
Natural language / AI search capabilities (Secoda AI) Secoda+3Secoda+3Secoda+3
-
-
Hands-On & Assignment (2 hrs)
-
Ingest metadata for a sample dataset
-
Search and explore lineage for a selected table
-
Assignment: Document one dataset (tables, columns, descriptions) in Secoda
-
Week 2: Data Lineage, Observability, and Monitoring in Secoda
Total Time: ~8 hours
-
Lineage Concepts & Implementation (1.5 hrs)
-
Table-level and column-level lineage
-
Auto lineage vs manual contributions in Secoda Secoda+2Secoda+2
-
Understanding lineage graphs and dependency chains
-
-
Data Observability & Quality (1.5 hrs)
-
Secoda’s monitoring / observability modules
-
Data Quality Score (DQS): concept, metrics, scoring method Secoda
-
-
Alerts, Anomalies and Data Health (1.5 hrs)
-
Setting up monitors, alerting thresholds
-
Detecting anomalies, alert workflows
-
-
Hands-On Lab (2 hrs)
-
Create lineage for an ETL pipeline (ingestion → transformation → table)
-
Configure a monitor on a metric (e.g. row count change)
-
Trigger an alert in case data health degrades
-
-
Assignment (1.5 hrs)
-
Define 2–3 key metrics to monitor in a dataset
-
Document the lineage and link to metrics in Secoda
-
Week 3: Governance, Access, and Automation with Secoda
Total Time: ~8 hours
-
Governance and Permissions (1.5 hrs)
-
Role-based access control (RBAC), team/permission settings Secoda+2Secoda+2
-
Data classification, PII tagging, masking in Secoda
-
-
Workflows & Automation (1.5 hrs)
-
Automating metadata updates, syncs, documentation workflows
-
Secoda agents: documentation agent, automation agent, governance agent Secoda+2Secoda+2
-
-
Data Cost Governance (1 hr)
-
Secoda’s concept of data cost governance — identifying underused assets, cost optimization Secoda
-
-
Version Control & Collaboration (1 hr)
-
Git integration for metadata changes, versioning in Secoda Secoda
-
Collaboration (comments, ownership, change reviews)
-
-
Hands-On & Assignment (2 hrs)
-
Define roles and access for datasets
-
Create a workflow: On new table ingestion, auto-tag, notify steward
-
Assignment: Design a governance plan for a sample data domain
-
Week 4: Advanced Use, Embedding, and Capstone Project
Total Time: ~8 hours
-
Advanced Features: Secoda AI & Agents (1.5 hrs)
-
Using Secoda AI for natural language queries, insights, suggestions Secoda+2Secoda+2
-
Memory agent, search agent usage
-
-
Embedding & External Interfaces (1 hr)
-
Embedding metadata / lineage in external apps or portals
-
APIs and webhooks
-
-
Performance & Scaling (1 hr)
-
Scaling integrations, metadata volume handling
-
Best practices for large data environments
-
-
Capstone Project (3 hrs)
-
Build full Secoda setup for a data domain:
-
Ingest metadata + lineage
-
Define monitors / data quality metrics
-
Set governance policies & workflows
-
Enable AI search and user access
-
-
Present project: catalog, lineage, alerts & governance flow
-
-
Review & Q&A (1.5 hrs)
-
Review projects
-
Discuss challenges & best practices
-
Wrap-up and next steps in adoption
-
🔍 Deliverables & Assessments
-
Mini Tasks / Assignments each week (ingestion, lineage, monitoring, governance)
-
Capstone Project at end of Week 4
-
Quizzes / checkpoints on key concepts (metadata, lineage, RBAC, observability)
-
Tooling / integrations covered: data warehouses, ETL pipelines, Git, API, Slack/webhooks
Reviews
There are no reviews yet.