Key Responsibilities: Automate data quality and reconciliation checks across varied storage layers, including Snowflake, SQL, and RDF/SPARQL databases Test and verify data lineage, governance, and visualization components using Snowflake, data catalogs (ie. DataHub), Thoughtspot, and other visualization tools Integrate test suites into the core infrastructure orchestrated by Apache Airflow and utilizing Iceberg table formats, while monitoring data pipeline health, alerting, and observability metrics using Prometheus and Grafana Cloud Establish AI Evaluation Loops (Evals) and Guardrails: Build rigorous verification protocols— including structural tests, checks, and watchdog agents—to validate AI-generated artifacts, catch false positives, and ensure all automated outputs are secure, reliable, and free from hallucinations. Integrate automated testing workflows into CI/CD pipelines using GitHub Actions, ensuring continuous stability and quality gates across all deployment environments Validate ETL and dbt transformations across Data Lakehouses, rigorously testing data progression through a Medallion Architecture Test and automate complex API workflows, validating data payloads across OpenAPI integrations, 3rd party APIs, GraphQL, and AWS APIs (specifically S3) Must Haves: Data engineering & data testing: dbt, data lakehouse concepts, Medallion architecture Databases & storage testing: SQL, Snowflake, AWS S3, Iceberg Integrating quality check into data pipelines: Apache Airflow API testing & automation: REST/OpenAPI, GraphQL Integrating test automation into CI/CD: GitHub Actions (or similar like ArgoCD/GoCD) Cloud / infrastructure and observability basics: Kubernetes (K8s), Prometheus, Grafana Nice to Have: Graph databases: RDF / SPARQL Data governance & analytics tools: DataHub, Thoughtspot AI/ML testing & MLOps: AI evals, guardrails, RAG, vector databases, AI drift monitoring Advanced / emerging data tech: StarRocks, DuckDB Regulated environments: GxP, 21 CFR Part 11, HIPAA Clinical / domain-specific data standards: CDISC, ODM, FHIR AI-native tooling: Cursor, Claude Code, Copilot, QA Wolf
Data Engineer - Quality Assurance
TRANSCENDA
Caxias do Sul, Espírito Santo