Key Responsibilities:Automate data quality and reconciliation checks across varied storage layers, including Snowflake, SQL, and RDF/SPARQL databasesTest and verify data lineage, governance, and visualization components using Snowflake, data catalogs (ie. DataHub), Thoughtspot, and other visualization toolsIntegrate test suites into the core infrastructure orchestrated by Apache Airflow and utilizing Iceberg table formats, while monitoring data pipeline health, alerting, and observability metrics using Prometheus and Grafana CloudEstablish AI Evaluation Loops (Evals) and Guardrails: Build rigorous verification protocols— including structural tests, checks, and watchdog agents—to validate AI-generated artifacts, catch false positives, and ensure all automated outputs are secure, reliable, and free from hallucinations.Integrate automated testing workflows into CI/CD pipelines using GitHub Actions, ensuring continuous stability and quality gates across all deployment environmentsValidate ETL and dbt transformations across Data Lakehouses, rigorously testing data progression through a Medallion ArchitectureTest and automate complex API workflows, validating data payloads across OpenAPI integrations, 3rd party APIs, GraphQL, and AWS APIs (specifically S3)Must Haves:Data engineering & data testing: dbt, data lakehouse concepts, Medallion architectureDatabases & storage testing: SQL, Snowflake, AWS S3, IcebergIntegrating quality check into data pipelines: Apache AirflowAPI testing & automation: REST/OpenAPI, GraphQLIntegrating test automation into CI/CD: GitHub Actions (or similar like ArgoCD/GoCD)Cloud / infrastructure and observability basics: Kubernetes (K8s), Prometheus, GrafanaNice to Have:Graph databases: RDF / SPARQLData governance & analytics tools: DataHub, ThoughtspotAI/ML testing & MLOps: AI evals, guardrails, RAG, vector databases, AI drift monitoringAdvanced / emerging data tech: StarRocks, DuckDBRegulated environments: GxP, 21 CFR Part 11, HIPAAClinical / domain-specific data standards: CDISC, ODM, FHIRAI-native tooling: Cursor, Claude Code, Copilot, QA Wolf
Data Engineer - Quality Assurance
TRANSCENDA
Campos dos Goytacazes, Rio de Janeiro