This is a fully remote position, 6 month contract (PJ model) and salary range is 25-30/hr USD.
Must-haves:
5+ years of professional experience as a Data Engineer
Experience with Python and Bash – shell script wrappers and python building data pipelines
Working with big data, making sure it is optimized
Airflow experience for orchestration
Snowflake DB experience/RedShift
Terraform experience, AWS – S3, SQS
SQL – adhoc queries and SQL for data pipelines
Plusses:
AI for generating code – Claude, CodeX, Copilot
Streaming and batch experience – spark, Kafka
The Search and AI Platform is our client's agentic data platform which powers products and their next-generation LLM-powered research systems.
The platform uses agentic services to interrogate our rich knowledge graphs, search and recommendation systems, and our unparalleled collection of research data to deliver insights to the scientific community so they can collaborate more effectively, work smarter, and deliver quality research more quickly.
We’re looking for an innovative, passionate Senior Data Engineer I to be the senior data engineer on the new AI Content team to help design, build and maintain scalable pipelines to ingest content to a centralized content storage system, to help build and maintain a scalable content enrichment and retrieval system, as well as pipelines to load content search indexes. The newly built systems are in early-stage implementation, and we are giving this team remit to ingest massive amounts of content, which supports AI-powered products across the organization.
You will be
Designing, prototyping, and building robust and scalable pipelines using AI-assisted, spec-driven development, following clean code and best-practice software engineering principles
Working with technologies including Python (FastAPI), Spark, Airflow, Snowflake, Iceberg, Kafka and RDBMS
Building cloud infrastructure in AWS to host and monitor the services, automating common tasks mercilessly