Yahoo – Junior AI Engineer (Champaign, IL)

Posted: 4.1.26

Yahoo – Junior AI Engineer (Champaign, IL)

Yahoo

  • Student Employment & Internships

About Yahoo!

Yahoo is an American web portal that provides the search engine Yahoo Search and related services including My Yahoo, Yahoo Mail, Yahoo News, Yahoo Finance, Yahoo Sports, y!entertainment, yahoo!life, and its advertising platform, Yahoo Native.

Yahoo builds, improves, and maintains one of the highest scaling platforms in the world. Their amazing team of engineers work on next-generation data platforms that transform how users connect every single day. Yahoo’s Central Data Platform drives some of the most demanding applications in the industry. The system handles billions of requests a day and runs on some of the largest Hadoop clusters ever built! 50,000 nodes strong and several multi-thousand node clusters bring scalable computing to a whole new level.

They work on problems that cover a wide spectrum – from web services to operating systems and networking layers. Their biggest challenges ahead are designing efficient cloud-native data platforms.

About the Role

We’re looking for a motivated Junior AI Engineer to join our AI & ML team in Champaign, Illinois, where you’ll have the opportunity to design and build scalable, high-performance tools in the areas of data governance, orchestration, and query technologies as part of the Central Data Platform Team. In this role,, you will be instrumental in developing and deploying AI-powered features. Your responsibilities will include analyzing requirements, supporting prompt engineering and RAG workflows, and collaborating across product and engineering teams to integrate generative AI into production systems. Furthermore, this role contributes to delivering high-quality software and platform products that underpin Yahoo’s enterprise data ecosystem.

Responsibilities

  • Assist in building AI features: use Python, LLM APIs (OpenAI, Anthropic, etc.), vector embedding pipelines.
  • Support prompt engineering and RAG workflows: design, test, iterate prompt templates, integrate vector search.
  • Help build and maintain AI-model monitoring/observability dashboards: track model accuracy, latency, drift and work with backend engineers to integrate AI services into the product
  • Participate in experimenting with AI workflows: multi-agent orchestration, model fine-tuning, system prompts.
  • Working through documents and conversations with colleagues to understand product requirements for new features.
  • Work closely with cross-functional teams to understand product and technical roadmaps, identifying potential impacts on system operability and proposing proactive solutions for Cloud environments.
  • Lead initiatives to enhance and optimize existing cloud infrastructure, drive improvements in scalability, efficiency, and resilience, and oversee large-scale projects related to cloud platforms, automation, and performance optimization.
  • Foster cross-functional collaboration between development, infrastructure, and operations teams to improve the overall performance, reliability, and security of services on cloud.

Requirements

  • A solid Computer Science foundation in data structures and algorithms, object oriented programming, and modern software engineering practices from your achievement of obtaining a degree in CS or a similar engineering pursuit.
  • A self-starter with the ability to work independently and within a team with excellent design, coding, debugging and testing skills.
  • Proactive in staying updated with evolving AI trends and new LLM releases.
  • Skilled at diagnosing and solving complex, ambiguous problems with curiosity and a product-focused mindset.
  • Strong communication and presentation skills, able to convey complex analysis clearly and actionably.
  • Experience working with the latest Large Language Models (LLMs) and AI advancements, cloud native AI services like Sagemaker, VertexAI, LangChain, LlamaIndex, or other LLM-orchestration libraries.
  • The ability to use an object oriented programming language like Java or C++ or scripting languages like Python or Perl, and Unix or Linux systems.
  • Knowledge of SQL and distributed query engines (e.g., Presto, Trino, Athena, BigQuery). Familiarity with data concepts such as joins, aggregation, projection, and explosion.
  • The ability to work with large-scale distributed systems.
  • Strong analytical and problem-solving skills with the ability to work effectively in a cross-functional, collaborative environment.
  • Great team-working capabilities in an agile development environment.
  • Willingness to engage productively with others in the industry.
  • The passion to build great products, work with great people and change the world.

Preferred Qualifications

  • Working knowledge of AWS and GCP cloud environments, including core data and compute services (e.g., EMR, MWAA, S3, Lambda, ECS, BigQuery, Dataproc).
  • Experience with data pipeline orchestration tools and frameworks such as Oozie and Airflow.
  • Query Execution and Optimization: Designing and optimizing queries to run efficiently on platforms such as BigQuery, Hive, Pig, and Spark, ensuring high performance and scalability.
  • Familiarity with modern data architectures, including lakehouse and Medallion design patterns.
  • Understanding of data processing/data governance concepts
  • Familiarity with AI-assisted engineering tools (e.g., Cursor, MCP, Copilot, agentic AI frameworks) and emerging AI/ML technologies that enhance data engineering productivity.
  • Experience working with IaC (eg. Terraform, Ansible).
  • Experience working with Infrastructure as Code (IaC) tools, such as Terraform, or CloudFormation, to automate and manage cloud infrastructure deployments and automations.
  • Familiarity & working experience with Kubernetes and container-based orchestration.

By joining the Central Data Platform Team, candidates will have the opportunity to work on complex data platform systems and products, interact with cutting-edge technologies, and contribute to key projects that drive Yahoo’s business forward.

To apply, please send resumes to resume-champaign-data@yahooinc.com