Elastic logo

Search - Workchat - Principal Data Scientist

Elastic
Onsite
Canada
Software Engineering & IT

What is The Role:

The Search Data Science team is responsible for developing and integrating statistical tools and machine learning models within the Search domain in support of semantic search, RAG, agentic search, and chat applications. As a Data Scientist in this area, you will work closely with our Product teams to lead the innovation, incubation, and prototyping phases of how to evolve and transform our AI/ML driven Search experiences and solutions with a focus on quickly bringing new ideas to production and into the hands of our customers. Your primary focus will be driving forward research and development in support of improving semantic search with proprietary models and customized open source models, developing techniques and models for query and document understanding, implementing RAG and LLM-driven search experiences, and developing tooling to help customers design and implement successful end-to-end RAG systems. Furthermore, you’ll be investigating aspects of modern agentic search including reasoning engines, prompt engineering techniques, query understanding, and more. Doing this requires exploring and benchmarking new open source models, and existing proprietary Elastic models, while keeping up to date with the latest major advancements in the fields of NLP and information retrieval.

What You Will Be Doing:

  • Explore, select and benchmark open source and Elastic proprietary models
  • Implementing RAG and other LLM-based search experiences
  • Designing evaluation protocols for semantic search, tool selection, and generation in LLM-based search experiences
  • Keeping up-to-date with the most significant recent developments in the field of NLP and information retrieval
  • Engage with the NLP and information retrieval communities (blogs, documentation, Python examples, conference talks, academic papers, etc.)
  • Collaborate with cross-functional teams of data scientists, engineers, and product managers
  • Promote knowledge sharing and collaboration in a distributed team

What You Will Bring:

  • 8+ years of confirmed experience building and applying NLP to production use cases
  • 8+ years of professional software development experience in Python
  • Experience in Generative AI, Retrieval Augmented Generation, and information retrieval
  • Experience with libraries and frameworks such as PyTorch, transformers, and Pandas
  • Experience using collaborative notebook-based workflows (e.g. Jupyter) for prototyping and knowledge sharing
  • Expertise in AI/ML quality evaluation and improvement, including balancing tuning techniques with cost/benefit tradeoffs
  • Self motivated, collaborative style, open communicator, experience in a distributed team
  • Good attention to detail and highly organized
  • Real passion for data, analysis and achieving excellence
  • Experience with Elasticsearch is useful
  • An academic background in the domain is also a plus

If this sounds interesting, we would love to hear from you! Please include whatever info you believe is relevant: resume, GitHub profile, code samples, blog posts and writing samples, links to personal projects, etc.