- Collaborate with technical and non-technical stakeholders to design and execute research on how humans evaluate LLM features.
- Conduct research studies with humans to validate and establish automated benchmarks for LLM evaluations.
- Excellent understanding of machine learning principles, particularly in the context of LLMs.
- Knowledgeable about LLM evaluation techniques, including human evaluation and automated benchmarks.
- Ability to own and pursue a research agenda, selecting impactful problems and autonomously conducting projects.
- Proficiency in at least one statistical programming, preferably Python.
- Demonstrated background in collecting data from human participants (surveys, experiments) with a focus on data quality and validity.
- Excellent verbal and written communication skills for effective collaboration in virtual teams.
- PhD or advanced degree in computer science, machine learning, cognitive science, psychology, economics, or a related field.