SURF Mentoring
Potential projects/topics: The project will focus on the interpretability of language models (LMs). We will focus on the first-principle understanding of LMs and leverage insights to control its behavior in neuron, feature or representation level. We would be interested in perspectives from information theory, compression theory or other methods with theoretical guarantee.
Potential skills gained: Students are expected to gain the skill of basic control of LM like how to edit knowledge inside LM, how to change the behavior of them, etc. Also we expect students could learn hoe to use theory 'practically' in LM research.
Required qualifications or skills: Machine learning and NLP basics is a must. We prefer students with a computer science, computer engineering, math or stats major.
Direct mentor: Faculty/P.I., Graduate Student