Davis Brown
@davisbrownr
Research in science of {deep learning, AI security, safety}. PhD student at UPenn & RS at @PNNLab
Excited to share our new paper: "Instruction Following by Boosting Attention of Large Language Models"! We introduce Instruction Attention Boosting (InstABoost), a simple yet powerful method to steer LLM behavior by making them pay more attention to instructions. (馃У1/7)
New Anthropic blog: We benchmark approaches to making classifiers more cost-effective by reusing activations from the model being queried. We find that using linear probes or retraining just a single layer of the model can push the cost-effectiveness frontier. 馃У1/
How well can LLMs predict future events? Recent studies suggest LLMs approach human performance. But evaluating forecasters presents unique challenges compared to standard LLM evaluations. We identify key issues with forecasting evaluations 馃У (1/7)
3.7 sonnet: *hands behind back* yes the tests do pass. why do you ask. what did you hear 4o: yes you are Jesus Christ's brother. now go. Nanjing awaits o3: Listen, sorry, I owe you a straight explanation. This was once revealed to me in a dream