Skip to main content

Showing 1–4 of 4 results for author: Bigelow, E

  1. arXiv:2404.09932  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Foundational Challenges in Assuring Alignment and Safety of Large Language Models

    Authors: Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi , et al. (17 additional authors not shown)

    Abstract: This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

    Submitted 5 September, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  2. arXiv:2310.17639  [pdf, other

    cs.AI cs.CL cs.LG

    In-Context Learning Dynamics with Random Binary Sequences

    Authors: Eric J. Bigelow, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Tomer D. Ullman

    Abstract: Large language models (LLMs) trained on huge corpora of text datasets demonstrate intriguing capabilities, achieving state-of-the-art performance on tasks they were not explicitly trained for. The precise nature of LLM capabilities is often mysterious, and different prompts can elicit different capabilities through in-context learning. We propose a framework that enables us to analyze in-context l… ▽ More

    Submitted 15 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  3. arXiv:2211.08422  [pdf, other

    cs.LG cs.CV

    Mechanistic Mode Connectivity

    Authors: Ekdeep Singh Lubana, Eric J. Bigelow, Robert P. Dick, David Krueger, Hidenori Tanaka

    Abstract: We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved via training on a dataset are connected via simple paths of low loss. Specifically, we ask the following question: are minimizers that rely on different mechanisms for making their predictions connected via simple paths of low loss? We provide a definition of… ▽ More

    Submitted 1 June, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: ICML, 2023

  4. arXiv:1701.06236  [pdf, other

    cs.SI cs.CY

    Tales of Two Cities: Using Social Media to Understand Idiosyncratic Lifestyles in Distinctive Metropolitan Areas

    Authors: Tianran Hu, Eric Bigelow, Jiebo Luo, Henry Kautz

    Abstract: Lifestyles are a valuable model for understanding individuals' physical and mental lives, comparing social groups, and making recommendations for improving people's lives. In this paper, we examine and compare lifestyle behaviors of people living in cities of different sizes, utilizing freely available social media data as a large-scale, low-cost alternative to traditional survey methods. We use t… ▽ More

    Submitted 22 January, 2017; originally announced January 2017.

    Comments: Published at IEEE transactions on Big Data