Lists (1)
Sort Name ascending (A-Z)
Stars
3D LEGO models and mosaics from images using R and #tidyverse
AI Logging for Interpretability and Explainability🔬
LLM training code for Databricks foundation models
A Survey on Data Selection for Language Models
vietnews dataset for vietnamese summarization benchmark
Probabilistic programming with HuggingFace language models
[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.
Apéro is a Hugo theme for personal websites. A Hugo theme you'll want to hang out with 🌌 . This is the source for the theme files to install.
Fine-tune mistral-7B on 3090s, a100s, h100s
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard
OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.
[ASE 2023] Software Entity Recognition with Noise-Robust Learning
Gram2Vec is a document embedding algorithm that embeds documents into a higher dimensional space based off grammatical style.
[ACL 2023] Explanation-based Finetuning Makes Models More Robust to Spurious Correlation
Syntax Regex Matcher is a package for applying regular expressions to parse trees to look for syntactic constructions in English sentences
Multilingual syllable annotation pipeline component for spacy
The original Backpack Language Model implementation, a fork of FlashAttention
In-context Example Selection with Influences
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
Fast & Simple repository for pre-training and fine-tuning T5-style models
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Accessible large language models via k-bit quantization for PyTorch.
Source code for paper "Learning from Noisy Labels for Entity-Centric Information Extraction", EMNLP 2021
Code for paper "CrossFit 🏋️: A Few-shot Learning Challenge for Cross-task Generalization in NLP" (https://arxiv.org/abs/2104.08835)