Natural Language Processing

Large Language Model

1121 papers with code • 0 benchmarks • 5 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Large Language Model

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Libraries

Use these libraries to find Large Language Model models and implementations

hiyouga/llama-factory

5 papers

27,980

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen • • 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Paper
Code

Generative Agents: Interactive Simulacra of Human Behavior

joonspk-research/generative_agents • 7 Apr 2023

Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools.

Paper
Code

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

lm-sys/fastchat • • NeurIPS 2023

Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences.

Paper
Code

Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data

project-baize/baize-chatbot • • 3 Apr 2023

Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT.

Paper
Code

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

vision-cair/minigpt-4 • • 20 Apr 2023

Our work, for the first time, uncovers that properly aligning the visual features with an advanced large language model can possess numerous advanced multi-modal abilities demonstrated by GPT-4, such as detailed image description generation and website creation from hand-drawn drafts.

Paper
Code

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

ziyuguo99/point-bind_point-llm • • 1 Sep 2023

We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video.

Paper
Code

Efficient Memory Management for Large Language Model Serving with PagedAttention

vllm-project/vllm • • 12 Sep 2023

On top of it, we build vLLM, an LLM serving system that achieves (1) near-zero waste in KV cache memory and (2) flexible sharing of KV cache within and across requests to further reduce memory usage.

Paper
Code

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

PKU-YuanGroup/Video-LLaVA • • 16 Nov 2023

In this work, we unify visual representation into the language feature space to advance the foundational LLM towards a unified LVLM.

Paper
Code

Fast Transformer Decoding: One Write-Head is All You Need

thudm/chatglm-6b • • 6 Nov 2019

Multi-head attention layers, as used in the Transformer neural sequence model, are a powerful alternative to RNNs for moving information across and between sequences.

Paper
Code

Muse: Text-To-Image Generation via Masked Generative Transformers

lucidrains/muse-pytorch • • 2 Jan 2023

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Paper
Code

Large Language Model

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result