Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture.

AllShopping Books Images Maps Videos News

Scaling Pre-trained Language Models to Deeper via Parameter ...

Mar 27, 2023 � In this paper, we propose a highly parameter-efficient approach to scaling pre-trained language models (PLMs) to a deeper model depth.

Enhancing Scalability of Pre-trained Language Models via Efficient ...

aclanthology.org › 2023.findings-emnlp.920

In this paper, we propose a highly parameter-efficient approach to scaling pre-trained language models (PLMs) to a deeper model depth.

Scaling Pre-trained Language Models to Deeper via Parameter ...

www.researchgate.net › publication › 369623725_Scaling_Pre-trained_La...

Mar 27, 2023 � In this paper, we propose a highly parameter-efficient approach to scaling pre-trained language models (PLMs) to a deeper model depth.

Enhancing Scalability of Pre-trained Language Models via Efficient...

openreview.net › forum

Mar 4, 2024 � In this paper, we propose a highly parameter-efficient approach to scaling pre-trained language models (PLMs) to a deeper model depth.

Exploring the Impact of Model Scaling on Parameter-Efficient Tuning

Make Pre-trained Model Reversible: From Parameter to Memory...

SaMoE: Parameter Efficient MoE Language Models via Self-Adaptive...

Exploring extreme parameter compression for pre-trained language...

More results from openreview.net

Parameter-efficient fine-tuning of large-scale pre-trained language ...

www.nature.com › nature machine intelligence › analyses

Mar 2, 2023 � A new branch of research focusing on the parameter-efficient adaptation of PLMs, which optimizes a small portion of the model parameters while keeping the rest�...

[PDF] arXiv:2303.16753v1 [cs.CL] 27 Mar 2023

arxiv.org › pdf

Mar 27, 2023 � In this paper, we propose a highly parameter- efficient approach to scaling pre-trained lan- guage models (PLMs) to a deeper model depth.

[PDF] Enhancing Scalability of Pre-trained Language Models via Efficient ...

gsai.ruc.edu.cn › uploads

Jan 5, 2024 � In this paper, we propose a highly parameter- efficient approach to scaling pre-trained lan- guage models (PLMs) to a deeper model depth.

PEFT - a stereoplegic Collection - Hugging Face

huggingface.co › collections › stereoplegic

Exploring parameter-efficient fine-tuning techniques for code generation with large language models.

Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained ...

www.semanticscholar.org › paper › Parameter-Efficient-Mixture-of-Expert...

This paper adopts matrix product operator (MPO, a tensor decomposition from quantum many-body physics) to reconstruct the parameter matrix in the expert�...

Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained ...

inspirehep.net › literature

Recently, Mixture-of-Experts (short as MoE) architecture has achieved remarkable success in increasing the model capacity of large-scale language models.