编组
ConsoleUser Center

Mengzi Pre-training Model

Introduction to Technology

The Mengzi pre-training model is a large-scale pre-training language model developed based on the team's self-developed technology. It can handle multilingual and multimodal data, and supports multiple text understanding and text generation tasks. The Mengzi model is based on Transformer architecture, contains 1 billion parameters, and is based on hundreds of gigabytes of high-quality corpus covering Internet web pages, communities, news, e-commerce, finance and other fields.

solution.title

Mengzi Pre-training Model

Introduction to Technology

The Mengzi pre-training model is a large-scale pre-training language model developed based on the team's self-developed technology. It can handle multilingual and multimodal data, and supports multiple text understanding and text generation tasks. The Mengzi model is based on Transformer architecture, contains 1 billion parameters, and is based on hundreds of gigabytes of high-quality corpus covering Internet web pages, communities, news, e-commerce, finance and other fields.

Technical Solutions

mengzi-advantage

Support Multiple Model Architectures

  • Autoregressive models: such as GPT
  • Self-encoding models: such as BERT
  • Encoder-Decoder model: T5
mengzi-advantage

Lightweight Model Performance Enhancement

  • Fusion of multiple pre-training tasks
  • SMART adverserial training
  • Knowledge distillation
mengzi-advantage

Knowledge Graph Based Enhancement

  • Enhancements with entity extraction
  • Knowledge graph enhancement (isa relationship)
  • Knowledge graph to text conversion
mengzi-advantage

Linguistic Knowledge Based Enhancement

  • Mask mechanism enhanced by syntactic information
  • Semantic role embedding enhancement
  • Attention weight constrained pruning of dependencies
mengzi-advantage

Few-Shot/Zero-Shot Learning

  • Prompt template construction
  • Multi-task learning technique
  • Common information extraction scenarios, out of the box
mengzi-advantage

Retrieval Based Enhancement

  • Knowledge decoupling
  • Strong interpretability
  • External knowledge components are updated in real time

Technical Advantages

advantage

It has achieved better performance than conventional models in multiple tasks

advantage

It supports BERT, GPT, T5 and other architectures, with different scenarios covered

line2line2line1centerline4
advantage

It supports image and text dual-mode input, which better handles image and text related tasks

advantage

It supports rapid optimization for vertical domains, and offers models scaling from 10M to 1B parameters

CLUE Leaderboards

*Ranking as of July 30, 2021

Ranking123
ModelMengziMotianBETRTSGHuman Level
Scale1B1B10B
Total Score82.9082.1581.8086.68
AFQMC79.8278.3079.8581.00
TNEWS64.6857.4257.4271.00
IFLYTEK65.0865.4664.5480.30
OCNLI81.8784.9785.9390.30
WSC202096.5594.8395.1798.00
CSL89.8790.1789.0084.00
CMRC201882.2585.3083.8092.40
CHID96.0094.4393.0687.10
C389.9888.4987.4496.00
RankingModelScaleTotal ScoreAFQMCTNEWSIFLYTEKOCNLIWSC2020CSLCMRC2018CHIDC3
1Mengzi1B82.9079.8264.6865.0881.8796.5589.8782.2596.0089.98
2Motian1B82.1578.3057.4265.4684.9794.8390.1785.3094.4388.49
3BETRTSG10B81.8079.8557.4264.5485.9395.1789.0083.8093.0687.44
Human Level86.6881.0071.0080.3090.3098.0084.0092.4087.1096.00

Application Scenarios

scenescene

Bulletin Extraction

The model can extract announcement information from a large amount of text, which is convenient for quickly obtaining important information.
scenescene

Fiction Generation

The model can automatically generate novel content based on the information provided by users.
scenescene

Sentiment Classification

The model can perform sentiment analysis on the text to distinguish positive, negative or neutral sentiment in the text.
scenescene

Research Reports Classification

The model can classify research reports and classify them according to different themes.
scenescene

News Digest

The model can automatically generate news summaries and quickly provide news key information.
scenescene

Knowledge Map Construction

The model can build a knowledge graph based on existing knowledge, which is convenient for quick query.
scenescene

Q&A System

The model can provide answers to questions through semantic analysis.
scenescene

Image-text Mutual Inspection

The model can measure the relevance of a text and images.

Customer Stories

https://cdn.langboat.com/portal/page.technology.mengzi.case1.title

Hithink RoyalFlush Information Network Co., Ltd.

Together with RoyalFlush, Langboat Technology focuses on the field of cognitive intelligence, jointly innovates NLP technology, upgrades products and services in the financial technology field, and brings better user experience to customers.

Experience Mengzi Pre-training Model

Products

Business Cooperation Email

bd@langboat.com

ewm

Address

11F, Block A, Dinghao DH3 Building, No.3 Haidian Street, Haidian District, Beijing, China


© 2023, Langboat Co., Limited. All rights reserved.

Business Cooperation:

bd@langboat.com

Address:

11F, Block A, Dinghao DH3 Building, No.3 Haidian Street, Haidian District, Beijing, China

Official Accounts:

ewm

© 2023, Langboat Co., Limited. All rights reserved.
support
business