百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 技术教程 > 正文

解锁混合专家 (MoE) LLM:你的 MoE 模型可以免费嵌入模型

csdh11 2025-02-09 11:57 49 浏览

我最近发现了一篇有趣的论文,题为“您的混合专家LLM其实是一个免费的嵌入模型”。最近的LLM架构趋势是解码器模型,由于其注意力方法,它不适合嵌入模型。然而,作者透露,混合专家 (MoE) LLM可以作为嵌入模型,应用多种以嵌入为重点的任务,而无需进一步微调。在本文中,首先,让我们回顾一下 MoE,我将介绍它的工作原理及其实际实现。

目录

  1. 什么是混合专家 (MoE)?
  2. MoE 如何作为嵌入模型发挥作用?
  3. 实际实施:利用 MoEE 和 BERTopic

1.什么是混合专家(MoE)?

混合专家 (MoE) 是一种具有多个子网络(称为“专家”)的架构,每个子网络专门处理不同的任务或数据方面。MoE 的优势之一是,它使 AI 模型能够以比相同或更大的模型更少的计算量进行预训练,同时保持或提高质量。因此,如果我们的预算有限,我们可以使用 MoE 实现比密集、类似大小的传统模型更好的模型。就最近的成功而言,Mixtral 8 x 7B 在许多评估数据集上的表现优于 LLaMA 2 70B。

从现在开始,让我们研究一下 MoE 的架构。最近成功的 MoE 使用了 transformer 模型,因此我将重点介绍流行的 transformer MoE 架构。MoE 主要有两个组件,如下所述。

  • MoE 层

在 Transformer 架构中,MoE 将前馈网络 (FFN) 层替换为 MoE 层。每个 MoE 层都有一些专家(例如上图中的 4 个专家),并且每个专家由简单的FFN层组成。请注意,Transformer 中的其他组件(例如自注意力层)共享相同的权重。因此,MoE 的权重数量并不简单。例如,Mixtral 8 x 7B 权重不是 8 x 7 = 56B 而是 47B,因为 MoE 层以外的其他层共享相同的权重。

  • 门控网络

门控网络或路由器是 MoE 中的关键组件。它接收输入标记并为每个标记选择最相关的专家。例如,在上图中,路由器的左侧选择第二个专家来处理单词“more”标记。同时,路由器确定第一个专家来处理单词“Parameters”标记。通常,门控网络选择与给定标记相关的前 k 个专家,并将标记发送给选定的专家;例如,Mixtral 8 x 7B 选择前 2 个专家。

我们如何选择前 k 名专家?我们使用 softmax 函数计算专家的重要性概率并保留前 k 名概率专家,如下所示。我提取了上图中的门控部分。

门控网络有其权重。我们将 softmax 函数应用于输入单词 token 与门控网络权重之间的点积结果,然后得到专家与给定 token 相关的概率。根据该概率,我们可以选出前 k 个相关专家。具有这种门控网络的 MoE 称为稀疏 MoE。

这些是理解 MoE 作为嵌入模型如何工作所需的基础知识。现在,让我们深入了解 MoE 作为嵌入模型的实际工作方式。

2. MoE 如何作为嵌入模型发挥作用?

快速回顾一下嵌入

在深入探讨本节主题之前,让我们快速回顾一下嵌入。最近,嵌入已成为深度学习模型中输入数据的内部表示,它具有语义和浓缩的数据信息。我们通常将神经网络的最后一个隐藏状态提取为嵌入,如下所示。

我们通常使用基于编码器的模型来提取嵌入,因为与仅使用解码器的模型相比,它们可以使用双向注意力来捕获语义。仅使用解码器的模型通常使用因果注意力来仅与前一个单词标记进行交互;因此,它们无法像编码器-解码器模型那样捕获丰富的语义,例如上下文信息。

MoE 如何作为嵌入模型发挥作用?

以前人们普遍认为解码器模型不能用于嵌入提取。然而,作者发现 MoE 中的路由权重为解码器嵌入提供了补充信息。每层中的路由权重反映了对输入 token 的推理选择,因此它包含了隐藏状态嵌入可能丢失的输入语义信息。在数学公式中,我们可以将其描述为:

g是 softmax 函数,H表示隐藏状态。我们将所有 MoE 层的路由权重连接起来,以避免丢失模型的推理选择。

为了充分利用路由权重和解码器嵌入,作者提出了一种称为 MoE 嵌入 (MoEE) 的方法来形成更全面的嵌入表示。MoEE 有两种类型。一种方法是基于连接的组合,如下所述。

这种方法很简单,我们只需将路由权重和解码器嵌入连接起来即可。作者将此方法称为 MoEE(concat)。它可以保留每个路由权重捕获的不同信息,同时允许下游任务利用组合表示。

另一种方法是加权和积分。它对从路由权重和隐藏状态 (HS) 嵌入计算出的相似度得分进行加权和,表示为 MoEE (sum)。此方法用于比较两个句子的任务,例如语义文本相似度。

是控制路由权重贡献的超参数。计算每对的相似度得分后,我们计算计算出的相似度得分与真实相似度之间的秩相关性,例如 Spearman 秩相关性。

对于实际使用,我认为 MoEE(concat) 很容易使用。此外,作者利用 PromptEOL 技术 [4] 来增强 MoEE。此技术提示以下模板来约束 LLM 预测下一个标记中的语义信息。

现在,这是跨 MTEB 任务的性能表。

带有 PromptEOL 的 MoEE 比监督和自监督方法效果更好。请注意,此排行榜不是最新的,因此此结果不是 SOTA。此方法的价值在于我们可以获得不错的嵌入任务结果,并且无需进一步训练即可使用。

到目前为止,我们已经介绍了 MoEE 的工作原理。在下一节中,我们将使用 BERTopic 和聚类句子来实现 MoEE。

3. 实际实施:利用 MoEE 和 BERTopic

在本节中,我们从预先训练的 MoE LLM 中提取嵌入,并使用 20 个新闻组数据集将它们与BERTopic结合使用。供您参考,BERTopic 是一个超越传统统计主题建模的便捷主题建模库。它利用 Transformer 中的嵌入进行主题聚类,因此我认为它适合检查功能。首先,让我们准备一个环境。

环境设置

我使用了带有 Python 3.10 的 conda 环境。我在 Ubuntu 20.04 上进行了实验,使用的是 cuda 12.4、16 GB VRAM。您可能需要 32 GB RAM 来下载模型权重。

conda create -n moee python=3.10 -y
conda activate moee

接下来,我们需要通过 pip 安装下面的库。

pip install transformers torch bitsandbytes bertopic accelerate

MoE 模型通常需要较高的 VRAM,因为我们需要提前将整个模型加载到 VRAM 中。因此,我们需要使用量化包 bitsandbytes 来节省 VRAM 内存。

我们需要克隆官方 GitHub 存储库。

git clone https://github.com/tianyi-lab/MoE-Embedding.git

所有准备工作都已完成。现在,让我们使用 MoEE 通过 BERTopic 实现主题聚类。

利用 MoEE 和 BERTopic

现在,我们将使用 MoEE 作为 BERTopic 的嵌入模型并尝试主题聚类。原始存储库允许我们使用小型 MoE 模型,例如 Qwen-1.5-MoE-A2.7B 或 OLMoE-1B-7B。在本文中,我将使用 OLMoE-1B-7B,它适合在 16 GB VRAM 上运行推理。首先,我们需要加载 OLMoE-1B-7B。

kwargs = {
        "base_model": 'allenai/OLMoE-1B-7B-0924',
        "normalized": False,
        "torch_dtype": torch.bfloat16,
        "mode": "embedding",
        "pooling_method": "mean",
        "attn_implementation": "sdpa",
        "attn": "bbcc",
    }

config = {
    'embed_method': 'prompteol',
    'emb_info': 'MoEE'
    }

embedding_model = MOEE(model_name_or_path='allenai/OLMoE-1B-7B-0924', **kwargs)

接下来,我们需要计算 20 个新闻组数据集的嵌入以通过 BERTopic。

from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']

dataset = MyDataset(docs)
dataloader = DataLoader(dataset=dataset, batch_size=8)
embeddings = None

for batch in tqdm(dataloader):
    with torch.no_grad():      
        embedding = embedding_model.encode(batch, **config)
        
        if embeddings is None:
            embeddings = embedding[0]
        else:
            embeddings = np.vstack((embeddings, embedding[0]))
    
    torch.cuda.empty_cache()

为了提前计算嵌入,我们使用
torch.utils.data.DataLoader 作为迭代器,并对每个批处理文档进行编码。请注意,我们必须将嵌入作为 np.asarray 类型传递给 BERTopic。

当你想使用自己的 MoE 模型时,你必须实现从每个 MoE 层获取路由权重。对于隐藏状态嵌入,我们可以利用 HuggingFace 转换器函数。我们只需要在推理时传递 output_hidden_?states=True 参数。

现在,我们可以运行主题建模。

# Step 2 - Reduce dimensionality
umap_model = UMAP(n_neighbors=15, n_components=5, min_dist=0.0, metric='cosine')

# Step 3 - Cluster reduced embeddings
hdbscan_model = HDBSCAN(min_cluster_size=15, metric='euclidean', cluster_selection_method='eom', prediction_data=True)

# Step 4 - Tokenize topics
vectorizer_model = CountVectorizer(stop_words="english")

# Step 5 - Create topic representation
ctfidf_model = ClassTfidfTransformer()

# Step 6 - (Optional) Fine-tune topic representations with 
# a `bertopic.representation` model
representation_model = KeyBERTInspired()

# All steps together
topic_model = BERTopic(
  embedding_model=embedding_model,          # Step 1 - Extract embeddings
  umap_model=umap_model,                    # Step 2 - Reduce dimensionality
  hdbscan_model=hdbscan_model,              # Step 3 - Cluster reduced embeddings
  vectorizer_model=vectorizer_model,        # Step 4 - Tokenize topics
  ctfidf_model=ctfidf_model,                # Step 5 - Extract topic words
  representation_model=representation_model # Step 6 - (Optional) Fine-tune topic representations
)

# topic modeling using BERTopic model
topics, probs = topic_model.fit_transform(docs, embeddings)

默认设置下我们得到了 42 个主题;下面是一些示例。虽然我随机挑选了主题,但它可以很好地捕捉语义。

此外,这里是主题集群可视化。

请看主题聚类可视化中的红色圆圈。这个红色圆圈指的是主题 0,与计算机相关。更接近的主题也与机械词汇相关,例如图形、数字和打印机。

该方法向我们展示了我们可以在不进行任何训练的情况下获得不错的嵌入。尽管仍有提升质量以达到与 SOTA 监督模型相当的质量的空间,但本文的发现为进一步改进无需训练的嵌入提取方法迈出了良好的一步。

全部代码参考如下。您需要将此文件放入 MoE-Embedding 目录的顶部。

import sys
sys.path.append('.')
import re
import numpy as np

import torch
from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm

from umap import UMAP
from hdbscan import HDBSCAN
from sklearn.feature_extraction.text import CountVectorizer

from bertopic import BERTopic
from bertopic.representation import KeyBERTInspired
from bertopic.vectorizers import ClassTfidfTransformer
from moee import MOEE
/opt/conda/envs/moee/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device
'cuda'
Load dataset
from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
def remove_punctuation(x: str) -> str:
    cleaned = re.sub(r"[!\"#$%&()*+-./:;<=>?@[\]^_`{|}~\n -' ]", " ", x)
    return cleaned

def clean_caption(x :str) -> str:
    # align the character 
    x = x.lower()
    
    # remove URLs and punctuation
    x = re.sub(r"http\S+", "", x)
    x = re.sub(r"www.\S+", "", x)
    x = remove_punctuation(x)
    x = re.sub(r"  ", " ", x)
    
    return x
docs = [clean_caption(doc) for doc in docs]
Define MoEE and BERTopic
kwargs = {
        "base_model": 'allenai/OLMoE-1B-7B-0924',
        "normalized": False,
        "torch_dtype": torch.bfloat16,
        "mode": "embedding",
        "pooling_method": "mean",
        "attn_implementation": "sdpa",
        "attn": "bbcc",
    }

config = {
    'embed_method': 'prompteol',
    'emb_info': 'MoEE'
    }

embedding_model = MOEE(model_name_or_path='allenai/OLMoE-1B-7B-0924', **kwargs)
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
OlmoeForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From v4.50 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Loading checkpoint shards: 100%|██████████| 3/3 [00:52<00:00, 17.59s/it]
self.model:  OlmoeForCausalLM(
  (model): OlmoeModel(
    (embed_tokens): Embedding(50304, 2048, padding_idx=1)
    (layers): ModuleList(
      (0-15): 16 x OlmoeDecoderLayer(
        (self_attn): OlmoeSdpaAttention(
          (q_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (v_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (o_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (q_norm): OlmoeRMSNorm((2048,), eps=1e-05)
          (k_norm): OlmoeRMSNorm((2048,), eps=1e-05)
        )
        (mlp): OlmoeSparseMoeBlock(
          (gate): Linear4bit(in_features=2048, out_features=64, bias=False)
          (experts): ModuleList(
            (0-63): 64 x OlmoeMLP(
              (gate_proj): Linear4bit(in_features=2048, out_features=1024, bias=False)
              (up_proj): Linear4bit(in_features=2048, out_features=1024, bias=False)
              (down_proj): Linear4bit(in_features=1024, out_features=2048, bias=False)
              (act_fn): SiLU()
            )
          )
        )
        (input_layernorm): OlmoeRMSNorm((2048,), eps=1e-05)
        (post_attention_layernorm): OlmoeRMSNorm((2048,), eps=1e-05)
      )
    )
    (norm): OlmoeRMSNorm((2048,), eps=1e-05)
    (rotary_emb): OlmoeRotaryEmbedding()
  )
  (lm_head): Linear(in_features=2048, out_features=50304, bias=False)
)
class MyDataset(Dataset):
    """Dataset to pass to `transformers.pipelines.pipeline`."""

    def __init__(self, docs, truncate_token_num: int = 300):
        self.docs = docs
        self.truncate_token_num = truncate_token_num

    def __len__(self):
        return len(self.docs)

    def __getitem__(self, idx):
        if len(self.docs[idx]) > self.truncate_token_num:
            return self.docs[idx][:self.truncate_token_num]
        
        return self.docs[idx]
dataset = MyDataset(docs)
dataloader = DataLoader(dataset=dataset, batch_size=16)
embeddings = None

for batch in tqdm(dataloader):
    with torch.no_grad():      
        embedding = embedding_model.encode(batch, **config)
        
        if embeddings is None:
            embeddings = embedding[0]
        else:
            embeddings = np.vstack((embeddings, embedding[0]))
    
    torch.cuda.empty_cache()
100%|██████████| 2356/2356 [43:44<00:00,  1.11s/it]
np.save('embedding.npy', embeddings)
# Step 2 - Reduce dimensionality
umap_model = UMAP(n_neighbors=15, n_components=5, min_dist=0.0, metric='cosine')

# Step 3 - Cluster reduced embeddings
hdbscan_model = HDBSCAN(min_cluster_size=15, metric='euclidean', cluster_selection_method='eom', prediction_data=True)

# Step 4 - Tokenize topics
vectorizer_model = CountVectorizer(stop_words="english")

# Step 5 - Create topic representation
ctfidf_model = ClassTfidfTransformer()

# Step 6 - (Optional) Fine-tune topic representations with 
# a `bertopic.representation` model
representation_model = KeyBERTInspired()

# All steps together
topic_model = BERTopic(
  embedding_model=embedding_model,          # Step 1 - Extract embeddings
  umap_model=umap_model,                    # Step 2 - Reduce dimensionality
  hdbscan_model=hdbscan_model,              # Step 3 - Cluster reduced embeddings
  vectorizer_model=vectorizer_model,        # Step 4 - Tokenize topics
  ctfidf_model=ctfidf_model,                # Step 5 - Extract topic words
  representation_model=representation_model # Step 6 - (Optional) Fine-tune topic representations
)
topics, probs = topic_model.fit_transform(docs, embeddings)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
topic_model.get_topic_info()
Topic	Count	Name	Representation	Representative_Docs
0	-1	5271	-1_christian_church_believe_read	[christian, church, believe, read, god, eviden...	[i have come across what i consider to be an e...
1	0	4110	0_dos_os_windows_microsoft	[dos, os, windows, microsoft, ms, pc, mac, dis...	[ \t munch \t munch following is reformatted...
2	1	1057	1_scripture_christianity_christians_bible	[scripture, christianity, christians, bible, c...	[ this is something i ve always found a littl...
3	2	1022	2_flyers_puck_nhl_leafs	[flyers, puck, nhl, leafs, sabres, bruins, pla...	[the flyers closed out the season last night w...
4	3	963	3_riding_driving_wheel_bike	[riding, driving, wheel, bike, ride, honda, bi...	[sixteen days i had put off test driving the h...
5	4	902	4_comics_hulk_sale_list	[comics, hulk, sale, list, wolverine, forsale,...	[the following comics are for auction the hig...
6	5	696	5_firearms_guns_handgun_gun	[firearms, guns, handgun, gun, crime, criminal...	[ because the gun loonies were firing on vehic...
7	6	626	6_infections_clinical_diseases_infection	[infections, clinical, diseases, infection, ca...	[ one of the responsibilities of a licensed ph...
8	7	567	7_maybe_mailing_probably_does	[maybe, mailing, probably, does, say, hope, gu...	[ oh yes i m quite sure they will , \ti looke...
9	8	480	8_nasa_spacecraft_shuttle_satellite	[nasa, spacecraft, shuttle, satellite, orbit, ...	[ in fact you probably want to avoid us govern...
10	9	478	9_clipper_encryption_decrypt_cryptography	[clipper, encryption, decrypt, cryptography, c...	[it looks like dorothy denning s wrong headed ...
11	10	380	10____	[, , , , , , , , , ]	[, , ]
12	11	290	11_palestinians_israeli_israelis_gaza	[palestinians, israeli, israelis, gaza, gazans...	[many of you ask me whether i approve of sever...
13	12	249	12_ax_9f_qax_b8f	[ax, 9f, qax, b8f, kn, 6um, pl, m9, max, k8]	[ part 13 of 14 mtm 3v9f0 7ey 7e...
14	13	206	13_armenians_armenian_armenia_azerbaijanis	[armenians, armenian, armenia, azerbaijanis, a...	[accounts of anti armenian human rights violat...
15	14	167	14_archive_graphics_formats_information	[archive, graphics, formats, information, data...	[archive name graphics resources list part1 la...
16	15	145	15_grounded_grounding_ground_outlets	[grounded, grounding, ground, outlets, wiring,...	[ no no nooo the ground green wire is for ...
17	16	133	16_scorer_pittsburgh_pts_pp	[scorer, pittsburgh, pts, pp, stl, 78, 43, det...	[scoring stats for the swedish nhl players apr...
18	17	103	17____	[, , , , , , , , , ]	[ , and a vga monitor e mail , cica indiana ...
19	18	97	18_supplementation_vitamin_vitamins_cancer	[supplementation, vitamin, vitamins, cancer, c...	[ i ll tell you all that i know about chromium...
20	19	87	19_batteries_radio_battery_electronics	[batteries, radio, battery, electronics, elect...	[ in order to emit blue light a semiconductor ...
21	20	86	20_nasa_spacecraft_saturn_astronomy	[nasa, spacecraft, saturn, astronomy, satellit...	[archive name space references last modified ...
22	21	75	21_investigation_bombing_evidence_news	[investigation, bombing, evidence, news, witne...	[i told some friends of mine two weeks ago tha...
23	22	54	22_stephanopoulos_briefing_secretary_president	[stephanopoulos, briefing, secretary, presiden...	[the white house office of the press...
24	23	51	23_send_entries_dos_fpu	[send, entries, dos, fpu, slip, pktmux, guidel...	[here are the standings after game 1 of each o...
25	24	50	24____	[, , , , , , , , , ]	[there seems to be a p pds slot in the above p...
26	25	44	25_islamic_islam_quran_qur	[islamic, islam, quran, qur, muslim, muslims, ...	[ secular laws seem to value criminal life mor...
27	26	42	26_nonsense_claims_censorship_argument	[nonsense, claims, censorship, argument, claim...	[ i m going to cut rex s ramblings down a bit ...
28	27	37	27_paintshop_contacting_sold_sent	[paintshop, contacting, sold, sent, thanks, f5...	[found it thanks i got several offers for help...
29	28	35	28_homosexuality_homosexual_homosexuals_hetero...	[homosexuality, homosexual, homosexuals, heter...	[ can someone tell me why when mr cramer spo...
30	29	35	29_sphere_triangulation_algorithms_perpendicular	[sphere, triangulation, algorithms, perpendicu...	[ good i had a bad feeling about this prob...
31	30	32	30_shortstop_pitchers_outfielder_hitters	[shortstop, pitchers, outfielder, hitters, bas...	[ he s not gone yet the position opening is d...
32	31	32	31_skepticism_geb_n3jxp_gordon	[skepticism, geb, n3jxp, gordon, intellect, in...	[ senile keratoses have nothing to do with th...
33	32	31	32_militia_amendment_constitution_firearm	[militia, amendment, constitution, firearm, li...	[ actually the words a well regulated milita ...
34	33	30	33_subscribe_unsubscribe_subscrive_email	[subscribe, unsubscribe, subscrive, email, wan...	[please subscribe me , please subscribe me , p...
35	34	28	34_speeding_manslaughter_policeman_cop	[speeding, manslaughter, policeman, cop, court...	[pmoloney maths tcd ie paul moloney writes n...
36	35	28	35_modems_modem_mhz_tcp	[modems, modem, mhz, tcp, digital, signal, mai...	[ db 25\tdb 9 pin \tpin \tname\teia\tccitt\tdt...
37	36	24	36_dial_0055_800_930314	[dial, 0055, 800, 930314, number, 9000, 8287, ...	[1 800 832 4778 western digital s voice mail ...
38	37	24	37_inkjet_inkjets_printers_laserjet	[inkjet, inkjets, printers, laserjet, deskjet,...	[fyi the actual horizontal dot placement reso...
39	38	24	38_rangers_adams_quakers_ivy	[rangers, adams, quakers, ivy, douglass, hope,...	[ i think that they go to divisional records b...
40	39	21	39_homosexual_percent_sexual_majority	[homosexual, percent, sexual, majority, percen...	[ from the santa rosa cal press democrat apr...
41	40	19	40_irony_cycnicism_sarcasm_acetone	[irony, cycnicism, sarcasm, acetone, humour, k...	[ \t1 they are religious parodies not atheisti...
42	41	15	41_autobiography_author_book_books	[autobiography, author, book, books, bookstore...	[this is the story of kent the archetype finn ...
topic_model.get_topic(0)
[('dos', np.float32(0.45857304)),
 ('os', np.float32(0.43415424)),
 ('windows', np.float32(0.40028214)),
 ('microsoft', np.float32(0.32284227)),
 ('ms', np.float32(0.31080914)),
 ('pc', np.float32(0.28627717)),
 ('mac', np.float32(0.2705468)),
 ('disk', np.float32(0.26714522)),
 ('scsi', np.float32(0.24755469)),
 ('cx', np.float32(0.2305391))]
topic_model.get_topic(2)
[('flyers', np.float32(0.5347663)),
 ('puck', np.float32(0.4863899)),
 ('nhl', np.float32(0.4710263)),
 ('leafs', np.float32(0.4642067)),
 ('sabres', np.float32(0.45007592)),
 ('bruins', np.float32(0.41095752)),
 ('playoffs', np.float32(0.39904732)),
 ('hockey', np.float32(0.3952221)),
 ('pitching', np.float32(0.39289254)),
 ('braves', np.float32(0.37793285))]
topic_model.get_topic(29)
[('sphere', np.float32(0.42566895)),
 ('triangulation', np.float32(0.42115515)),
 ('algorithms', np.float32(0.37481007)),
 ('perpendicular', np.float32(0.36362517)),
 ('algorithm', np.float32(0.35225672)),
 ('3d', np.float32(0.351159)),
 ('coplanar', np.float32(0.31972635)),
 ('circle', np.float32(0.29665813)),
 ('vertices', np.float32(0.28228626)),
 ('bisector', np.float32(0.2748276))]
topic_model.visualize_topics()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /opt/conda/envs/moee/lib/python3.10/site-packages/IPython/core/formatters.py:925, in IPythonDisplayFormatter.__call__(self, obj)
    923 method = get_real_method(obj, self.print_method)
    924 if method is not None:
--> 925     method()
    926     return True

File /opt/conda/envs/moee/lib/python3.10/site-packages/plotly/basedatatypes.py:832, in BaseFigure._ipython_display_(self)
    829 import plotly.io as pio
    831 if pio.renderers.render_on_display and pio.renderers.default:
--> 832     pio.show(self)
    833 else:
    834     print(repr(self))

File /opt/conda/envs/moee/lib/python3.10/site-packages/plotly/io/_renderers.py:394, in show(fig, renderer, validate, **kwargs)
    389         raise ValueError(
    390             "Mime type rendering requires ipython but it is not installed"
    391         )
    393     if not nbformat or Version(nbformat.__version__) < Version("4.2.0"):
--> 394         raise ValueError(
    395             "Mime type rendering requires nbformat>=4.2.0 but it is not installed"
    396         )
    398     ipython_display.display(bundle, raw=True)
    400 # external renderers

ValueError: Mime type rendering requires nbformat>=4.2.0 but it is not installed

参考:

https://arxiv.org/pdf/2410.10814

https://huggingface.co/blog/moe

https://arxiv.org/pdf/2101.03961

https://arxiv.org/pdf/2307.16645

相关推荐

Github霸榜的SpringBoot全套学习教程,从入门到实战,内容超详细

前言...

SpringBoot+LayUI后台管理系统开发脚手架

源码获取方式:关注,转发之后私信回复【源码】即可免费获取到!项目简介本项目本着避免重复造轮子的原则,建立一套快速开发JavaWEB项目(springboot-mini),能满足大部分后台管理系统基础开...

Spring Boot+Vue全栈开发实战,中文版高清PDF资源

SpringBoot+Vue全栈开发实战,中文高清PDF资源,需要的可以私我:)SpringBoot致力于简化开发配置并为企业级开发提供一系列非业务性功能,而Vue则采用数据驱动视图的方式将程序...

2021年超详细的java学习路线总结—纯干货分享

本文整理了java开发的学习路线和相关的学习资源,非常适合零基础入门java的同学,希望大家在学习的时候,能够节省时间。纯干货,良心推荐!第一阶段:Java基础...

探秘Spring Cache:让Java应用飞起来的秘密武器

探秘SpringCache:让Java应用飞起来的秘密武器在当今快节奏的软件开发环境中,性能优化显得尤为重要。SpringCache作为Spring框架的一部分,为我们提供了强大的缓存管理能力,让...

3,从零开始搭建SSHM开发框架(集成Spring MVC)

目录本专题博客已共享在(这个可能会更新的稍微一些)https://code.csdn.net/yangwei19680827/maven_sshm_blog...

Spring Boot中如何使用缓存?超简单

SpringBoot中的缓存可以减少从数据库重复获取数据或执行昂贵计算的需要,从而显著提高应用程序的性能。SpringBoot提供了与各种缓存提供程序的集成,您可以在应用程序中轻松配置和使用缓...

我敢保证,全网没有再比这更详细的Java知识点总结了,送你啊

接下来你看到的将是全网最详细的Java知识点总结,全文分为三大部分:Java基础、Java框架、Java+云数据小编将为大家仔细讲解每大部分里面的详细知识点,别眨眼,从小白到大佬、零基础到精通,你绝...

1,从零开始搭建SSHM开发框架(环境准备)

目录本专题博客已共享在https://code.csdn.net/yangwei19680827/maven_sshm_blog1,从零开始搭建SSHM开发框架(环境准备)...

做一个适合二次开发的低代码平台,把程序员从curd中解脱出来-1

干程序员也有好长时间了,大多数时间都是在做curd。现在想做一个通用的curd平台直接将我们解放出来;把核心放在业务处理中。用过代码生成器,在数据表设计好之后使用它就可以生成需要的controller...

设计一个高性能Java Web框架(java做网站的框架)

设计一个高性能JavaWeb框架在当今互联网高速发展的时代,构建高性能的JavaWeb框架对于提升用户体验至关重要。本文将从多个角度探讨如何设计这样一个框架,让我们一起进入这段充满挑战和乐趣的旅程...

【推荐】强&amp;牛!一款开源免费的功能强大的代码生成器系统!

今天,给大家推荐一个代码生成器系统项目,这个项目目前收获了5.3KStar,个人觉得不错,值得拿出来和大家分享下。这是我目前见过最好的代码生成器系统项目。功能完整,代码结构清晰。...

Java面试题及答案总结(2025版持续更新)

大家好,我是Java面试分享最近很多小伙伴在忙着找工作,给大家整理了一份非常全面的Java面试场景题及答案。...

Java开发网站架构演变过程-从单体应用到微服务架构详解

Java开发网站架构演变过程,到目前为止,大致分为5个阶段,分别为单体架构、集群架构、分布式架构、SOA架构和微服务架构。下面玄武老师来给大家详细介绍下这5种架构模式的发展背景、各自优缺点以及涉及到的...

本地缓存GuavaCache(一)(guava本地缓存原理)

在并发量、吞吐量越来越大的情况下往往是离不开缓存的,使用缓存能减轻数据库的压力,临时存储数据。根据不同的场景选择不同的缓存,分布式缓存有Redis,Memcached、Tair、EVCache、Aer...