songsenand
  • Joined on 2024-01-17
songsenand pushed to main at songsenand/SUIME 2026-03-01 10:53:27 +08:00
8a29e86c37 docs(README): 修正客户端前端说明中的格式问题
songsenand pushed to main at songsenand/SUIME 2026-03-01 10:51:48 +08:00
085d90b5d3 feat(tokenizer): 添加自定义分词器配置文件
Compare 2 commits »
songsenand pushed to main at songsenand/SUIME 2026-02-28 17:56:28 +08:00
7abf0edce1 更新 Linux 客户端 fcitx5 插件名称为 fcitx5-ext
songsenand pushed to main at songsenand/SUIME 2026-02-28 17:21:47 +08:00
e715845d19 feat(readme): 添加项目架构、技术细节与开发路线图说明
songsenand pushed to main at songsenand/SUIME 2026-02-28 16:19:32 +08:00
84f33d0845 chore: 添加 Cargo 项目基础文件和 main 函数
songsenand created repository songsenand/SUIME 2026-02-28 16:08:59 +08:00
songsenand pushed to main at songsenand/SUInput 2026-02-28 09:42:41 +08:00
d88a68e421 feat(model): 添加 MoEModelWithNeck 类及注意力池化模块
songsenand pushed to main at songsenand/SUInput 2026-02-26 23:03:57 +08:00
4031a668da feat(model): 添加 tensorboard 依赖并重构训练监控逻辑
songsenand pushed to main at songsenand/SUInput 2026-02-26 14:36:46 +08:00
43c8349d51 fix(trainer): 移除固定总步数,使用实际停止批次计算 warmup 步数
songsenand pushed to main at songsenand/SUInput 2026-02-26 14:30:51 +08:00
1178f87713 fix(dataset): 添加6%概率返回None以增强数据多样性
songsenand pushed to main at songsenand/SUInput 2026-02-26 14:14:03 +08:00
dfcce1f1ed feat(dataset): 调整拼音输入数据集的采样和处理逻辑以提升效果
songsenand pushed to main at songsenand/SUInput 2026-02-26 01:19:35 +08:00
66c2f78dda fix(model): 移除梯度 NaN 检查,直接执行优化器步骤
songsenand pushed to main at songsenand/SUInput 2026-02-26 01:00:37 +08:00
b0a4ce9ac8 fix(model): 修正评估损失计算以避免除零错误
songsenand pushed to main at songsenand/SUInput 2026-02-26 00:58:27 +08:00
dc718cde5b fix(dataset): 添加 token_type_ids 到 collate 函数的 hint 字段
songsenand pushed to main at songsenand/SUInput 2026-02-26 00:48:34 +08:00
7c90633ebc refactor(model): 使用注意力池化替换 span pooling 并支持 token_type_ids
songsenand pushed to main at songsenand/SUInput 2026-02-25 16:56:43 +08:00
93dced50c7 feat(model): 更新模型结构,使用 GELU 激活函数并优化专家网络参数
songsenand pushed to main at songsenand/SUInput 2026-02-24 01:06:17 +08:00
db90516fcf fix(encoder): 修复 encoder 调用时缺少 src_key_padding_mask 参数
songsenand pushed to main at songsenand/SUInput 2026-02-24 00:53:01 +08:00
b1f78668dc fix(model): 修正池化层输入源以确保正确计算特征向量
songsenand pushed to main at songsenand/SUInput 2026-02-24 00:49:04 +08:00
be6b686bd1 refactor(model): 移除 encoder 的 padding_mask 参数调用
songsenand pushed to main at songsenand/SUInput 2026-02-24 00:10:38 +08:00
63efc49aa6 feat(model): 添加训练完成通知功能,通过ServerChan发送微信消息