This website requires JavaScript.
Explore
Help
Sign In
songsenand
0 Followers
·
0 Following
Joined on
2024-01-17
Repositories
12
Projects
Packages
Public Activity
Starred Repositories
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-15 01:07:06 +08:00
94b44e6f71
添加损失权重支持并重构部分模块结构
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-15 00:25:48 +08:00
515f261824
修复模型加载方法,使用正确的实例方法加载状态字典
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-15 00:08:59 +08:00
fd913748ca
调整残差块和分类头的 dropout 概率,并新增残差模块到 MoE 模型
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-14 23:34:41 +08:00
e91f823d65
feat: 优化模型输入处理与专家数量,增强训练与推理兼容性
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-14 17:07:42 +08:00
9fad2bf1d4
修复损失计算方式,使用NLLLoss替代原始criterion
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-14 15:51:11 +08:00
f89635b201
添加 package-data 配置以包含 trainer 和 suinput 模块的额外数据文件
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-14 15:43:31 +08:00
d60997438e
更新评估数据集样本文件
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-14 15:29:35 +08:00
b68f75b09d
修复 char_info.pinyin 访问方式,使用字典形式确保兼容性
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-14 15:27:16 +08:00
d2d65c7efa
调整导入顺序并修复pickle保存逻辑
134c8a09cf
feat: 重构拼音输入数据集与 MoE 模型结构,优化专家网络配置及评估逻辑
Compare 2 commits »
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 16:13:21 +08:00
7eb00c6207
feat(model): 优化专家输出结构并添加专家偏置支持
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 15:44:52 +08:00
f4be47df78
feat(trainer): 使用 hidden_size 代替 d_model 计算输出维度并添加池化层
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 14:20:15 +08:00
d82c80f3a9
修复分类头输出维度,使用 d_model 替代 hidden_size
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 14:15:51 +08:00
6923870171
修复输出维度计算错误,使用 d_model 代替 input_dim
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 12:58:38 +08:00
0e3418798e
添加自定义学习率调度支持并优化默认优化器配置
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 12:12:45 +08:00
335540d8c2
调整学习率阈值并优化日志输出精度
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 11:31:09 +08:00
02f851205f
修复周期性评估时平均损失计算错误
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 11:05:58 +08:00
92b12ef703
调整数据集打乱缓冲区大小并优化样本处理逻辑
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 10:55:17 +08:00
54ac5af876
feat: 优化数据加载与训练逻辑,增加自定义学习率调度支持
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 01:45:22 +08:00
982d0521d5
添加日志记录和确保模型处于训练模式
songsenand
pushed to
main
at
songsenand/SUInput
2026-02-13 01:32:03 +08:00
35e835f618
使用 hint 字段替代原始 input_ids 和 attention_mask 进行推理
First
Previous
...
5
6
7
8
9
Next
Last