• Joined on 2025-10-12
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 11:57:40 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 11:45:22 +08:00
57360bec8a Remove CPU optimization call and add logging for TPU strategy and data pipeline performance
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 11:38:59 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 10:54:09 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 02:09:16 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 02:01:53 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 01:58:32 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 01:54:44 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 01:49:07 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 01:36:10 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 01:26:08 +08:00
0a72143513 legacy adam
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 01:07:05 +08:00
7df78244e6 adamw to adam
zchen pushed to dev2 at zchen/b2txt25 2025-10-17 00:52:13 +08:00
a96e272f7b fix twice gradient cut
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 23:06:11 +08:00
7a43ebfb71 refactor: streamline model building and ensure dtype consistency in L2 loss calculation
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 23:06:00 +08:00
9453b70fad remove quick test script for TensorFlow implementation fixes
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 22:42:40 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 22:20:11 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 22:02:13 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 21:51:46 +08:00
zchen pushed to dev2 at zchen/b2txt25 2025-10-16 21:42:26 +08:00