Jim Lai

grimjim

AI & ML interests

Experimenting with 7B-9B parameter text completion models. Not all models are intended for direct use, but for educational and/or merge purposes.

Organizations

Posts 4

view post
Post
1261
I propose "merge densification", a style of merger which attempts to transfer the benefits of a denser model to a base model. The model weight in this case is 0.02, which is atypically small for mergers, but high compared to the learning rate used during training. In this case, the expectation is more creative text-generation. More details below:
grimjim/kunoichi-lemon-royale-v3-32K-7B