AI & ML interests

None defined yet.

Recent Activity

FlameF0X  updated a dataset about 1 month ago
SHA-index/model-dna-index
FlameF0X  updated a Space about 1 month ago
SHA-index/Search-SHA
FlameF0X  updated a dataset about 1 month ago
SHA-index/model-dna-index
View all activity

FlameF0X 
posted an update 1 day ago
view post
Post
87
I did some testing on the scalability of FWKV. It hits a speed bottleneck at 1B due to the T4’s bandwidth limitations. Theoretically, it should match RWKV’s inference speed if the GPU had more bandwidth. So the 1B size is not accurate.
FlameF0X 
posted an update 3 days ago
view post
Post
139
Greetings Hugging Face!

I started a new project called **FWKV** (Feed-forward Weighted Key Value, or Floored Weighted Key Value), a RWKV-style LM that uses FFNNs (Feed-Forward Neural Networks) instead of RNN and floor(W·K·V). I'm hoping to make it much more efficient and scalable than RWKV.

So far I have:

- FlameF0X/FWKV-29M — this one is undertrained and doesn't have a Space yet. In the attached image you can see its speed on a T4 compared to models with the same configuration.

The only model that's fully working right now is:
- FlameF0X/FWKV-TinyStories — trained on TinyStories for one epoch. The demo Space is FlameF0X/FWKV-demo.
  • 2 replies
·
FlameF0X 
updated a Space 5 months ago
FlameF0X 
published a Space 5 months ago
FlameF0X 
posted an update 9 months ago
view post
Post
4371
I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore.
  • 7 replies
·
FlameF0X 
posted an update 9 months ago
view post
Post
819
the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus.
FlameF0X 
posted an update 10 months ago
view post
Post
277
The development of SnowflakeCore-G1-7B-MoE it getting delay. In the mean time I am working on SnowflakeCore-G1-1B-MoE witch would be a pre-train chatbot.
  • 1 reply
·
FlameF0X 
posted an update 10 months ago
view post
Post
2958
The development of SnowflakeCore-G1-7B-MoE. I can't say when it would be publish yet because it's big and it requires a lot of computational power.
  • 1 reply
·
FlameF0X 
posted an update 10 months ago
FlameF0X 
posted an update 10 months ago
view post
Post
315
Hello! Important announcement, I will rename SnowflakeCore-G1-Medium to SnowflakeCore-G1-Tiny2 because it's going to have the same parameters as the Tiny version, but this one is trained on more data.
  • 1 reply
·
FlameF0X 
posted an update 10 months ago
view post
Post
747
Currently working on SnowflakeCore-G1-Medium. [Updated loss cruve]
  • 3 replies
·
FlameF0X 
posted an update 10 months ago
FlameF0X 
posted an update 10 months ago
FlameF0X 
posted an update 11 months ago
FlameF0X 
posted an update 11 months ago
view post
Post
258
SnowflakeCore-G1 Update:
Got it running and training! Context window is currently set to 2048 tokens.
Training is active and stable. Will share results once I have some metrics to report.
  • 2 replies
·