sha-index

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

FlameF0X updated a dataset about 1 month ago

SHA-index/model-dna-index

FlameF0X updated a Space about 1 month ago

SHA-index/Search-SHA

FlameF0X updated a dataset about 1 month ago

SHA-index/model-dna-index

View all activity

FlameF0X

posted an update 1 day ago

Post

I did some testing on the scalability of FWKV. It hits a speed bottleneck at 1B due to the T4’s bandwidth limitations. Theoretically, it should match RWKV’s inference speed if the GPU had more bandwidth. So the 1B size is not accurate.

FlameF0X

posted an update 3 days ago

Post

139

Greetings Hugging Face!

I started a new project called **FWKV** (Feed-forward Weighted Key Value, or Floored Weighted Key Value), a RWKV-style LM that uses FFNNs (Feed-Forward Neural Networks) instead of RNN and floor(W·K·V). I'm hoping to make it much more efficient and scalable than RWKV.

So far I have:

- FlameF0X/FWKV-29M — this one is undertrained and doesn't have a Space yet. In the attached image you can see its speed on a T4 compared to models with the same configuration.

The only model that's fully working right now is:
- FlameF0X/FWKV-TinyStories — trained on TinyStories for one epoch. The demo Space is FlameF0X/FWKV-demo.

2 replies

FlameF0X

updated a dataset about 1 month ago

SHA-index/model-dna-index

Viewer • Updated Apr 7 • 11.3k • 14

FlameF0X

updated a Space about 1 month ago

Search SHA

🕵

Trace model weight origins using SHA256 hashes

FlameF0X

in SHA-index/model-dna-index 5 months ago

[bot] Conversion to Parquet

#1 opened 5 months ago by

parquet-converter

FlameF0X

updated a Space 5 months ago

SHA-Index

🕵

FlameF0X

published a Space 5 months ago

SHA-Index

🕵

FlameF0X

published a dataset 5 months ago

SHA-index/model-dna-index

Viewer • Updated Apr 7 • 11.3k • 14

FlameF0X

published a Space 5 months ago

Search SHA

🕵

Trace model weight origins using SHA256 hashes

FlameF0X

posted an update 9 months ago

Post

4371

I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore.

7 replies

FlameF0X

posted an update 9 months ago

Post

819

the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus.

FlameF0X

posted an update 10 months ago

Post

277

The development of SnowflakeCore-G1-7B-MoE it getting delay. In the mean time I am working on SnowflakeCore-G1-1B-MoE witch would be a pre-train chatbot.

1 reply

FlameF0X

posted an update 10 months ago

Post

2958

The development of SnowflakeCore-G1-7B-MoE. I can't say when it would be publish yet because it's big and it requires a lot of computational power.

1 reply

FlameF0X

posted an update 10 months ago

Post

293

I just finished the benchmarks for https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny and https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny2 in comparation with openai-community/gpt2 .

FlameF0X

posted an update 10 months ago

Post

315

Hello! Important announcement, I will rename SnowflakeCore-G1-Medium to SnowflakeCore-G1-Tiny2 because it's going to have the same parameters as the Tiny version, but this one is trained on more data.

1 reply

FlameF0X

posted an update 10 months ago

Post

747

Currently working on SnowflakeCore-G1-Medium. [Updated loss cruve]

3 replies

FlameF0X

posted an update 10 months ago

Post

157

Hello there world! I am happy to announce that you now can fine-tune https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny , the code for that is in the model card.

I aslo lost the training log 😐

FlameF0X

posted an update 10 months ago

Post

1208

Hello! I am sad to say but fine-tuning https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny is complicated and the instruct version would need to wait some time.

2 replies

FlameF0X

posted an update 11 months ago

Post

231

SnowflakeCore-G1-Tiny has landed on Hugging Face! 🚀. Give it a try and let me know what you think: https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny.

FlameF0X

posted an update 11 months ago

Post

258

SnowflakeCore-G1 Update:
Got it running and training! Context window is currently set to 2048 tokens.
Training is active and stable. Will share results once I have some metrics to report.

2 replies

AI & ML interests

Recent Activity

Team members 1

SHA-index's activity

Search SHA

[bot] Conversion to Parquet

SHA-Index

SHA-Index

Search SHA