ai21labs/Jamba-v0.1 · Discussions

Install & run ai21labs/Jamba-v0.1 easily using llmpm

#53 opened 3 months ago by

sarthak-saxena

Update README.md

#52 opened about 1 year ago by

Motit

TypeError: forward() got an unexpected keyword argument 'num_logits_to_keep'

#51 opened almost 2 years ago by

shajiu

Adding Evaluation Results

#50 opened almost 2 years ago by

leaderboard-pr-bot

AttributeError: 'HybridMambaAttentionDynamicCache' object has no attribute '_modules'

7

#48 opened almost 2 years ago by

xxrjun

Adding Evaluation Results

#47 opened almost 2 years ago by

leaderboard-pr-bot

ai21 instance not runnable with langchain

1

#45 opened almost 2 years ago by

LordSahu

Is there any SFT or Chat model?

2

#41 opened about 2 years ago by

chuyi777

How to use accelerate evaluate Jamba

#40 opened about 2 years ago by

Xidong

Jamba Evaluation Task on GSM8K

➕ 3

#39 opened about 2 years ago by

ssparks

Do you have plans to release papers on Jamba's architecture or miniature models?

👍 2

#38 opened about 2 years ago by

badrabbitt

Are there any weight files for pre-trained models?

#37 opened about 2 years ago by

aidenxy

Memory usage on single A100*80GB in training

👍 2

#36 opened about 2 years ago by

DavidWu1116

Fast Mamba

5

#34 opened about 2 years ago by

Praneethkeerthi

Why does throughput increase with longer context window?

3

#33 opened about 2 years ago by

jingyu-q

Request: DOI

#32 opened about 2 years ago by

kozolex

GGUF quants?

1

#31 opened about 2 years ago by

6346y9uey

Any release plans for the 7b jamba model without MoE?

➕ 2

2

#30 opened about 2 years ago by

danielpark

Why is there an MLP in the Mamba Layer?

👀 3

#28 opened about 2 years ago by

naston

Complex vs Real parametrization.

#27 opened about 2 years ago by

YuvMil

How to Fine-tune Jamba on google Colab?

7

#26 opened about 2 years ago by

Ateeqq

Layer-Selective Rank Reduction

#25 opened about 2 years ago by

mizinovmv

Update README.md

#23 opened about 2 years ago by

rombodawg

Would there a chance Jamba to be train in 1.58bit weight?

👍 6

1

#22 opened about 2 years ago by

shing3232

Anyone else currently experimenting with fine-tuning Jamba?

3

#21 opened about 2 years ago by

Severian

IndentationError: unindent does not match any outer indentation level

#19 opened about 2 years ago by

thebeline

ModuleNotFoundError: No module named 'transformers_modules.ai21labs.Jamba-v0'

5

#17 opened about 2 years ago by

hjewr

Fast Mamba kernels are not available

10

#16 opened about 2 years ago by

MohamedRashad

does all safe tensors needed to be downloaded to use this model on colab?

2

#14 opened about 2 years ago by

Kv-boii

How many pretraining tokens?

👍 12

#13 opened about 2 years ago by

CyberNative

Smaller version to ease implementation experiments?

🔥👍 6

7

#12 opened about 2 years ago by

compilade

Coding performance of base model?

4

#11 opened about 2 years ago by

rombodawg

Jambaleo

🤯🔥 12

#10 opened about 2 years ago by

pszemraj

Can you give a short explanation about the benefits and the architecture?

👍 2

2

#7 opened about 2 years ago by

SicariusSicariiStuff

A Bang Up Job

🔥 1

2

#4 opened about 2 years ago by

nightvision04

multiple gpu?

3

#3 opened about 2 years ago by

bdambrosio

Just a solid congrats and thank you to your team

🔥❤️ 54

1

#1 opened about 2 years ago by

Severian