-
Notifications
You must be signed in to change notification settings - Fork 505
Issues: open-compass/opencompass
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] 示例代码 opencompass ./examples/eval_simpleqa.py 跑不通
#1885
opened Feb 20, 2025 by
superzhangmch
2 tasks done
[Feature] Can we support terminate dlc and volc tasks when oc evaluation task is terminated.
#1884
opened Feb 20, 2025 by
zhulinJulia24
1 task
[Bug] human-eval assertion failed while testing partial data
#1880
opened Feb 19, 2025 by
dengbinbox
2 tasks done
[Feature] 我需要对本地部署的Qwen-110B模型进行MMLU基准测试,请问该怎么操作呢?
#1877
opened Feb 18, 2025 by
hi112233445566
1 task
[Feature] 目前是否有适配Codeforces、SWE Verified、Aider-Polyglot这些在R1中出现的数据集的计划呢?
#1875
opened Feb 17, 2025 by
linbeyoung
1 task
[Bug] Medbench dataset only provides test data, not the entire dataset
#1874
opened Feb 16, 2025 by
ryan0980
2 tasks done
[Bug] Only Debug Mode can perform eval tasks correctly.
#1859
opened Feb 8, 2025 by
GenerallyCovetous
2 tasks done
[Bug] 在对DeepSeek-R1-Distill-Qwen-1.5B模型评测livecodebench数据集时,lcb_test_output为什么为0呢?
#1856
opened Feb 7, 2025 by
guoguo1314
2 tasks done
[Bug] MBPP score significantly lower than official results
#1855
opened Feb 7, 2025 by
GenerallyCovetous
2 tasks done
Missing one required positional argument in tools/collect_code_preds.py#L190
#1844
opened Jan 23, 2025 by
grassFlamingo
[Feature] 请问是否能有一个汇总,就是opencompass包含的所有评测数据集、这些数据集的metrics和链接(paper/github)?
#1820
opened Jan 14, 2025 by
141forever
1 task
Previous Next
ProTip!
Follow long discussions with comments:>50.