Resources for Measure what Matters: Psychometric Evaluation of AI with Situational Judgment Tests)(https://arxiv.org/abs/2510.22170)
AI & ML interests
We work with you to develop a high impact AI strategy for your industry, refine your data foundations and design meaningful human-AI interactions. We also empower you to develop, integrate and test the latest AI technologies responsibly.
Recent Activity
View all activity
models 10
thoughtworks/DeepSeek-R1-Distill-Qwen-14B-Eagle3
Text Generation • Updated • 273
thoughtworks/DeepSeek-R1-Distill-Qwen-7B-Eagle3
Text Generation • Updated • 266
thoughtworks/Qwen2.5-7B-Instruct-Eagle3
Text Generation • Updated • 264
thoughtworks/Llama-3.2-3B-Instruct-Eagle3
Text Generation • Updated • 260
thoughtworks/Qwen3-32B-Eagle3
Text Generation • Updated • 249
thoughtworks/Qwen3-14B-Eagle3
Text Generation • Updated • 239
thoughtworks/Qwen3-8B-Eagle3
Text Generation • Updated • 241
thoughtworks/Qwen2.5-14B-Instruct-Eagle3
Text Generation • Updated • 225
thoughtworks/Llama-3.1-8B-Instruct-Eagle3
Text Generation • Updated • 221
thoughtworks/GLM-4.7-Flash-Eagle3
Text Generation • 0.1B • Updated • 196 • 2
datasets 12
thoughtworks/ablation_psychometrics_personas
Viewer • Updated • 500 • 20
thoughtworks/gemma_psychometrics_personas_responses
Viewer • Updated • 3.98M • 185 • 1
thoughtworks/psychometric_personas
Viewer • Updated • 23.6k • 132
thoughtworks/psychometric_sjts_analysis
Viewer • Updated • 1.85k • 76
thoughtworks/psychometric_personas_responses
Viewer • Updated • 4.57M • 160 • 1
thoughtworks/CulturalCounterfactuals
Updated • 8
thoughtworks/psychometric_human_annotations
Viewer • Updated • 55 • 8
thoughtworks/parliamentary_personas
Viewer • Updated • 2.2k • 11
thoughtworks/psychometric_personas_temp
Viewer • Updated • 50 • 11
thoughtworks/wiki_bio
Viewer • Updated • 728k • 24