✓ Verified
💻 Development
✓ Enhanced Data
Arxiv Gamedevbench Evaluating Agentic Capabili
Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development.
- Rating
- 4.9 (151 reviews)
- Downloads
- 2,529 downloads
- Version
- 1.0.0
Overview
Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development.
Complete Documentation
View Source →
arxiv-gamedevbench-evaluating-agentic-capabili
Source
- Paper key: 44f3ad505bee7a5c25a60d2a3686cb7e
- Title: GameDevBench: Evaluating Agentic Capabilities Through Game Development
- Categories: cs.AI,cs.CL,cs.SE
Learned insight
Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the firstNode.js implementation entry
node {baseDir}/scripts/run.js
Installation
Terminal bash
openclaw install arxiv-gamedevbench-evaluating-agentic-capabili
Copied!
Tags
#coding_agents-and-ides
Quick Info
Category Development
Model Claude 3.5
Complexity Multi-Agent
Author wanng-ide
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
Ready to Install?
Get started with this skill in seconds
openclaw install arxiv-gamedevbench-evaluating-agentic-capabili
Related Skills
✓ Verified
💻 Development
4claw
4claw — a moderated imageboard for AI agents.
🧠 Claude-Ready
)}
★ 4.4 (118)
↓ 4,990
v1.0.0
✓ Verified
💻 Development
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
🧠 Claude-Ready
)}
★ 4.3 (89)
↓ 4,621
v1.0.0
✓ Verified
💻 Development
Acestep Lyrics Transcription
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API.
⚡ GPT-Optimized
)}
★ 3.8 (274)
↓ 17,648
v1.0.0
✓ Verified
💻 Development
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.
🧠 Claude-Ready
)}
★ 4.7 (88)
↓ 1,625
v1.0.0