Cua-BenchExamples
macOS Native Task
Build a Reminders task using osascript-based app helpers on macOS
Create a task where agents must add a reminder using the Reminders app. This shows how to use macOS app helpers that leverage osascript for evaluation.
Prerequisites
macOS native provider configured (runs on actual Mac hardware).
Create the Task
Create tasks/add_reminder/main.py:
import cua_bench as cb
@cb.tasks_config(split="train")
def load():
return [
cb.Task(
description="Open Reminders and create a new reminder called 'Buy groceries' in the default list.",
metadata={"reminder_name": "Buy groceries"},
computer={
"provider": "native",
"setup_config": {"os_type": "macos", "width": 1440, "height": 900},
},
)
]
@cb.setup_task(split="train")
async def start(task_cfg: cb.Task, session: cb.DesktopSession):
"""Launch Reminders app."""
await session.apps.reminders.launch()
@cb.evaluate_task(split="train")
async def evaluate(task_cfg: cb.Task, session: cb.DesktopSession) -> list[float]:
"""Check if the reminder was created using osascript getter."""
target_name = task_cfg.metadata["reminder_name"]
# App helper uses osascript internally to query Reminders
reminders = await session.apps.reminders.get_incomplete_reminders()
for reminder in reminders:
if reminder["name"] == target_name:
return [1.0]
return [0.0]
if __name__ == "__main__":
cb.interact(__file__)How It Works
The macOS app helpers use AppleScript via osascript to interact with first-party apps:
# These methods run osascript commands under the hood
notes = await session.apps.notes.get_all_notes()
events = await session.apps.calendar.get_events_today()
reminders = await session.apps.reminders.get_incomplete_reminders()Run It
cb interact tasks/add_reminderNext Steps
- Windows Native Task - File system-based evaluation
- App Helpers - Full API reference
Was this page helpful?