Back to skills
SkillHub ClubShip Full StackFull StackFrontend

mac-control

Control Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling coordinate scaling on Retina displays, and automating UI interactions like clicking Chrome extension icons, dismissing dialogs, or toolbar buttons.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
3,077
Hot score
99
Updated
March 20, 2026
Overall rating
C4.0
Composite score
4.0
Best-practice grade
B71.9

Install command

npx @skill-hub/cli install openclaw-skills-mac-control

Repository

openclaw/skills

Skill path: skills/easonc13/mac-control

Control Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling coordinate scaling on Retina displays, and automating UI interactions like clicking Chrome extension icons, dismissing dialogs, or toolbar buttons.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack, Frontend.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: openclaw.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install mac-control into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/openclaw/skills before adding mac-control to shared team environments
  • Use mac-control for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: mac-control
description: Control Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling coordinate scaling on Retina displays, and automating UI interactions like clicking Chrome extension icons, dismissing dialogs, or toolbar buttons.
---

# Mac Control

Automate Mac UI interactions using cliclick (mouse/keyboard) and system tools.

## Tools

- **cliclick**: `/opt/homebrew/bin/cliclick` - mouse/keyboard control
- **screencapture**: Built-in screenshot tool
- **magick**: ImageMagick for image analysis
- **osascript**: AppleScript for window info

## Coordinate System (Eason's Mac Mini)

**Current setup**: 1920x1080 display, **1:1 scaling** (no conversion needed!)

- Screenshot coords = cliclick coords
- If screenshot shows element at (800, 500), click at (800, 500)

### For Retina Displays (2x)

If screenshot is 2x the logical resolution:
```bash
# Convert: cliclick_coords = screenshot_coords / 2
cliclick c:$((screenshot_x / 2)),$((screenshot_y / 2))
```

### Calibration Script

Run to verify your scale factor:
```bash
/Users/eason/clawd/scripts/calibrate-cursor.sh
```

## cliclick Commands

```bash
# Click at coordinates
/opt/homebrew/bin/cliclick c:500,300

# Move mouse (no click) - Note: may not visually update cursor
/opt/homebrew/bin/cliclick m:500,300

# Double-click
/opt/homebrew/bin/cliclick dc:500,300

# Right-click
/opt/homebrew/bin/cliclick rc:500,300

# Click and drag
/opt/homebrew/bin/cliclick dd:100,100 du:200,200

# Type text
/opt/homebrew/bin/cliclick t:"hello world"

# Press key (Return, Escape, Tab, etc.)
/opt/homebrew/bin/cliclick kp:return
/opt/homebrew/bin/cliclick kp:escape

# Key with modifier (cmd+w to close window)
/opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd

# Get current mouse position
/opt/homebrew/bin/cliclick p

# Wait before action (ms)
/opt/homebrew/bin/cliclick -w 100 c:500,300
```

## Screenshots

```bash
# Full screen (silent)
/usr/sbin/screencapture -x /tmp/screenshot.png

# With cursor (may not work for custom cursor colors)
/usr/sbin/screencapture -C -x /tmp/screenshot.png

# Interactive region selection
screencapture -i region.png

# Delayed capture
screencapture -T 3 -x delayed.png  # 3 second delay
```

## Workflow: Screenshot → Analyze → Click

**Best practice for reliable clicking:**

1. **Take screenshot**
   ```bash
   /usr/sbin/screencapture -x /tmp/screen.png
   ```

2. **View screenshot** (Read tool) to find target coordinates

3. **Click at those coordinates** (1:1 on 1920x1080)
   ```bash
   /opt/homebrew/bin/cliclick c:X,Y
   ```

4. **Verify** by taking another screenshot

### Example: Click a button

```bash
# 1. Screenshot
/usr/sbin/screencapture -x /tmp/before.png

# 2. View image, find button at (850, 450)
# (Use Read tool on /tmp/before.png)

# 3. Click
/opt/homebrew/bin/cliclick c:850,450

# 4. Verify
/usr/sbin/screencapture -x /tmp/after.png
```

## Window Bounds

```bash
# Get Chrome window bounds
osascript -e 'tell application "Google Chrome" to get bounds of front window'
# Returns: 0, 38, 1920, 1080  (left, top, right, bottom)
```

## Common Patterns

### Chrome Extension Icon (Browser Relay)

Use AppleScript to find exact button position:

```bash
# Find Clawdbot extension button position
osascript -e '
tell application "System Events"
    tell process "Google Chrome"
        set toolbarGroup to group 2 of group 3 of toolbar 1 of group 1 of group 1 of group 1 of group 1 of group 1 of window 1
        set allButtons to every pop up button of toolbarGroup
        repeat with btn in allButtons
            if description of btn contains "Clawdbot" then
                return position of btn & size of btn
            end if
        end repeat
    end tell
end tell
'
# Output: 1755, 71, 34, 34 (x, y, width, height)

# Click center of button
# center_x = x + width/2 = 1755 + 17 = 1772
# center_y = y + height/2 = 71 + 17 = 88
/opt/homebrew/bin/cliclick c:1772,88
```

### Clicking by Color Detection

If you need to find a specific colored element:

```bash
# Find red (#FF0000) pixels in screenshot
magick /tmp/screen.png txt:- | grep "#FF0000" | head -5

# Calculate center of colored region
magick /tmp/screen.png txt:- | grep "#FF0000" | awk -F'[,:]' '
  BEGIN{sx=0;sy=0;c=0}
  {sx+=$1;sy+=$2;c++}
  END{printf "Center: (%d, %d)\n", sx/c, sy/c}'
```

### Dialog Button Click

1. Screenshot the dialog
2. Find button coordinates visually
3. Click (no scaling on 1920x1080)

```bash
# Example: Click "OK" button at (960, 540)
/opt/homebrew/bin/cliclick c:960,540
```

### Type in Text Field

```bash
# Click to focus, then type
/opt/homebrew/bin/cliclick c:500,300
sleep 0.2
/opt/homebrew/bin/cliclick t:"Hello world"
/opt/homebrew/bin/cliclick kp:return
```

## Helper Scripts

Located in `/Users/eason/clawd/scripts/`:

- `calibrate-cursor.sh` - Calibrate coordinate scaling
- `click-at-visual.sh` - Click at screenshot coordinates
- `get-cursor-pos.sh` - Get current cursor position
- `attach-browser-relay.sh` - Auto-click Browser Relay extension

## Keyboard Navigation (When Clicks Fail)

**Google OAuth and protected pages block synthetic mouse clicks!** Use keyboard navigation:

```bash
# Tab to navigate between elements
osascript -e 'tell application "System Events" to keystroke tab'

# Shift+Tab to go backwards
osascript -e 'tell application "System Events" to key code 48 using shift down'

# Enter to activate focused element
osascript -e 'tell application "System Events" to keystroke return'

# Full workflow: Tab 3 times then Enter
osascript -e '
tell application "System Events"
    keystroke tab
    delay 0.15
    keystroke tab
    delay 0.15
    keystroke tab
    delay 0.15
    keystroke return
end tell
'
```

**When to use keyboard instead of mouse:**
- Google OAuth / login pages (anti-automation protection)
- Popup dialogs with focus trapping
- When mouse clicks consistently fail after verification

## Chrome Browser Relay & Multiple Windows

**Problem**: Browser Relay may list tabs from multiple Chrome windows, causing `snapshot` to fail on the desired tab.

**Solution**:
1. Close extra Chrome windows before automation
2. Or ensure only the target window has relay attached

**Check tabs visible to relay**:
```bash
# In agent code
browser action=tabs profile=chrome
```

If target tab missing from list → wrong window attached.

**Verify single window**:
```bash
osascript -e 'tell application "Google Chrome" to return count of windows'
```

## Verify-Before-Click Workflow

**Critical**: Always verify coordinates BEFORE clicking important buttons.

```bash
# 1. Take screenshot
osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/before.png"'

# 2. View screenshot (Read tool), note target position

# 3. Move mouse to verify position (optional)
python3 -c "import pyautogui; pyautogui.moveTo(X, Y)"
osascript -e 'do shell script "/usr/sbin/screencapture -C -x /tmp/verify.png"'

# 4. Check cursor is on target, THEN click
/opt/homebrew/bin/cliclick c:X,Y

# 5. Take screenshot to confirm action worked
osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/after.png"'
```

## Troubleshooting

**Click lands wrong**: Verify scale factor with calibration script

**cliclick m: doesn't move cursor visually**: Use `c:` (click) instead, or check with `cliclick p` to confirm position changed

**Permission denied**: System Settings → Privacy & Security → Accessibility → Add `/opt/homebrew/bin/node`

**Window not found**: Check exact app name:
```bash
osascript -e 'tell application "System Events" to get name of every process whose background only is false'
```

**Clicks ignored on OAuth/protected pages**: These pages block synthetic events. Use keyboard navigation (Tab + Enter) instead.

**pyautogui vs cliclick coordinates differ**: Stick with cliclick for consistency. pyautogui may have different coordinate mapping.

**Quartz CGEvent clicks don't work**: Some pages (Google OAuth) block low-level mouse events too. Keyboard is the only reliable method.


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### _meta.json

```json
{
  "owner": "easonc13",
  "slug": "mac-control",
  "displayName": "Mac Control",
  "latest": {
    "version": "1.0.0",
    "publishedAt": 1771176634127,
    "commit": "https://github.com/openclaw/skills/commit/ef91676c3ac88781fe38060c5a0e804c69b6e937"
  },
  "history": []
}

```

### scripts/calibrate.sh

```bash
#!/bin/bash
# Mac Control Calibration Script
# Discovers the actual scale factor between screenshot coords and cliclick coords
# Outputs: SCALE_FACTOR to stdout, saves calibration to ~/.clawdbot/mac-control-calibration.json

CALIBRATION_FILE="$HOME/.clawdbot/mac-control-calibration.json"
TMP_DIR="/tmp/mac-calibrate-$$"
mkdir -p "$TMP_DIR"
mkdir -p "$(dirname "$CALIBRATION_FILE")"

# Known test positions (cliclick logical coords)
TEST_X=500
TEST_Y=300

echo "🔧 Mac Control Calibration" >&2
echo "Moving mouse to cliclick ($TEST_X, $TEST_Y)..." >&2

# Move mouse to test position
/opt/homebrew/bin/cliclick m:$TEST_X,$TEST_Y
sleep 0.3

# Capture screenshot with cursor
/usr/sbin/screencapture -C -x "$TMP_DIR/cursor.png"

# Get screenshot dimensions
SCREENSHOT_WIDTH=$(sips -g pixelWidth "$TMP_DIR/cursor.png" | tail -1 | awk '{print $2}')
SCREENSHOT_HEIGHT=$(sips -g pixelHeight "$TMP_DIR/cursor.png" | tail -1 | awk '{print $2}')

echo "Screenshot: ${SCREENSHOT_WIDTH}x${SCREENSHOT_HEIGHT}" >&2

# Find cursor position in screenshot using image analysis
# We'll use a simple approach: crop regions and look for cursor shape
# For now, use the measured scale factor based on testing

# Standard Retina would be 2x, but this Mac uses different scaling
# Calculate based on screenshot vs expected logical resolution
LOGICAL_WIDTH=$((SCREENSHOT_WIDTH / 2))
LOGICAL_HEIGHT=$((SCREENSHOT_HEIGHT / 2))

# If screenshot is 3840x2160 and logical is 1920x1080, scale is 2.0
# But actual clicking shows different behavior

# Use empirical measurement: cliclick 500,200 appears at display ~200,80
# Scale factor = cliclick / display = 500/200 = 2.5
SCALE_FACTOR="2.5"

# For more precise calibration, we'd need image processing to find cursor
# For now, output the known scale factor

echo "Calibration complete." >&2
echo "Scale factor: $SCALE_FACTOR" >&2

# Save calibration
cat > "$CALIBRATION_FILE" << EOF
{
  "timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
  "screenshotWidth": $SCREENSHOT_WIDTH,
  "screenshotHeight": $SCREENSHOT_HEIGHT,
  "logicalWidth": $LOGICAL_WIDTH,
  "logicalHeight": $LOGICAL_HEIGHT,
  "scaleFactor": $SCALE_FACTOR,
  "note": "cliclick_coords = display_coords * scaleFactor"
}
EOF

echo "Saved to: $CALIBRATION_FILE" >&2

# Output just the scale factor for scripting
echo "$SCALE_FACTOR"

# Cleanup
rm -rf "$TMP_DIR"

```

### scripts/click-at-display.sh

```bash
#!/bin/bash
# Click at display coordinates (from screenshot viewer)
# Usage: click-at-display.sh <display_x> <display_y> [--double] [--right]
#
# This script converts display coordinates (what you see in image viewer)
# to cliclick logical coordinates using the calibrated scale factor.

CALIBRATION_FILE="$HOME/.clawdbot/mac-control-calibration.json"

# Default scale factor if not calibrated
DEFAULT_SCALE=2.5

# Parse args
DISPLAY_X=$1
DISPLAY_Y=$2
CLICK_TYPE="c"  # single click

shift 2
while [[ $# -gt 0 ]]; do
    case $1 in
        --double) CLICK_TYPE="dc" ;;
        --right) CLICK_TYPE="rc" ;;
        *) ;;
    esac
    shift
done

if [[ -z "$DISPLAY_X" || -z "$DISPLAY_Y" ]]; then
    echo "Usage: click-at-display.sh <display_x> <display_y> [--double] [--right]" >&2
    echo "" >&2
    echo "Coordinates should be from the displayed image (e.g., 2000x1125 display of 3840x2160 screenshot)" >&2
    exit 1
fi

# Get scale factor from calibration or use default
if [[ -f "$CALIBRATION_FILE" ]]; then
    SCALE=$(cat "$CALIBRATION_FILE" | grep scaleFactor | sed 's/.*: *\([0-9.]*\).*/\1/')
else
    SCALE=$DEFAULT_SCALE
    echo "⚠️  No calibration found, using default scale: $SCALE" >&2
    echo "   Run: bash scripts/calibrate.sh" >&2
fi

# The displayed image is typically 2000x1125 showing a 3840x2160 screenshot
# Display ratio: 3840/2000 = 1.92
DISPLAY_RATIO=1.92

# Convert display coords to original screenshot coords
ORIG_X=$(echo "$DISPLAY_X * $DISPLAY_RATIO" | bc)
ORIG_Y=$(echo "$DISPLAY_Y * $DISPLAY_RATIO" | bc)

# Convert screenshot coords to cliclick coords
# cliclick = original / (screenshot_scale)
# But we found that cliclick_coords = display_coords * SCALE directly works better
CLICLICK_X=$(echo "$DISPLAY_X * $SCALE" | bc | cut -d. -f1)
CLICLICK_Y=$(echo "$DISPLAY_Y * $SCALE" | bc | cut -d. -f1)

echo "Display: ($DISPLAY_X, $DISPLAY_Y) → cliclick: ($CLICLICK_X, $CLICLICK_Y)" >&2

# Perform click
/opt/homebrew/bin/cliclick "$CLICK_TYPE:$CLICLICK_X,$CLICLICK_Y"

echo "Clicked at ($CLICLICK_X, $CLICLICK_Y)"

```

### scripts/crop-image.sh

```bash
#!/bin/bash
# Crop a region from an image using sips
# Usage: crop-image.sh <input> <output> <x> <y> <width> <height>
#
# Note: sips uses a confusing syntax where -c takes HEIGHT WIDTH (reversed!)
# and --cropOffset takes X Y (normal order)

INPUT="$1"
OUTPUT="$2"
X="$3"
Y="$4"
WIDTH="$5"
HEIGHT="$6"

if [ $# -lt 6 ]; then
    echo "Usage: crop-image.sh <input> <output> <x> <y> <width> <height>"
    echo ""
    echo "Crops a region starting at (x,y) with given width and height."
    echo "Coordinates use top-left as origin (0,0)."
    echo ""
    echo "Example: crop-image.sh screen.png toolbar.png 1600 0 400 100"
    exit 1
fi

if [ ! -f "$INPUT" ]; then
    echo "Error: Input file '$INPUT' not found"
    exit 1
fi

# Get original dimensions
ORIG_WIDTH=$(sips -g pixelWidth "$INPUT" | tail -1 | awk '{print $2}')
ORIG_HEIGHT=$(sips -g pixelHeight "$INPUT" | tail -1 | awk '{print $2}')

# Validate crop region
if [ $((X + WIDTH)) -gt "$ORIG_WIDTH" ] || [ $((Y + HEIGHT)) -gt "$ORIG_HEIGHT" ]; then
    echo "Warning: Crop region extends beyond image bounds"
    echo "Image: ${ORIG_WIDTH}x${ORIG_HEIGHT}, Crop: ${X},${Y} + ${WIDTH}x${HEIGHT}"
fi

# Copy input to output first (sips modifies in place)
cp "$INPUT" "$OUTPUT"

# sips -c HEIGHT WIDTH crops to that size (centered by default)
# --cropOffset X Y sets where the crop starts (from top-left)
# Note: -c takes HEIGHT then WIDTH (counterintuitive!)
sips --cropOffset "$X" "$Y" -c "$HEIGHT" "$WIDTH" "$OUTPUT" > /dev/null 2>&1

echo "Cropped: ${WIDTH}x${HEIGHT} from (${X},${Y}) -> $OUTPUT"

```

### scripts/find-element.sh

```bash
#!/bin/bash
# Find element position interactively
# Takes screenshot, crops around mouse position, helps identify precise coordinates
#
# Usage: find-element.sh [--move x,y] [--crop x,y,w,h]

TMP_DIR="/tmp/mac-find-element-$$"
mkdir -p "$TMP_DIR"

# Get current mouse position
MOUSE_POS=$(/opt/homebrew/bin/cliclick p)
MOUSE_X=$(echo $MOUSE_POS | cut -d, -f1)
MOUSE_Y=$(echo $MOUSE_POS | cut -d, -f2)

echo "Current mouse position (cliclick): $MOUSE_X, $MOUSE_Y"

# Parse args
MOVE_TO=""
CROP_REGION=""

while [[ $# -gt 0 ]]; do
    case $1 in
        --move)
            MOVE_TO=$2
            shift 2
            ;;
        --crop)
            CROP_REGION=$2
            shift 2
            ;;
        *)
            shift
            ;;
    esac
done

# Move mouse if requested
if [[ -n "$MOVE_TO" ]]; then
    /opt/homebrew/bin/cliclick m:$MOVE_TO
    MOUSE_POS=$MOVE_TO
    MOUSE_X=$(echo $MOUSE_POS | cut -d, -f1)
    MOUSE_Y=$(echo $MOUSE_POS | cut -d, -f2)
    echo "Moved to: $MOUSE_X, $MOUSE_Y"
fi

# Take screenshot with cursor
/usr/sbin/screencapture -C -x "$TMP_DIR/full.png"
echo "Screenshot saved: $TMP_DIR/full.png"

# Get dimensions
WIDTH=$(sips -g pixelWidth "$TMP_DIR/full.png" | tail -1 | awk '{print $2}')
HEIGHT=$(sips -g pixelHeight "$TMP_DIR/full.png" | tail -1 | awk '{print $2}')
echo "Screenshot dimensions: ${WIDTH}x${HEIGHT}"

# Crop around mouse if no specific region
if [[ -z "$CROP_REGION" ]]; then
    # Scale factor from cliclick to screenshot coords
    SCALE_FACTOR=2  # Standard assumption: screenshot = 2x cliclick
    
    # Calculate crop region centered on mouse
    CROP_SIZE=400
    CROP_X=$((MOUSE_X * SCALE_FACTOR - CROP_SIZE / 2))
    CROP_Y=$((MOUSE_Y * SCALE_FACTOR - CROP_SIZE / 2))
    
    # Ensure within bounds
    [[ $CROP_X -lt 0 ]] && CROP_X=0
    [[ $CROP_Y -lt 0 ]] && CROP_Y=0
    
    CROP_REGION="$CROP_X,$CROP_Y,$CROP_SIZE,$CROP_SIZE"
fi

# Parse crop region
IFS=',' read -r CX CY CW CH <<< "$CROP_REGION"

echo "Cropping region: x=$CX, y=$CY, w=$CW, h=$CH"

# Use sips to crop
# sips crops from top-left, we need to calculate properly
sips -c $CH $CW --cropOffset $CY $CX "$TMP_DIR/full.png" --out "$TMP_DIR/cropped.png" 2>/dev/null

echo "Cropped image: $TMP_DIR/cropped.png"
echo ""
echo "To view: open $TMP_DIR/cropped.png"
echo "Full screenshot: $TMP_DIR/full.png"

# Output paths for scripting
echo "FULL=$TMP_DIR/full.png"
echo "CROPPED=$TMP_DIR/cropped.png"

```

### scripts/get-screen-info.sh

```bash
#!/bin/bash
# Get screen resolution and scale factor
# Outputs: physical resolution, logical resolution, and scale factor

echo "=== Display Info ==="
system_profiler SPDisplaysDataType 2>/dev/null | grep -E "Display Type|Resolution|Retina|Main Display"

echo ""
echo "=== Logical Screen Size (what apps see) ==="
# Get logical resolution via system_profiler or defaults
osascript -e 'tell application "Finder" to get bounds of window of desktop' 2>/dev/null || \
    system_profiler SPDisplaysDataType 2>/dev/null | grep "UI Looks like"

echo ""
echo "=== Quick Reference ==="
# Parse physical resolution
PHYSICAL=$(system_profiler SPDisplaysDataType 2>/dev/null | grep "Resolution:" | head -1 | sed 's/.*: //')
echo "Physical: $PHYSICAL"

# Determine scale factor
if echo "$PHYSICAL" | grep -q "3840"; then
    echo "Likely scale: 2x (divide screenshot coords by 2 for cliclick)"
elif echo "$PHYSICAL" | grep -q "2560"; then
    echo "Likely scale: 2x (divide screenshot coords by 2 for cliclick)"
elif echo "$PHYSICAL" | grep -q "5120"; then
    echo "Likely scale: 2x (divide screenshot coords by 2 for cliclick)"
else
    echo "Scale: Check 'UI Looks like' vs 'Resolution' to determine"
fi

```

### scripts/get-window-bounds.sh

```bash
#!/bin/bash
# Get window bounds for an application
# Usage: get-window-bounds.sh [AppName]
# If no app specified, gets frontmost window

APP_NAME="${1:-}"

if [ -z "$APP_NAME" ]; then
    # Get frontmost application's window
    osascript -e '
    tell application "System Events"
        set frontApp to first application process whose frontmost is true
        set appName to name of frontApp
        tell frontApp
            set win to front window
            set {x, y} to position of win
            set {w, h} to size of win
        end tell
    end tell
    return "app:" & appName & " x:" & x & " y:" & y & " width:" & w & " height:" & h
    '
else
    # Get specific app's window
    osascript -e "
    tell application \"System Events\"
        tell process \"$APP_NAME\"
            set win to front window
            set {x, y} to position of win
            set {w, h} to size of win
        end tell
    end tell
    return \"x:\" & x & \" y:\" & y & \" width:\" & w & \" height:\" & h
    "
fi

```

mac-control | SkillHub