mac-control
Control Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling coordinate scaling on Retina displays, and automating UI interactions like clicking Chrome extension icons, dismissing dialogs, or toolbar buttons.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install openclaw-skills-mac-control
Repository
Skill path: skills/easonc13/mac-control
Control Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling coordinate scaling on Retina displays, and automating UI interactions like clicking Chrome extension icons, dismissing dialogs, or toolbar buttons.
Open repositoryBest for
Primary workflow: Ship Full Stack.
Technical facets: Full Stack, Frontend.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: openclaw.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install mac-control into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/openclaw/skills before adding mac-control to shared team environments
- Use mac-control for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: mac-control
description: Control Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling coordinate scaling on Retina displays, and automating UI interactions like clicking Chrome extension icons, dismissing dialogs, or toolbar buttons.
---
# Mac Control
Automate Mac UI interactions using cliclick (mouse/keyboard) and system tools.
## Tools
- **cliclick**: `/opt/homebrew/bin/cliclick` - mouse/keyboard control
- **screencapture**: Built-in screenshot tool
- **magick**: ImageMagick for image analysis
- **osascript**: AppleScript for window info
## Coordinate System (Eason's Mac Mini)
**Current setup**: 1920x1080 display, **1:1 scaling** (no conversion needed!)
- Screenshot coords = cliclick coords
- If screenshot shows element at (800, 500), click at (800, 500)
### For Retina Displays (2x)
If screenshot is 2x the logical resolution:
```bash
# Convert: cliclick_coords = screenshot_coords / 2
cliclick c:$((screenshot_x / 2)),$((screenshot_y / 2))
```
### Calibration Script
Run to verify your scale factor:
```bash
/Users/eason/clawd/scripts/calibrate-cursor.sh
```
## cliclick Commands
```bash
# Click at coordinates
/opt/homebrew/bin/cliclick c:500,300
# Move mouse (no click) - Note: may not visually update cursor
/opt/homebrew/bin/cliclick m:500,300
# Double-click
/opt/homebrew/bin/cliclick dc:500,300
# Right-click
/opt/homebrew/bin/cliclick rc:500,300
# Click and drag
/opt/homebrew/bin/cliclick dd:100,100 du:200,200
# Type text
/opt/homebrew/bin/cliclick t:"hello world"
# Press key (Return, Escape, Tab, etc.)
/opt/homebrew/bin/cliclick kp:return
/opt/homebrew/bin/cliclick kp:escape
# Key with modifier (cmd+w to close window)
/opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd
# Get current mouse position
/opt/homebrew/bin/cliclick p
# Wait before action (ms)
/opt/homebrew/bin/cliclick -w 100 c:500,300
```
## Screenshots
```bash
# Full screen (silent)
/usr/sbin/screencapture -x /tmp/screenshot.png
# With cursor (may not work for custom cursor colors)
/usr/sbin/screencapture -C -x /tmp/screenshot.png
# Interactive region selection
screencapture -i region.png
# Delayed capture
screencapture -T 3 -x delayed.png # 3 second delay
```
## Workflow: Screenshot → Analyze → Click
**Best practice for reliable clicking:**
1. **Take screenshot**
```bash
/usr/sbin/screencapture -x /tmp/screen.png
```
2. **View screenshot** (Read tool) to find target coordinates
3. **Click at those coordinates** (1:1 on 1920x1080)
```bash
/opt/homebrew/bin/cliclick c:X,Y
```
4. **Verify** by taking another screenshot
### Example: Click a button
```bash
# 1. Screenshot
/usr/sbin/screencapture -x /tmp/before.png
# 2. View image, find button at (850, 450)
# (Use Read tool on /tmp/before.png)
# 3. Click
/opt/homebrew/bin/cliclick c:850,450
# 4. Verify
/usr/sbin/screencapture -x /tmp/after.png
```
## Window Bounds
```bash
# Get Chrome window bounds
osascript -e 'tell application "Google Chrome" to get bounds of front window'
# Returns: 0, 38, 1920, 1080 (left, top, right, bottom)
```
## Common Patterns
### Chrome Extension Icon (Browser Relay)
Use AppleScript to find exact button position:
```bash
# Find Clawdbot extension button position
osascript -e '
tell application "System Events"
tell process "Google Chrome"
set toolbarGroup to group 2 of group 3 of toolbar 1 of group 1 of group 1 of group 1 of group 1 of group 1 of window 1
set allButtons to every pop up button of toolbarGroup
repeat with btn in allButtons
if description of btn contains "Clawdbot" then
return position of btn & size of btn
end if
end repeat
end tell
end tell
'
# Output: 1755, 71, 34, 34 (x, y, width, height)
# Click center of button
# center_x = x + width/2 = 1755 + 17 = 1772
# center_y = y + height/2 = 71 + 17 = 88
/opt/homebrew/bin/cliclick c:1772,88
```
### Clicking by Color Detection
If you need to find a specific colored element:
```bash
# Find red (#FF0000) pixels in screenshot
magick /tmp/screen.png txt:- | grep "#FF0000" | head -5
# Calculate center of colored region
magick /tmp/screen.png txt:- | grep "#FF0000" | awk -F'[,:]' '
BEGIN{sx=0;sy=0;c=0}
{sx+=$1;sy+=$2;c++}
END{printf "Center: (%d, %d)\n", sx/c, sy/c}'
```
### Dialog Button Click
1. Screenshot the dialog
2. Find button coordinates visually
3. Click (no scaling on 1920x1080)
```bash
# Example: Click "OK" button at (960, 540)
/opt/homebrew/bin/cliclick c:960,540
```
### Type in Text Field
```bash
# Click to focus, then type
/opt/homebrew/bin/cliclick c:500,300
sleep 0.2
/opt/homebrew/bin/cliclick t:"Hello world"
/opt/homebrew/bin/cliclick kp:return
```
## Helper Scripts
Located in `/Users/eason/clawd/scripts/`:
- `calibrate-cursor.sh` - Calibrate coordinate scaling
- `click-at-visual.sh` - Click at screenshot coordinates
- `get-cursor-pos.sh` - Get current cursor position
- `attach-browser-relay.sh` - Auto-click Browser Relay extension
## Keyboard Navigation (When Clicks Fail)
**Google OAuth and protected pages block synthetic mouse clicks!** Use keyboard navigation:
```bash
# Tab to navigate between elements
osascript -e 'tell application "System Events" to keystroke tab'
# Shift+Tab to go backwards
osascript -e 'tell application "System Events" to key code 48 using shift down'
# Enter to activate focused element
osascript -e 'tell application "System Events" to keystroke return'
# Full workflow: Tab 3 times then Enter
osascript -e '
tell application "System Events"
keystroke tab
delay 0.15
keystroke tab
delay 0.15
keystroke tab
delay 0.15
keystroke return
end tell
'
```
**When to use keyboard instead of mouse:**
- Google OAuth / login pages (anti-automation protection)
- Popup dialogs with focus trapping
- When mouse clicks consistently fail after verification
## Chrome Browser Relay & Multiple Windows
**Problem**: Browser Relay may list tabs from multiple Chrome windows, causing `snapshot` to fail on the desired tab.
**Solution**:
1. Close extra Chrome windows before automation
2. Or ensure only the target window has relay attached
**Check tabs visible to relay**:
```bash
# In agent code
browser action=tabs profile=chrome
```
If target tab missing from list → wrong window attached.
**Verify single window**:
```bash
osascript -e 'tell application "Google Chrome" to return count of windows'
```
## Verify-Before-Click Workflow
**Critical**: Always verify coordinates BEFORE clicking important buttons.
```bash
# 1. Take screenshot
osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/before.png"'
# 2. View screenshot (Read tool), note target position
# 3. Move mouse to verify position (optional)
python3 -c "import pyautogui; pyautogui.moveTo(X, Y)"
osascript -e 'do shell script "/usr/sbin/screencapture -C -x /tmp/verify.png"'
# 4. Check cursor is on target, THEN click
/opt/homebrew/bin/cliclick c:X,Y
# 5. Take screenshot to confirm action worked
osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/after.png"'
```
## Troubleshooting
**Click lands wrong**: Verify scale factor with calibration script
**cliclick m: doesn't move cursor visually**: Use `c:` (click) instead, or check with `cliclick p` to confirm position changed
**Permission denied**: System Settings → Privacy & Security → Accessibility → Add `/opt/homebrew/bin/node`
**Window not found**: Check exact app name:
```bash
osascript -e 'tell application "System Events" to get name of every process whose background only is false'
```
**Clicks ignored on OAuth/protected pages**: These pages block synthetic events. Use keyboard navigation (Tab + Enter) instead.
**pyautogui vs cliclick coordinates differ**: Stick with cliclick for consistency. pyautogui may have different coordinate mapping.
**Quartz CGEvent clicks don't work**: Some pages (Google OAuth) block low-level mouse events too. Keyboard is the only reliable method.
---
## Skill Companion Files
> Additional files collected from the skill directory layout.
### _meta.json
```json
{
"owner": "easonc13",
"slug": "mac-control",
"displayName": "Mac Control",
"latest": {
"version": "1.0.0",
"publishedAt": 1771176634127,
"commit": "https://github.com/openclaw/skills/commit/ef91676c3ac88781fe38060c5a0e804c69b6e937"
},
"history": []
}
```
### scripts/calibrate.sh
```bash
#!/bin/bash
# Mac Control Calibration Script
# Discovers the actual scale factor between screenshot coords and cliclick coords
# Outputs: SCALE_FACTOR to stdout, saves calibration to ~/.clawdbot/mac-control-calibration.json
CALIBRATION_FILE="$HOME/.clawdbot/mac-control-calibration.json"
TMP_DIR="/tmp/mac-calibrate-$$"
mkdir -p "$TMP_DIR"
mkdir -p "$(dirname "$CALIBRATION_FILE")"
# Known test positions (cliclick logical coords)
TEST_X=500
TEST_Y=300
echo "🔧 Mac Control Calibration" >&2
echo "Moving mouse to cliclick ($TEST_X, $TEST_Y)..." >&2
# Move mouse to test position
/opt/homebrew/bin/cliclick m:$TEST_X,$TEST_Y
sleep 0.3
# Capture screenshot with cursor
/usr/sbin/screencapture -C -x "$TMP_DIR/cursor.png"
# Get screenshot dimensions
SCREENSHOT_WIDTH=$(sips -g pixelWidth "$TMP_DIR/cursor.png" | tail -1 | awk '{print $2}')
SCREENSHOT_HEIGHT=$(sips -g pixelHeight "$TMP_DIR/cursor.png" | tail -1 | awk '{print $2}')
echo "Screenshot: ${SCREENSHOT_WIDTH}x${SCREENSHOT_HEIGHT}" >&2
# Find cursor position in screenshot using image analysis
# We'll use a simple approach: crop regions and look for cursor shape
# For now, use the measured scale factor based on testing
# Standard Retina would be 2x, but this Mac uses different scaling
# Calculate based on screenshot vs expected logical resolution
LOGICAL_WIDTH=$((SCREENSHOT_WIDTH / 2))
LOGICAL_HEIGHT=$((SCREENSHOT_HEIGHT / 2))
# If screenshot is 3840x2160 and logical is 1920x1080, scale is 2.0
# But actual clicking shows different behavior
# Use empirical measurement: cliclick 500,200 appears at display ~200,80
# Scale factor = cliclick / display = 500/200 = 2.5
SCALE_FACTOR="2.5"
# For more precise calibration, we'd need image processing to find cursor
# For now, output the known scale factor
echo "Calibration complete." >&2
echo "Scale factor: $SCALE_FACTOR" >&2
# Save calibration
cat > "$CALIBRATION_FILE" << EOF
{
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"screenshotWidth": $SCREENSHOT_WIDTH,
"screenshotHeight": $SCREENSHOT_HEIGHT,
"logicalWidth": $LOGICAL_WIDTH,
"logicalHeight": $LOGICAL_HEIGHT,
"scaleFactor": $SCALE_FACTOR,
"note": "cliclick_coords = display_coords * scaleFactor"
}
EOF
echo "Saved to: $CALIBRATION_FILE" >&2
# Output just the scale factor for scripting
echo "$SCALE_FACTOR"
# Cleanup
rm -rf "$TMP_DIR"
```
### scripts/click-at-display.sh
```bash
#!/bin/bash
# Click at display coordinates (from screenshot viewer)
# Usage: click-at-display.sh <display_x> <display_y> [--double] [--right]
#
# This script converts display coordinates (what you see in image viewer)
# to cliclick logical coordinates using the calibrated scale factor.
CALIBRATION_FILE="$HOME/.clawdbot/mac-control-calibration.json"
# Default scale factor if not calibrated
DEFAULT_SCALE=2.5
# Parse args
DISPLAY_X=$1
DISPLAY_Y=$2
CLICK_TYPE="c" # single click
shift 2
while [[ $# -gt 0 ]]; do
case $1 in
--double) CLICK_TYPE="dc" ;;
--right) CLICK_TYPE="rc" ;;
*) ;;
esac
shift
done
if [[ -z "$DISPLAY_X" || -z "$DISPLAY_Y" ]]; then
echo "Usage: click-at-display.sh <display_x> <display_y> [--double] [--right]" >&2
echo "" >&2
echo "Coordinates should be from the displayed image (e.g., 2000x1125 display of 3840x2160 screenshot)" >&2
exit 1
fi
# Get scale factor from calibration or use default
if [[ -f "$CALIBRATION_FILE" ]]; then
SCALE=$(cat "$CALIBRATION_FILE" | grep scaleFactor | sed 's/.*: *\([0-9.]*\).*/\1/')
else
SCALE=$DEFAULT_SCALE
echo "⚠️ No calibration found, using default scale: $SCALE" >&2
echo " Run: bash scripts/calibrate.sh" >&2
fi
# The displayed image is typically 2000x1125 showing a 3840x2160 screenshot
# Display ratio: 3840/2000 = 1.92
DISPLAY_RATIO=1.92
# Convert display coords to original screenshot coords
ORIG_X=$(echo "$DISPLAY_X * $DISPLAY_RATIO" | bc)
ORIG_Y=$(echo "$DISPLAY_Y * $DISPLAY_RATIO" | bc)
# Convert screenshot coords to cliclick coords
# cliclick = original / (screenshot_scale)
# But we found that cliclick_coords = display_coords * SCALE directly works better
CLICLICK_X=$(echo "$DISPLAY_X * $SCALE" | bc | cut -d. -f1)
CLICLICK_Y=$(echo "$DISPLAY_Y * $SCALE" | bc | cut -d. -f1)
echo "Display: ($DISPLAY_X, $DISPLAY_Y) → cliclick: ($CLICLICK_X, $CLICLICK_Y)" >&2
# Perform click
/opt/homebrew/bin/cliclick "$CLICK_TYPE:$CLICLICK_X,$CLICLICK_Y"
echo "Clicked at ($CLICLICK_X, $CLICLICK_Y)"
```
### scripts/crop-image.sh
```bash
#!/bin/bash
# Crop a region from an image using sips
# Usage: crop-image.sh <input> <output> <x> <y> <width> <height>
#
# Note: sips uses a confusing syntax where -c takes HEIGHT WIDTH (reversed!)
# and --cropOffset takes X Y (normal order)
INPUT="$1"
OUTPUT="$2"
X="$3"
Y="$4"
WIDTH="$5"
HEIGHT="$6"
if [ $# -lt 6 ]; then
echo "Usage: crop-image.sh <input> <output> <x> <y> <width> <height>"
echo ""
echo "Crops a region starting at (x,y) with given width and height."
echo "Coordinates use top-left as origin (0,0)."
echo ""
echo "Example: crop-image.sh screen.png toolbar.png 1600 0 400 100"
exit 1
fi
if [ ! -f "$INPUT" ]; then
echo "Error: Input file '$INPUT' not found"
exit 1
fi
# Get original dimensions
ORIG_WIDTH=$(sips -g pixelWidth "$INPUT" | tail -1 | awk '{print $2}')
ORIG_HEIGHT=$(sips -g pixelHeight "$INPUT" | tail -1 | awk '{print $2}')
# Validate crop region
if [ $((X + WIDTH)) -gt "$ORIG_WIDTH" ] || [ $((Y + HEIGHT)) -gt "$ORIG_HEIGHT" ]; then
echo "Warning: Crop region extends beyond image bounds"
echo "Image: ${ORIG_WIDTH}x${ORIG_HEIGHT}, Crop: ${X},${Y} + ${WIDTH}x${HEIGHT}"
fi
# Copy input to output first (sips modifies in place)
cp "$INPUT" "$OUTPUT"
# sips -c HEIGHT WIDTH crops to that size (centered by default)
# --cropOffset X Y sets where the crop starts (from top-left)
# Note: -c takes HEIGHT then WIDTH (counterintuitive!)
sips --cropOffset "$X" "$Y" -c "$HEIGHT" "$WIDTH" "$OUTPUT" > /dev/null 2>&1
echo "Cropped: ${WIDTH}x${HEIGHT} from (${X},${Y}) -> $OUTPUT"
```
### scripts/find-element.sh
```bash
#!/bin/bash
# Find element position interactively
# Takes screenshot, crops around mouse position, helps identify precise coordinates
#
# Usage: find-element.sh [--move x,y] [--crop x,y,w,h]
TMP_DIR="/tmp/mac-find-element-$$"
mkdir -p "$TMP_DIR"
# Get current mouse position
MOUSE_POS=$(/opt/homebrew/bin/cliclick p)
MOUSE_X=$(echo $MOUSE_POS | cut -d, -f1)
MOUSE_Y=$(echo $MOUSE_POS | cut -d, -f2)
echo "Current mouse position (cliclick): $MOUSE_X, $MOUSE_Y"
# Parse args
MOVE_TO=""
CROP_REGION=""
while [[ $# -gt 0 ]]; do
case $1 in
--move)
MOVE_TO=$2
shift 2
;;
--crop)
CROP_REGION=$2
shift 2
;;
*)
shift
;;
esac
done
# Move mouse if requested
if [[ -n "$MOVE_TO" ]]; then
/opt/homebrew/bin/cliclick m:$MOVE_TO
MOUSE_POS=$MOVE_TO
MOUSE_X=$(echo $MOUSE_POS | cut -d, -f1)
MOUSE_Y=$(echo $MOUSE_POS | cut -d, -f2)
echo "Moved to: $MOUSE_X, $MOUSE_Y"
fi
# Take screenshot with cursor
/usr/sbin/screencapture -C -x "$TMP_DIR/full.png"
echo "Screenshot saved: $TMP_DIR/full.png"
# Get dimensions
WIDTH=$(sips -g pixelWidth "$TMP_DIR/full.png" | tail -1 | awk '{print $2}')
HEIGHT=$(sips -g pixelHeight "$TMP_DIR/full.png" | tail -1 | awk '{print $2}')
echo "Screenshot dimensions: ${WIDTH}x${HEIGHT}"
# Crop around mouse if no specific region
if [[ -z "$CROP_REGION" ]]; then
# Scale factor from cliclick to screenshot coords
SCALE_FACTOR=2 # Standard assumption: screenshot = 2x cliclick
# Calculate crop region centered on mouse
CROP_SIZE=400
CROP_X=$((MOUSE_X * SCALE_FACTOR - CROP_SIZE / 2))
CROP_Y=$((MOUSE_Y * SCALE_FACTOR - CROP_SIZE / 2))
# Ensure within bounds
[[ $CROP_X -lt 0 ]] && CROP_X=0
[[ $CROP_Y -lt 0 ]] && CROP_Y=0
CROP_REGION="$CROP_X,$CROP_Y,$CROP_SIZE,$CROP_SIZE"
fi
# Parse crop region
IFS=',' read -r CX CY CW CH <<< "$CROP_REGION"
echo "Cropping region: x=$CX, y=$CY, w=$CW, h=$CH"
# Use sips to crop
# sips crops from top-left, we need to calculate properly
sips -c $CH $CW --cropOffset $CY $CX "$TMP_DIR/full.png" --out "$TMP_DIR/cropped.png" 2>/dev/null
echo "Cropped image: $TMP_DIR/cropped.png"
echo ""
echo "To view: open $TMP_DIR/cropped.png"
echo "Full screenshot: $TMP_DIR/full.png"
# Output paths for scripting
echo "FULL=$TMP_DIR/full.png"
echo "CROPPED=$TMP_DIR/cropped.png"
```
### scripts/get-screen-info.sh
```bash
#!/bin/bash
# Get screen resolution and scale factor
# Outputs: physical resolution, logical resolution, and scale factor
echo "=== Display Info ==="
system_profiler SPDisplaysDataType 2>/dev/null | grep -E "Display Type|Resolution|Retina|Main Display"
echo ""
echo "=== Logical Screen Size (what apps see) ==="
# Get logical resolution via system_profiler or defaults
osascript -e 'tell application "Finder" to get bounds of window of desktop' 2>/dev/null || \
system_profiler SPDisplaysDataType 2>/dev/null | grep "UI Looks like"
echo ""
echo "=== Quick Reference ==="
# Parse physical resolution
PHYSICAL=$(system_profiler SPDisplaysDataType 2>/dev/null | grep "Resolution:" | head -1 | sed 's/.*: //')
echo "Physical: $PHYSICAL"
# Determine scale factor
if echo "$PHYSICAL" | grep -q "3840"; then
echo "Likely scale: 2x (divide screenshot coords by 2 for cliclick)"
elif echo "$PHYSICAL" | grep -q "2560"; then
echo "Likely scale: 2x (divide screenshot coords by 2 for cliclick)"
elif echo "$PHYSICAL" | grep -q "5120"; then
echo "Likely scale: 2x (divide screenshot coords by 2 for cliclick)"
else
echo "Scale: Check 'UI Looks like' vs 'Resolution' to determine"
fi
```
### scripts/get-window-bounds.sh
```bash
#!/bin/bash
# Get window bounds for an application
# Usage: get-window-bounds.sh [AppName]
# If no app specified, gets frontmost window
APP_NAME="${1:-}"
if [ -z "$APP_NAME" ]; then
# Get frontmost application's window
osascript -e '
tell application "System Events"
set frontApp to first application process whose frontmost is true
set appName to name of frontApp
tell frontApp
set win to front window
set {x, y} to position of win
set {w, h} to size of win
end tell
end tell
return "app:" & appName & " x:" & x & " y:" & y & " width:" & w & " height:" & h
'
else
# Get specific app's window
osascript -e "
tell application \"System Events\"
tell process \"$APP_NAME\"
set win to front window
set {x, y} to position of win
set {w, h} to size of win
end tell
end tell
return \"x:\" & x & \" y:\" & y & \" width:\" & w & \" height:\" & h
"
fi
```