preset
Intelligently deploys Azure OpenAI models to optimal regions by analyzing capacity across all available regions. Automatically checks current region first and shows alternatives if needed. USE FOR: quick deployment, optimal region, best region, automatic region selection, fast setup, multi-region capacity check, high availability deployment, deploy to best location. DO NOT USE FOR: custom SKU selection (use customize), specific version selection (use customize), custom capacity configuration (use customize), PTU deployments (use customize).
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install microsoft-azure-skills-preset
Repository
Skill path: .github/plugins/azure-skills/skills/microsoft-foundry/models/deploy-model/preset
Intelligently deploys Azure OpenAI models to optimal regions by analyzing capacity across all available regions. Automatically checks current region first and shows alternatives if needed. USE FOR: quick deployment, optimal region, best region, automatic region selection, fast setup, multi-region capacity check, high availability deployment, deploy to best location. DO NOT USE FOR: custom SKU selection (use customize), specific version selection (use customize), custom capacity configuration (use customize), PTU deployments (use customize).
Open repositoryBest for
Primary workflow: Run DevOps.
Technical facets: Full Stack, DevOps.
Target audience: everyone.
License: MIT.
Original source
Catalog source: SkillHub Club.
Repository owner: microsoft.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install preset into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/microsoft/azure-skills before adding preset to shared team environments
- Use preset for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: preset
description: "Intelligently deploys Azure OpenAI models to optimal regions by analyzing capacity across all available regions. Automatically checks current region first and shows alternatives if needed. USE FOR: quick deployment, optimal region, best region, automatic region selection, fast setup, multi-region capacity check, high availability deployment, deploy to best location. DO NOT USE FOR: custom SKU selection (use customize), specific version selection (use customize), custom capacity configuration (use customize), PTU deployments (use customize)."
license: MIT
metadata:
author: Microsoft
version: "1.0.1"
---
# Deploy Model to Optimal Region
Automates intelligent Azure OpenAI model deployment by checking capacity across regions and deploying to the best available option.
## What This Skill Does
1. Verifies Azure authentication and project scope
2. Checks capacity in current project's region
3. If no capacity: analyzes all regions and shows available alternatives
4. Filters projects by selected region
5. Supports creating new projects if needed
6. Deploys model with GlobalStandard SKU
7. Monitors deployment progress
## Prerequisites
- Azure CLI installed and configured
- Active Azure subscription with Cognitive Services read/create permissions
- Azure AI Foundry project resource ID (`PROJECT_RESOURCE_ID` env var or provided interactively)
- Format: `/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}`
- Found in: Azure AI Foundry portal → Project → Overview → Resource ID
## Quick Workflow
### Fast Path (Current Region Has Capacity)
```
1. Check authentication → 2. Get project → 3. Check current region capacity
→ 4. Deploy immediately
```
### Alternative Region Path (No Capacity)
```
1. Check authentication → 2. Get project → 3. Check current region (no capacity)
→ 4. Query all regions → 5. Show alternatives → 6. Select region + project
→ 7. Deploy
```
---
## Deployment Phases
| Phase | Action | Key Commands |
|-------|--------|-------------|
| 1. Verify Auth | Check Azure CLI login and subscription | `az account show`, `az login` |
| 2. Get Project | Parse `PROJECT_RESOURCE_ID` ARM ID, verify exists | `az cognitiveservices account show` |
| 3. Get Model | List available models, user selects model + version | `az cognitiveservices account list-models` |
| 4. Check Current Region | Query capacity using GlobalStandard SKU | `az rest --method GET .../modelCapacities` |
| 5. Multi-Region Query | If no local capacity, query all regions | Same capacity API without location filter |
| 6. Select Region + Project | User picks region; find or create project | `az cognitiveservices account list`, `az cognitiveservices account create` |
| 7. Deploy | Generate unique name, calculate capacity (50% available, min 50 TPM), create deployment | `az cognitiveservices account deployment create` |
For detailed step-by-step instructions, see [workflow reference](references/workflow.md).
---
## Error Handling
| Error | Symptom | Resolution |
|-------|---------|------------|
| Auth failure | `az account show` returns error | Run `az login` then `az account set --subscription <id>` |
| No quota | All regions show 0 capacity | Defer to the [quota skill](../../../quota/quota.md) for increase requests and troubleshooting; check existing deployments; try alternative models |
| Model not found | Empty capacity list | Verify model name with `az cognitiveservices account list-models`; check case sensitivity |
| Name conflict | "deployment already exists" | Append suffix to deployment name (handled automatically by `generate_deployment_name` script) |
| Region unavailable | Region doesn't support model | Select a different region from the available list |
| Permission denied | "Forbidden" or "Unauthorized" | Verify Cognitive Services Contributor role: `az role assignment list --assignee <user>` |
---
## Advanced Usage
```bash
# Custom capacity
az cognitiveservices account deployment create ... --sku-capacity <value>
# Check deployment status
az cognitiveservices account deployment show --name <acct> --resource-group <rg> --deployment-name <name> --query "{Status:properties.provisioningState}"
# Delete deployment
az cognitiveservices account deployment delete --name <acct> --resource-group <rg> --deployment-name <name>
```
## Notes
- **SKU:** GlobalStandard only — **API Version:** 2024-10-01 (GA stable)
---
## Related Skills
- **microsoft-foundry** - Parent skill for Azure AI Foundry operations
- **[quota](../../../quota/quota.md)** — For quota viewing, increase requests, and troubleshooting quota errors, defer to this skill
- **azure-quick-review** - Review Azure resources for compliance
- **azure-cost-estimation** - Estimate costs for Azure deployments
- **azure-validate** - Validate Azure infrastructure before deployment
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### references/workflow.md
```markdown
# Preset Deployment Workflow — Step-by-Step
Condensed implementation reference for preset (optimal region) model deployment. See [SKILL.md](../SKILL.md) for overview.
**Table of Contents:** [Phase 1: Verify Authentication](#phase-1-verify-authentication) · [Phase 2: Get Current Project](#phase-2-get-current-project) · [Phase 3: Get Model Name](#phase-3-get-model-name) · [Phase 4: Check Current Region Capacity](#phase-4-check-current-region-capacity) · [Phase 5: Query Multi-Region Capacity](#phase-5-query-multi-region-capacity) · [Phase 6: Select Region and Project](#phase-6-select-region-and-project) · [Phase 7: Deploy Model](#phase-7-deploy-model)
---
## Phase 1: Verify Authentication
```bash
az account show --query "{Subscription:name, User:user.name}" -o table
```
If not logged in: `az login`
Switch subscription:
```bash
az account list --query "[].[name,id,state]" -o table
az account set --subscription <subscription-id>
```
---
## Phase 2: Get Current Project
Read `PROJECT_RESOURCE_ID` from env or prompt user. Format:
`/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}`
Parse ARM ID components:
```bash
SUBSCRIPTION_ID=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/subscriptions/\([^/]*\).*|\1|p')
RESOURCE_GROUP=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/resourceGroups/\([^/]*\).*|\1|p')
ACCOUNT_NAME=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/accounts/\([^/]*\)/projects.*|\1|p')
PROJECT_NAME=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/projects/\([^/?]*\).*|\1|p')
```
Verify project exists and get region:
```bash
az account set --subscription "$SUBSCRIPTION_ID"
PROJECT_REGION=$(az cognitiveservices account show \
--name "$PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query location -o tsv)
```
---
## Phase 3: Get Model Name
If model not provided as parameter, list available models:
```bash
az cognitiveservices account list-models \
--name "$PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query "[].name" -o tsv | sort -u
```
Get versions for selected model:
```bash
az cognitiveservices account list-models \
--name "$PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query "[?name=='$MODEL_NAME'].{Name:name, Version:version, Format:format}" \
-o table
```
---
## Phase 4: Check Current Region Capacity
```bash
CAPACITY_JSON=$(az rest --method GET \
--url "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/providers/Microsoft.CognitiveServices/locations/$PROJECT_REGION/modelCapacities?api-version=2024-10-01&modelFormat=OpenAI&modelName=$MODEL_NAME&modelVersion=$MODEL_VERSION")
CURRENT_CAPACITY=$(echo "$CAPACITY_JSON" | jq -r '.value[] | select(.properties.skuName=="GlobalStandard") | .properties.availableCapacity')
```
If `CURRENT_CAPACITY > 0` → skip to Phase 7. Otherwise continue to Phase 5.
---
## Phase 5: Query Multi-Region Capacity
```bash
ALL_REGIONS_JSON=$(az rest --method GET \
--url "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/providers/Microsoft.CognitiveServices/modelCapacities?api-version=2024-10-01&modelFormat=OpenAI&modelName=$MODEL_NAME&modelVersion=$MODEL_VERSION")
```
Extract available regions (capacity > 0):
```bash
AVAILABLE_REGIONS=$(echo "$ALL_REGIONS_JSON" | jq -r '.value[] | select(.properties.skuName=="GlobalStandard" and .properties.availableCapacity > 0) | "\(.location)|\(.properties.availableCapacity)"')
```
Extract unavailable regions:
```bash
UNAVAILABLE_REGIONS=$(echo "$ALL_REGIONS_JSON" | jq -r '.value[] | select(.properties.skuName=="GlobalStandard" and (.properties.availableCapacity == 0 or .properties.availableCapacity == null)) | "\(.location)|0"')
```
If no regions have capacity, defer to the [quota skill](../../../../quota/quota.md) for increase requests. Suggest checking existing deployments or trying alternative models like `gpt-4o-mini`.
---
## Phase 6: Select Region and Project
Present available regions to user. Store selection as `SELECTED_REGION`.
Find projects in selected region:
```bash
PROJECTS_IN_REGION=$(az cognitiveservices account list \
--query "[?kind=='AIProject' && location=='$SELECTED_REGION'].{Name:name, ResourceGroup:resourceGroup}" \
--output json)
```
**If no projects exist — create new:**
```bash
az cognitiveservices account create \
--name "$HUB_NAME" \
--resource-group "$RESOURCE_GROUP" \
--location "$SELECTED_REGION" \
--kind "AIServices" \
--sku "S0" --yes
az cognitiveservices account create \
--name "$NEW_PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--location "$SELECTED_REGION" \
--kind "AIProject" \
--sku "S0" --yes
```
---
## Phase 7: Deploy Model
Generate unique deployment name using `scripts/generate_deployment_name.sh`:
```bash
DEPLOYMENT_NAME=$(bash scripts/generate_deployment_name.sh "$ACCOUNT_NAME" "$RESOURCE_GROUP" "$MODEL_NAME")
```
Calculate capacity — 50% of available, minimum 50 TPM:
```bash
SELECTED_CAPACITY=$(echo "$ALL_REGIONS_JSON" | jq -r ".value[] | select(.location==\"$SELECTED_REGION\" and .properties.skuName==\"GlobalStandard\") | .properties.availableCapacity")
DEPLOY_CAPACITY=$(( SELECTED_CAPACITY / 2 ))
[ "$DEPLOY_CAPACITY" -lt 50 ] && DEPLOY_CAPACITY=50
```
Create deployment:
```bash
az cognitiveservices account deployment create \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--deployment-name "$DEPLOYMENT_NAME" \
--model-name "$MODEL_NAME" \
--model-version "$MODEL_VERSION" \
--model-format "OpenAI" \
--sku-name "GlobalStandard" \
--sku-capacity "$DEPLOY_CAPACITY"
```
Monitor with `az cognitiveservices account deployment show ... --query "properties.provisioningState"` until `Succeeded` or `Failed`.
```
### ../../../quota/quota.md
```markdown
# Microsoft Foundry Quota Management
Quota and capacity management for Microsoft Foundry. Quotas are **subscription + region** level.
> ⚠️ **Important:** This is the **authoritative skill** for all Foundry quota operations. When a user asks about quota, capacity, TPM, PTU, quota errors, or deployment limits, **always invoke this skill** rather than using MCP tools (azure-quota, azure-documentation, azure-foundry) directly. This skill provides structured workflows and error handling that direct tool calls lack.
> **Important:** All quota operations are **control plane (management)** operations. Use **Azure CLI commands** as the primary method. MCP tools are optional convenience wrappers around the same control plane APIs.
## Quota Types
| Type | Description |
|------|-------------|
| **TPM** | Tokens Per Minute, pay-per-token, subject to rate limits |
| **PTU** | Provisioned Throughput Units, monthly commitment, no rate limits |
| **Region** | Max capacity per region, shared across subscription |
| **Slots** | 10-20 deployment slots per resource |
**When to use PTU:** Consistent high-volume production workloads where monthly commitment is cost-effective.
---
Use this sub-skill when the user needs to:
- **View quota usage** — check current TPM/PTU allocation and available capacity
- **Check quota limits** — show quota limits for a subscription, region, or model
- **Find optimal regions** — compare quota availability across regions for deployment
- **Plan deployments** — verify sufficient quota before deploying models
- **Request quota increases** — navigate quota increase process through Azure Portal
- **Troubleshoot deployment failures** — diagnose QuotaExceeded, InsufficientQuota, DeploymentLimitReached, 429 rate limit errors
- **Optimize allocation** — monitor and consolidate quota across deployments
- **Monitor quota across deployments** — track capacity by model and region
- **Explain quota concepts** — explain TPM, PTU, capacity units, regional quotas
- **Free up quota** — identify and delete unused deployments
**Key Points:**
1. Isolated by region (East US ≠ West US)
2. Regional capacity varies by model
3. Multi-region enables failover and load distribution
4. Quota requests specify target region
See [detailed guide](./references/workflows.md#regional-quota).
---
## Core Workflows
### 1. Check Regional Quota
```bash
subId=$(az account show --query id -o tsv)
az rest --method get \
--url "https://management.azure.com/subscriptions/$subId/providers/Microsoft.CognitiveServices/locations/eastus/usages?api-version=2023-05-01" \
--query "value[?contains(name.value,'OpenAI')].{Model:name.value, Used:currentValue, Limit:limit}" -o table
```
**Output interpretation:**
- **Used**: Current TPM consumed (10000 = 10K TPM)
- **Limit**: Maximum TPM quota (15000 = 15K TPM)
- **Available**: Limit - Used (5K TPM available)
Change region: `eastus`, `eastus2`, `westus`, `westus2`, `swedencentral`, `uksouth`.
---
### 2. Find Best Region for Deployment
Check specific regions for available quota:
```bash
subId=$(az account show --query id -o tsv)
region="eastus"
az rest --method get \
--url "https://management.azure.com/subscriptions/$subId/providers/Microsoft.CognitiveServices/locations/$region/usages?api-version=2023-05-01" \
--query "value[?name.value=='OpenAI.Standard.gpt-4o'].{Model:name.value, Used:currentValue, Limit:limit, Available:(limit-currentValue)}" -o table
```
See [workflows reference](./references/workflows.md#multi-region-check) for multi-region comparison.
---
### 3. Check Quota Before Deployment
Verify available quota for your target model:
```bash
subId=$(az account show --query id -o tsv)
region="eastus"
model="OpenAI.Standard.gpt-4o"
az rest --method get \
--url "https://management.azure.com/subscriptions/$subId/providers/Microsoft.CognitiveServices/locations/$region/usages?api-version=2023-05-01" \
--query "value[?name.value=='$model'].{Model:name.value, Used:currentValue, Limit:limit, Available:(limit-currentValue)}" -o table
```
- **Available > 0**: Yes, you have quota
- **Available = 0**: Delete unused deployments or try different region
---
### 4. Monitor Quota by Model
Show quota allocation grouped by model:
```bash
subId=$(az account show --query id -o tsv)
region="eastus"
az rest --method get \
--url "https://management.azure.com/subscriptions/$subId/providers/Microsoft.CognitiveServices/locations/$region/usages?api-version=2023-05-01" \
--query "value[?contains(name.value,'OpenAI')].{Model:name.value, Used:currentValue, Limit:limit, Available:(limit-currentValue)}" -o table
```
Shows aggregate usage across ALL deployments by model type.
**Optional:** List individual deployments:
```bash
az cognitiveservices account list --query "[?kind=='AIServices'].{Name:name,RG:resourceGroup}" -o table
az cognitiveservices account deployment list --name <resource> --resource-group <rg> \
--query "[].{Name:name,Model:properties.model.name,Capacity:sku.capacity}" -o table
```
---
### 5. Delete Deployment (Free Quota)
```bash
az cognitiveservices account deployment delete --name <resource> --resource-group <rg> \
--deployment-name <deployment>
```
Quota freed **immediately**. Re-run Workflow #1 to verify.
---
### 6. Request Quota Increase
**Azure Portal Process:**
1. Navigate to [Azure Portal - All Resources](https://portal.azure.com/#view/HubsExtension/BrowseAll) → Filter "AI Services" → Click resource
2. Select **Quotas** in left navigation
3. Click **Request quota increase**
4. Fill form: Model, Current Limit, Requested Limit, Region, **Business Justification**
5. Wait for approval: **3-5 business days typically, up to 10 business days** ([source](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/quota))
**Justification template:**
```
Production [workload type] using [model] in [region].
Expected traffic: [X requests/day] with [Y tokens/request].
Requires [Z TPM] capacity. Current [N TPM] insufficient.
Request increase to [M TPM]. Deployment target: [date].
```
See [detailed quota request guide](./references/workflows.md#request-quota-increase) for complete steps.
---
## Quick Troubleshooting
| Error | Quick Fix | Detailed Guide |
|-------|-----------|----------------|
| `QuotaExceeded` | Delete unused deployments or request increase | [Error Resolution](./references/error-resolution.md#quotaexceeded) |
| `InsufficientQuota` | Reduce capacity or try different region | [Error Resolution](./references/error-resolution.md#insufficientquota) |
| `DeploymentLimitReached` | Delete unused deployments (10-20 slot limit) | [Error Resolution](./references/error-resolution.md#deploymentlimitreached) |
| `429 Rate Limit` | Increase TPM or migrate to PTU | [Error Resolution](./references/error-resolution.md#429-errors) |
---
## References
**Detailed Guides:**
- [Error Resolution Workflows](./references/error-resolution.md) - Detailed workflows for quota exhausted, 429 errors, insufficient quota, deployment limits
- [Troubleshooting Guide](./references/troubleshooting.md) - Quick error fixes and diagnostic commands
- [Quota Optimization Strategies](./references/optimization.md) - 5 strategies for freeing quota and reducing costs
- [Capacity Planning Guide](./references/capacity-planning.md) - TPM vs PTU comparison, model selection, workload calculations
- [Workflows Reference](./references/workflows.md) - Complete workflow steps and multi-region checks
- [PTU Guide](./references/ptu-guide.md) - Provisioned throughput capacity planning
**Official Microsoft Documentation:**
- [Azure OpenAI Service Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/) - Official pay-per-token rates
- [PTU Costs and Billing](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/provisioned-throughput-onboarding) - PTU hourly rates
- [Azure OpenAI Models](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models) - Model capabilities and regions
- [Quota Management Guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/quota) - Official quota procedures
- [Quotas and Limits](https://learn.microsoft.com/azure/ai-services/openai/quotas-limits) - Rate limits and quota details
**Calculators:**
- [Azure Pricing Calculator](https://azure.microsoft.com/pricing/calculator/) - Official pricing estimator
- Azure AI Foundry PTU calculator (Microsoft Foundry → Operate → Quota → Provisioned Throughput Unit tab) - PTU capacity sizing
```
---
## Skill Companion Files
> Additional files collected from the skill directory layout.
### EXAMPLES.md
```markdown
# Examples: preset
## Example 1: Fast Path — Current Region Has Capacity
**Scenario:** Deploy gpt-4o to project in East US, which has capacity.
**Result:** Deployed in ~45s. No region selection needed. 100K TPM default, GlobalStandard SKU.
## Example 2: Alternative Region — No Capacity in Current Region
**Scenario:** Deploy gpt-4-turbo to dev project in West US 2 (no capacity).
**Result:** Queried all regions → user selected East US 2 (120K available) → deployed in ~2 min.
## Example 3: Create New Project in Optimal Region
**Scenario:** Deploy gpt-4o-mini in Europe for data residency; no existing European project.
**Result:** Created AI Services hub + project in Sweden Central → deployed in ~4 min with 150K TPM.
## Example 4: Insufficient Quota Everywhere
**Scenario:** Deploy gpt-4 but all regions have exhausted quota.
**Result:** Graceful failure with actionable guidance:
1. Request quota increase via the [quota skill](../../../quota/quota.md)
2. List existing deployments consuming quota
3. Suggest alternative models (gpt-4o, gpt-4o-mini)
## Example 5: First-Time User — No Project
**Scenario:** Deploy gpt-4o with no existing AI Foundry project.
**Result:** Full onboarding in ~5 min — created resource group, AI Services hub, project, then deployed.
## Example 6: Deployment Name Conflict
**Scenario:** Auto-generated deployment name already exists.
**Result:** Appended random hex suffix (e.g., `-7b9e`) and retried automatically.
## Example 7: Multi-Version Model Selection
**Scenario:** Deploy "latest gpt-4o" when multiple versions exist.
**Result:** Latest stable version auto-selected. Capacity aggregated across versions.
## Example 8: Anthropic Model (claude-sonnet-4-6)
**Scenario:** Deploy claude-sonnet-4-6 (Anthropic model requiring modelProviderData).
**Result:** User prompted for industry selection → tenant country code and org name fetched automatically → deployed via ARM REST API with `modelProviderData` payload in ~2 min. Capacity set to 1 (MaaS billing).
---
## Summary of Scenarios
| Scenario | Duration | Key Features |
|----------|----------|--------------|
| **1: Fast Path** | ~45s | Current region has capacity, direct deploy |
| **2: Alt Region** | ~2m | Region selection, project switch |
| **3: New Project** | ~4m | Project creation in optimal region |
| **4: No Quota** | N/A | Graceful failure, actionable guidance |
| **5: First-Time** | ~5m | Complete onboarding |
| **6: Name Conflict** | ~1m | Auto-retry with suffix |
| **7: Multi-Version** | ~1m | Latest version auto-selected |
| **8: Anthropic** | ~2m | Industry prompt, tenant info, REST API deploy |
## Common Patterns
```
A: Quick Deploy Auth → Get Project → Check Region (✓) → Deploy
B: Region Select Auth → Get Project → Region (✗) → Query All → Select → Deploy
C: Full Onboarding Auth → No Projects → Create Project → Deploy
D: Error Recovery Deploy (✗) → Analyze → Fix → Retry
```
```
### references/preset-workflow.md
```markdown
# Preset Deployment Workflow - Detailed Implementation
This file contains the full step-by-step bash/PowerShell scripts for preset (optimal region) model deployment. Referenced from the main [SKILL.md](../SKILL.md).
---
## Phase 1: Verify Authentication
Check if user is logged into Azure CLI:
```bash
az account show --query "{Subscription:name, User:user.name}" -o table
```
**If not logged in:**
```bash
az login
```
**Verify subscription is correct:**
```bash
# List all subscriptions
az account list --query "[].[name,id,state]" -o table
# Set active subscription if needed
az account set --subscription <subscription-id>
```
---
## Phase 2: Get Current Project
**Check for PROJECT_RESOURCE_ID environment variable first:**
```bash
if [ -n "$PROJECT_RESOURCE_ID" ]; then
echo "Using project resource ID from environment: $PROJECT_RESOURCE_ID"
else
echo "PROJECT_RESOURCE_ID not set. Please provide your Azure AI Foundry project resource ID."
echo ""
echo "You can find this in:"
echo " • Azure AI Foundry portal → Project → Overview → Resource ID"
echo " • Format: /subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}"
echo ""
echo "Example: /subscriptions/abc123.../resourceGroups/rg-prod/providers/Microsoft.CognitiveServices/accounts/my-account/projects/my-project"
echo ""
read -p "Enter project resource ID: " PROJECT_RESOURCE_ID
fi
```
**Parse the ARM resource ID to extract components:**
```bash
# Extract components from ARM resource ID
# Format: /subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}
SUBSCRIPTION_ID=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/subscriptions/\([^/]*\).*|\1|p')
RESOURCE_GROUP=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/resourceGroups/\([^/]*\).*|\1|p')
ACCOUNT_NAME=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/accounts/\([^/]*\)/projects.*|\1|p')
PROJECT_NAME=$(echo "$PROJECT_RESOURCE_ID" | sed -n 's|.*/projects/\([^/?]*\).*|\1|p')
if [ -z "$SUBSCRIPTION_ID" ] || [ -z "$RESOURCE_GROUP" ] || [ -z "$ACCOUNT_NAME" ] || [ -z "$PROJECT_NAME" ]; then
echo "❌ Invalid project resource ID format"
echo "Expected format: /subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}"
exit 1
fi
echo "Parsed project details:"
echo " Subscription: $SUBSCRIPTION_ID"
echo " Resource Group: $RESOURCE_GROUP"
echo " Account: $ACCOUNT_NAME"
echo " Project: $PROJECT_NAME"
```
**Verify the project exists and get its region:**
```bash
# Set active subscription
az account set --subscription "$SUBSCRIPTION_ID"
# Get project details to verify it exists and extract region
PROJECT_REGION=$(az cognitiveservices account show \
--name "$PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query location -o tsv 2>/dev/null)
if [ -z "$PROJECT_REGION" ]; then
echo "❌ Project '$PROJECT_NAME' not found in resource group '$RESOURCE_GROUP'"
echo ""
echo "Please verify the resource ID is correct."
echo ""
echo "List available projects:"
echo " az cognitiveservices account list --query \"[?kind=='AIProject'].{Name:name, Location:location, ResourceGroup:resourceGroup}\" -o table"
exit 1
fi
echo "✓ Project found"
echo " Region: $PROJECT_REGION"
```
---
## Phase 3: Get Model Name
**If model name provided as skill parameter, skip this phase.**
Ask user which model to deploy. **Fetch available models dynamically** from the account rather than using a hardcoded list:
```bash
# List available models in the account
az cognitiveservices account list-models \
--name "$PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query "[].name" -o tsv | sort -u
```
Present the results to the user and let them choose, or enter a custom model name.
**Store model:**
```bash
MODEL_NAME="<selected-model>"
```
**Get model version (latest stable):**
```bash
# List available models and versions in the account
az cognitiveservices account list-models \
--name "$PROJECT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query "[?name=='$MODEL_NAME'].{Name:name, Version:version, Format:format}" \
-o table
```
**Use latest version or let user specify:**
```bash
MODEL_VERSION="<version-or-latest>"
```
**Detect model format:**
```bash
# Get model format from model catalog (e.g., OpenAI, Anthropic, Meta-Llama, Mistral, Cohere)
MODEL_FORMAT=$(az cognitiveservices account list-models \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query "[?name=='$MODEL_NAME'].format" -o tsv | head -1)
# Default to OpenAI if not found
MODEL_FORMAT=${MODEL_FORMAT:-"OpenAI"}
echo "Model format: $MODEL_FORMAT"
```
> 💡 **Model format determines the deployment path:**
> - `OpenAI` — Standard CLI deployment, TPM-based capacity, RAI policies apply
> - `Anthropic` — REST API deployment with `modelProviderData`, capacity=1, no RAI
> - All other formats (`Meta-Llama`, `Mistral`, `Cohere`, etc.) — Standard CLI deployment, capacity=1 (MaaS), no RAI
---
## Phase 4: Check Current Region Capacity
Before checking other regions, see if the current project's region has capacity:
```bash
# Query capacity for current region
CAPACITY_JSON=$(az rest --method GET \
--url "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/providers/Microsoft.CognitiveServices/locations/$PROJECT_REGION/modelCapacities?api-version=2024-10-01&modelFormat=$MODEL_FORMAT&modelName=$MODEL_NAME&modelVersion=$MODEL_VERSION")
# Extract available capacity for GlobalStandard SKU
CURRENT_CAPACITY=$(echo "$CAPACITY_JSON" | jq -r '.value[] | select(.properties.skuName=="GlobalStandard") | .properties.availableCapacity')
```
**Check result:**
```bash
if [ -n "$CURRENT_CAPACITY" ] && [ "$CURRENT_CAPACITY" -gt 0 ]; then
echo "✓ Current region ($PROJECT_REGION) has capacity: $CURRENT_CAPACITY TPM"
echo "Proceeding with deployment..."
# Skip to Phase 7 (Deploy)
else
echo "⚠ Current region ($PROJECT_REGION) has no available capacity"
echo "Checking alternative regions..."
# Continue to Phase 5
fi
```
---
## Phase 5: Query Multi-Region Capacity (If Needed)
Only execute this phase if current region has no capacity.
**Query capacity across all regions:**
```bash
# Get capacity for all regions in subscription
ALL_REGIONS_JSON=$(az rest --method GET \
--url "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/providers/Microsoft.CognitiveServices/modelCapacities?api-version=2024-10-01&modelFormat=$MODEL_FORMAT&modelName=$MODEL_NAME&modelVersion=$MODEL_VERSION")
# Save to file for processing
echo "$ALL_REGIONS_JSON" > /tmp/capacity_check.json
```
**Parse and categorize regions:**
```bash
# Extract available regions (capacity > 0)
AVAILABLE_REGIONS=$(jq -r '.value[] | select(.properties.skuName=="GlobalStandard" and .properties.availableCapacity > 0) | "\(.location)|\(.properties.availableCapacity)"' /tmp/capacity_check.json)
# Extract unavailable regions (capacity = 0 or undefined)
UNAVAILABLE_REGIONS=$(jq -r '.value[] | select(.properties.skuName=="GlobalStandard" and (.properties.availableCapacity == 0 or .properties.availableCapacity == null)) | "\(.location)|0"' /tmp/capacity_check.json)
```
**Format and display regions:**
```bash
# Format capacity (e.g., 120000 -> 120K)
format_capacity() {
local capacity=$1
if [ "$capacity" -ge 1000000 ]; then
echo "$(awk "BEGIN {printf \"%.1f\", $capacity/1000000}")M TPM"
elif [ "$capacity" -ge 1000 ]; then
echo "$(awk "BEGIN {printf \"%.0f\", $capacity/1000}")K TPM"
else
echo "$capacity TPM"
fi
}
echo ""
echo "⚠ No Capacity in Current Region"
echo ""
echo "The current project's region ($PROJECT_REGION) does not have available capacity for $MODEL_NAME."
echo ""
echo "Available Regions (with capacity):"
echo ""
# Display available regions with formatted capacity
echo "$AVAILABLE_REGIONS" | while IFS='|' read -r region capacity; do
formatted_capacity=$(format_capacity "$capacity")
# Get region display name (capitalize and format)
region_display=$(echo "$region" | sed 's/\([a-z]\)\([a-z]*\)/\U\1\L\2/g; s/\([a-z]\)\([0-9]\)/\1 \2/g')
echo " • $region_display - $formatted_capacity"
done
echo ""
echo "Unavailable Regions:"
echo ""
# Display unavailable regions
echo "$UNAVAILABLE_REGIONS" | while IFS='|' read -r region capacity; do
region_display=$(echo "$region" | sed 's/\([a-z]\)\([a-z]*\)/\U\1\L\2/g; s/\([a-z]\)\([0-9]\)/\1 \2/g')
if [ "$capacity" = "0" ]; then
echo " ✗ $region_display (Insufficient quota - 0 TPM available)"
else
echo " ✗ $region_display (Model not supported)"
fi
done
```
**Handle no capacity anywhere:**
```bash
if [ -z "$AVAILABLE_REGIONS" ]; then
echo ""
echo "❌ No Available Capacity in Any Region"
echo ""
echo "No regions have available capacity for $MODEL_NAME with GlobalStandard SKU."
echo ""
echo "Next Steps:"
echo "1. Request quota increase — use the quota skill (../../../quota/quota.md)"
echo ""
echo "2. Check existing deployments (may be using quota):"
echo " az cognitiveservices account deployment list \\"
echo " --name $PROJECT_NAME \\"
echo " --resource-group $RESOURCE_GROUP"
echo ""
echo "3. Consider alternative models with lower capacity requirements:"
echo " • gpt-4o-mini (cost-effective, lower capacity requirements)"
echo " List available models: az cognitiveservices account list-models --name \$PROJECT_NAME --resource-group \$RESOURCE_GROUP --output table"
exit 1
fi
```
---
## Phase 6: Select Region and Project
**Ask user to select region from available options.**
Example using AskUserQuestion:
- Present available regions as options
- Show capacity for each
- User selects preferred region
**Store selection:**
```bash
SELECTED_REGION="<user-selected-region>" # e.g., "eastus2"
```
**Find projects in selected region:**
```bash
PROJECTS_IN_REGION=$(az cognitiveservices account list \
--query "[?kind=='AIProject' && location=='$SELECTED_REGION'].{Name:name, ResourceGroup:resourceGroup}" \
--output json)
PROJECT_COUNT=$(echo "$PROJECTS_IN_REGION" | jq '. | length')
if [ "$PROJECT_COUNT" -eq 0 ]; then
echo "No projects found in $SELECTED_REGION"
echo "Would you like to create a new project? (yes/no)"
# If yes, continue to project creation
# If no, exit or select different region
else
echo "Projects in $SELECTED_REGION:"
echo "$PROJECTS_IN_REGION" | jq -r '.[] | " • \(.Name) (\(.ResourceGroup))"'
echo ""
echo "Select a project or create new project"
fi
```
**Option A: Use existing project**
```bash
PROJECT_NAME="<selected-project-name>"
RESOURCE_GROUP="<resource-group>"
```
**Option B: Create new project**
```bash
# Generate project name
USER_ALIAS=$(az account show --query user.name -o tsv | cut -d'@' -f1 | tr '.' '-')
RANDOM_SUFFIX=$(openssl rand -hex 2)
NEW_PROJECT_NAME="${USER_ALIAS}-aiproject-${RANDOM_SUFFIX}"
# Prompt for resource group
echo "Resource group for new project:"
echo " 1. Use existing resource group: $RESOURCE_GROUP"
echo " 2. Create new resource group"
# If existing resource group
NEW_RESOURCE_GROUP="$RESOURCE_GROUP"
# Create AI Services account (hub)
HUB_NAME="${NEW_PROJECT_NAME}-hub"
echo "Creating AI Services hub: $HUB_NAME in $SELECTED_REGION..."
az cognitiveservices account create \
--name "$HUB_NAME" \
--resource-group "$NEW_RESOURCE_GROUP" \
--location "$SELECTED_REGION" \
--kind "AIServices" \
--sku "S0" \
--yes
# Create AI Foundry project
echo "Creating AI Foundry project: $NEW_PROJECT_NAME..."
az cognitiveservices account create \
--name "$NEW_PROJECT_NAME" \
--resource-group "$NEW_RESOURCE_GROUP" \
--location "$SELECTED_REGION" \
--kind "AIProject" \
--sku "S0" \
--yes
echo "✓ Project created successfully"
PROJECT_NAME="$NEW_PROJECT_NAME"
RESOURCE_GROUP="$NEW_RESOURCE_GROUP"
```
---
## Phase 7: Deploy Model
**Generate unique deployment name:**
The deployment name should match the model name (e.g., "gpt-4o"), but if a deployment with that name already exists, append a numeric suffix (e.g., "gpt-4o-2", "gpt-4o-3"). This follows the same UX pattern as Azure AI Foundry portal.
Use the `generate_deployment_name` script to check existing deployments and generate a unique name:
*Bash version:*
```bash
DEPLOYMENT_NAME=$(bash scripts/generate_deployment_name.sh \
"$ACCOUNT_NAME" \
"$RESOURCE_GROUP" \
"$MODEL_NAME")
echo "Generated deployment name: $DEPLOYMENT_NAME"
```
*PowerShell version:*
```powershell
$DEPLOYMENT_NAME = & .\scripts\generate_deployment_name.ps1 `
-AccountName $ACCOUNT_NAME `
-ResourceGroup $RESOURCE_GROUP `
-ModelName $MODEL_NAME
Write-Host "Generated deployment name: $DEPLOYMENT_NAME"
```
**Calculate deployment capacity:**
Follow UX capacity calculation logic. For OpenAI models, use 50% of available capacity (minimum 50 TPM). For all other models (MaaS), capacity is always 1:
```bash
if [ "$MODEL_FORMAT" = "OpenAI" ]; then
# OpenAI models: TPM-based capacity (50% of available, minimum 50)
SELECTED_CAPACITY=$(echo "$ALL_REGIONS_JSON" | jq -r ".value[] | select(.location==\"$SELECTED_REGION\" and .properties.skuName==\"GlobalStandard\") | .properties.availableCapacity")
if [ "$SELECTED_CAPACITY" -gt 50 ]; then
DEPLOY_CAPACITY=$((SELECTED_CAPACITY / 2))
if [ "$DEPLOY_CAPACITY" -lt 50 ]; then
DEPLOY_CAPACITY=50
fi
else
DEPLOY_CAPACITY=$SELECTED_CAPACITY
fi
echo "Deploying with capacity: $DEPLOY_CAPACITY TPM (50% of available: $SELECTED_CAPACITY TPM)"
else
# Non-OpenAI models (MaaS): capacity is always 1
DEPLOY_CAPACITY=1
echo "MaaS model — deploying with capacity: 1 (pay-per-token billing)"
fi
```
### If MODEL_FORMAT is NOT "Anthropic" — Standard CLI Deployment
> 💡 **Note:** The Azure CLI supports all non-Anthropic model formats directly.
*Bash version:*
```bash
echo "Creating deployment..."
az cognitiveservices account deployment create \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--deployment-name "$DEPLOYMENT_NAME" \
--model-name "$MODEL_NAME" \
--model-version "$MODEL_VERSION" \
--model-format "$MODEL_FORMAT" \
--sku-name "GlobalStandard" \
--sku-capacity "$DEPLOY_CAPACITY"
```
*PowerShell version:*
```powershell
Write-Host "Creating deployment..."
az cognitiveservices account deployment create `
--name $ACCOUNT_NAME `
--resource-group $RESOURCE_GROUP `
--deployment-name $DEPLOYMENT_NAME `
--model-name $MODEL_NAME `
--model-version $MODEL_VERSION `
--model-format $MODEL_FORMAT `
--sku-name "GlobalStandard" `
--sku-capacity $DEPLOY_CAPACITY
```
> 💡 **Note:** For non-OpenAI MaaS models (Meta-Llama, Mistral, Cohere, etc.), `$DEPLOY_CAPACITY` is `1` (set in capacity calculation above).
### If MODEL_FORMAT is "Anthropic" — REST API Deployment with modelProviderData
The Azure CLI does not support `--model-provider-data`. You must use the ARM REST API directly.
**Step 1: Prompt user to select industry**
Present the following list and ask the user to choose one:
```
1. None (API value: none)
2. Biotechnology (API value: biotechnology)
3. Consulting (API value: consulting)
4. Education (API value: education)
5. Finance (API value: finance)
6. Food & Beverage (API value: food_and_beverage)
7. Government (API value: government)
8. Healthcare (API value: healthcare)
9. Insurance (API value: insurance)
10. Law (API value: law)
11. Manufacturing (API value: manufacturing)
12. Media (API value: media)
13. Nonprofit (API value: nonprofit)
14. Technology (API value: technology)
15. Telecommunications (API value: telecommunications)
16. Sport & Recreation (API value: sport_and_recreation)
17. Real Estate (API value: real_estate)
18. Retail (API value: retail)
19. Other (API value: other)
```
> ⚠️ **Do NOT pick a default industry or hardcode a value. Always ask the user.** This is required by Anthropic's terms of service. The industry list is static — there is no REST API that provides it.
Store selection as `SELECTED_INDUSTRY` (use the API value, e.g., `technology`).
**Step 2: Fetch tenant info (country code and organization name)**
```bash
TENANT_INFO=$(az rest --method GET \
--url "https://management.azure.com/tenants?api-version=2024-11-01" \
--query "value[0].{countryCode:countryCode, displayName:displayName}" -o json)
COUNTRY_CODE=$(echo "$TENANT_INFO" | jq -r '.countryCode')
ORG_NAME=$(echo "$TENANT_INFO" | jq -r '.displayName')
```
*PowerShell version:*
```powershell
$tenantInfo = az rest --method GET `
--url "https://management.azure.com/tenants?api-version=2024-11-01" `
--query "value[0].{countryCode:countryCode, displayName:displayName}" -o json | ConvertFrom-Json
$countryCode = $tenantInfo.countryCode
$orgName = $tenantInfo.displayName
```
**Step 3: Deploy via ARM REST API**
*Bash version:*
```bash
echo "Creating Anthropic model deployment via REST API..."
az rest --method PUT \
--url "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.CognitiveServices/accounts/$ACCOUNT_NAME/deployments/$DEPLOYMENT_NAME?api-version=2024-10-01" \
--body "{
\"sku\": {
\"name\": \"GlobalStandard\",
\"capacity\": 1
},
\"properties\": {
\"model\": {
\"format\": \"Anthropic\",
\"name\": \"$MODEL_NAME\",
\"version\": \"$MODEL_VERSION\"
},
\"modelProviderData\": {
\"industry\": \"$SELECTED_INDUSTRY\",
\"countryCode\": \"$COUNTRY_CODE\",
\"organizationName\": \"$ORG_NAME\"
}
}
}"
```
*PowerShell version:*
```powershell
Write-Host "Creating Anthropic model deployment via REST API..."
$body = @{
sku = @{
name = "GlobalStandard"
capacity = 1
}
properties = @{
model = @{
format = "Anthropic"
name = $MODEL_NAME
version = $MODEL_VERSION
}
modelProviderData = @{
industry = $SELECTED_INDUSTRY
countryCode = $countryCode
organizationName = $orgName
}
}
} | ConvertTo-Json -Depth 5
az rest --method PUT `
--url "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.CognitiveServices/accounts/$ACCOUNT_NAME/deployments/${DEPLOYMENT_NAME}?api-version=2024-10-01" `
--body $body
```
> 💡 **Note:** Anthropic models use `capacity: 1` (MaaS billing model), not TPM-based capacity.
**Monitor deployment progress:**
```bash
echo "Monitoring deployment status..."
MAX_WAIT=300 # 5 minutes
ELAPSED=0
INTERVAL=10
while [ $ELAPSED -lt $MAX_WAIT ]; do
STATUS=$(az cognitiveservices account deployment show \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--deployment-name "$DEPLOYMENT_NAME" \
--query "properties.provisioningState" -o tsv 2>/dev/null)
case "$STATUS" in
"Succeeded")
echo "✓ Deployment successful!"
break
;;
"Failed")
echo "❌ Deployment failed"
# Get error details
az cognitiveservices account deployment show \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--deployment-name "$DEPLOYMENT_NAME" \
--query "properties"
exit 1
;;
"Creating"|"Accepted"|"Running")
echo "Status: $STATUS... (${ELAPSED}s elapsed)"
sleep $INTERVAL
ELAPSED=$((ELAPSED + INTERVAL))
;;
*)
echo "Unknown status: $STATUS"
sleep $INTERVAL
ELAPSED=$((ELAPSED + INTERVAL))
;;
esac
done
if [ $ELAPSED -ge $MAX_WAIT ]; then
echo "⚠ Deployment timeout after ${MAX_WAIT}s"
echo "Check status manually:"
echo " az cognitiveservices account deployment show \\"
echo " --name $ACCOUNT_NAME \\"
echo " --resource-group $RESOURCE_GROUP \\"
echo " --deployment-name $DEPLOYMENT_NAME"
exit 1
fi
```
---
## Phase 8: Display Deployment Details
**Show deployment information:**
```bash
echo ""
echo "═══════════════════════════════════════════"
echo "✓ Deployment Successful!"
echo "═══════════════════════════════════════════"
echo ""
# Get endpoint information
ENDPOINT=$(az cognitiveservices account show \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--query "properties.endpoint" -o tsv)
# Get deployment details
DEPLOYMENT_INFO=$(az cognitiveservices account deployment show \
--name "$ACCOUNT_NAME" \
--resource-group "$RESOURCE_GROUP" \
--deployment-name "$DEPLOYMENT_NAME" \
--query "properties.model")
echo "Deployment Name: $DEPLOYMENT_NAME"
echo "Model: $MODEL_NAME"
echo "Version: $MODEL_VERSION"
echo "Region: $SELECTED_REGION"
echo "SKU: GlobalStandard"
echo "Capacity: $(format_capacity $DEPLOY_CAPACITY)"
echo "Endpoint: $ENDPOINT"
echo ""
# Generate direct link to deployment in Azure AI Foundry portal
DEPLOYMENT_URL=$(bash "$(dirname "$0")/scripts/generate_deployment_url.sh" \
--subscription "$SUBSCRIPTION_ID" \
--resource-group "$RESOURCE_GROUP" \
--foundry-resource "$ACCOUNT_NAME" \
--project "$PROJECT_NAME" \
--deployment "$DEPLOYMENT_NAME")
echo "🔗 View in Azure AI Foundry Portal:"
echo ""
echo "$DEPLOYMENT_URL"
echo ""
echo "═══════════════════════════════════════════"
echo ""
echo "Test your deployment:"
echo ""
echo "# View deployment details"
echo "az cognitiveservices account deployment show \\"
echo " --name $ACCOUNT_NAME \\"
echo " --resource-group $RESOURCE_GROUP \\"
echo " --deployment-name $DEPLOYMENT_NAME"
echo ""
echo "# List all deployments"
echo "az cognitiveservices account deployment list \\"
echo " --name $ACCOUNT_NAME \\"
echo " --resource-group $RESOURCE_GROUP \\"
echo " --output table"
echo ""
echo "Next steps:"
echo "• Click the link above to test in Azure AI Foundry playground"
echo "• Integrate into your application"
echo "• Set up monitoring and alerts"
```
```