AcuOps Operations Runbook¶
Standard Operating Procedures for managing AcuOps CI/CD pipeline clients.
Table of Contents¶
- Client Onboarding
- Client Offboarding
- Daily Operations
- Incident Response
- Deployment Support
- Monthly Reporting
- Maintenance Windows
- Escalation Procedures
Client Onboarding¶
Automated Onboarding¶
Use the onboarding script for standard client setup:
./scripts/onboard-client.sh \
--client <client-id> \
--name "Client Name" \
--acumatica-url https://client.acumatica.com \
--project-name ClientCustomizations \
--tier standard
This script:
1. Creates a private repo from the AcuOps template
2. Sets GitHub Variables (project name, co-publish list)
3. Prompts for and sets Acumatica API credentials as GitHub Secrets
4. Enables branch protection on main
5. Generates acuops.yaml via setup.sh --non-interactive
6. Triggers initial CI validation
7. Registers client in acuops-clients.yaml
Dry Run First¶
Always dry-run before onboarding a new client:
./scripts/onboard-client.sh --dry-run --client acme --name "ACME" \
--acumatica-url https://acme.acumatica.com --project-name AcmeProject
Post-Onboarding Checklist¶
- [ ] Verify repo created at
https://github.com/<org>/acuops-<client> - [ ] Confirm branch protection is enabled on
main - [ ] Test API credentials: trigger a deploy workflow manually
- [ ] Client exports customization project from Acumatica SM204505
- [ ] First deploy succeeds
- [ ] Client added to monitoring (
acuops-manage.py health) - [ ] Share USAGE.md and TROUBLESHOOTING.md with client team
- [ ] Schedule onboarding call (Standard/Premium tiers)
Manual Steps (if automated onboarding fails)¶
- Create repo:
gh repo create <org>/acuops-<client> --template <template> --private - Set secrets:
gh secret set ACUMATICA_PROD_URL --body "..." --repo <org>/acuops-<client> - Set variables:
gh variable set CUSTOMIZATION_PROJECT_NAME --body "..." --repo <org>/acuops-<client> - Register:
python scripts/acuops-manage.py add-client --id <id> --name "..." --repo <org>/acuops-<client>
Client Offboarding¶
Standard Offboarding¶
- Notify client of offboarding timeline (30 days for Standard, 60 for Premium)
- Export final backup of customization package
- Remove client from monitoring registry
- Archive the GitHub repository
- Delete GitHub Secrets (credentials)
- Remove from
acuops-clients.yaml - Send final report
# Remove from registry
python scripts/acuops-manage.py remove-client <client-id>
# Archive the repo (preserves history)
gh repo archive <org>/acuops-<client> --yes
# Delete secrets (rotate client's Acumatica API password)
gh secret delete ACUMATICA_PROD_URL --repo <org>/acuops-<client>
gh secret delete ACUMATICA_PROD_USERNAME --repo <org>/acuops-<client>
gh secret delete ACUMATICA_PROD_PASSWORD --repo <org>/acuops-<client>
gh secret delete ACUMATICA_PROD_TENANT --repo <org>/acuops-<client>
Important¶
- Always advise client to rotate their Acumatica API credentials after offboarding
- Retain archived repo for 90 days before deletion (contractual obligation)
- Generate final monthly report before archiving
Daily Operations¶
Health Monitoring¶
Run daily health check across all clients:
Expected output: all clients show "healthy" status. Address any failures immediately.
Health Check Criteria¶
| Check | Healthy | Unhealthy |
|---|---|---|
| Last deploy | Within 30 days | No deploys in 30+ days |
| Recent CI | Passing | 3+ consecutive failures |
| Workflow status | Enabled | Disabled or missing |
| API connectivity | 200 OK | Timeout or auth failure |
Triage Priority¶
| Tier | Response Time | Severity |
|---|---|---|
| Premium | 4 hours | Any failure = P1 |
| Standard | 8 hours | Deploy failure = P1, CI failure = P2 |
| Self-Service | Best effort | Critical only |
Incident Response¶
Deploy Failure¶
Symptoms: Workflow run shows "failure" conclusion.
Steps:
1. Check workflow run logs: gh run view <run-id> --repo <repo> --log-failed
2. Identify failure stage (validation, import, publish, polling)
3. Common causes and fixes:
| Error | Cause | Fix |
|---|---|---|
401 Unauthorized |
Expired API credentials | Rotate password, update secret |
Import failed |
Invalid package structure | Validate project.xml, check file paths |
Publish timeout |
Large package or server load | Increase poll_timeout in acuops.yaml |
Co-publish conflict |
Incompatible customization projects | Review conflicting code, coordinate with other project owners |
Validation error |
project.xml schema issues | Run python scripts/validate-project.py locally |
- If credentials issue: coordinate with client to rotate API password
- If package issue: review recent commits, revert if necessary
- Document incident in client status notes
CI Failure¶
Symptoms: Lint or test job fails on PR or push.
Steps:
1. Check CI logs: gh run view <run-id> --repo <repo> --log-failed
2. Common causes:
- Lint failures: shellcheck warnings, Python syntax
- Test failures: flaky tests, environment issues
3. For flaky tests: re-run the workflow
4. For real failures: review recent changes, fix, and push
Acumatica API Outage¶
Symptoms: All deploy and health checks fail with connection errors.
Steps:
1. Check Acumatica status page (if available)
2. Test API manually: curl -s https://client.acumatica.com/entity/auth/login
3. If Acumatica is down: wait and monitor, no action needed
4. If network issue: check firewall rules, IP allowlists
5. Notify affected clients (Standard/Premium)
Deployment Support¶
Manual Deploy¶
Trigger a deploy manually via GitHub Actions:
Deploy with Intelligence¶
Deploy Intelligence predicts success probability based on historical data:
# Check prediction before deploying
python scripts/intelligence.py predict
# Record outcome after deploy
python scripts/intelligence.py record --success
python scripts/intelligence.py record --failure --reason "co-publish conflict"
Rollback Procedure¶
- Download the pre-deploy backup (if backup is enabled):
- Import the backup package via the same deploy workflow
- Verify rollback succeeded
Staging Validation¶
For clients with staging environments:
# Deploy to staging first
gh workflow run acuops-deploy.yml \
--repo <org>/acuops-<client> \
--ref staging
# Verify staging, then merge to main for production
Monthly Reporting¶
Generate Reports¶
# Single client
python scripts/generate-report.py heritage --month 2026-03
# All clients
python scripts/generate-report.py --all --month 2026-03 --output-dir reports/2026-03
Report Contents¶
Each report includes: - Executive Summary — Key metrics at a glance - Deployment Statistics — Count, success rate, duration trends - CI/CD Health — Test pass rates, lint compliance - SLA Compliance — Target vs actual for the service tier - Recommendations — Actionable items for improvement
Delivery Schedule¶
| Tier | Delivery | Format |
|---|---|---|
| Premium | By 5th of month | Report + review call |
| Standard | By 10th of month | Report via email |
| Self-Service | On request | Self-serve via CLI |
Maintenance Windows¶
Scheduled Maintenance¶
- Acumatica publishes restart the app pool. Schedule after business hours.
- Coordinate with client for maintenance windows (Premium tier: pre-agreed schedule)
- Post maintenance notice to client's Slack/Teams if integrated
AcuOps Pipeline Updates¶
When updating the AcuOps template for all clients:
- Tag a new release:
git tag v1.x.x && git push --tags - Test changes on reference client first (Heritage Fabrics)
- For each managed client:
- Create PR from template update
- Run CI to verify compatibility
- Merge after review
- See
docs/UPGRADE.mdfor client-facing upgrade instructions
Escalation Procedures¶
Escalation Matrix¶
| Level | Criteria | Action | Timeline |
|---|---|---|---|
| L1 | CI failure, non-blocking | Fix in next business day | 24h |
| L2 | Deploy failure, client impacted | Investigate immediately | 4h (Premium), 8h (Standard) |
| L3 | Data loss, security incident | All hands, client notification | 1h |
Communication Templates¶
Deploy Failure (L2):
We detected a deployment failure for [project] at [time]. Our team is investigating. Current status: [investigating/identified/resolved]. Next update in [timeframe].
Planned Maintenance:
Scheduled maintenance for [project] on [date] at [time] ([duration] estimated). During this window, the Acumatica instance will restart. No action required from your team.
Resolution:
The deployment issue for [project] has been resolved. Root cause: [brief description]. Preventive measure: [action taken]. Full details in the monthly report.
Appendix¶
Environment Variables¶
| Variable | Required | Description |
|---|---|---|
GITHUB_TOKEN |
Yes | GitHub PAT with repo scope |
ACUOPS_CLIENTS |
No | Path to client registry (default: acuops-clients.yaml) |
Service Tier Comparison¶
| Feature | Self-Service | Standard | Premium |
|---|---|---|---|
| Pipeline access | BSL 1.1 | Managed | Managed |
| Deploy SLA | — | 95% success | 99% success |
| Response time | — | 8 hours | 4 hours |
| Monthly reports | — | Report + call | |
| Deploy Intelligence | Optional | Included | Included |
| Dedicated support | — | — | Named contact |
Useful Commands¶
# List all clients
python scripts/acuops-manage.py list-clients
# Check specific client health
python scripts/acuops-manage.py client-status heritage
# Run validation on a project
python scripts/validate-project.py Customization/_project/project.xml
# Check ISV certification readiness
python scripts/certification-checklist.py Customization/_project/project.xml
# View deploy intelligence report
python scripts/intelligence.py report