Migrating Alerts to SolarWinds Alert Central: Step-by-Step Checklist
Overview
A concise, low-risk migration ensures alerts remain reliable and noise is minimized. This checklist assumes you already have access to both the source alerting system and SolarWinds Alert Central, administrative credentials, and a maintenance/change window if required.
Pre-migration preparations
- Inventory alerts
- Export or list all existing alerts: name, description, trigger conditions, severity, frequency, scope (hosts/groups), actions (email, webhook, ticket), escalation/time thresholds, custom fields.
- Classify and prioritize
- Mark as: Critical (must migrate immediately), Important (migrate), Historical/Obsolete (archive or delete).
- Map fields & actions
- Create a mapping table from source alert fields to Alert Central fields (name, condition syntax, severity levels, tags, notification endpoints, runbooks).
- Collect credentials & endpoints
- Service accounts, API keys, SMTP details, webhook URLs, ITSM instance credentials (ServiceNow/Service Desk).
- Backup current configuration
- Export alert definitions and actions from the source system; snapshot Alert Central current settings if in use.
- Plan testing scope
- Select 5–10 representative alerts (one per priority/type) for an initial pilot.
Migration steps (pilot)
- Create corresponding alert categories/tags in Alert Central
- Recreate alert groups, severity taxonomy, and tags to match mapping.
- Recreate alert logic
- Translate source conditions to Alert Central expressions. Preserve thresholds and time windows.
- Recreate actions & integrations
- Configure notification channels (email, Slack, webhooks), and configure ITSM integrations (ServiceNow/SolarWinds Service Desk) using the integration instances workflow.
- Attach runbooks and playbooks
- Link existing escalation steps or paste runbooks into Alert Central action/description fields.
- Set deduplication and suppression rules
- Configure noise reduction: alert throttling, suppression windows, correlation rules.
- Assign owners & permissions
- Set alert owners/teams and apply appropriate RBAC in Alert Central.
- Test end-to-end
- Trigger test events for each pilot alert; verify conditions, deduplication, notifications, ITSM ticket creation, and recovery/clear actions.
- Validate metrics & observability
- Confirm Alert Central records history/metric data and retention settings match requirements.
Migration steps (full rollout)
- Schedule bulk migration
- Use the mapping table to batch-create alerts via API or UI; migrate by priority group (Critical → Important → Others).
- Automate where possible
- Use Alert Central API or automation scripts to import standardized alert templates and reduce manual errors.
- Migrate integrations
- Switch or duplicate integrations (emails, webhooks, ITSM) to Alert Central and validate each integration’s operational state.
- Staged cutover
- For each group: enable Alert Central copy in parallel (shadow mode) for 24–72 hours, compare behavior, then disable source alert or update source to route to Alert Central.
- Monitor for gaps
- Track missed alerts, duplicate incidents, or unexpected noise; adjust thresholds and suppression rules promptly.
- Communicate changes
- Notify stakeholders and on-call teams about new alert names, owners, and expected behaviors.
Post-migration tasks
- Audit and reconcile
- Compare alert counts and incident/ticket volumes across systems for 7–30 days to confirm parity.
- Tune and optimize
- Reduce noise: merge duplicate alerts, tighten conditions, refine suppression/correlation rules.
- Document
- Update runbooks, escalation matrices, and a migration log with decisions and mapping references.
- Decommission or archive
- Disable source alerts after a verification period; archive exported configs and remove obsolete integrations.
- Training
- Run a short training session or distribute a quick reference for on-call teams covering Alert Central workflows and alert naming conventions.
- Review retention & compliance
- Ensure Alert Central retention settings meet audit, compliance, and reporting needs.
Quick troubleshooting checklist
- Notifications not sent: verify SMTP/webhook credentials, integration operational state, and notification throttling.
- Tickets not created: confirm ITSM integration instance enabled, credentials, and field mappings.
- Duplicate incidents: check deduplication/correlation settings and whether both systems are active for the same alerts.
- Missing alerts: validate condition translation and that the monitored objects are in scope (correct node/group IDs, case-sensitive hostnames).
Example minimal timeline (for medium-sized environment, ~200 alerts)
- Week 0: Inventory, mapping, and pilot selection
- Week 1: Pilot migration and validation
- Weeks 2–3: Bulk migration by priority groups (Critical → Important → Others)
- Week 4: Tuning, auditing, training, and decommissioning source alerts
Checklist (compact)
- Inventory complete
- Field/action mapping created
- Backups exported
- Pilot alerts migrated & validated
- Integrations configured & tested
- Bulk migration executed in stages
- Shadow-mode verification completed
- Auditing & tuning finished
- Documentation updated
- Source alerts archived/decommissioned
- Team trained
Leave a Reply