Everything you need to deploy DrainCtl, configure it for your environment, and start getting real-time drain mode visibility across your RDSH farm.
DrainCtl ships as a single MSI. Deploy it however you deploy software — your RMM, SCCM, Intune, GPO, or just double-click it.
Double-click the MSI to launch the interactive installer. It walks you through choosing an install mode (dashboard, registration, or standalone) and setting notification URLs, dashboard address, grace period, and other options.
msiexec /i LISSTech.DrainCtl.msi /qn
Pass these properties on the command line (or in your RMM/SCCM transform) to pre-configure the installation:
| Property | Values / Example | Description |
|---|---|---|
INSTALL_MODE |
dashboard | registration | standalone |
Sets the operational mode. dashboard enables the web UI, registration registers with an existing dashboard, standalone runs independently. |
DASHBOARD_URL |
https://dash:49470 |
Dashboard URL to register with (used with registration mode) |
DASHBOARD_PORT |
49470 |
Port for the dashboard listener (used with dashboard mode) |
DASHBOARD_GROUP |
Domain Admins |
AD group authorized for dashboard access |
GRACE_PERIOD |
120 |
Grace period in minutes before drain mode becomes an alert |
POLL_INTERVAL |
60 |
Service poll interval in seconds |
SESSION_THRESHOLD |
80 |
Session utilization warning threshold percentage; 0 disables it |
ENABLE_PERF |
1 | 0 |
Enable host performance monitoring |
ENABLE_RFX |
1 | 0 |
Enable optional RemoteFX counter collection |
LOG_FILE_LEVEL |
info |
Daily file-log level: debug, info, warn, or error |
LOG_EVENT_LEVEL |
info |
Windows Event Log sink level |
DASHBOARD_ONLY |
1 | 0 |
Run only the dashboard service on this host; skip local drain monitoring |
Notification targets are intentionally not MSI properties. Add webhook, ntfy, or email targets after
install with drainctl notify add-* or through the dashboard Configuration modal.
Example — register with a dashboard, no UI:
msiexec /i LISSTech.DrainCtl.msi /qn INSTALL_MODE=registration DASHBOARD_URL=https://dash:49470
drainctl.exe and drainctl.dll to
C:\Program Files\LISS Technologies\LISSTech DrainCtl\bin\
drainctl is available
immediately in any terminal
C:\Program Files\WindowsPowerShell\Modules\
C:\ProgramData\LISS Technologies\LISSTech DrainCtl\%ProgramData%\LISS Technologies\LISSTech DrainCtl\config.json
drainctl check will work right
away.
If you just want the PowerShell cmdlets without the Windows Service, install directly from PSGallery:
Install-Module -Name LISSTech.DrainCtl -Scope AllUsers -AllowPrerelease
This gives you Get-RDSHDrainMode, Test-RDSHDrainMode, and friends — but without the service, queries go directly to the registry instead of the named pipe (slightly slower, no persistent audit store).
msiexec /x LISSTech.DrainCtl.msi /qn
This stops the service, removes all files, cleans up PATH, and deregisters the Event Log source. Your audit data in ProgramData is preserved (delete it manually if you want a clean slate).
Run the new MSI over an existing installation. The installer detects the previous version and skips all configuration dialogs — you go straight from the welcome screen to install.
config.json and servers.json are preservedSilent upgrades work the same way — just run the new MSI with /qn.
After installation, verify everything is working:
# Is the service running?
Get-Service DrainCtl
# Quick status check (instant — talks to the service via named pipe)
drainctl check
# PowerShell way
Get-RDSHDrainMode | Format-List
You should see status=Healthy and connections_allowed=true. If drain mode is
active on this server, you'll see the current state and how long it's been active.
drainctl check responds in under 100ms, it's talking to the service over the named pipe.
If it takes a second or two, the service isn't running and it's falling back to a direct registry read.
Check Get-Service DrainCtl.
All settings live in a single JSON file and hot-reload automatically when you save it. No service restart needed.
%ProgramData%\LISS Technologies\LISSTech DrainCtl\config.json
{
"grace_period": 60,
"retention_days": 90,
"poll_interval": 60,
"audit_path": "C:\\ProgramData\\LISS Technologies\\LISSTech DrainCtl\\drainctl.db",
"memory_limit_mb": 256,
"log_file_level": "info",
"log_event_level": "info",
"notifications": [
{
"type": "webhook",
"url": "https://hooks.example.com/drainctl",
"triggers": ["drain_on", "drain_off", "alert", "healthy"],
"repeat_minutes": 30
},
{
"type": "ntfy",
"url": "https://ntfy.sh/my-rdsh-alerts",
"triggers": ["alert", "session_warning"],
"repeat_minutes": 0
},
{
"type": "email",
"url": "smtp://smtp.example.com:587",
"to": ["ops@example.com"],
"from": "drainctl@example.com",
"secret": "smtp-password",
"triggers": ["drain_on", "drain_off", "alert"]
}
],
"dashboard": {
"enabled": false,
"port": 49470,
"group": "Domain Admins",
"fetch_interval": 300
},
"session_warning_threshold": 80,
"performance": {
"enabled": true,
"sample_interval_sec": 30,
"cpu_warn_pct": 70,
"cpu_crit_pct": 85,
"mem_warn_pct": 20,
"mem_crit_pct": 10,
"input_delay_warn_ms": 50,
"input_delay_crit_ms": 100,
"input_delay_percentile": "p95",
"load_alert_delay_sec": 60,
"input_delay_alert_delay_sec": 90,
"collect_remotefx": false,
"collect_per_session": true
},
"evtspike": {
"enabled": false
},
"retention": {
"metrics_days": 30,
"audit_days": 365
},
"telemetry": {
"aggregator_interval_seconds": 60,
"retention_interval_minutes": 15
},
"update": {
"enabled": false,
"channel": "stable",
"poll_interval": "24h"
}
}
| Key | Default | What it does |
|---|---|---|
grace_period |
60 | Minutes drain mode must be active before it becomes an alert. During the grace period, the status is "Grace" and exit code is 0. Performance thresholds may independently promote the status to "Warning" (any warning) or "Alert" (any critical). |
retention_days |
90 | Days to keep audit records. Maximum 365. Older records are pruned daily. |
poll_interval |
60 | Seconds between polls. The service also uses real-time registry notifications for drain mode changes; this interval governs performance monitoring and session checks. |
audit_path |
ProgramData path |
SQLite database path. Legacy audit.jsonl values are normalized to
drainctl.db in the same directory; the live audit trail, telemetry tiers,
server registry, and event-spike history live in SQLite WAL mode.
|
memory_limit_mb |
256 | Go runtime soft memory limit for the service process. Dashboard installs raise this to 512 MB unless explicitly configured. Valid range: 32–4096 MB. |
log_file_level / log_event_level |
info |
Minimum level for the daily file log and Windows Event Log sinks. |
notifications |
[] | Array of notification targets (see Notifications). |
dashboard |
see above | Dashboard server, registration, TLS pinning, and config-fetch settings (see Dashboard). |
dashboard_only |
false | Run the dashboard on this host without local drain monitoring. |
session_warning_threshold |
80 |
Session utilization percentage that triggers a session_warning notification.
|
performance |
see above |
Performance monitoring settings. Set enabled: true to collect CPU, memory, disk
I/O, input delay, and optional RemoteFX metrics on each sample tick. Thresholds default to industry
baselines (CPU 70/85%, memory 20/10% free, input delay 50/100ms). Set any threshold to
-1 to disable it. Additional fields:
sample_interval_sec (collection interval, default 30, range 10–300),
input_delay_percentile ("p50" or "p95", default
"p95"),
load_alert_delay_sec (seconds CPU/memory must breach before firing, default
60),
input_delay_alert_delay_sec (seconds input-delay must breach before firing,
default 90).
|
retention |
see above |
SQLite retention windows: metrics_days defaults to 30, and
audit_days defaults to 365. Legacy retention_days is still accepted
for compatibility.
|
telemetry |
see above | Background worker cadence: metric aggregation defaults to 60 seconds; retention sweeps default to every 15 minutes. |
update |
disabled | Opt-in self-update settings (see Auto-update). |
Edit the file with any text editor:
%ProgramData%\LISS Technologies\LISSTech DrainCtl\config.json
The service picks up changes within seconds (it watches config.json with a file system watcher). No
restart needed. On dashboard-joined agents, edit through the dashboard Configuration modal
instead — runtime fields you change locally will be overwritten on the next pull. The
local file still owns bootstrap settings like dashboard.url,
dashboard.tls_fingerprint, log_file_level, and
memory_limit_mb.
config.json and writes it back with all
fields populated at their effective defaults. New fields (like sample_interval_sec,
load_alert_delay_sec, or evtspike) appear automatically after upgrading
— no manual editing required. If migrating from a registry-based version, settings are imported
from HKLM\...\DrainCtl\Parameters on first run; the old values are left in place but no
longer read.
DrainCtl can tell you who changed drain mode — but it needs Windows to record that information first. Run this once (as admin):
drainctl audit-setup
# or
Install-RDSHDrainAudit
This does two things:
auditpolTSServerDrainMode is modified
auditpol settings. For persistence, configure the equivalent GPO:Computer Configuration > Policies > Windows Settings > Security Settings > Advanced
Audit Policy Configuration > Object Access > Audit Registry > Success
Without audit setup, DrainCtl still detects changes instantly — it just can't tell you who made
them. The changed_by field will be empty.
# Plain text (default — great for terminal and RMM scripts)
drainctl check
# JSON (great for parsing)
drainctl check --format json
# Table
drainctl check --format table
# Quiet mode (only final status line)
drainctl --quiet check
# Last 50 records (table format)
drainctl history
# Only state transitions
drainctl history --changes-only
# Last 10 records as JSON
drainctl history --limit 10 --format json
# Time-bounded queries (RFC 3339 timestamps)
drainctl history --since 2025-01-01T00:00:00Z
drainctl history --since 2025-06-01T00:00:00Z --until 2025-06-30T23:59:59Z
# Table view (default) — session ID, user, station, state + summary line
drainctl sessions
# JSON — structured output with sessions array and summary object
drainctl sessions --format json
# CSV — for spreadsheet import or RMM custom fields
drainctl sessions --format csv
# Print all settings at a glance — no prompts, no side-effects
drainctl configure show
Prints version, config path, grace period, poll interval, retention days, session warning threshold,
dashboard settings, and all notification targets. Useful for confirming what is active on a machine
without opening
config.json or running the interactive wizard.
# Wipe in-memory evtspike detectors + baseline.json, in sync with the running service
drainctl baseline reset
Use after confirming the event-log anomaly baseline is poisoned (e.g., a real incident fired during
training and got absorbed as "normal"). The command routes through the service, wipes every
subscribed channel's detector state, and deletes baseline.json. Scoring resumes from the
prior on the next 10-second tick — subscriptions stay alive. See
Event-Log Anomaly Detection for background.
| Code | Meaning | When |
|---|---|---|
| 0 | Healthy / Grace | Connections allowed, or drain mode active but within grace period |
| 1 | Alert | Drain mode active beyond grace period — new connections are blocked |
| 2 | Error | Registry unreadable or other failure |
--format plain (default for check), table (default for
history), csv, json.
The JSON output from drainctl check --format json includes session data — active
sessions, total capacity, and utilization percentage — useful for building custom dashboards or
feeding into monitoring systems.
# View notification status
drainctl notify status
# Add a webhook target
drainctl notify add-webhook https://hooks.example.com/drainctl
# Add webhook with HMAC signing secret and custom trigger filter
drainctl notify add-webhook https://hooks.example.com/drainctl \
--secret MY_SIGNING_SECRET \
--triggers drain_on,alert,healthy \
--repeat-minutes 60
# Update the first webhook target; fields not supplied are preserved
drainctl notify set-webhook https://hooks.example.com/drainctl --target-index 0 --secret NEW_SECRET
# Add an ntfy target
drainctl notify add-ntfy https://ntfy.sh/my-rdsh-alerts
# ntfy with session-warning notifications and 30-minute repeat throttle
drainctl notify add-ntfy https://ntfy.sh/my-rdsh-alerts \
--triggers drain_on,alert,healthy,session_warning \
--repeat-minutes 30
# Send a test notification to all configured targets
drainctl notify test
add-webhook, add-ntfy, or add-email to create a target.
Use set-webhook, set-ntfy, or set-email with --target-index
to update an existing target; omitted optional flags preserve the existing secret, triggers, and repeat
interval. Valid trigger names: drain_on, drain_off,
grace_entered, alert, healthy, session_warning,
cpu_warning, cpu_critical, memory_warning,
memory_critical, input_delay_warning, input_delay_critical,
event_spike.
Run drainctl configure without flags to step through every setting interactively. The
current value is shown in brackets — press Enter to keep it.
drainctl configure
When called with flags (e.g., by the MSI installer during silent install), it applies the values
directly and writes config.json without prompting:
drainctl configure --grace-period 120 --mode dashboard --dashboard-port 49470
# Agent registration with session warning threshold and custom poll interval
drainctl configure --mode registration --dashboard-url https://dash:49470 \
--session-warning-threshold 90 --poll-interval 30 --retention-days 90
To review every active setting at a glance without prompting or making changes, use the read-only
configure show subcommand:
drainctl configure show
Output covers: version, config file path, grace period, poll interval, retention days, session warning threshold, dashboard server status, agent registration URL, TLS pin status, and all notification targets (with HMAC signing status for webhook targets).
The module is auto-registered by the MSI. It works on both PowerShell 5.1 and 7+.
# Rich status object
Get-RDSHDrainMode
# Boolean check — perfect for scripts and alerts
if (Test-RDSHDrainMode) { "All good" } else { "ALERT: drain mode active!" }
# Audit history
Get-RDSHDrainHistory -Limit 20
# Only transitions (who changed what, when)
Get-RDSHDrainHistory -ChangesOnly | Format-Table Timestamp, DrainMode, ChangedBy
# Pipeline magic
Get-RDSHDrainHistory -ChangesOnly |
Where-Object { $_.DrainMode -ne "ALLOW_ALL_CONNECTIONS" } |
Select-Object Timestamp, ChangedBy, DrainMode
| Cmdlet | Returns | Description |
|---|---|---|
Get-RDSHDrainMode |
PSObject | Full drain mode state with audit data |
Test-RDSHDrainMode |
bool | $true if connections allowed |
Get-RDSHDrainHistory |
PSObject[] | Audit trail records |
Install-RDSHDrainAudit |
— | One-time audit configuration |
Get-RDSHDrainNotificationTarget |
PSObject[] | Lists all notification targets with type, URL, triggers, repeat interval |
Add-RDSHDrainNotificationTarget |
— |
Adds a notification target (-Type, -URL, -Triggers,
-RepeatMinutes)
|
Set-RDSHDrainNotificationTarget |
— | Updates a notification target by type and index |
Remove-RDSHDrainNotificationTarget |
— | Removes a notification target by -URL |
Test-RDSHDrainNotificationTarget |
— | Sends a test notification to configured targets |
Enable-RDSHDrainDashboard |
— | Enable the multi-server dashboard on this server |
Disable-RDSHDrainDashboard |
— | Disable the dashboard on this server |
Install-RDSHDrainCertificate |
— | Install a custom TLS certificate for the dashboard |
Get pushed when something happens. DrainCtl supports multiple notification targets — any combination of webhooks (Slack, Teams, PagerDuty, custom HTTP endpoints), ntfy.sh topics, and email via SMTP. Each target has its own triggers and repeat interval.
# Add a webhook target (Slack, Teams, PagerDuty, custom endpoint, etc.)
drainctl notify add-webhook https://hooks.slack.com/services/T.../B.../xxx
# Add webhook with HMAC signing and only alert/healthy triggers
drainctl notify add-webhook https://hooks.slack.com/services/T.../B.../xxx \
--secret MY_SIGNING_SECRET \
--triggers drain_on,alert,healthy
# Add an ntfy target (push notifications on your phone!)
drainctl notify add-ntfy https://ntfy.sh/my-rdsh-alerts
# ntfy with session-warning alerts, repeat at most every 30 minutes
drainctl notify add-ntfy https://ntfy.sh/my-rdsh-alerts \
--triggers drain_on,alert,healthy,session_warning \
--repeat-minutes 30
# Add an email target (any SMTP relay — Gmail, Mailgun, O365, SendGrid)
drainctl notify add-email smtp://smtp.example.com:587 \
--to ops@example.com --from drainctl@example.com \
--secret smtp-password \
--triggers drain_on,drain_off,alert
# View notification status
drainctl notify status
# Send a test to make sure it works
drainctl notify test
Optional flags on add-webhook, add-ntfy, set-webhook, and set-ntfy:
--secret TEXT — HMAC-SHA256 signing secret sent as
X-DrainCtl-Signature (webhook only)
--triggers LIST — comma-separated events that fire this target; valid names:
drain_on, drain_off,
grace_entered, alert, healthy, session_warning,
cpu_warning, cpu_critical, memory_warning,
memory_critical, input_delay_warning, input_delay_critical,
event_spike
--repeat-minutes N — minimum minutes between repeated alert /
session_warning notifications; 0 means once per state change (default)
For set-* commands, omitted flags preserve the existing value, so you can update a single
field without touching the rest. To manage multiple targets with per-target settings, use the
dashboard UI or the CLI --target-index flag.
# List all targets
Get-RDSHDrainNotificationTarget
# Add a webhook for alerts only
Add-RDSHDrainNotificationTarget -Type webhook -URL 'https://hooks.example.com/drain' -Triggers alert -RepeatMinutes 15
# Add ntfy for all events
Add-RDSHDrainNotificationTarget -Type ntfy -URL 'https://ntfy.sh/drainctl'
# Add email target
Add-RDSHDrainNotificationTarget -Type email -URL 'smtp://smtp.example.com:587' `
-To 'ops@example.com' -From 'drainctl@example.com' -Secret 'smtp-password' `
-Triggers drain_on,alert
# Update the first webhook target in place
Set-RDSHDrainNotificationTarget -Type webhook -TargetIndex 0 `
-URL 'https://hooks.example.com/new-drain' -RepeatMinutes 30
# Send a test notification
Test-RDSHDrainNotificationTarget
# Remove a target
Remove-RDSHDrainNotificationTarget -URL 'https://hooks.example.com/drain'
Each notification target can subscribe to specific triggers. Assign them with
--triggers when adding a target, or edit config.json directly.
| Trigger | Description |
|---|---|
drain_on |
Drain mode activated |
drain_off |
Drain mode deactivated |
grace_entered |
Entered grace period |
alert |
Grace period exceeded |
healthy |
Returned to healthy |
session_warning |
Session utilization threshold exceeded (default 80%) |
cpu_warning |
CPU usage at or above warning threshold (default 70%) |
cpu_critical |
CPU usage at or above critical threshold (default 85%) |
memory_warning |
Available memory at or below warning threshold (default 20% free) |
memory_critical |
Available memory at or below critical threshold (default 10% free) |
input_delay_warning |
Input delay P95 at or above warning threshold (default 50ms) |
input_delay_critical |
Input delay P95 at or above critical threshold (default 100ms) |
event_spike |
Event-log anomaly detector (evtspike) confirmed a rate spike on a
subscribed channel. Requires evtspike.enabled.
|
All three target types share the same natural-language subjects, e.g.
"RDS01 — New remote connections disabled by DOMAIN\admin",
"RDS01 — Remote connections disabled for 2h 15m (exceeds 1h grace period)", or
"RDS01 — Session utilization at 85% (17/20 sessions)". For webhooks the subject appears
in the message field; for ntfy it's the notification title; for email it's the Subject
header and the HTML body heading.
Each target also has a repeat_minutes setting — set it to re-send alerts periodically
while the condition persists (0 = notify once).
{
"event": "transition",
"host": "MDS-LDC1-RDS5",
"drain_mode": "ALLOW_RECONNECTIONS_PREVENT_NEW_LOGONS",
"previous_mode": "ALLOW_ALL_CONNECTIONS",
"status": "Grace",
"message": "Drain mode active, within grace period (45m remaining).",
"changed_by": "MDS\\lissadmin",
"state_duration_seconds": 900,
"grace_period_seconds": 3600,
"connections_allowed": false,
"version": "26.95.0",
"timestamp": "2026-04-03T14:30:00-04:00",
"sessions": {
"active_sessions": 12,
"disconnected_sessions": 3,
"total_sessions": 15,
"max_sessions": 20,
"utilization_pct": 75
}
}
previous_mode is only present on transition events. sessions is
only present when session monitoring is active (see Session Tracking). The
connections_allowed field is false when drain mode is blocking new
connections, true when all connections are allowed — use this instead of parsing
status to gate automation.
ntfy messages arrive with priority high for alerts (red notification on your phone) and default for transitions.
Email notifications via SMTP. Use smtp:// for STARTTLS (port 587) or
smtps:// for implicit TLS (port 465). Works with any SMTP relay — Gmail, Mailgun,
SendGrid, O365. The secret field is the SMTP password. The from and
to fields are required.
{
"type": "email",
"url": "smtp://smtp.example.com:587",
"to": ["ops@example.com", "oncall@example.com"],
"from": "drainctl@example.com",
"secret": "smtp-password",
"triggers": ["drain_on", "drain_off", "alert"]
}
DrainCtl monitors active RDS sessions via WTSEnumerateSessionsW and exposes utilization
data alongside drain mode state.
The drainctl sessions command lists every active RDS session on the local server in real
time, including session ID, username, station name, and connection state. A summary line shows totals
and utilization when a session cap is configured.
drainctl sessions
Example output (table format):
ID USER STATION STATE
--- ---- ------- -----
1 alice RDP-Tcp#0 Active
2 bob RDP-Tcp#1 Active
3 carol RDP-Tcp#2 Disconnected
Summary: 3/50 sessions (6% utilization)
Use --format json for structured output (a sessions array plus a
summary object), or --format csv for spreadsheet import.
Session counts and utilization appear automatically in drainctl check --format json output,
including active sessions, total capacity, and utilization percentage. This is useful for feeding into
monitoring systems or custom dashboards.
When session utilization exceeds the configured threshold, DrainCtl fires a
session_warning notification. The default threshold is 80% — change
it via the CLI or in config.json:
# Via CLI (0 = disabled)
drainctl configure --session-warning-threshold 90
# Or in config.json directly
{
"session_warning_threshold": 90
}
Add session_warning to a notification target's triggers to receive these alerts (see
Notifications).
The multi-server dashboard displays per-server session gauges showing current utilization at a glance.
Managing multiple RDSH servers? The dashboard gives you a single, live view of drain mode state across your entire farm — with Windows Authentication so only the right people see it.
config.json, the
DrainCtl service serves an HTTP dashboard
# Enable the dashboard server (writes to config.json)
drainctl dashboard enable --port 49470 --group "Domain Admins"
# Or with a custom AD group
drainctl dashboard enable --port 49470 --group "RDS Admins"
You can also edit config.json directly:
{
"dashboard": {
"enabled": true,
"port": 49470,
"group": "Domain Admins"
}
}
Changes are hot-reloaded — no service restart needed.
On each RDSH server you want to monitor, run this once:
# Register with explicit URL
drainctl register https://dashboard-server:49470
# Or auto-discover via DNS SRV record
drainctl register --auto
This does two things:
dashboard.url to the local config.json (so the service starts
reporting automatically on startup)
Instead of configuring dashboard.url on every agent, create a DNS SRV record and agents
will find the dashboard automatically. The service checks for
_drainctl._tcp.<domain> on startup when no URL is configured.
# PowerShell — create the SRV record in AD-integrated DNS
Add-DnsServerResourceRecord -ZoneName "contoso.com" `
-Name "_drainctl._tcp" -Srv `
-DomainName "dashboard.contoso.com" `
-Port 49470 -Priority 0 -Weight 0
# Verify
Resolve-DnsName -Name "_drainctl._tcp.contoso.com" -Type SRV
Once the SRV record exists, agents discover the dashboard without any per-machine config. Just install the MSI and the service registers itself.
# Open in your browser
drainctl dashboard
# Or navigate directly
https://dashboard-server:49470
The dashboard shows a live grid with each server's hostname, drain mode, status, state duration, who last changed it, and when it was last seen. Color-coded: green for healthy, amber for grace, red for alert, gray for offline. Each server card includes a session gauge showing current utilization, plus an Overview LOAD chart that overlays CPU, CPU P95, Memory, and Sessions for the entire fleet on one dual-axis uPlot canvas.
Clicking a server opens the Server Detail panel. The Host Load tile on that panel is a
single dual-axis chart showing CPU, CPU P95, Memory, and Sessions for that host with its
own 5M/1H/1D/3D/5D window pill (replacing the earlier layout of three separate small charts). A
swimlane tile below it plots confirmed event-log spikes per channel, with a
HEALTHY / TRAINING / DISABLED / ERROR state chip in the
header sourced from the host's evtspike detector. See
Event-Log Anomaly Detection for how those states are derived.
The Configuration modal (gear icon) is the single place operators edit runtime
settings. It exposes: CPU warn/critical and memory warn/critical thresholds, input-delay warn/critical
thresholds and percentile (P50/P95), the grace period, session warning threshold, agent poll interval,
sample interval, performance-monitoring toggle (including RemoteFX and per-session CPU sub-toggles), the
evtspike detector toggle, and every notification target (each with its own triggers — including
event_spike — and repeat interval). The Alert Sensitivity row (Chill / Anxious /
Twitchy presets) sets thresholds, percentile, and sustain windows in one click.
# List all registered servers
drainctl dashboard list-servers
# Remove a decommissioned server
drainctl dashboard remove-server RDSH-OLD
The dashboard supports two login methods:
DOMAIN\username or username@domain format. Credentials are validated
against Active Directory via LogonUserW; passwords are never stored or logged.
Both methods create a session cookie (drainctl_session) with an 8-hour inactivity
timeout and 24-hour absolute lifetime. Sessions are stored in memory and do not survive service restarts.
The dashboard always runs over HTTPS. On first start, the service auto-generates a self-signed TLS certificate (ECDSA P-256, valid 1 year). The certificate and private key are stored in the DrainCtl data directory:
%ProgramData%\LISS Technologies\LISSTech DrainCtl\dashboard-tls.crt # world-readable
%ProgramData%\LISS Technologies\LISSTech DrainCtl\dashboard-tls.key # SYSTEM + Admins only
The private key is ACL-restricted to SYSTEM and Administrators. The certificate auto-renews 24 hours before expiry.
To view the dashboard's current TLS certificate fingerprint:
drainctl dashboard fingerprint
Certificate pinning lets agents verify they're talking to the real dashboard, not an impersonator. Pinning is opt-in — without it, agents use trust-on-first-use (TOFU) for TLS, which is fine for most environments.
Auto-pin on registration: Pass --pin when registering to automatically
capture and save the dashboard's certificate fingerprint:
# Register and auto-pin the dashboard's TLS certificate
drainctl register --auto --pin
drainctl register https://dashboard:49470 --pin
The fingerprint is saved to dashboard.tls_fingerprint in the agent's
config.json. All subsequent reports will verify the dashboard's certificate matches.
Auto-pin via config (for service auto-registration): Set
"auto_pin": true in config.json and the service will capture the fingerprint
on its next registration:
{
"dashboard": {
"url": "https://dashboard:49470",
"auto_pin": true
}
}
MSI deployment: the MSI can set DASHBOARD_URL for registration, but it does
not currently expose an AUTO_PIN property. For pinned deployments, set
dashboard.auto_pin in config management after install, or run
drainctl register ... --pin once on each agent.
Manual pinning: If you prefer to set the fingerprint yourself (e.g., distributed via
GPO), run drainctl dashboard fingerprint on the dashboard server and set
dashboard.tls_fingerprint in each agent's config. A manually-set fingerprint is never
overwritten by auto-pin.
drainctl register --auto --pin on each agent, or
set "auto_pin": true in config and restart the service. Without pinning enabled, cert
renewal is seamless.
To use a CA-issued or internal PKI certificate instead of the auto-generated self-signed one:
# CLI
drainctl dashboard install-cert C:\certs\dashboard.pem C:\certs\dashboard-key.pem
# PowerShell
Install-RDSHDrainCertificate -CertPath C:\certs\dashboard.pem -KeyPath C:\certs\dashboard-key.pem
This copies the PEM files into the DrainCtl data directory and updates config.json with the
paths. Restart the service to use the new certificate.
Both files must be PEM-encoded. The service loads them on startup and skips auto-generation when both are set. If the custom cert is invalid or expired, the service falls back to HTTP with a warning in the event log.
You can also set the paths directly in config.json:
{
"dashboard": {
"tls_cert": "C:\\certs\\dashboard.pem",
"tls_key": "C:\\certs\\dashboard-key.pem"
}
}
| Key | Default | Description |
|---|---|---|
dashboard.enabled |
false | Enable the HTTPS dashboard (set to true) |
dashboard.port |
49470 | Port the dashboard listens on |
dashboard.group |
Domain Admins | AD group authorized to view the dashboard |
dashboard.url |
empty | Agent-side: URL of the dashboard to report to (set by drainctl register) |
dashboard.tls_cert |
empty | Path to PEM certificate file. If empty, a self-signed cert is auto-generated. |
dashboard.tls_key |
empty | Path to PEM private key file. Required when tls_cert is set. |
dashboard.tls_fingerprint |
empty |
SHA-256 certificate fingerprint for agent-side pinning. Set manually or via
--pin.
|
dashboard.auto_pin |
false | Auto-capture dashboard cert fingerprint on service registration. |
dashboard.fetch_interval |
300 | Seconds between dashboard-authoritative config pulls by registered agents. |
The dashboard is the single source of truth for runtime settings. When an agent has a
dashboard.url configured, it pulls notification targets, thresholds, the grace period, the
session warning threshold, the performance block, and the evtspike enabled flag from the dashboard
server. Edit settings once in the Configuration modal and every agent picks up the change on its next
poll — and immediately on its local config.json write.
How it works:
GET /api/v1/config from the dashboard using its machine
Kerberos ticket
config.json
(an atomic write); the service's config-file watcher sees the change and hot-reloads live
config.json
No dashboard URL? Standalone agents (no dashboard.url) keep using their
local config.json — nothing changes for single-server deployments.
config.json.
Once an agent is joined to a dashboard, any runtime field you change locally will be overwritten on the
next pull. Use the dashboard Configuration modal instead. The local file is useful for bootstrap
settings (dashboard.url, dashboard.tls_fingerprint, log levels,
memory_limit_mb) and for admin-only evtspike knobs (channel lists, thresholds, baseline
path) that the dashboard modal intentionally does not expose.
notifications empty in each agent's config.json — the
dashboard owns them.
DrainCtl collects host-level and per-session performance metrics through the Windows PDH API. Enable it with one config key:
"performance": { "enabled": true }
With defaults, DrainCtl collects CPU, available memory, pages/sec, disk queue length, TCP retransmits, and per-session input delay on the configured sample interval. Metrics appear in the dashboard, audit trail, webhook payloads, and CLI JSON output.
| Trigger | Default | Fires when |
|---|---|---|
cpu_warning |
70% | CPU ≥ threshold, sustained for load_alert_delay_sec (default 60s) |
cpu_critical |
85% | CPU ≥ threshold, sustained for load_alert_delay_sec (default 60s) |
memory_warning |
20% free | Available memory ≤ threshold, sustained for load_alert_delay_sec (default 60s) |
memory_critical |
10% free | Available memory ≤ threshold, sustained for load_alert_delay_sec (default 60s) |
input_delay_warning |
50ms | Input delay ≥ threshold, sustained for input_delay_alert_delay_sec (default 90s) |
input_delay_critical |
100ms | Input delay ≥ threshold, sustained for input_delay_alert_delay_sec (default 90s) |
All performance triggers require a sustained breach before firing, eliminating flap noise from
transient spikes. CPU and memory default to a 60-second sustain window
(load_alert_delay_sec); input delay defaults to 90 seconds
(input_delay_alert_delay_sec) because it is inherently volatile. Both windows are expressed
in seconds and translate to a whole number of polls internally based on
sample_interval_sec (default 30). Input delay is evaluated against the P95 percentile by
default; set "input_delay_percentile": "p50" for median-based thresholds.
Set any threshold to 0 for the default, or -1 to disable.
Set "collect_remotefx": true to collect RemoteFX Graphics and Network counters: output FPS,
encoding time, frame quality, RTT, packet loss, and skip rates. These counters are only available when
the RemoteFX role is installed; absence is not an error.
Each server card in the Overview grid shows live CPU/Memory/Sessions values color-coded by severity (green/amber/red). The fleet-wide LOAD chart overlays CPU, CPU P95, Memory, and Sessions on one dual-axis canvas; a P95 toggle surfaces spikes the average smooths over.
Opening a server reveals the Server Detail panel — a single dual-axis Host Load chart scoped to that host plots CPU, CPU P95, Memory, and Sessions together, with its own 5M/1H/1D/3D/5D window pill. Input delay, disk queue, and RemoteFX counters (when the role is installed) are surfaced in the metrics tiles alongside.
The dashboard shows a composite status for each server reflecting both drain mode and performance health. The worst severity wins:
| Status | Color | Condition |
|---|---|---|
| Healthy | Green | Drain off, no performance threshold breaches |
| Warning | Amber | Drain off, but one or more perf metrics at warning level (CPU, memory, or input delay) |
| Grace | Amber | Drain mode active, within configured grace period |
| Alert | Red | Drain mode exceeded grace period, or any performance metric at critical level |
| Offline | Grey | No report received in the last 10 minutes |
Each notification target’s repeat_minutes acts as a cooldown window. Once a
notification fires, the same trigger type will not fire again for that target until the interval elapses.
If the metric is still breaching when the interval expires, a reminder is sent.
Cooldowns are never reset by metric oscillation. A metric that briefly dips below threshold and rises again does not produce a new notification — the original cooldown holds. This prevents notification storms from volatile metrics like input delay.
The dashboard streams real-time updates to connected browsers via Server-Sent Events. When an agent reports or settings change, every open dashboard tab updates within 2 seconds — no manual refresh needed.
GET /api/v1/events (requires session cookie)server_update (agent reports, local checks),
server_deleted (server removed via dashboard), and
settings_update (config changes)
The dashboard serves an OpenAPI 3.1 spec at
/api/v1/openapi.yaml and a Swagger UI at /api/docs.
Both are public (no authentication required). The spec covers all API endpoints
including the SSE event stream.
DrainCtl includes an opt-in event-log anomaly detector (evtspike). It subscribes to a
configured set of Windows event-log channels, learns a per-channel, time-of-day rate baseline with a
robust-cap Bayesian model, and fires a notification on the event_spike trigger when a
confirmed rate anomaly is detected. It is designed to flag things like authentication-failure floods,
application-crash storms, or a misbehaving service that starts spewing warnings — without needing
a SIEM.
Open the dashboard, click the gear icon, scroll to the Event Log Anomaly Detection group, and tick Enable detector. The change propagates to every connected agent on the next pull and takes effect without a service restart.
Subscribed channels default to Application, System, and
Microsoft-Windows-TerminalServices-LocalSessionManager/Operational. Channel lists,
thresholds, cooldowns, and the optional Security-channel opt-in are admin-only knobs and
live under "evtspike" in config.json on the dashboard server.
Each Server Detail panel carries an Event Spikes swimlane tile: one row per active channel, with confirmed spikes plotted as markers on a time axis (5M / 1H / 1D / 3D / 5D window pill). The tile header carries a state chip:
| State | Meaning |
|---|---|
HEALTHY |
Detector enabled; at least half of the subscribed channels have accumulated enough per-slot observations to score reliably. |
TRAINING |
Detector enabled and subscribed, but fewer than half of the channels are mature yet. Spikes can still fire for channels that have matured; the chip stays amber until the majority are in a usable state. |
DISABLED |
evtspike.enabled is false on this host. |
ERROR |
Zero channels subscribed successfully, or a startup error surfaced. Check the DrainCtl file
log for evtspike WARN lines — typically a missing channel,
access-denied, or the Security channel enabled without
SeSecurityPrivilege on the service token.
|
Every numeric knob below lives under "evtspike" in config.json on the
dashboard server and propagates to every connected agent on the next pull. Numeric fields are clamped
to the ranges shown; setting a value outside the range silently falls back to the default. The only
runtime knob also editable from the dashboard UI is enabled; everything else is an
admin-only file edit.
A complete block with defaults:
"evtspike": {
"enabled": true,
"min_count": 10,
"threshold": 1e-4,
"cooldown_minutes": 10,
"slot_maturity_observations": 90,
"persist_interval_seconds": 900,
"half_life_buckets": 360,
"prior_strength": 60.0,
"mean_per_bucket_prior": 0.1,
"security_channel_enabled": false,
"disabled_channels": [],
"added_channels": []
}
These three work together. The detector emits a candidate spike when (a) the 10-second bucket
count is at least min_count, and (b) the Bayesian tail probability of seeing that
count under the learned baseline is below threshold. Two-of-three consecutive candidates
confirm a spike, and cooldown_minutes then suppresses repeats on the same channel.
| Key | Default | Range | What it does — and when to tune |
|---|---|---|---|
min_count |
10 | 1 – 10000 |
Floor on the bucket count required to even consider scoring an anomaly. Rare channels where
two events/10 s is already catastrophic want 3–5. Very
chatty channels where background is 50+/10 s want 25–50 to
avoid scoring normal noise. Raising this is the single most effective lever for cutting
false positives on a noisy channel without retraining the baseline.
|
threshold |
1e-4 |
1e-9 – 0.1 |
Maximum tail probability for a count to qualify as anomalous. Lower = stricter. At
1e-4 the detector flags roughly one-in-ten-thousand buckets under a well-learned
baseline. Drop to 1e-6–1e-8 if you trust the baseline deeply
and only want the most extreme deviations. Raise to 1e-3–1e-2
for exploratory fleets where you want to see borderline activity.
|
cooldown_minutes |
10 | 1 – 1440 |
After a confirmed spike fires on a channel, further spikes on the same channel are
suppressed for this many minutes. Pair with your notification target's
repeat_minutes: the detector cooldown gates the first fire; the target cooldown
gates reminder notifications. Short cooldowns (2–5) are useful
on channels where distinct incidents can arrive back-to-back (auth failure storms). Long
cooldowns (60+) fit channels where a single incident produces a multi-hour
burst (patch storms, scheduled backup windows).
|
The baseline is a per-channel Gamma-Poisson posterior with one bucket per 15-minute time-of-day slot (96 slots/day). These knobs control how fast the baseline learns, how confidently it starts, and when a slot is considered mature enough for its own scoring.
| Key | Default | Range | What it does — and when to tune |
|---|---|---|---|
slot_maturity_observations |
90 | 1 – 100 |
Number of bucket observations a single 15-minute slot must accumulate before scoring
prefers it over the global posterior. The default of 90 means a slot is
declared mature only after a full 15-minute visit (90 ten-second buckets) in that
time-of-day window — eliminating the "first morning back after a long weekend" false
positive class. Lower values accept less evidence before trusting a slot's own
statistics; if you find the detector spends too long in TRAINING on quiet channels,
30–60 is a reasonable softer setting. Pre-009 the default
was 7 (≈70 s of coverage); persisted 7 values are NOT
auto-migrated, so existing installs keep their tuning.
|
half_life_buckets |
360 | 60 – 10000 |
Exponential-forgetting half-life in 10-second buckets. At 360 the baseline
weights the last hour heavily and older data fades; after one half-life the contribution
of a given observation drops to half. Shorter (120–180,
20–30 minutes) adapts faster to channel-behaviour drift but is more easily dragged
by a sustained background-rate shift. Longer (720+, two-plus hours) is more
stable across short incidents but slower to re-home on a permanent rate change (e.g., a
newly deployed application that changed the channel's idle rate).
|
prior_strength |
60.0 | 1.0 – 10000.0 |
How many bucket-equivalents of "pretend evidence" the initial prior is worth. With the
default, the baseline starts with the equivalent of 60 observations already seen at the
prior mean, so scoring is meaningful from minute one rather than exquisitely sensitive.
Raise this to 300+ to make the detector slower and more skeptical on freshly
installed agents; lower to 10–20 to let a real baseline take
over faster. Rarely worth touching outside a staging environment.
|
mean_per_bucket_prior |
0.1 | 0.0 – 1000.0 |
Expected events per 10-second bucket under "normal" conditions. Used only to seed the prior
before real data arrives. 0.1 reflects the assumption that most curated channels
are mostly quiet. For a known-chatty channel (SMB audit, Kerberos on a DC with many
short-lived tickets) you can seed a larger value so the first few minutes of observation
aren't flagged as anomalous simply because the prior was too low.
|
persist_interval_seconds |
900 | 60 – 86400 |
How often the in-memory detector state is written to baseline.json on disk.
Shorter = less learning lost on a hard service kill; longer = fewer disk writes on
constrained hosts. The default of 15 minutes loses at most one slot's worth of
observations on a crash. A clean service stop always forces a final save regardless of
cadence.
|
The detector subscribes to a curated list of 54 default channels — Application, System, FSLogix, RemoteApp/RDP, SMB client/server, Terminal Services subsystems, auth (NTLM / Kerberos / LSA), infrastructure (DNS / TCPIP / CAPI2), user experience (GroupPolicy / User Profile Service / PrintService), and stability (WHEA / Windows Defender / Crashdump). Three knobs shape that list:
| Key | Default | What it does |
|---|---|---|
security_channel_enabled |
false |
Opt-in subscription to the Security channel. Disabled by default because
subscribing requires SeSecurityPrivilege on the service account — the
DrainCtl service runs as LocalSystem, which has this privilege, but if you
run it under a different service account you must grant the right first. Enabling this
can generate a lot of noise on a domain controller; pair with higher min_count.
|
disabled_channels |
[] |
Case-insensitive list of default channels to remove from the subscription set. Use
this to silence a default channel without editing the curated list. Does not affect
added_channels — so an admin who disables a default can still re-add it
explicitly via the added list if they want custom per-channel tuning later.
|
added_channels |
[] |
Extra channel names to subscribe to beyond the curated defaults. Any channel you actually
want the detector watching and that isn't already in the default set goes here —
application-specific channels (Microsoft-Windows-Hyper-V-Worker/Operational,
vendor-specific subsystems) are the typical case. Non-existent channels are logged as
skipped at service start and do not block other subscriptions.
|
wevtutil el. The case used in that output is the exact string
DrainCtl expects in added_channels. For a channel's log path, use
wevtutil gl <channel-name>.
Defaults ship for a general-purpose Terminal Services host running well-behaved line-of-business apps. Four common scenarios warrant different postures:
You just turned the detector on. You don't yet know what your baseline looks like. You want to see borderline activity so you can decide what to tune. Prefer false positives to false negatives.
"evtspike": {
"enabled": true,
"min_count": 5,
"threshold": 1e-3,
"cooldown_minutes": 30,
"slot_maturity_observations": 30
}
Lower min_count and looser threshold catch more. Longer cooldown
prevents a single noisy channel from pager-bombing you while you figure out if it's signal or normal.
Lower slot_maturity_observations than the default (90) lets per-slot scoring engage sooner
so you see borderline activity earlier — at the cost of more false positives in the first few hours.
The baseline has been learning for a month. Operators are tired of informational spikes. You want only real emergencies to fire. Defaults already give you the strict-direction slot maturity; tighten the detection knobs:
"evtspike": {
"enabled": true,
"min_count": 25,
"threshold": 1e-6,
"cooldown_minutes": 10
}
Higher min_count filters out small deviations. Tighter threshold demands
extreme tail events. The default slot_maturity_observations = 90 already means the
detector refuses to call a slot mature until it has seen a full 15-minute visit in that time-of-day
window — eliminates
the "first morning back after a three-day weekend" false positive class.
Global defaults work for 50+ channels but one specific channel (often Microsoft-Windows-SMBClient/Audit
or Microsoft-Windows-Kerberos/Operational on a DC) keeps firing and you trust it's
operational noise, not signal. Disable just that channel:
"evtspike": {
"enabled": true,
"disabled_channels": [
"Microsoft-Windows-SMBClient/Audit"
]
}
Prefer this to globally raising min_count — it preserves sensitivity on the other 53
channels. Reinstate the channel after the workload stabilises.
You're running DrainCtl on domain controllers and want auth-failure flood detection. Enable Security, tighten thresholds since the baseline on auth channels is naturally high:
"evtspike": {
"enabled": true,
"min_count": 50,
"threshold": 1e-6,
"cooldown_minutes": 5,
"security_channel_enabled": true,
"mean_per_bucket_prior": 2.0,
"half_life_buckets": 720
}
Higher mean_per_bucket_prior stops the first few hours of runtime from flagging normal
Security churn. Longer half_life_buckets gives the baseline two hours of memory so short
spiky authentications don't nudge the baseline upward. Short cooldown is appropriate
because auth storms (password-spraying, service-account lockouts) are often distinct incidents arriving
minutes apart.
prior_strength, mean_per_bucket_prior, and half_life_buckets only
affect the initial seeding of a detector — an already-running detector keeps its current
posterior. Run drainctl baseline reset after changing any of these so the new values
actually take effect. Detection knobs (min_count, threshold, cooldown)
and channel knobs apply immediately on the next config pull; no reset needed.
The detector stores its per-channel sufficient statistics (Gamma-Poisson posteriors, robust-cap state, last-alert timestamps) to a single file:
%ProgramData%\LISS Technologies\LISSTech DrainCtl\baseline.json
The service rewrites this file on a fixed persistence cadence and once more on clean shutdown, so a restart resumes scoring from the same learned distribution instead of starting over.
If a real operational incident fired during the training window — e.g., a patch Tuesday burst got learned as normal — the baseline can become desensitised. Reset it via the CLI:
drainctl baseline reset
This routes through the running service so both halves stay in sync: the in-memory detectors for every
subscribed channel are wiped, and baseline.json is deleted. Subscriptions and the scoring
loop keep running; a fresh baseline starts accumulating from the next 10-second scoring tick.
baseline.json by hand while the service runs.
The file is rewritten from in-memory state on the next persistence tick, so a manual delete without the
in-memory wipe achieves nothing. Always use drainctl baseline reset.
In the Configuration modal's notification target editor, add event_spike to any target's
trigger list. The same target can combine drain events and event spikes (e.g., PagerDuty gets
alert and event_spike; Slack gets everything). The target's
repeat_minutes cooldown applies per trigger type, so an event_spike cooldown is independent
of the drain-mode alert cooldown.
DrainCtl is designed to integrate with any RMM platform that can run a script and check an exit code.
drainctl check --quiet
exit $LASTEXITCODE
That's it. --quiet suppresses intermediate log lines — only the final status line is emitted. Works with any RMM that supports PowerShell or CMD scripts with exit code thresholds.
# Parse JSON output for custom dashboards or APIs
$result = drainctl check --format json | ConvertFrom-Json
if ($result.exit_code -ne 0) {
# Send to your ticketing system, dashboard, etc.
}
The agent can self-update by polling GitHub releases on a daily cadence, verifying the release manifest,
verifying the MSI Authenticode signature against the LISS code-signing certificate, and running
msiexec silently. Auto-update is off by default — it does not contact
GitHub at all until the operator opts in.
Edit config.json and add (or modify) the update object:
"update": {
"enabled": true,
"channel": "prerelease",
"poll_interval": "24h"
}
The service watches config.json and reloads within seconds — no restart needed. The
first poll fires 5–15 minutes after the change (jittered); subsequent polls run every
poll_interval ± 8% (so 24h ± 2h on the default).
| Field | Default | Notes |
|---|---|---|
enabled | false |
When false, no goroutine, no GitHub call. Default is opt-in by design —
auto-installing software without explicit consent on a host violates the principle of least
surprise. |
channel | "stable" |
Recognized values: "stable" and "prerelease". stable
polls /releases/latest (skips prereleases). prerelease polls
/releases?per_page=1 (most recent regardless of flag). Unknown values clamp to
"stable" with a warn log. |
poll_interval | "24h" |
Go-duration string ("24h", "6h30m", "72h"). Minimum
"1h"; lower values clamp. |
channel: "stable" finds nothing on
GitHub and the agent logs update=no_stable_release on each poll. To actually receive
updates today, both "enabled": true AND "channel": "prerelease" are required.
Once a stable release is published, "channel": "stable" becomes operationally useful.
api.github.com for the latest release on the configured channel.update=up_to_date and reschedule.release.json and release.json.sig, then verify the Ed25519 signature against the public key embedded in the binary.%TEMP%\drainctl-update-<random>.msi and verify its SHA-256 matches the signed manifest.LISS Consulting, Corp. Anything else — missing signature, untrusted chain, wrong
subject — is refused and the temp file is deleted.msiexec /i <path> /quiet /norestart as a detached process so it survives
the service's exit.
Set log_file_level to "debug" in config.json to see every poll
attempt. Then tail the daily file log:
Get-Content '$env:ProgramData\LISS Technologies\LISSTech DrainCtl\drainctl-*.log' |
Select-String 'update='
You'll see lines like:
update=poll_start — one per scheduled poll.update=up_to_date — remote is the same version we're running.update=not_modified — GitHub returned 304 against the cached ETag.update=no_stable_release — channel is "stable" and no non-prerelease
release exists. Steady-state idle, not a failure.update=installing old=… new=… — install path triggered. Persists 7
days in the rotated daily logs.update=refused reason=… — manifest or Authenticode verification failed. Stays
at the configured cadence (deliberate refusal, not a transient failure).Set "enabled": false in config.json. The next poll tick (within
poll_interval) will see the change and skip the GitHub call. To stop all activity
immediately, restart the service after the edit.
The MSI handles service installation. These commands are for manual management or troubleshooting.
# Check service status (CLI — shows Running / Stopped)
drainctl service status
# Check service status (PowerShell — richer output)
Get-Service DrainCtl
# Stop / Start / Restart
Stop-Service DrainCtl
Start-Service DrainCtl
Restart-Service DrainCtl
# Manual install (if not using MSI)
drainctl service install
drainctl service start
# Manual uninstall
drainctl service stop
drainctl service uninstall
TSServerDrainMode via
RegNotifyChangeKeyValue — instant detection
EvtSubscribe — knows
who made the change
drainctl.db, WAL mode) under the DrainCtl data directory — the audit trail,
metrics tiers, server registry, and event-spike history all live here
\\.\pipe\drainctl) for instant
CLI/PS responses
WTSEnumerateSessionsW for utilization
alerts
config.json changesOpen Event Viewer → Applications and Services Logs → DrainCtl:
| ID | Level | What happened |
|---|---|---|
| 1000 | Info | Service started |
| 1001 | Info | Service stopped |
| 1002 | Info | Check: healthy |
| 1004 | Info | State transition detected |
| 2000 | Warning | Drain mode in grace period |
| 3000 | Error | Drain mode alert (grace exceeded) |
Alongside the Event Log sink, the service writes a slog-formatted text log to:
%ProgramData%\LISS Technologies\LISSTech DrainCtl\drainctl.log
The file rotates at local midnight: the previous day's file is renamed to
drainctl-YYYY-MM-DD.log in the same directory, and a fresh drainctl.log is
opened. Archives older than 7 days are pruned automatically at rotation time. The
per-sink level is controlled by log_file_level (file) and log_event_level
(Windows Event Log) in config.json; both default to info. The CLI is
independent — pass --log-level debug for one-shot verbosity.
Run drainctl audit-setup as admin. On domain-joined machines, also configure the GPO (see
Audit Setup).
Check the DrainCtl event log for DrainCtl errors (Event ID 3002). Common causes:
The service isn't running. drainctl check falls back to direct registry read + file-based
audit. Start the service: Start-Service DrainCtl.
drainctl notify status — verify URLs are setdrainctl notify test — sends a test messagefrom address is authorized to
send, and the relay allows connections from the server's IP
The server's DrainCtl service isn't reporting. Check:
Get-Service DrainCtldashboard.url set in
%ProgramData%\LISS Technologies\LISSTech DrainCtl\config.json?
Test-NetConnection dashboard-server -Port 49470
Windows Authentication (Kerberos) issue:
https://server.domain.com:49470) — Kerberos needs the
full hostname
The reporting server isn't registered. Run drainctl register https://dashboard:49470 on
that server.
Open an issue on GitHub or contact LISS Technologies support.