CANARY Web Console

CANARY (Component Analytics & Near-term Advisory Risk Yardstick) predicts near-term security advisory risk for Jenkins plugins using publicly observable project signals. Rather than waiting for a vulnerability to be disclosed, CANARY estimates the likelihood that a plugin will appear in a Jenkins security advisory within the next 180 days — giving security teams a proactive prioritization signal.

This is a research prototype developed as part of a Doctor of Engineering praxis at The George Washington University. It is intended as a decision-support tool, not a replacement for security review.

Click the Scoring tab.
Type a Jenkins plugin name in the Plugin ID field (autocomplete is populated from the live registry).
Optionally select an ML model from the dropdown to add a probabilistic score alongside the heuristic one.
Click Score plugin.
Review the heuristic score, ML advisory probability, and SHAP-based feature drivers. Use Explain now (AI) for a plain-English summary.

The CANARY score (0.0–1.0) is the model's estimated probability of a Jenkins security advisory within the next 180 days. Supporting signals — including maintenance history, governance artifacts, and dependency risk — are shown alongside the score to provide interpretable context.

Risk level	ML score	Suggested action
Low	Score < 0.05	Normal patch hygiene — no special action needed.
Medium	0.05 – 0.20	Monitor advisories; include in scheduled patch cycles.
High	Score ≥ 0.20	Prioritize review; consider alternatives for new pipelines.

CANARY uses only publicly observable data — no private telemetry or credentials are required. The most predictive signals come from Software Heritage archival data and GitHub Archive event history.

Signal	What it captures
`Days since last commit`	How long ago the repository was last updated — stale repos carry higher risk.
`Archive age`	How long the plugin has been publicly archived — older projects tend to be better-hardened.
`Release recency`	Time since the last published release — infrequent releases correlate with elevated risk.
`Security-fix commit count`	Commits whose messages reference security fixes — a positive maintenance signal.
`Advisory history`	Number and recency of previously published Jenkins security advisories.
`Governance artifacts`	Presence of SECURITY.md, Dependabot config, changelog, and CI workflows.
`Dependency risk`	Whether the plugin's dependencies have known advisories.

The Machine learning tab lets you explore pre-computed results across 64 model configurations. Use the three dropdowns to select an algorithm (XGBoost, LightGBM, Random Forest, Logistic Regression), a feature set (from advisory-history-only up to all 154 features), and an evaluation strategy (time split or group-time split). Where available, a feature selection panel shows which features are most important and whether a compact subset can match full-model performance.

Time split evaluates models where the same plugins appear in both training and testing — a continuous monitoring scenario. Group-time split withholds entire plugins from training, testing whether the model generalises to previously unseen plugins. The group-time design is the more conservative and realistic evaluation of the two.

CANARY is scoped to the Jenkins plugin ecosystem only — it does not score npm, PyPI, Maven, or other package registries.
Scores reflect near-term advisory likelihood, not exploitability or severity in your specific environment.
A low score does not mean a plugin is safe — it means CANARY sees no strong signal of an imminent advisory based on publicly observable data.
Data is updated periodically, not in real time. Always consult the official Jenkins security advisories for the authoritative source.

The full source code, data pipeline, and research documentation are available on GitHub. The praxis document provides a detailed description of the methodology, ablation results, and future research directions.

🔗 View on GitHub Jenkins Security Advisories

CANARY Web Console

What is CANARY?

Score a plugin in 30 seconds

What the scores mean

Key signals used

Exploring model results

What CANARY is not

Going deeper