It's time once again for OpenNMS On the Horizon.

Since last time, we worked on documentation (backup/restore, events, provisioning, traps, surveillance categories, detectors), CI optimizations, Horizon Stream (Kubernetes, provisioning and detectors, alarm and queries, Keycloak integration, shared APIs, CI, Minion, GraphQL), device config backup logging, OpenTelemetry tracing, JUnit 5, telemetry replay, non-root, parallel time-series writes, ALEC algorithms and UI, the Elasticsearch Drift Plugin, flow processing refactoring, the Twin API, Kafka handling of zero values, Pollerd optimization, datachoices metric collection, Prometheus collector error handling, WebMonitor response times, time-series API off-heap storage, REST (alarms, device config, provisioning config), bridge topology classifying, alarm advanced search, (deep breath) aaaaand... password complexity validation.

Really gotta make sure I don't miss a week, we've got a lot more people working on the code now. :D

Github Project Updates

Internals, APIs, and Documentation

  • Mark Mahacek did more documentation work on backup/restore, events, and provisioning policies.
  • Morteza continued his work on making optimized CircleCI pipelines.
  • Jeffrey Kapp merged the Kubernetes operator into the Horizon Stream codebase.
  • Alex worked on making device config backup output logging more configurable.
  • Arthur did more work porting detectors to Horizon Stream.
  • DJ worked on moving from OpenTracing to OpenTelemetry.
  • Chandra fixed an issue with querying alarms in stream.
  • Alexander worked on updating our test infrastructure to JUnit 5.
  • Jesse worked on Telemetryd packet capture replay support.
  • Alex worked on some improvements to the non-root support in scripts.
  • Jason and Yang Li did more work on Keycloak integration in Horizon Stream.
  • Bonnie worked on migrating Trapd configs from the wiki to the docs.
  • Yang Li worked on some shared API code for Horizon Stream.
  • Chandra did more work on parallel time-series writes to multiple backends.
  • Yang Li fixed an issue with event DTO queries in stream.
  • Benjamin Janssens worked on Hellinger Distance support in ALEC.
  • Emily worked on modernizing some terminology relating to surveillance categories throughout the docs, plus detector documentation.
  • I worked on updating the Elasticsearch Drift Plugin CI to auto-build release tags.
  • Dustin worked on refactoring some of the flow processing code to split up processing and persistence.
  • Jason started on automation for integrating the operator environment in stream.
  • Łukasz worked on the GRPC twin subscriber.
  • Chandra worked on integrating the Minion in stream.
  • Alex fixed an issue with the handling of zero values in Kafka.
  • Jesse continued his work on optimization of Pollerd startup.
  • Scott did more work on provisioning in stream.
  • Thomas, Christian, and Alex added collected datachoices metrics for CPU count, memory, configured services, provisioning config, notifications, and business services.
  • Gerald added a PR template and default editor config to stream.
  • Dino made some improvements to the PrometheusCollector's error output.
  • Dino fixed a bug where the WebMonitor wouldn't report response times.
  • Jason worked on support for debugging over ssh in stream.
  • Morteza and I worked on automating portions of the release process.
  • Patrick worked on some changes to the time-series API for off-heap storage.
  • Antonio did some refactoring to simplify bridge connection classifying in Enlinkd.

Web, REST, UI, and Helm

  • Alberto fixed an NPE in the alarm REST service when an alarm's associated event(s) have been deleted.
  • Christian worked on a bug in deleting a graphml entry through the REST interface.
  • Pushkar worked on a REST interface for deleting device config backups.
  • Yang Li worked on a device REST API.
  • Chinh Le did more work on device listing in the new UI.
  • Christian fixed a bug in query url escaping.
  • Anya did more work on an ALEC web UI.
  • Mike Rose did more fixes to new UI code, including support for better graphql error handling.
  • Dmitri worked on improvements to the provisiond-config REST API.
  • Scott worked on an HQL alarm query bug.
  • James worked on supporting negating some terms in the alarm advanced search.
  • Christian added some error handling to panel refresh in the topology UI.
  • James and Chandra worked on a Minion REST API to stream.
  • Chinh Le added a mock GraphQL server to stream.
  • Lars worked on enforcing password complexity in the new password dialog.

Contributors

Thanks to the following contributors for committing changes since last OOH:

  • Chandra Gorantla
  • Patrick Schweizer
  • Antonio Russo
  • Yang Li
  • Jason Berry
  • Morteza Ershad-Manesh
  • Alex May
  • Bonnie Robinson
  • Benjamin Janssens
  • Emily Marsh
  • Chinh Le
  • DJ Gregor
  • Mike Rose
  • Benjamin Reed
  • Anya Rybalova
  • Lars Schreiber
  • Christian Pape
  • Pushkar Suthar
  • Dustin Frisch
  • Mark Mahacek
  • Alberto Ramos
  • Gerald Humphries
  • Scott Theleman
  • Jeffrey-David Kapp
  • Sean Torres
  • Dino Yancey
  • Alexander Chadfield
  • Thomas Bigger
  • Jesse White
  • Łukasz Dywicki
  • James Hutchinson
  • Mark Frazier
  • Dmitri Herdt
  • Arthur Naseef

Coming Soon: JIRA Migration

We will be migrating our JIRA issue-tracker from a self-hosted version to Atlassian's cloud version.
I don't have a timeline for this yet, but expect it in the coming months.

If you currently have an account at the OpenNMS issue tracker your account should already be migrated to JIRA Cloud, but you will need to perform a password reset with the "Can't log in?" link before you can log in.

Releases and Roadmap

July 2022 Releases - Horizon 30.0.1, Meridians 2022.1.5, 2021.1.17, 2020.1.25, 2019.1.36

In July, we released updates to all OpenNMS Meridian versions under active support, as well as Horizon 30.0.1.

Meridian Stable Updates

Meridians 2019.1.36, 2020.1.25 , and 2021.1.17 contain just a few small bug fixes and enhancements.

Additionally, Meridian 2022.1.5 contains all of the changes from the previous Meridian releases, as well as additional changes relating to elasticsearch, running as non-root, web authentication, and more.

For a list of changes, see the release notes:

Horizon 30.0.1

Horizon 30.0.1 contains all of the changes included in Meridian 2022, as well as a number of additional enhancements and bug fixes, including added support for encrypting credentials.

For a high-level overview of what has changed in Horizon 30, see What’s New in OpenNMS Horizon 30.

For a complete list of changes, see the changelog.

The codename for Horizon 30.0.1 is Chinchilla.

Upcoming August Releases

OpenNMS is on a monthly release schedule, with releases happening on the second Wednesday of the month.

The next OpenNMS release day is August 10th, 2022.

We currently expect updates to Horizon 30 and all supported Meridians.

Next Horizon: 31 (Q4 2022)

The next major Horizon release will be Horizon 31.

Since Horizon 30 was only recently released, there is nothing concrete on the roadmap for Horizon 31 yet.
Stay tuned for details when they come.

Next Meridian: 2023 (Q1 2023)

Meridian 2023 is still reasonably early in its development cycle, but you can expect it to contain, at the very least, the work that's going into Horizon 30.

Disclaimer

Note that this is just based on current plans; dates, features, and releases can change or slip depending on how development goes.

The statements contained herein may contain certain forward-looking statements relating to The OpenNMS Group that are based on the beliefs of the Group’s management as well as assumptions made by and information currently available to the Group’s management. These forward-looking statements are, by their nature, subject to significant risks and uncertainties.

...We apologize for the excessive disclaimers. Those responsible have been sacked.

Mynd you, møøse bites Kan be pretti nasti...

We apologise again for the fault in the disclaimers. Those responsible for sacking the people who have just been sacked have been sacked.

Calendar of Events

SCaLE - Los Angeles, California - July 28th through 31st, 2022

The OpenNMS Group will be an exhibitor and Gold Sponsor at the Southern California Linux Expo in Los Angeles, California, from July 28th through 31st.
Stop by our booth and say hi, and don't miss Jeff Gehlbach's presentation on Sunday, titled Network Visibility: The Heart of Modern Monitoring.

Grace Hopper Celebration - Orlando, FL - September 20th through 23rd

Veena Kannan will be presenting a virtual lightning talk titled "Open Source 101 – Myth Buster Edition" at the Grace Hopper Celebration.

We don't know the day nor time of the presentation yet, more details to come.

All Things Open - Raleigh, NC - October 30th through November 2nd, 2022

The OpenNMS Group will be a live stream sponsor for All Things Open, and will have a booth in the exhibition hall.
A bunch of OpenNMS folks will be attending or helping out in the booth, so be sure to say hi!

Open Source Monitoring Conference - Nuremberg, Germany - November 14th through 16th

The OpenNMS Group is a gold sponsor of OSMC this year, and will have a booth as well.
Stop by and say hello!

Until Next Time…

If there’s anything you’d like me to talk about in a future OOH, or you just have a comment or criticism you’d like to share, don’t hesitate to say hi.

- Ben

Resolved Issues Since Last OOH

  • ALEC-113: Temporarily Present Old Situations List
  • ALEC-114: Temporarily Present Old Situations Details
  • ALEC-115: Define a list of items to collect from the user. (Alarms / Nodes)
  • ALEC-125: Workflow to Secure Permission to Collect Data
  • HELM-272: Filter panel does not work for affectedNodeCount
  • HS-46: Skaffold + Karaf: investigate use of Skaffold with Karaf for fast iteration of development
  • HS-75: Periodic drools issue, "query did not return a unique result"
  • HS-79: Initial load test for the POC
  • HS-84: POC demonstrate Kafka vs AMQ continuous flow disk limitation
  • HS-115: Merge Operator Repo into Horizon Stream
  • HS-136: Formalize code standards, guidelines, and documentation requirements
  • HS-147: Update checker failing to get pods correctly
  • HS-172: Migrate the SIMPLE detectors for provisioning
  • HS-173: Add ForeignSource CRUD and REST service
  • HS-174: App Header: Design + Iteration
  • HS-175: Convert REST API server to BFF component
  • HS-183: HttpClient connection to keycloak not closed correctly
  • HS-186: FE - Shared method of mock endpoints
  • HS-187: Backend: Onboard the default Minion
  • HS-189: DevOps: Default Minion deployment
  • HS-194: Frontend: Add a device button and modal
  • HS-200: Frontend: Button and modal to configure PagerDuty API Key
  • HS-204: Dev Environment: Add secrets management / credentials for keycloak mail sever
  • HS-205: DevOps: Add secrets management / credentials for keycloak mail sever
  • HS-207: Frontend: Device table endpoint integration
  • HS-209: Add GQL to Vue App, Test with Alarms
  • HS-210: BFF use the platform core DTO library
  • HS-221: shared-lib/ build and version management
  • HS-222: Add Rest API/GraphQL API to query Minion nodes
  • HS-225: Minion landing - UX wireframes and design options
  • HS-233: FE - mock local GraphQL server
  • HS-237: FE: GQL error plugin fix for 200 responses with separate status in error msg
  • HS-239: Enable and port forward debug ports in Skaffold dev
  • HS-241: Add Cucumber IT for validating minion end point
  • NMS-10634: event nodeCategoryMembershipChanged should be more verbose
  • NMS-12981: Clearing an alarm brings alarm not found message
  • NMS-13953: Upgrade Kafka components to 3.2.0
  • NMS-14043: Negate search terms in alarms advanced search
  • NMS-14096: Ensure OpenNMS docker images work with Macs with M1 chip
  • NMS-14108: figure out automating/deploying release versions of the elasticsearch drift plugin
  • NMS-14120: Debugging DCB scripts is a pain
  • NMS-14197: Support writing to multiple TSDB in parallel
  • NMS-14233: Add KPIs to datachoices telemetry for Provisiond config items
  • NMS-14279: Tag Netflow v9 packets as Ingress on the INPUT_SNMP ifindex and Egress on the OUTPUT_SNMP ifindex
  • NMS-14302: Upgrade JUnit from version 4 to 5
  • NMS-14331: Grafana Panel Internal Server Error when lasteventid is Null for an Alarm when Using HELM
  • NMS-14379: Topology UI Error when deleting a graphml
  • NMS-14383: Update Releases for elasticsearch-drift-plugin to allow compatible versions for M1
  • NMS-14403: Notification with Destination Path and Group, Interval Delay doesnt show
  • NMS-14410: Scripts invoke sudo even if running as root
  • NMS-14412: SNMP Interface Poller doc updates
  • NMS-14437: Add documentation to describe negate search terms in alarms advanced search
  • NMS-14438: Moving JUnit4 to JUnit5
  • NMS-14457: Mock/Practice release work
  • NMS-14458: Release work (July 13)
  • NMS-14462: Rename integration tests that are currently running as unit tests
  • NMS-14465: Add support for replaying packet captures to telemetryd
  • NMS-14467: Prefer ingressPhysicalInterface over INPUT_SNMP when processing flows
  • NMS-14468: Add script to manipulate flows
  • NMS-14469: Kafka metrics producer considers zero values optional
  • NMS-14478: External Requisition UI: foreignSource not set for VMware requisition
  • NMS-14479: Simplify BridgeSimpleConnection Class
  • NMS-14482: Add KPIs for CPU count and memory size to datachoices telemetry
  • NMS-14488: Add KPIs for notification entities to datachoices telemetry
  • NMS-14489: Add KPI for list of enabled service daemons to datachoices telemetry
  • NMS-14524: Pollerd take a long time to start on systems with large inventories
  • NMS-14528: Update documentation for policy matching
  • NMS-14537: Add option to not store DCB script output
  • NMS-14540: Move BridgeDiscovery to new project Enlinkd Adapters Discovers Bridge