Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for OpenTelemetry & Prometheus metrics #696

Open
Lucas-C opened this issue Jan 15, 2025 · 6 comments
Open

Support for OpenTelemetry & Prometheus metrics #696

Lucas-C opened this issue Jan 15, 2025 · 6 comments

Comments

@Lucas-C
Copy link

Lucas-C commented Jan 15, 2025

Hi,

OpenTelemetry & Prometheus metrics were implemented last year in MOTIS 1 in PR #541.

This issue is there to plan & track the work needed to add those features back to MOTIS 2.

End goal

Integration in Motis code base

  • trace the following metrics:
    • number of routing requests (nigiri + intermodal)
    • routing time histograms
    • routing request statistics, e.g. number of via stations
  • integrate tracing with the utl library, in utl::verify & utl::fail. This could also be an opportunity to add stack traces with boost::stacktrace at those places (at least for utl::verify failures), so we have more infos on what's happening.
  • have one logging function in utl that everything inside MOTIS (including osr, nigiri, etc.) uses.

OpenTelemetry

  • expose an OTLP HTTP exporter at localhost:4317 by default
  • allow configuration to be provided through config.yml

Prometheus

  • expose a /metrics endpoint
  • allow configuration to be provided through config.yml

@felixguendling: is this plan OK with you?
Do you see other angles that we should consider beforehand?

I think it could maybe be implemented through several PRs, do you agree?

@felixguendling
Copy link
Member

It would also be nice to have a shorter way of expressing something like this:
https://github.com/motis-project/nigiri/blob/0861ed983e0b84062f49e82c416b2e4aa000e972/src/routing/raptor_search.cc#L111-L139

  span->SetAttribute("nigiri.query.start_match_mode",
                     location_match_mode_str(q.start_match_mode_));
  span->SetAttribute("nigiri.query.destination_match_mode",
                     location_match_mode_str(q.dest_match_mode_));

// shorter

  set_attr(span, {
      {"nigiri.query.start_match_mode",
       location_match_mode_str(q.start_match_mode_)}, 
      {"nigiri.query.destination_match_mode",
       location_match_mode_str(q.dest_match_mode_)}});

@felixguendling
Copy link
Member

Regarding metrics, IMO a very important point is also the collection of real-time metrics per real-time endpoint.

This has been done here:

metrics_->total_entities_.Increment(stats.total_entities_);
metrics_->total_entities_success_.Increment(stats.total_entities_success_);
metrics_->total_entities_fail_.Increment(stats.total_entities_fail_);
metrics_->unsupported_deleted_.Increment(stats.unsupported_deleted_);
metrics_->unsupported_vehicle_.Increment(stats.unsupported_vehicle_);
metrics_->unsupported_alert_.Increment(stats.unsupported_alert_);
metrics_->unsupported_no_trip_id_.Increment(stats.unsupported_no_trip_id_);
metrics_->no_trip_update_.Increment(stats.no_trip_update_);
metrics_->trip_update_without_trip_.Increment(
stats.trip_update_without_trip_);
metrics_->trip_resolve_error_.Increment(stats.trip_resolve_error_);
metrics_->unsupported_schedule_relationship_.Increment(
stats.unsupported_schedule_relationship_);
metrics_->feed_timestamp_.Set(
static_cast<double>(stats.feed_timestamp_.time_since_epoch().count()));

And I would not only monitor metrics regarding the routing endpoint but maybe for every endpoint.

But as you said - one small PR after another. Not everything has to be done at once. For now, it would be useful to just setup the basics. Telemetry can be added here and there as needed once we have the basics.

@Lucas-C
Copy link
Author

Lucas-C commented Jan 15, 2025

Thank you for your answers! 👍

For reference, I'm working in PR motis-project/utl#25 to introduce:

one logging function in utl that everything inside MOTIS (including osr, nigiri, etc.) uses.

@felixguendling
Copy link
Member

Maybe it's better to split the Doxygen part and logging. So could you please create another PR just for logging and we'll merge Doxygen support in utl/#25?

@Lucas-C
Copy link
Author

Lucas-C commented Jan 16, 2025

For reference, this other PR in https://github.com/motis-project/utl finalizes the "unified logging API": motis-project/utl#27.

Feedbacks on it are welcome 🙂

@Lucas-C
Copy link
Author

Lucas-C commented Jan 17, 2025

I opened issue motis-project/utl#28 to track further improvements to the logging system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants