Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trace sdk: add samplers #355

Merged
merged 1 commit into from
Dec 16, 2023
Merged

Conversation

iRevive
Copy link
Contributor

@iRevive iRevive commented Nov 5, 2023

Reference Link
Spec https://opentelemetry.io/docs/specs/otel/trace/sdk/#sampler
Java implementation Sampler.java
ParentBasedSampler.java
TraceIdRatioBasedSampler.java
SamplingResult.java

I started decoupling #325 into smaller changes to simplify the review process.

* the desired ratio of sampling. Must be >= 0 and <= 1.0.
*/
def create(ratio: Double): Sampler = {
require(ratio >= 0 && ratio <= 1.0, "ratio must be >= 0 and <= 1.0")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use Either instead?

def create(ratio: Double): Either[IllegalArgumentException, Sampler] =

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opine that we should not—it makes the API much more annoying to use if you'r using it correctly. at some point, the user does just need to read the docs

isValid = alwaysValid || isValid
)

private case class SpanContextImpl(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A temporary solution. It can be removed once the SpanContext is updated.

@iRevive iRevive force-pushed the sdk-trace/samplers branch from 5de90d6 to ff70e75 Compare November 5, 2023 13:33
@iRevive iRevive added tracing Improvements to tracing module module:sdk Features and improvements to the sdk module labels Nov 6, 2023
NthPortal
NthPortal previously approved these changes Nov 8, 2023
@NthPortal NthPortal dismissed their stale review November 8, 2023 19:34

got tricked by a link into only reviewing a single commit

@iRevive iRevive marked this pull request as draft November 8, 2023 20:00
* cleared, so samplers should normally return the passed-in trace state if
* they do not intend to change it.
*/
def traceStateUpdater: SamplingResult.TraceStateUpdater
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java SDK uses a default method in the interface: https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/trace/src/main/java/io/opentelemetry/sdk/trace/samplers/SamplingResult.java#L134

Unfortunately, this approach does not work if we want to keep the trait sealed.

I've added TraceStateUpdater instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this approach does not work if we want to keep the trait sealed.

Dumb question: why? I'm having trouble seeing the difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the spec, the SamplingResult could be defined as:

sealed trait SamplingResult {
  def decision: SamplingDecision
  def attributes: Attributes
  def updatedTraceState: TraceState
}

But with this design, we cannot use constant sampling results (e.g. SamplingResult.Drop) anymore, because we always need to return the parent's trace state. So, we need to create a new sampling result on every call of Sampler.shouldSample.

Since only a few samplers in the contrib module (e.g. ConsistentSampler) provide a custom implementation of the getUpdatedTraceState Java SDK team decided to go their way, I assume.


We can still use a method, but encoding will be cumbersome:

sealed trait SamplingResult {
  ...
  def updateTraceState(state: TraceState): TraceState
}

def apply(
    decision: SamplingDecision,
    attributes: Attributes,
    updateState: TraceState => TraceState
): SamplingResult = ...

When we use a function instead of a TraceStateUpdater ADT, it's more complex to:

  1. Implement a proper hashing
  2. Informative toString
  3. Could be less intuitive

@iRevive iRevive marked this pull request as ready for review November 8, 2023 20:31
@iRevive iRevive force-pushed the sdk-trace/samplers branch 5 times, most recently from 3b37153 to 16bf4a9 Compare November 16, 2023 10:40
@iRevive iRevive requested a review from NthPortal November 16, 2023 10:40
@iRevive
Copy link
Contributor Author

iRevive commented Nov 16, 2023

I've updated the PR (merged changes from the main, primarily). May I ask for yet another review, please?
@armanbilge @NthPortal

val idUpperBound =
if (ratio == 0.0) Long.MinValue
else if (ratio == 1.0) Long.MaxValue
else (ratio * Long.MaxValue).toLong
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct? ratio == 0.5 should yield 0 not Long.MaxValue/2, shouldn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling very hard to figure out how to do the maths without resorting to BigDecimal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input ratio must be between 0 and 1.0. 0.1 means we sample only 10% of the incoming spans, and 1.0 means 100%.

The ratio-based sampler should be deterministic, so under the hood, we extract a long value from the trace id: https://github.com/typelevel/otel4s/pull/355/files#diff-fc234a9eef7868569981b776843221255b54d7835308e46da41ab99eb4bc315eR48.

The logic is the following: the span is marked as 'sampled' if the extracted long value is lower than the upper bound limit (ratio * Long.MaxValue).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me try to clarify.

the Long value is used to represent a sequence of bits and is in that sense "unsigned", but when doing maths on it, it's signed. so it can have any value from Long.MinValue to Long.MaxValue. even if the input ratio is 0.0, the current implementation will still sample any values in Long.MinValue until 0L, which is 50% not 0%

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The math.abs is used here, so the distribution of values from 0 to Long.MaxValue.

https://github.com/typelevel/otel4s/pull/355/files#diff-fc234a9eef7868569981b776843221255b54d7835308e46da41ab99eb4bc315eR54

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see. oops

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edge case: math.abs(Long.MinValue) == Long.MinValue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. If I understand correctly, this rare case shouldn't affect the sampling ratio much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except it will violate the expectations of a ratio of 0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I didn't think about it that way. I made an issue for myself #404

@iRevive iRevive merged commit 6672e4d into typelevel:main Dec 16, 2023
10 checks passed
@iRevive iRevive deleted the sdk-trace/samplers branch December 16, 2023 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:sdk Features and improvements to the sdk module tracing Improvements to tracing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants