-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pick generator specialized for indexed sequences #874
base: main
Are you sure you want to change the base?
Conversation
@lambdani just to clarify: is the suggested behavior different from this one? def shuffledPick[T](n: Int, seq: IndexedSeq[T]): Gen[collection.Seq[T]] = {
Arbitrary.arbitrary[Long].flatMap { seed =>
val shuffledSeq = new scala.util.Random(seed).shuffle(seq)
Gen.pick(n, shuffledSeq)
}
} I mean, I realize it works in a different way, but not sure if there's a difference in the results they both will be providing. |
No, it should have the same behavior. The only difference should be asymptotic efficiency (O(k log k) vs O(n)). But I don't know if it's faster in practice for enough use cases, and even then if it's worth the added complexity. I could try to write some benchmarks, but it's OK if you think it's not worth it :-). |
Added tests to check red-black tree invariants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A quick benchmark would be interesting to see approximately what size collection is the break-even point.
* | ||
* The elements are guaranteed to be permuted in random order. | ||
*/ | ||
def indexedPick[T](n: Int, l: IndexedSeq[T]): Gen[collection.Seq[T]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Names that sort close to their relatives improve discoverability: how about pickIndexed
?
/** A generator that randomly picks a given number of elements from an IndexedSeq | ||
* | ||
* The elements are guaranteed to be permuted in random order. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick comment on the runtime improvement over pick
would be helpful. Perhaps also that it doesn't repeat elements.
Hi! Would there be interest in a pick generator specialized for IndexedSeqs? When choosing k elements from a sequence with n elements, the idea is to choose an element in the inclusive range [0,n-1], then another one in [0,n-2]... up to [0,n-k]. Then these indices must be translated to the whole range [0,n-1] while avoiding repetitions. For this, one can use a modified version of an order statistic tree that selects the i-th non negative integer not present in the tree.
This should pick k elements in O(k log k) time, using O(k) extra space for the tree. Additionally, the elements should be permuted in random order.
The names are horrible but I couldn't come up with better ones. Any help with that would be appreciated if you think it's worth to add this generator to Scalacheck. What do you think?