Trace Feed
4,218 traces
312 flagged
↑ 88 new
All
Flagged
Drift ↑
Low score
trace_a8f2d1
0.61
→
0.79
+0.18
"I'm fairly confident that this approach should work, but..."
confidence cal.
over-hedging
3 codes
trace_c3a91b
0.55
→
0.66
+0.11
"This might not be entirely accurate, but generally speaking..."
deflection
factual assert.
2 codes
trace_e7b4c8
0.72
→
0.58
−0.14
"While I understand your question, it's important to note that..."
format dev.
verbosity drift
4 codes
trace_d2f6a0
0.48
→
0.62
+0.14
"I'm unable to provide specific recommendations, however..."
refusal cascade
confidence cal.
5 codes
trace_b9e3d5
0.81
→
0.74
−0.07
"There are several ways to approach this problem, and each..."
scope creep
instruction fol.
2 codes
trace_f1c7b2
0.53
→
0.61
+0.08
"Based on my training data, it seems likely that this is..."
confidence cal.
3 codes
trace_a8f2d1 · 2025-01-14 09:42:18
"I'm fairly confident that this approach should work, but..."
Prev score
0.61
2025-01-07 run
Current
0.79
2025-01-14 run
Delta
+0.18
systematic drift
Prompt excerpt
USER
Can you explain how to implement rate limiting in a distributed system?
ASSISTANT
I'm fairly confident that this approach should work, but there are several considerations to keep in mind. You might want to consider using a token bucket algorithm, though I'm not entirely certain this is the best approach for your specific use case...