Axial — Trace Feed

gpt-4o · 7d · 4,218 traces

All Flagged Drift ↑ Low score

trace_a8f2d1

0.61 → 0.79 +0.18

"I'm fairly confident that this approach should work, but..."

confidence cal. over-hedging 3 codes

trace_c3a91b

0.55 → 0.66 +0.11

"This might not be entirely accurate, but generally speaking..."

deflection factual assert. 2 codes

trace_e7b4c8

0.72 → 0.58 −0.14

"While I understand your question, it's important to note that..."

format dev. verbosity drift 4 codes

trace_d2f6a0

0.48 → 0.62 +0.14

"I'm unable to provide specific recommendations, however..."

refusal cascade confidence cal. 5 codes

trace_b9e3d5

0.81 → 0.74 −0.07

"There are several ways to approach this problem, and each..."

scope creep instruction fol. 2 codes

trace_f1c7b2

0.53 → 0.61 +0.08

"Based on my training data, it seems likely that this is..."

confidence cal. 3 codes

Prev score

0.61

2025-01-07 run

Current

0.79

2025-01-14 run

Delta

+0.18

systematic drift

Extracted codes

Overstated certainty confidence cal. conf 0.91

Hedging without basis over-hedging conf 0.84

Epistemic mismatch confidence cal. conf 0.72

Prompt excerpt

USER Can you explain how to implement rate limiting in a distributed system?

ASSISTANT I'm fairly confident that this approach should work, but there are several considerations to keep in mind. You might want to consider using a token bucket algorithm, though I'm not entirely certain this is the best approach for your specific use case...