PR #1105 · March 2026

Autopsy of a SOTA Parameter-Golf GPT, Round 2

The PR #1019 autopsy said MLP was 4× more sensitive to quantization than attention. So we expanded MLP from 3× to 3.5× and switched to mixed int5/int6. Result: 1.1086 BPB, −0.005 over the previous SOTA. The tradeoff: calibration ECE jumped from 0.24% to 1.26% — mixed quantization introduced systematic overconfidence across 99.6% of tokens.

Sliding BPB

1.1086

Parameters

29.95M

vs PR #1019

−0.005BPB

Artifact

14.52MB

Hardware

8×H100SXM


MLP Still Dominates — But the Gap Narrowed

Per-matrix int6 sensitivity, same methodology as PR #1019

Same test: quantize each matrix individually to int6, keep everything else at full precision. MLP_down accounts for 1,302 × 10⁻⁶ total sensitivity. All four attention matrix types combined: 599 × 10⁻⁶. The ratio dropped from 4× to 2.2× — the wider MLP spreads quantization damage across more parameters, making each individual weight less critical.

Each cell: quantize only that one matrix to int6, measure BPB degradation. Baseline: 1.150966 BPB. Full int6: +2.3×10⁻³.
All 6 matrices of each layer quantized to int6. Decoder layers 9–10 are most sensitive.

Key Insight

Layer 9 is now the most sensitive (+403 × 10⁻⁶), overtaking Layer 10 from PR #1019. Full-model int6 penalty: +0.0023 BPB, down from +0.0083 in PR #1019. The wider MLP and mixed quantization strategy are working — less total damage per bit removed.

Mixed Precision: Where the Extra Bit Matters

Per-matrix int5 vs int6 — the knapsack that makes 3.5× MLP fit in 16 MB

Uniform int6 can't fit 29.95M parameters under 16 MB. The fix: default to int5 and selectively promote matrices where the extra bit pays for itself. Every single top-10 upgrade benefit is an MLP_down matrix. L9 MLP_down gains 1,167 × 10⁻⁶ BPB from one extra bit. L10 MLP_down: 1,066 × 10⁻⁶. The next non-MLP_down entry is L9 V at rank #11. The efficiency ranking (BPB gain per bit spent) feeds the knapsack that decides the final allocation.

Upgrade benefit: how much BPB improves when switching a single matrix from int5 to int6. Baseline: 1.150966 BPB. Darker = more benefit from the extra bit.
Efficiency ranking: BPB improvement per additional bit spent. MLP Attention. Knapsack uses this ranking to allocate the bit budget.

Key Insight

Across all 66 matrices: MLP_down accounts for 6,058 × 10⁻⁶ of total upgrade benefit — 70% of the entire budget. MLP_up adds 1,541 × 10⁻⁶ (18%). All attention matrices combined: 1,065 × 10⁻⁶ (12%). 11 matrices show negative upgrade benefits (int5 actually outperforms int6) — concentrated in V and K matrices at layers 1–3 and 7–8. L3 V is the most negative at −118 × 10⁻⁶. At these magnitudes this is likely eval noise, but it means promoting those matrices would waste bits and hurt BPB.

Condition Numbers: Still a Red Herring

Q up to 30,759, Out up to 25,556 — MLP still hurts more

Q's condition number dropped from 54,000 to 30,759 with the wider MLP, but the story is the same: high condition number does not predict quantization sensitivity. MLP matrices have condition numbers of 4–13 and dominate the damage. Stable rank remains the real predictor — MLP utilizes 95% of its rank capacity vs 73% for Q.

Higher condition number = expected to be more fragile. Q matrices top the chart — yet are least sensitive to quantization.
Bottom = high readability improvement. Right = high condition number. Top-right = encoder layers preparing skip features.

Key Insight

MLP rank utilization: 95.2% (up from 94.4% in PR #1019). At 3.5× expansion the MLP is nearly fully packed. Further expansion would add parameters that actually get used. Q utilization at 73.3% means attention still has headroom — it's not parameter-starved.

Stable Rank Explains the Gap

Why low condition numbers hurt more than high ones

The singular value curves show the mechanism: Q concentrates energy in ~12% of its channels. The other 88% carry near-zero signal, so quantization errors there are harmless. MLP uses ~35% of its channels — 3× more active capacity, 3× more ways to accumulate rounding damage. Same pattern as PR #1019, now with wider matrices.

Each line is one layer. Steep decay = low effective rank. Flat = capacity fully used.

Key Insight

The 3.5× MLP increased MLP stable rank from ~160 to ~184 on average. More channels are active, but the utilization ratio (stable rank / full rank) held steady at ~95%. The expansion added capacity that the model actually uses — not dead parameters.

Layer 7 Still Does Most of the Work

Projecting each layer's residual stream through the unembedding matrix

Same logit lens methodology. Layer 7 contributes −3.82 bits/token (vs −4.35 in PR #1019). Layers 3–5 still show increased loss from skip connection reorganization. The pattern is structurally identical to PR #1019 despite the wider MLP and different quantization.

Loss (nats)Top-1 accuracy (scaled)Readability dropped

Key Insight

Layer 7: −3.82 bits/token. Layer 10: −0.52 bits/token, still the weakest contributor. The encoder layers 3–4 readability cost is larger here (+1.22, +0.99, +2.24 for L3–L5) vs PR #1019, suggesting the wider MLP is pushing more representational work through the skip connections.

Calibration Degraded: The Cost of Mixed Quantization

ECE = 1.26% — up from 0.24% in PR #1019

ECE jumped 5× from 0.24% to 1.26%. 99.6% of tokens land in overconfident bins — the model consistently believes it's more right than it is. This is a direct consequence of mixed int5/int6: the coarser int5 matrices introduce systematic confidence bias that uniform int6 did not.

Accuracy − confidence per bin. Zero = perfectly calibrated. Model is overconfident on 100% of tokens.
Where loss comes from. Low P(correct) = model assigned little probability to the right answer — these tokens dominate loss.

Key Insight

The BPB gain from mixed quantization (-0.005) is worth the calibration cost — but this model would benefit from temperature scaling, unlike PR #1019. 70% of total loss still comes from tokens where P(correct) < 5%. The bottleneck is still accuracy, but confidence is now slightly misaligned.

Further Exploration

What Each Head Learns

Classifying 88 attention heads by function

28 previous-token heads (vs 22 in PR #1019), 1 induction head (down from 2), and 2 positional heads (new). The wider MLP didn't change the fundamental attention patterns — this model still relies on n-gram statistics, not in-context copying.

1 induction, 28 previous-token, 2 positional, 57 other
Higher values indicate stronger A B … A → B copying behavior

Key Insight

The single induction head (L2H3, score 0.020) is marginal. 28 previous-token heads split evenly between encoder (15) and decoder (13). The emergence of 2 positional heads in the decoder (L6H2, L6H3) is new — these attend to absolute position, possibly compensating for the larger MLP's representational demands.

Reading the Model's Mind

Token-level loss, top-k predictions at failure points, and generation vs. reality

Individual tokens, failure points, alternative predictions, and free-form generation compared against ground truth.

Loss Heatmap

Light = predicted well, dark = surprised. Hover for loss values.

Low loss
High loss
<s>Insurance Company Declares Living Man Dead George Johannesen is very much alive. Which is why it was so surprising when the Canadian man received a letter addressedTo the Estate of George Johannesen.” Even more surprising is that it came from his insurance company, who should really be on top of such things. Now this wouldnt have been so terrible if Manitoba Public Insurance was giving Johannesens estate a fat check for his passing away. But thats not what happened. Instead the letter was to inform the estate that, since George was dead, his driver license and auto insurance had been cancelled in October. This poses a problem for Johannesen because, being alive, he continues to drive his car.I dont understand how this could have happened, he told the Toronto Sun.For me to be declared dead, someone would have to present a death certificate. For someone to get that, I guess I must have died sometime in October.” Now the 59-year-old worries that he will stop getting his pension and other government benefits. The Manitoba Public Insurance Company says they are trying to resolve the issue. They also claim they werent the ones who determined Johannesen was dead, but cryptically cant reveal the source of the confusion for confidentially reasons. Perhaps a pesky ghost is behind the mix up?<s>Exhausted and euphoric. Those are the words to describe me right now. Six days after boarding the "Good Morning America" Whistle-Stop '08 Tour train, and beginning the adventure of a lifetime, it's over. We just wrapped the last show of our little Odyssey from the Newseum in the nation's capitol, Washington D.C., and for me, it was an appropriate but bittersweet ending to our tale. Appropriate because the point of our tour was to go out and ask real people what was on their minds, to hear straight from them their concerns about our nation. By ending in Washington, D.C., we brought their thoughts, problems and hopes to the doorstep of the government -- to the people that can do something about them. But it was also bittersweet because I honestly didn't want the trip to end. I'm not going to lie to you and say that I loved everything about it (3:00 AM wake up calls being the main offender), but I was consistently surprised to find that even in the tougher times, when we had been blearily working for 18 hours straight, something or someone would come along to pick everyone up. From the absolute chaos of pre-show preparations, to the fleeting sparkle of pride in the production team's eyes when a show went just as planned, life on the train was crazy, grueling and complicated, but most of all, fun. Some moments I'll never forget. Like the celebration in Massachusetts after we pulled off what had never been done before -- the first live network television broadcast from a moving train. Or when Diane, Robin and Chris teamed up -- using Rick Klein and me as props -- to convince Sam that he was supposed to share his tiny room on the train with two roomates. Or when Chris put his life on a very secure line at Niagara Falls to dramatically bring the news from the brink of watery doom. Or, my personal favorite moment, when Sam, Chris and two producers played the most ridiculous game of Monopoly I've ever seen for four hours and a few of us, Sam included, cried from laughing so hard. But far more moving than the obvious and endearing camaraderie between the anchors was their care for the American people to whom they bring the news every morning. Never was this more obvious than yesterday, when I accidentally stumbled into an anchors' meeting where they discuss the content of the next day's show and, for some reason, I was allowed to stay. As an aspiring journalist myself, I can't express how inspiring it was to listen in on this discussion and know firsthand that whatever goes on the air, it's the fairest, most accurate and most informative report possible. Though they have their fun, when it comes to the news Diane, Robin, Chris and Sam are professionals in every sense of the word. But now I have to go -- have to return to "normal" life, and I don't want to. I have to shave my rail-beard, the result of a production-wide pact to not shave for the duration of the trip. I have to wash some extremely smelly clothes. I also have a feeling that the spontaneous dance parties that erupted on the train will be for some reason looked down upon in the office. These are all reasons to miss dragging myself aboard that cramped studio on rails well before the sun comes up. But maybe we'll be able to do it again sometime. We were told to get back to New York however we wanted. I think I'll take a train.<s>By registering on amnesty.org you can join in on the human rights conversation and ensure your contributions are combined with ours. If you come from a country that doesn't have an office you have the option to become an International member. Here, you will receive emails about human rights campaigns that are targeted to your interests and opportunities to take action for human rights impact. You can also become a volunteer, and lead on activism initiatives in your community. Furthermore, you will have full use of the Amnesty International online communities. We cant do it without you! Where do you live? We have 50 international offices. This information will help us ensure that you receive the appropriate services. If you don't have an office in your country, you can

Top-k Predictions at High-Loss Tokens

186 tokens above the 90th-percentile loss threshold. Sorted by loss (highest first).

...sey from the Newseum in the nation...
Position 634 · Loss: 11.765 nats (16.973 bits) · P(correct): 0.00%
RankPredictedProbBar
1." Y"67.7%
2." O"8.1%
3." E"4.3%
4." W"3.4%
5." J"2.1%
Actual: "se" (rank >5, prob 0.00%)
... of us, Sam included, cried...
Position 1284 · Loss: 11.236 nats (16.210 bits) · P(correct): 0.00%
RankPredictedProbBar
1." and"61.5%
2.","17.6%
3."'"6.5%
4." &"0.9%
5." was"0.9%
Actual: " includ" (rank >5, prob 0.00%)
...es to the news Diane, Rob...
Position 1521 · Loss: 10.453 nats (15.081 bits) · P(correct): 0.00%
RankPredictedProbBar
1.","43.7%
2." and"7.6%
3." they"3.2%
4." that"3.2%
5." of"2.8%
Actual: " D" (rank >5, prob 0.00%)
... back to New York however we wanted...
Position 1764 · Loss: 9.776 nats (14.103 bits) · P(correct): 0.01%
RankPredictedProbBar
1.","11.6%
2." in"9.1%
3." on"7.1%
4." and"7.1%
5." C"6.2%
Actual: " how" (rank >5, prob 0.01%)
...in with two roomates. Or when Ch...
Position 1166 · Loss: 9.739 nats (14.051 bits) · P(correct): 0.01%
RankPredictedProbBar
1."m"89.2%
2."s"6.5%
3."-"1.0%
4." m"0.4%
5."b"0.3%
Actual: "ates" (rank >5, prob 0.01%)
... on rails well before the sun...
Position 1722 · Loss: 9.500 nats (13.706 bits) · P(correct): 0.01%
RankPredictedProbBar
1."."32.5%
2.","17.4%
3." and"7.2%
4." -"4.4%
5." in"3.4%
Actual: " well" (rank >5, prob 0.01%)
... that cramped studio on rail...
Position 1715 · Loss: 9.497 nats (13.701 bits) · P(correct): 0.01%
RankPredictedProbBar
1." tra"32.6%
2.","9.3%
3." t"4.4%
4." "3.4%
5." and"3.0%
Actual: " stud" (rank >5, prob 0.01%)
...in.<s>By registering on am...
Position 1785 · Loss: 9.441 nats (13.621 bits) · P(correct): 0.01%
RankPredictedProbBar
1." J"9.9%
2." M"8.7%
3." D"5.3%
4." S"5.3%
5." R"4.7%
Actual: " reg" (rank >5, prob 0.01%)
...For me to be declared dead...
Position 320 · Loss: 9.144 nats (13.193 bits) · P(correct): 0.01%
RankPredictedProbBar
1." al"36.1%
2." ab"8.1%
3." on"7.1%
4." in"7.1%
5." de"6.3%
Actual: " dec" (rank >5, prob 0.01%)
...haps a pesky ghost...
Position 508 · Loss: 8.973 nats (12.946 bits) · P(correct): 0.01%
RankPredictedProbBar
1."erson"26.0%
2."oss"8.4%
3."ol"7.4%
4."ot"7.4%
5."ens"5.8%
Actual: "es" (rank >5, prob 0.01%)
...-year-old worries that he will...
Position 380 · Loss: 8.930 nats (12.883 bits) · P(correct): 0.01%
RankPredictedProbBar
1." J"18.6%
2." is"10.0%
3." has"7.8%
4." man"6.0%
5." G"5.3%
Actual: " wor" (rank >5, prob 0.01%)
... also claim they werent the on...
Position 445 · Loss: 8.766 nats (12.647 bits) · P(correct): 0.02%
RankPredictedProbBar
1." have"24.9%
2." are"24.9%
3." will"9.2%
4.""7.1%
5." can"3.8%
Actual: " we" (rank >5, prob 0.02%)
...ributions are combined with our...
Position 1824 · Loss: 8.715 nats (12.574 bits) · P(correct): 0.02%
RankPredictedProbBar
1." pro"7.5%
2." made"5.8%
3." v"4.0%
4." re"4.0%
5." f"4.0%
Actual: " com" (rank >5, prob 0.02%)
... human rights impact. You can also...
Position 1918 · Loss: 8.536 nats (12.315 bits) · P(correct): 0.02%
RankPredictedProbBar
1." v"35.5%
2." ab"13.1%
3."."10.2%
4." is"9.0%
5." c"7.9%
Actual: " imp" (rank >5, prob 0.02%)
... when a show went just as planned...
Position 992 · Loss: 8.462 nats (12.208 bits) · P(correct): 0.02%
RankPredictedProbBar
1." on"20.5%
2." off"15.9%
3." down"14.1%
4." out"9.7%
5." l"5.2%
Actual: " just" (rank >5, prob 0.02%)
...p. I have to wash some extre...
Position 1629 · Loss: 8.296 nats (11.968 bits) · P(correct): 0.02%
RankPredictedProbBar
1." sh"8.9%
2." be"6.9%
3." re"6.1%
4." go"4.8%
5." st"3.3%
Actual: " was" (rank >5, prob 0.02%)
...ie between the anchors was their c...
Position 1329 · Loss: 8.275 nats (11.939 bits) · P(correct): 0.03%
RankPredictedProbBar
1." two"52.2%
2." produ"7.1%
3." c"4.3%
4." show"2.0%
5." te"2.0%
Actual: " an" (rank >5, prob 0.03%)
... dead, but cryptically can...
Position 471 · Loss: 8.242 nats (11.891 bits) · P(correct): 0.03%
RankPredictedProbBar
1."all"28.9%
2."ame"17.5%
3."ert"15.4%
4."ur"5.7%
5."le"5.0%
Actual: "ry" (rank >5, prob 0.03%)
...estate a fat check for his pass...
Position 172 · Loss: 8.200 nats (11.829 bits) · P(correct): 0.03%
RankPredictedProbBar
1."al"63.8%
2."her"23.5%
3."ally"3.6%
4." l"1.0%
5." b"0.7%
Actual: " che" (rank >5, prob 0.03%)
...ance company, who should really be on to...
Position 113 · Loss: 8.109 nats (11.699 bits) · P(correct): 0.03%
RankPredictedProbBar
1."se"15.6%
2." had"7.4%
3." was"5.7%
4." is"5.7%
5." re"2.7%
Actual: " should" (rank >5, prob 0.03%)
... cryptically cant reveal...
Position 475 · Loss: 8.091 nats (11.673 bits) · P(correct): 0.03%
RankPredictedProbBar
1." s"6.6%
2." re"6.6%
3." t"5.8%
4." said"4.0%
5." cl"4.0%
Actual: " can" (rank >5, prob 0.03%)
... Living Man Dead George...
Position 19 · Loss: 7.994 nats (11.533 bits) · P(correct): 0.03%
RankPredictedProbBar
1."age"69.1%
2."ag"17.5%
3."u"2.7%
4."if"1.3%
5."ual"1.0%
Actual: " De" (rank >5, prob 0.03%)
... Chris and two producers played the...
Position 1242 · Loss: 7.928 nats (11.438 bits) · P(correct): 0.04%
RankPredictedProbBar
1." other"27.2%
2." ro"24.0%
3." of"18.7%
4." fr"3.3%
5." b"1.5%
Actual: " produ" (rank >5, prob 0.04%)
.... From the absolute ch...
Position 945 · Loss: 7.917 nats (11.422 bits) · P(correct): 0.04%
RankPredictedProbBar
1." m"7.9%
2." be"7.0%
3." start"4.8%
4." first"4.2%
5." p"3.7%
Actual: " ab" (rank >5, prob 0.04%)
... people to whom they bring the news every...
Position 1347 · Loss: 7.896 nats (11.391 bits) · P(correct): 0.04%
RankPredictedProbBar
1." were"21.9%
2." had"13.3%
3." l"6.3%
4." c"3.8%
5."'"3.0%
Actual: " br" (rank >5, prob 0.04%)
... America" Whistle-St...
Position 575 · Loss: 7.882 nats (11.371 bits) · P(correct): 0.04%
RankPredictedProbBar
1." fl"8.2%
2." tra"6.3%
3." t"6.3%
4." a"4.4%
5." c"3.4%
Actual: " Wh" (rank >5, prob 0.04%)
...ction-wide pact to not shave...
Position 1609 · Loss: 7.796 nats (11.248 bits) · P(correct): 0.04%
RankPredictedProbBar
1."ol"12.9%
2."ub"6.1%
3."an"6.1%
4."and"6.1%
5."res"4.8%
Actual: "act" (rank >5, prob 0.04%)
...thing about it (3:00 AM...
Position 851 · Loss: 7.791 nats (11.240 bits) · P(correct): 0.04%
RankPredictedProbBar
1."and"18.9%
2."I"6.1%
3."w"5.4%
4."th"5.4%
5."in"4.8%
Actual: "3" (rank >5, prob 0.04%)
...ink of watery doom. Or,...
Position 1215 · Loss: 7.752 nats (11.184 bits) · P(correct): 0.04%
RankPredictedProbBar
1." w"6.4%
2." e"6.4%
3." d"4.4%
4." t"4.4%
5." h"3.9%
Actual: " do" (rank >5, prob 0.04%)
...terday, when I accidentally stumb...
Position 1374 · Loss: 7.697 nats (11.104 bits) · P(correct): 0.05%
RankPredictedProbBar
1." was"18.3%
2." w"7.6%
3." had"4.1%
4."'"4.1%
5." sa"3.2%
Actual: " acc" (rank >5, prob 0.05%)
...ave my rail-beard,...
Position 1592 · Loss: 7.693 nats (11.099 bits) · P(correct): 0.05%
RankPredictedProbBar
1."in"23.6%
2."w"18.4%
3."z"14.3%
4."ge"14.3%
5."g"11.2%
Actual: "il" (rank >5, prob 0.05%)
... registering on amnesty.or...
Position 1790 · Loss: 7.675 nats (11.073 bits) · P(correct): 0.05%
RankPredictedProbBar
1."line"24.1%
2." the"21.2%
3." our"11.4%
4." this"8.8%
5." A"2.9%
Actual: " am" (rank >5, prob 0.05%)
...), but I was consistently sur...
Position 877 · Loss: 7.642 nats (11.025 bits) · P(correct): 0.05%
RankPredictedProbBar
1."n"5.5%
2." also"4.9%
3." s"4.9%
4." just"4.3%
5." so"3.8%
Actual: " cons" (rank >5, prob 0.05%)
...fidentially reasons. Perh...
Position 498 · Loss: 7.628 nats (11.005 bits) · P(correct): 0.05%
RankPredictedProbBar
1."le"22.2%
2."port"19.6%
3."ce"19.6%
4."qu"10.5%
5."ve"8.2%
Actual: "as" (rank >5, prob 0.05%)
... obvious than yesterday, when...
Position 1367 · Loss: 7.557 nats (10.903 bits) · P(correct): 0.05%
RankPredictedProbBar
1." the"34.7%
2." when"11.3%
3." that"7.8%
4." in"4.2%
5." what"2.9%
Actual: " y" (rank >5, prob 0.05%)
... being the main offender),...
Position 868 · Loss: 7.532 nats (10.867 bits) · P(correct): 0.05%
RankPredictedProbBar
1." re"9.0%
2." c"8.0%
3." s"6.2%
4." e"4.3%
5." d"4.3%
Actual: " of" (rank >5, prob 0.05%)
... had been blearily working for 1...
Position 908 · Loss: 7.478 nats (10.788 bits) · P(correct): 0.06%
RankPredictedProbBar
1."ly"25.9%
2."ing"17.8%
3."he"17.8%
4."-"12.2%
5." and"6.5%
Actual: "ily" (rank >5, prob 0.06%)
... the result of a production-wide p...
Position 1603 · Loss: 7.398 nats (10.673 bits) · P(correct): 0.06%
RankPredictedProbBar
1." l"6.3%
2." m"4.9%
3." s"4.9%
4." c"3.8%
5." t"3.8%
Actual: " produ" (rank >5, prob 0.06%)
... appropriate but bittersweet...
Position 669 · Loss: 7.372 nats (10.636 bits) · P(correct): 0.06%
RankPredictedProbBar
1." time"7.3%
2." m"5.0%
3." t"5.0%
4." re"3.4%
5." e"3.4%
Actual: " but" (rank >5, prob 0.06%)
... boarding the "Good Morning...
Position 564 · Loss: 7.357 nats (10.614 bits) · P(correct): 0.06%
RankPredictedProbBar
1." pl"9.5%
2." bus"8.4%
3." fl"5.1%
4." a"5.1%
5." A"3.5%
Actual: " "" (rank >5, prob 0.06%)
... be for some reason looked down upon in...
Position 1679 · Loss: 7.342 nats (10.592 bits) · P(correct): 0.06%
RankPredictedProbBar
1." a"7.5%
2." un"6.6%
3." more"5.8%
4." the"5.8%
5." not"2.8%
Actual: " look" (rank >5, prob 0.06%)
..., Chris and two producers played...
Position 1241 · Loss: 7.338 nats (10.587 bits) · P(correct): 0.07%
RankPredictedProbBar
1." Ch"33.7%
2." I"26.2%
3." R"18.0%
4." me"4.6%
5." D"2.8%
Actual: " two" (rank >5, prob 0.07%)
... confusion for confidentially re...
Position 492 · Loss: 7.327 nats (10.570 bits) · P(correct): 0.07%
RankPredictedProbBar
1." the"23.4%
2." them"16.1%
3." J"14.2%
4." G"2.8%
5." M"2.8%
Actual: " con" (rank >5, prob 0.07%)
..., something or someone would come...
Position 926 · Loss: 7.285 nats (10.509 bits) · P(correct): 0.07%
RankPredictedProbBar
1." was"19.0%
2." we"6.2%
3." had"5.5%
4." that"5.5%
5." ha"4.8%
Actual: " or" (rank >5, prob 0.07%)
...:00 AM wake up calls...
Position 857 · Loss: 7.259 nats (10.472 bits) · P(correct): 0.07%
RankPredictedProbBar
1.")"25.1%
2.","13.4%
3." -"10.4%
4.")."7.2%
5." and"5.6%
Actual: " w" (rank >5, prob 0.07%)
... life on a very secure line at N...
Position 1182 · Loss: 7.189 nats (10.371 bits) · P(correct): 0.08%
RankPredictedProbBar
1." dif"8.7%
2." l"6.8%
3." b"5.3%
4." s"5.3%
5." t"5.3%
Actual: " sec" (rank >5, prob 0.08%)
...annesen is very much alive. Wh...
Position 33 · Loss: 7.150 nats (10.316 bits) · P(correct): 0.08%
RankPredictedProbBar
1." a"24.6%
2." the"16.9%
3." de"7.1%
4." one"3.3%
5." in"3.3%
Actual: " very" (rank >5, prob 0.08%)
... I'll take a train.<s>By...
Position 1779 · Loss: 7.146 nats (10.310 bits) · P(correct): 0.08%
RankPredictedProbBar
1." b"9.1%
2." look"9.1%
3." fe"7.1%
4." m"6.3%
5." l"5.5%
Actual: " tra" (rank >5, prob 0.08%)
...ed a letter addressedTo...
Position 67 · Loss: 7.049 nats (10.170 bits) · P(correct): 0.09%
RankPredictedProbBar
1." from"57.7%
2." s"7.8%
3." of"4.2%
4." to"2.5%
5." in"2.2%
Actual: " add" (rank >5, prob 0.09%)
... been blearily working for 18...
Position 909 · Loss: 7.034 nats (10.147 bits) · P(correct): 0.09%
RankPredictedProbBar
1." s"6.2%
2." dis"4.8%
3." d"4.3%
4." t"4.3%
5." m"3.8%
Actual: " work" (rank >5, prob 0.09%)
... I have to wash some extreme...
Position 1631 · Loss: 7.031 nats (10.143 bits) · P(correct): 0.09%
RankPredictedProbBar
1." my"85.6%
2." the"2.9%
3." off"1.1%
4." a"0.9%
5." it"0.8%
Actual: " some" (rank >5, prob 0.09%)
... ours. If you come from a count...
Position 1834 · Loss: 7.011 nats (10.115 bits) · P(correct): 0.09%
RankPredictedProbBar
1." are"22.1%
2." have"15.2%
3."'"9.2%
4.""8.1%
5." would"7.2%
Actual: " com" (rank >5, prob 0.09%)
...amed up -- using Rick Kle...
Position 1120 · Loss: 6.997 nats (10.094 bits) · P(correct): 0.09%
RankPredictedProbBar
1." and"15.4%
2." to"9.3%
3." the"9.3%
4." a"4.4%
5." in"3.4%
Actual: " us" (rank >5, prob 0.09%)
... now I have to go -- have to ret...
Position 1558 · Loss: 6.990 nats (10.084 bits) · P(correct): 0.09%
RankPredictedProbBar
1." back"22.5%
2." to"13.7%
3." out"12.1%
4." through"8.3%
5." into"3.5%
Actual: " -" (rank >5, prob 0.09%)
...e chaos of pre-show pre...
Position 954 · Loss: 6.988 nats (10.081 bits) · P(correct): 0.09%
RankPredictedProbBar
1." the"32.9%
2." our"12.1%
3." a"4.5%
4." W"3.1%
5." my"2.7%
Actual: " pre" (rank >5, prob 0.09%)
....” Now the 59-year-...
Position 373 · Loss: 6.960 nats (10.041 bits) · P(correct): 0.09%
RankPredictedProbBar
1."est"91.9%
2."2"1.7%
3."ve"0.9%
4."1"0.7%
5."ide"0.7%
Actual: "5" (rank >5, prob 0.09%)
... go out and ask real people what was on...
Position 705 · Loss: 6.950 nats (10.026 bits) · P(correct): 0.10%
RankPredictedProbBar
1." the"16.1%
2." for"14.2%
3." our"9.8%
4." people"5.9%
5." a"5.2%
Actual: " re" (rank >5, prob 0.10%)
...ging myself aboard that cr...
Position 1707 · Loss: 6.937 nats (10.008 bits) · P(correct): 0.10%
RankPredictedProbBar
1." to"21.0%
2." out"11.2%
3." into"8.7%
4." in"6.0%
5." down"5.3%
Actual: " ab" (rank >5, prob 0.10%)
... on the train with two roomates. O...
Position 1163 · Loss: 6.934 nats (10.004 bits) · P(correct): 0.10%
RankPredictedProbBar
1." the"21.0%
2." his"11.3%
3." a"7.7%
4." us"4.7%
5." her"4.1%
Actual: " two" (rank >5, prob 0.10%)
... the news from the brink of watery...
Position 1209 · Loss: 6.928 nats (9.994 bits) · P(correct): 0.10%
RankPredictedProbBar
1." tra"10.0%
2." c"5.3%
3." m"3.3%
4." p"2.9%
5." s"2.9%
Actual: " br" (rank >5, prob 0.10%)
... was to go out and ask real people what...
Position 703 · Loss: 6.881 nats (9.927 bits) · P(correct): 0.10%
RankPredictedProbBar
1." see"19.6%
2." exper"7.2%
3." en"4.4%
4." ex"4.4%
5." w"4.4%
Actual: " as" (rank >5, prob 0.10%)
... along to pick everyone up. Fr...
Position 937 · Loss: 6.867 nats (9.907 bits) · P(correct): 0.10%
RankPredictedProbBar
1." up"47.6%
2." us"37.1%
3." me"3.9%
4." our"1.4%
5." it"1.1%
Actual: " every" (rank >5, prob 0.10%)
...urance Company Declares L...
Position 9 · Loss: 6.864 nats (9.902 bits) · P(correct): 0.10%
RankPredictedProbBar
1." of"13.7%
2.","7.3%
3." ("5.7%
4." In"5.0%
5.":"3.5%
Actual: " De" (rank >5, prob 0.10%)
... be on top of such things. N...
Position 121 · Loss: 6.852 nats (9.885 bits) · P(correct): 0.11%
RankPredictedProbBar
1." the"33.2%
2." their"12.2%
3." his"9.5%
4." that"6.5%
5." it"5.8%
Actual: " su" (rank >5, prob 0.11%)
... bring the news every morning. Ne...
Position 1352 · Loss: 6.850 nats (9.883 bits) · P(correct): 0.11%
RankPredictedProbBar
1."."48.4%
2." to"12.2%
3.","7.4%
4." and"7.4%
5." from"5.8%
Actual: " every" (rank >5, prob 0.11%)
...hind the mix up?<s>Exh...
Position 521 · Loss: 6.841 nats (9.869 bits) · P(correct): 0.11%
RankPredictedProbBar
1."."29.6%
2." of"26.2%
3."ed"10.9%
4."?"8.5%
5.","7.5%
Actual: " up" (rank >5, prob 0.11%)
... straight from them their concerns...
Position 724 · Loss: 6.836 nats (9.863 bits) · P(correct): 0.11%
RankPredictedProbBar
1."."33.8%
2.","29.8%
3." and"8.5%
4." about"4.6%
5." what"2.2%
Actual: " their" (rank >5, prob 0.11%)
... you will have full use of the Amn...
Position 1964 · Loss: 6.802 nats (9.813 bits) · P(correct): 0.11%
RankPredictedProbBar
1." acc"65.3%
2." cont"16.5%
3." f"2.5%
4." right"2.2%
5." p"0.8%
Actual: " use" (rank >5, prob 0.11%)
... planned, life on the train...
Position 999 · Loss: 6.778 nats (9.778 bits) · P(correct): 0.11%
RankPredictedProbBar
1." I"21.7%
2." to"16.9%
3." we"14.9%
4." it"9.0%
5." the"7.0%
Actual: " l" (rank >5, prob 0.11%)
... absolute chaos of pre-...
Position 950 · Loss: 6.777 nats (9.777 bits) · P(correct): 0.11%
RankPredictedProbBar
1."ly"31.6%
2." best"8.0%
3." most"4.3%
4." wor"4.3%
5." s"2.9%
Actual: " ch" (rank >5, prob 0.11%)
...bined with ours. If you come...
Position 1830 · Loss: 6.770 nats (9.767 bits) · P(correct): 0.11%
RankPredictedProbBar
1." other"7.1%
2." m"5.5%
3." comm"4.3%
4." f"3.8%
5." c"2.3%
Actual: "s" (rank >5, prob 0.11%)
...ve ever seen for four hours...
Position 1269 · Loss: 6.764 nats (9.759 bits) · P(correct): 0.12%
RankPredictedProbBar
1."."28.2%
2.","24.9%
3." -"11.8%
4." in"8.1%
5." on"5.6%
Actual: " for" (rank >5, prob 0.12%)
... of pride in the production team's...
Position 978 · Loss: 6.751 nats (9.739 bits) · P(correct): 0.12%
RankPredictedProbBar
1." f"5.6%
2." p"3.9%
3." c"3.9%
4." m"3.0%
5." t"2.4%
Actual: " produ" (rank >5, prob 0.12%)
... his life on a very secure line at...
Position 1181 · Loss: 6.731 nats (9.711 bits) · P(correct): 0.12%
RankPredictedProbBar
1." p"8.4%
2." m"5.8%
3." t"5.8%
4." l"4.5%
5." b"4.5%
Actual: " very" (rank >5, prob 0.12%)
... me, it was an appropriate but...
Position 664 · Loss: 6.711 nats (9.682 bits) · P(correct): 0.12%
RankPredictedProbBar
1." ex"9.7%
2." am"9.7%
3." exper"6.7%
4." ad"6.7%
5." a"5.9%
Actual: " app" (rank >5, prob 0.12%)
...orstep of the government --...
Position 777 · Loss: 6.704 nats (9.672 bits) · P(correct): 0.12%
RankPredictedProbBar
1." c"5.9%
2." wor"5.9%
3." n"5.2%
4." p"2.8%
5." f"2.8%
Actual: " go" (rank >5, prob 0.12%)
...clares Living Man Dead...
Position 15 · Loss: 6.701 nats (9.668 bits) · P(correct): 0.12%
RankPredictedProbBar
1."e"20.7%
2."i"20.7%
3."aw"7.6%
4."ic"4.1%
5."ife"3.2%
Actual: "iv" (rank >5, prob 0.12%)
...tter addressedTo the Est...
Position 70 · Loss: 6.695 nats (9.658 bits) · P(correct): 0.12%
RankPredictedProbBar
1." to"93.3%
2." by"2.5%
3." at"0.6%
4." in"0.6%
5." as"0.4%
Actual: "" (rank >5, prob 0.12%)
...asons to miss dragging my...
Position 1699 · Loss: 6.652 nats (9.597 bits) · P(correct): 0.13%
RankPredictedProbBar
1." the"27.9%
2." this"11.6%
3." out"10.3%
4." my"7.0%
5." a"4.3%
Actual: " d" (rank >5, prob 0.13%)
...om. Or, my personal fav...
Position 1221 · Loss: 6.644 nats (9.586 bits) · P(correct): 0.13%
RankPredictedProbBar
1." when"52.5%
2." as"4.9%
3." if"3.0%
4." in"3.0%
5." like"2.3%
Actual: " my" (rank >5, prob 0.13%)
... we pulled off what had never been d...
Position 1063 · Loss: 6.636 nats (9.574 bits) · P(correct): 0.13%
RankPredictedProbBar
1." the"52.9%
2." our"25.0%
3." a"9.2%
4." an"1.4%
5." that"0.8%
Actual: " what" (rank >5, prob 0.13%)
... informative report possible. Though...
Position 1501 · Loss: 6.633 nats (9.570 bits) · P(correct): 0.13%
RankPredictedProbBar
1." I"60.2%
2."ing"5.6%
3." you"3.9%
4." of"3.9%
5." on"3.4%
Actual: " p" (rank >5, prob 0.13%)
... hopes to the doorstep of the...
Position 771 · Loss: 6.628 nats (9.562 bits) · P(correct): 0.13%
RankPredictedProbBar
1." wor"10.5%
2." f"5.6%
3." t"4.4%
4." e"3.4%
5." p"3.4%
Actual: " do" (rank >5, prob 0.13%)
...'08 Tour train, and beg...
Position 588 · Loss: 6.618 nats (9.548 bits) · P(correct): 0.13%
RankPredictedProbBar
1.","42.0%
2." of"5.7%
3."."3.9%
4." in"3.4%
5."ist"2.7%
Actual: " tra" (rank >5, prob 0.13%)
...ussion and know firsthand that whate...
Position 1467 · Loss: 6.615 nats (9.544 bits) · P(correct): 0.13%
RankPredictedProbBar
1." that"32.8%
2." how"12.1%
3." what"10.6%
4." the"6.5%
5." I"4.4%
Actual: " first" (rank >5, prob 0.13%)
...cribe me right now. Six d...
Position 551 · Loss: 6.609 nats (9.534 bits) · P(correct): 0.13%
RankPredictedProbBar
1."."42.4%
2.","8.3%
3.":"5.7%
4." as"5.7%
5." and"5.7%
Actual: " right" (rank >5, prob 0.13%)
... the train with two roomates. Or...
Position 1164 · Loss: 6.572 nats (9.482 bits) · P(correct): 0.14%
RankPredictedProbBar
1." of"18.3%
2." other"16.2%
3." fr"6.7%
4." people"3.6%
5." m"2.8%
Actual: " ro" (rank >5, prob 0.14%)
... have to go -- have to return to...
Position 1560 · Loss: 6.556 nats (9.458 bits) · P(correct): 0.14%
RankPredictedProbBar
1." and"21.1%
2." to"11.3%
3." I"6.0%
4." not"2.5%
5." in"2.5%
Actual: " have" (rank >5, prob 0.14%)
...ared dead, someone would have to p...
Position 327 · Loss: 6.535 nats (9.427 bits) · P(correct): 0.15%
RankPredictedProbBar
1." I"40.3%
2." it"9.0%
3." you"4.8%
4." and"4.2%
5." is"3.8%
Actual: " some" (rank >5, prob 0.15%)
...beard, the result of a production...
Position 1599 · Loss: 6.520 nats (9.406 bits) · P(correct): 0.15%
RankPredictedProbBar
1." s"5.5%
2." l"4.3%
3." b"4.3%
4." c"3.8%
5." p"3.4%
Actual: " res" (rank >5, prob 0.15%)
... letter was to inform the estate...
Position 205 · Loss: 6.507 nats (9.387 bits) · P(correct): 0.15%
RankPredictedProbBar
1."ld"53.2%
2." the"10.5%
3." s"2.6%
4."t"2.3%
5." J"2.3%
Actual: " in" (rank >5, prob 0.15%)
... down upon in the office. These are...
Position 1686 · Loss: 6.474 nats (9.340 bits) · P(correct): 0.15%
RankPredictedProbBar
1." f"13.9%
2." ne"10.8%
3." m"7.4%
4." e"5.8%
5." d"3.1%
Actual: " off" (rank >5, prob 0.15%)
... we had been blearily working for ...
Position 907 · Loss: 6.469 nats (9.333 bits) · P(correct): 0.16%
RankPredictedProbBar
1."ed"48.7%
2."nd"17.9%
3."ac"10.9%
4."ak"8.5%
5."w"4.5%
Actual: "ar" (rank >5, prob 0.16%)
... the fleeting sparkle of pride...
Position 970 · Loss: 6.439 nats (9.289 bits) · P(correct): 0.16%
RankPredictedProbBar
1." m"6.8%
2." s"5.3%
3." h"4.7%
4." d"4.1%
5." t"3.6%
Actual: " sp" (rank >5, prob 0.16%)
...ulous game of Monopoly I...
Position 1257 · Loss: 6.432 nats (9.279 bits) · P(correct): 0.16%
RankPredictedProbBar
1." the"44.6%
2." their"27.1%
3." our"4.7%
4." all"4.7%
5." my"1.1%
Actual: " M" (rank >5, prob 0.16%)
...rapped the last show of our little...
Position 620 · Loss: 6.422 nats (9.265 bits) · P(correct): 0.16%
RankPredictedProbBar
1." two"10.1%
2." of"6.9%
3." "4.8%
4." se"4.2%
5." th"4.2%
Actual: " show" (rank >5, prob 0.16%)
...ick Klein and me as props -...
Position 1128 · Loss: 6.410 nats (9.248 bits) · P(correct): 0.16%
RankPredictedProbBar
1." J"7.9%
2." M"6.2%
3." D"5.4%
4." B"5.4%
5." the"5.4%
Actual: " me" (rank >5, prob 0.16%)
... their care for the American people to...
Position 1338 · Loss: 6.371 nats (9.192 bits) · P(correct): 0.17%
RankPredictedProbBar
1." people"6.4%
2." c"4.4%
3." tra"3.9%
4." p"3.9%
5." l"3.4%
Actual: " A" (rank >5, prob 0.17%)
...ing alive, he continues to dri...
Position 275 · Loss: 6.349 nats (9.160 bits) · P(correct): 0.17%
RankPredictedProbBar
1." was"17.8%
2." would"9.5%
3." had"8.4%
4.""7.4%
5." d"5.8%
Actual: " cont" (rank >5, prob 0.17%)
...in and Chris teamed up --...
Position 1114 · Loss: 6.349 nats (9.159 bits) · P(correct): 0.17%
RankPredictedProbBar
1." were"13.9%
2.","8.4%
3." b"3.1%
4." c"2.7%
5." w"2.7%
Actual: " te" (rank >5, prob 0.17%)
...ive emails about human rights c...
Position 1880 · Loss: 6.348 nats (9.158 bits) · P(correct): 0.18%
RankPredictedProbBar
1." the"13.9%
2." new"12.3%
3." your"9.6%
4." up"8.4%
5." our"5.8%
Actual: " h" (rank >5, prob 0.18%)
...18 hours straight, som...
Position 918 · Loss: 6.347 nats (9.157 bits) · P(correct): 0.18%
RankPredictedProbBar
1." a"22.9%
2.","20.3%
3." at"6.6%
4." to"6.6%
5." and"5.8%
Actual: " st" (rank >5, prob 0.18%)
... to not shave for the duration of...
Position 1615 · Loss: 6.298 nats (9.086 bits) · P(correct): 0.18%
RankPredictedProbBar
1." my"51.0%
2." the"11.4%
3." a"3.7%
4." me"2.9%
5." our"2.5%
Actual: " for" (rank >5, prob 0.18%)
..., we brought their thoughts, pro...
Position 756 · Loss: 6.274 nats (9.052 bits) · P(correct): 0.19%
RankPredictedProbBar
1." our"13.2%
2." the"11.7%
3." a"10.3%
4." to"9.1%
5." back"4.9%
Actual: " their" (rank >5, prob 0.19%)
...ing sparkle of pride in the production...
Position 974 · Loss: 6.255 nats (9.023 bits) · P(correct): 0.19%
RankPredictedProbBar
1." the"19.6%
2." our"8.2%
3." a"3.4%
4." l"2.6%
5." m"2.6%
Actual: " pr" (rank >5, prob 0.19%)
... capitol, Washington D...
Position 647 · Loss: 6.225 nats (8.981 bits) · P(correct): 0.20%
RankPredictedProbBar
1." and"29.4%
2." where"6.6%
3." to"6.6%
4." which"4.0%
5." the"3.5%
Actual: " W" (rank >5, prob 0.20%)
... every morning. Never was this more o...
Position 1357 · Loss: 6.217 nats (8.969 bits) · P(correct): 0.20%
RankPredictedProbBar
1." The"15.8%
2." And"7.5%
3." It"5.8%
4." S"5.8%
5." I"5.8%
Actual: " Ne" (rank >5, prob 0.20%)
...ome an International member. Here...
Position 1864 · Loss: 6.203 nats (8.949 bits) · P(correct): 0.20%
RankPredictedProbBar
1." C"18.2%
2." P"8.6%
3." A"7.6%
4." S"5.9%
5." B"4.1%
Actual: " m" (rank >5, prob 0.20%)
... feeling that the spontaneous d...
Position 1654 · Loss: 6.200 nats (8.945 bits) · P(correct): 0.20%
RankPredictedProbBar
1." new"9.8%
2." p"4.1%
3." f"4.1%
4." ne"3.6%
5." t"3.2%
Actual: " sp" (rank >5, prob 0.20%)
..., to hear straight from them their con...
Position 720 · Loss: 6.178 nats (8.913 bits) · P(correct): 0.21%
RankPredictedProbBar
1."or"94.8%
2."u"1.7%
3."r"1.2%
4."ory"0.6%
5."at"0.2%
Actual: "ra" (rank >5, prob 0.21%)
...ake up calls being the main of...
Position 863 · Loss: 6.128 nats (8.841 bits) · P(correct): 0.22%
RankPredictedProbBar
1.","32.4%
2.")"13.5%
3." and"11.9%
4.")."5.6%
5." from"5.6%
Actual: " be" (rank >5, prob 0.22%)
... of our little Odyssey from...
Position 626 · Loss: 6.122 nats (8.832 bits) · P(correct): 0.22%
RankPredictedProbBar
1." t"8.2%
2." ad"4.4%
3." s"4.4%
4." c"4.4%
5." b"3.4%
Actual: " O" (rank >5, prob 0.22%)
... pension and other government ben...
Position 396 · Loss: 6.121 nats (8.830 bits) · P(correct): 0.22%
RankPredictedProbBar
1." p"5.7%
2." ins"5.0%
3." ex"5.0%
4." fin"4.4%
5." b"4.4%
Actual: " go" (rank >5, prob 0.22%)
... supposed to share his tiny...
Position 1150 · Loss: 6.103 nats (8.804 bits) · P(correct): 0.22%
RankPredictedProbBar
1." be"62.0%
2." have"5.8%
3." do"2.7%
4." get"2.1%
5." go"1.9%
Actual: " sh" (rank >5, prob 0.22%)
... behind the mix up?<s>Ex...
Position 520 · Loss: 6.094 nats (8.791 bits) · P(correct): 0.23%
RankPredictedProbBar
1."ess"17.9%
2."ur"12.3%
3."is"5.8%
4."at"5.8%
5."ist"3.1%
Actual: "ix" (rank >5, prob 0.23%)
...- the first live network tele...
Position 1078 · Loss: 6.092 nats (8.788 bits) · P(correct): 0.23%
RankPredictedProbBar
1." show"14.0%
2." per"7.5%
3." b"7.5%
4." m"5.8%
5."-"4.5%
Actual: " n" (rank >5, prob 0.23%)
... to share his tiny room on the...
Position 1154 · Loss: 6.073 nats (8.762 bits) · P(correct): 0.23%
RankPredictedProbBar
1."al"49.8%
2."ri"12.6%
3."ip"5.2%
4."est"4.6%
5."ast"3.2%
Actual: "in" (rank >5, prob 0.23%)
...es up. But maybe we'll...
Position 1734 · Loss: 6.066 nats (8.751 bits) · P(correct): 0.23%
RankPredictedProbBar
1." I"18.4%
2." the"6.0%
3." it"5.3%
4.","4.1%
5." they"4.1%
Actual: " may" (rank >5, prob 0.23%)
... the Canadian man received a...
Position 58 · Loss: 6.058 nats (8.740 bits) · P(correct): 0.23%
RankPredictedProbBar
1." go"21.1%
2." In"4.7%
3." P"4.7%
4." p"4.1%
5." A"3.2%
Actual: " man" (rank >5, prob 0.23%)
...ion to become an International member...
Position 1861 · Loss: 6.051 nats (8.730 bits) · P(correct): 0.24%
RankPredictedProbBar
1." off"34.9%
2." act"7.8%
3." ad"6.1%
4." am"5.4%
5." A"5.4%
Actual: " In" (rank >5, prob 0.24%)
... hard. But far more moving...
Position 1302 · Loss: 6.024 nats (8.692 bits) · P(correct): 0.24%
RankPredictedProbBar
1." I"9.1%
2." it"8.0%
3." the"7.1%
4." that"4.3%
5." we"3.8%
Actual: " f" (rank >5, prob 0.24%)
...C., and for me, it was an app...
Position 659 · Loss: 5.952 nats (8.587 bits) · P(correct): 0.26%
RankPredictedProbBar
1."g"18.2%
2."m"18.2%
3." the"18.2%
4." a"5.9%
5."c"4.1%
Actual: " me" (rank >5, prob 0.26%)
...s and a few of us, Sam includ...
Position 1279 · Loss: 5.951 nats (8.586 bits) · P(correct): 0.26%
RankPredictedProbBar
1." min"49.6%
2." d"16.1%
3." sec"8.6%
4." h"4.6%
5." more"4.1%
Actual: " of" (rank >5, prob 0.26%)
... company, who should really be on top...
Position 114 · Loss: 5.914 nats (8.533 bits) · P(correct): 0.27%
RankPredictedProbBar
1." have"66.1%
2." be"13.0%
3." not"4.8%
4.""1.8%
5."n"1.8%
Actual: " re" (rank >5, prob 0.27%)
...ramatically bring the news from...
Position 1202 · Loss: 5.912 nats (8.529 bits) · P(correct): 0.27%
RankPredictedProbBar
1." re"24.4%
2." inc"11.5%
3." imp"10.2%
4." ch"10.2%
5." al"5.4%
Actual: " br" (rank >5, prob 0.27%)
... have their fun, when it comes to the...
Position 1513 · Loss: 5.911 nats (8.528 bits) · P(correct): 0.27%
RankPredictedProbBar
1." they"24.4%
2." it"14.8%
3." I"11.5%
4." the"7.0%
5." their"4.2%
Actual: " when" (rank >5, prob 0.27%)
... the content of the next day's show...
Position 1402 · Loss: 5.880 nats (8.483 bits) · P(correct): 0.28%
RankPredictedProbBar
1." new"13.5%
2." show"10.5%
3." tra"5.6%
4." e"4.4%
5." b"3.9%
Actual: " ne" (rank >5, prob 0.28%)
... to return to "normal"...
Position 1566 · Loss: 5.866 nats (8.462 bits) · P(correct): 0.28%
RankPredictedProbBar
1." the"32.8%
2." my"15.5%
3." work"3.5%
4." W"3.5%
5." a"2.7%
Actual: " "" (rank >5, prob 0.28%)
...ed, cried from laughing...
Position 1290 · Loss: 5.840 nats (8.425 bits) · P(correct): 0.29%
RankPredictedProbBar
1." and"33.6%
2.","20.4%
3." out"6.6%
4."."4.5%
5." in"2.8%
Actual: " from" (rank >5, prob 0.29%)
...ramped studio on rails well...
Position 1717 · Loss: 5.836 nats (8.420 bits) · P(correct): 0.29%
RankPredictedProbBar
1." tra"43.3%
2." bus"5.9%
3." t"5.2%
4."."2.8%
5." a"2.5%
Actual: " on" (rank >5, prob 0.29%)
... few of us, Sam included, c...
Position 1282 · Loss: 5.817 nats (8.393 bits) · P(correct): 0.30%
RankPredictedProbBar
1." as"4.1%
2." were"3.6%
3." in"3.2%
4." and"3.2%
5." b"2.5%
Actual: " S" (rank >5, prob 0.30%)
... to lie to you and say that I l...
Position 837 · Loss: 5.817 nats (8.392 bits) · P(correct): 0.30%
RankPredictedProbBar
1.","26.8%
2."."16.3%
3." about"7.7%
4." that"7.7%
5." -"6.0%
Actual: " and" (rank >5, prob 0.30%)
... to shave my rail-be...
Position 1590 · Loss: 5.809 nats (8.380 bits) · P(correct): 0.30%
RankPredictedProbBar
1." he"23.8%
2." ha"12.8%
3." b"8.8%
4." s"5.3%
5." f"4.7%
Actual: " " (rank >5, prob 0.30%)
... their thoughts, problems and h...
Position 761 · Loss: 5.793 nats (8.358 bits) · P(correct): 0.30%
RankPredictedProbBar
1." their"18.9%
2." fe"8.9%
3." em"6.9%
4." "6.1%
5." and"4.8%
Actual: " pro" (rank >5, prob 0.30%)
..., and lead on activism init...
Position 1937 · Loss: 5.792 nats (8.356 bits) · P(correct): 0.31%
RankPredictedProbBar
1." the"18.9%
2." a"18.9%
3." your"8.9%
4." an"6.1%
5." our"4.8%
Actual: " act" (rank >5, prob 0.31%)
...esty International online communities....
Position 1975 · Loss: 5.789 nats (8.352 bits) · P(correct): 0.31%
RankPredictedProbBar
1." we"7.9%
2." A"7.0%
3." In"5.4%
4." C"5.4%
5." M"4.8%
Actual: " on" (rank >5, prob 0.31%)
... the confusion for confidentially...
Position 491 · Loss: 5.777 nats (8.334 bits) · P(correct): 0.31%
RankPredictedProbBar
1."."75.8%
2." that"2.3%
3.","2.0%
4." or"2.0%
5." and"1.8%
Actual: " for" (rank >5, prob 0.31%)
...cers played the most ridiculous...
Position 1248 · Loss: 5.772 nats (8.328 bits) · P(correct): 0.31%
RankPredictedProbBar
1." s"10.3%
2." show"4.3%
3." ro"3.8%
4." g"3.4%
5." m"3.4%
Actual: " most" (rank >5, prob 0.31%)
... Sam that he was supposed to sh...
Position 1145 · Loss: 5.757 nats (8.306 bits) · P(correct): 0.32%
RankPredictedProbBar
1."n"13.4%
2." go"9.2%
3." the"6.3%
4." a"6.3%
5." in"3.4%
Actual: " su" (rank >5, prob 0.32%)
...merican people to whom they bring the...
Position 1344 · Loss: 5.746 nats (8.289 bits) · P(correct): 0.32%
RankPredictedProbBar
1." be"10.6%
2."day"4.4%
3." get"3.4%
4." com"3.4%
5." l"3.0%
Actual: " wh" (rank >5, prob 0.32%)
...ings. Now this wouldnt have been...
Position 129 · Loss: 5.704 nats (8.229 bits) · P(correct): 0.33%
RankPredictedProbBar
1." is"49.5%
2." le"4.6%
3." man"2.5%
4." was"2.5%
5." m"1.9%
Actual: " would" (rank >5, prob 0.33%)
...lein and me as props -- to...
Position 1130 · Loss: 5.671 nats (8.182 bits) · P(correct): 0.34%
RankPredictedProbBar
1." the"21.3%
2." their"12.9%
3." a"12.9%
4." our"5.4%
5." an"2.2%
Actual: " pro" (rank >5, prob 0.34%)
...ime, it's over. We just wra...
Position 610 · Loss: 5.651 nats (8.153 bits) · P(correct): 0.35%
RankPredictedProbBar
1." time"31.6%
2." been"4.9%
3." a"4.9%
4." h"2.6%
5." the"2.6%
Actual: " over" (rank >5, prob 0.35%)
...assachusetts after we pulled off...
Position 1057 · Loss: 5.651 nats (8.153 bits) · P(correct): 0.35%
RankPredictedProbBar
1.","40.6%
2."."7.1%
3." of"5.5%
4." when"4.9%
5." and"3.8%
Actual: " after" (rank >5, prob 0.35%)
...wide pact to not shave for the...
Position 1611 · Loss: 5.647 nats (8.146 bits) · P(correct): 0.35%
RankPredictedProbBar
1." make"4.3%
2." be"3.4%
3." c"3.4%
4." re"3.4%
5." p"3.0%
Actual: " not" (rank >5, prob 0.35%)
... was dead, but cryptically can...
Position 470 · Loss: 5.644 nats (8.142 bits) · P(correct): 0.35%
RankPredictedProbBar
1." they"36.1%
2." the"9.1%
3." that"5.5%
4." it"3.8%
5." he"2.6%
Actual: " c" (rank >5, prob 0.35%)
...oin in on the human rights con...
Position 1805 · Loss: 5.607 nats (8.089 bits) · P(correct): 0.37%
RankPredictedProbBar
1." dis"29.2%
2." con"20.1%
3." f"6.5%
4." comm"2.4%
5." a"2.4%
Actual: " h" (rank >5, prob 0.37%)
...hausted and euphoric....
Position 531 · Loss: 5.602 nats (8.082 bits) · P(correct): 0.37%
RankPredictedProbBar
1." un"12.2%
2." t"4.5%
3." d"3.5%
4." st"3.1%
5." s"3.1%
Actual: " e" (rank >5, prob 0.37%)
... dance parties that erupted on...
Position 1664 · Loss: 5.584 nats (8.056 bits) · P(correct): 0.38%
RankPredictedProbBar
1." I"12.4%
2." are"6.7%
3." we"5.2%
4." have"5.2%
5." com"3.1%
Actual: " " (rank >5, prob 0.38%)
...ersweet because I honestly did...
Position 808 · Loss: 5.582 nats (8.054 bits) · P(correct): 0.38%
RankPredictedProbBar
1." was"20.5%
2." had"9.7%
3."'"8.6%
4." could"4.6%
5." d"3.6%
Actual: " h" (rank >5, prob 0.38%)
... an anchors' meeting where they dis...
Position 1388 · Loss: 5.578 nats (8.048 bits) · P(correct): 0.38%
RankPredictedProbBar
1." b"11.0%
2." ro"6.7%
3." l"5.2%
4." ch"4.1%
5." h"4.1%
Actual: " me" (rank >5, prob 0.38%)
...ent of the next day's show and,...
Position 1404 · Loss: 5.561 nats (8.023 bits) · P(correct): 0.38%
RankPredictedProbBar
1." e"11.2%
2." b"7.7%
3." tra"6.8%
4." m"4.7%
5." show"3.6%
Actual: " day" (rank >5, prob 0.38%)
...is teamed up -- using Rick...
Position 1118 · Loss: 5.540 nats (7.992 bits) · P(correct): 0.39%
RankPredictedProbBar
1." to"35.4%
2." with"27.5%
3." for"13.0%
4." in"7.9%
5." on"4.8%
Actual: " -" (rank >5, prob 0.39%)
... it came from his insurance company,...
Position 106 · Loss: 5.539 nats (7.992 bits) · P(correct): 0.39%
RankPredictedProbBar
1." fam"10.1%
2." p"8.9%
3." f"7.9%
4." w"7.9%
5." m"7.0%
Actual: " ins" (rank >5, prob 0.39%)
...yssey from the Newseum in the n...
Position 633 · Loss: 5.534 nats (7.984 bits) · P(correct): 0.39%
RankPredictedProbBar
1." be"4.8%
2." "3.8%
3." m"3.3%
4." p"3.3%
5." s"2.9%
Actual: " New" (rank >5, prob 0.39%)
... pre-show preparations, to...
Position 959 · Loss: 5.534 nats (7.983 bits) · P(correct): 0.40%
RankPredictedProbBar
1."er"13.1%
2."ers"13.1%
3." to"4.3%
4." time"3.3%
5." t"2.9%
Actual: " pre" (rank >5, prob 0.40%)
... for confidentially reasons. P...
Position 496 · Loss: 5.531 nats (7.980 bits) · P(correct): 0.40%
RankPredictedProbBar
1."ity"27.8%
2." information"21.6%
3." re"4.3%
4." d"4.3%
5." p"3.8%
Actual: "ly" (rank >5, prob 0.40%)
... up -- using Rick Klein and...
Position 1122 · Loss: 5.522 nats (7.967 bits) · P(correct): 0.40%
RankPredictedProbBar
1." the"19.3%
2." a"17.0%
3." their"11.7%
4." an"2.6%
5." s"1.8%
Actual: " R" (rank >5, prob 0.40%)
... when we had been blearily working for...
Position 906 · Loss: 5.521 nats (7.965 bits) · P(correct): 0.40%
RankPredictedProbBar
1."oth"17.0%
2."o"11.7%
3."rough"11.7%
4."itt"11.7%
5."re"10.3%
Actual: "le" (rank >5, prob 0.40%)
...ons to miss dragging mys...
Position 1700 · Loss: 5.515 nats (7.956 bits) · P(correct): 0.40%
RankPredictedProbBar
1."ance"28.2%
2."anc"17.1%
3."in"10.4%
4."r"6.3%
5."uring"6.3%
Actual: "ra" (rank >5, prob 0.40%)
...orge Johannesen is very...
Position 28 · Loss: 5.511 nats (7.950 bits) · P(correct): 0.40%
RankPredictedProbBar
1."n"99.0%
2."ans"0.6%
3."an" ← actual0.4%
4."ne"0.0%
5."r"0.0%
...efits. The Manitoba P...
Position 407 · Loss: 5.500 nats (7.935 bits) · P(correct): 0.41%
RankPredictedProbBar
1." "9.3%
2."re"9.3%
3."y"6.4%
4."n"5.6%
5." le"5.6%
Actual: " M" (rank >5, prob 0.41%)
....C., and for me, it was an...
Position 658 · Loss: 5.500 nats (7.934 bits) · P(correct): 0.41%
RankPredictedProbBar
1." we"8.2%
2." the"4.4%
3." then"3.9%
4." he"3.9%
5." I"3.9%
Actual: " for" (rank >5, prob 0.41%)
... beginning the adventure of a l...
Position 597 · Loss: 5.495 nats (7.927 bits) · P(correct): 0.41%
RankPredictedProbBar
1." t"7.3%
2." "5.7%
3." day"4.4%
4." j"4.4%
5." r"3.9%
Actual: " ad" (rank >5, prob 0.41%)
...isten in on this discussion and...
Position 1460 · Loss: 5.490 nats (7.921 bits) · P(correct): 0.41%
RankPredictedProbBar
1." m"6.5%
2."."4.4%
3." new"3.9%
4." s"3.0%
5." t"3.0%
Actual: " dis" (rank >5, prob 0.41%)
... reason, I was allowed to stay...
Position 1417 · Loss: 5.489 nats (7.920 bits) · P(correct): 0.41%
RankPredictedProbBar
1."n"12.1%
2." s"5.0%
3." not"3.9%
4." st"3.0%
5." a"2.7%
Actual: " all" (rank >5, prob 0.41%)
...ossible. Though they have their fun,...
Position 1507 · Loss: 5.468 nats (7.889 bits) · P(correct): 0.42%
RankPredictedProbBar
1." I"38.0%
2." it"10.9%
3." the"8.5%
4.","7.5%
5." my"3.1%
Actual: " they" (rank >5, prob 0.42%)
... nation. By ending in Was...
Position 737 · Loss: 5.461 nats (7.879 bits) · P(correct): 0.42%
RankPredictedProbBar
1." the"49.1%
2." that"3.6%
3." this"2.8%
4." now"2.4%
5." then"2.2%
Actual: " e" (rank >5, prob 0.42%)
... to. I have to shave my ra...
Position 1586 · Loss: 5.457 nats (7.872 bits) · P(correct): 0.43%
RankPredictedProbBar
1." re"12.5%
2." go"9.7%
3." be"8.6%
4." st"3.1%
5." l"2.8%
Actual: " sh" (rank >5, prob 0.43%)
... estate a fat check for his p...
Position 171 · Loss: 5.445 nats (7.856 bits) · P(correct): 0.43%
RankPredictedProbBar
1."air"23.6%
2."ull"23.6%
3."ree"12.6%
4."res"7.6%
5."our"3.2%
Actual: "at" (rank >5, prob 0.43%)
...- to the people that can do something...
Position 787 · Loss: 5.442 nats (7.851 bits) · P(correct): 0.43%
RankPredictedProbBar
1." were"11.2%
2." we"8.7%
3." are"6.8%
4." have"4.1%
5." l"4.1%
Actual: " can" (rank >5, prob 0.43%)
... word. But now I have to go -...
Position 1553 · Loss: 5.439 nats (7.846 bits) · P(correct): 0.43%
RankPredictedProbBar
1." they"11.2%
2." I"7.7%
3." it"6.8%
4." when"5.3%
5." the"5.3%
Actual: " now" (rank >5, prob 0.43%)
..., cried from laughing so...
Position 1291 · Loss: 5.415 nats (7.812 bits) · P(correct): 0.45%
RankPredictedProbBar
1." the"40.1%
2." our"7.9%
3." his"7.0%
4." a"7.0%
5." be"4.8%
Actual: " l" (rank >5, prob 0.45%)
... the anchors was their care for the A...
Position 1333 · Loss: 5.413 nats (7.810 bits) · P(correct): 0.45%
RankPredictedProbBar
1." the"16.7%
2.","4.2%
3." a"4.2%
4."."4.2%
5." when"3.7%
Actual: " their" (rank >5, prob 0.45%)
... everything about it (3:00 A...
Position 850 · Loss: 5.399 nats (7.789 bits) · P(correct): 0.45%
RankPredictedProbBar
1."."52.3%
2.","21.8%
3." and"6.2%
4." -"5.5%
5." so"1.6%
Actual: " (" (rank >5, prob 0.45%)
...s straight, something or some...
Position 922 · Loss: 5.382 nats (7.765 bits) · P(correct): 0.46%
RankPredictedProbBar
1." we"28.4%
2." the"10.5%
3." I"9.2%
4." it"7.2%
5." our"6.3%
Actual: " s" (rank >5, prob 0.46%)
... complicated, but most of all, fun...
Position 1021 · Loss: 5.360 nats (7.733 bits) · P(correct): 0.47%
RankPredictedProbBar
1." it"15.6%
2." I"12.1%
3." the"8.3%
4." we"5.7%
5." that"3.1%
Actual: " most" (rank >5, prob 0.47%)
... Odyssey from the Newseum in...
Position 631 · Loss: 5.352 nats (7.721 bits) · P(correct): 0.47%
RankPredictedProbBar
1.","6.5%
2." into"5.1%
3."."4.5%
4." T"4.5%
5." and"3.5%
Actual: " from" (rank >5, prob 0.47%)
...en the anchors was their care for the...
Position 1332 · Loss: 5.343 nats (7.709 bits) · P(correct): 0.48%
RankPredictedProbBar
1." and"38.0%
2.","15.8%
3." of"14.0%
4."."10.9%
5." -"4.0%
Actual: " was" (rank >5, prob 0.48%)
... laughing so hard. But...
Position 1296 · Loss: 5.339 nats (7.702 bits) · P(correct): 0.48%
RankPredictedProbBar
1." at"14.0%
2." and"12.4%
3.","10.9%
4."."8.5%
5." in"5.2%
Actual: " so" (rank >5, prob 0.48%)
...in. Or when Diane, Rob...
Position 1103 · Loss: 5.329 nats (7.687 bits) · P(correct): 0.49%
RankPredictedProbBar
1." we"26.5%
2." I"18.2%
3." the"16.1%
4." it"5.2%
5." a"4.6%
Actual: " D" (rank >5, prob 0.49%)
... to get that, I guess I must have...
Position 351 · Loss: 5.324 nats (7.680 bits) · P(correct): 0.49%
RankPredictedProbBar
1." would"38.7%
2.""14.2%
3." don"9.8%
4." have"7.6%
5." need"2.8%
Actual: " gu" (rank >5, prob 0.49%)
...errible if Manitoba Pub...
Position 142 · Loss: 5.311 nats (7.662 bits) · P(correct): 0.49%
RankPredictedProbBar
1."r"83.0%
2."s"2.5%
3."ich"1.2%
4."ar"1.1%
5."c"0.8%
Actual: "an" (rank >5, prob 0.49%)
... Manitoba Public Insur...
Position 146 · Loss: 5.305 nats (7.653 bits) · P(correct): 0.50%
RankPredictedProbBar
1." had"16.4%
2." was"12.8%
3.""11.3%
4." d"3.2%
5." p"2.2%
Actual: " P" (rank >5, prob 0.50%)
... me right now. Six days after bo...
Position 555 · Loss: 5.292 nats (7.635 bits) · P(correct): 0.50%
RankPredictedProbBar
1."o"31.1%
2."ome"14.7%
3."om"8.9%
4."in"4.8%
5."ure"2.9%
Actual: "ix" (rank >5, prob 0.50%)
...cussion and know firsthand that wh...
Position 1466 · Loss: 5.288 nats (7.629 bits) · P(correct): 0.51%
RankPredictedProbBar
1." to"18.9%
2." see"3.7%
3." how"3.7%
4." be"3.3%
5." the"3.3%
Actual: " know" (rank >5, prob 0.51%)

Side-by-Side Generation

50-token prompt, 200-token continuation (temp=0.8, seed=42). Highlighted = token matches at same position.

Position matches: 6/200 (3.0%)Unique token overlap: 40 (33% of real)
Prompt (50 tokens)
<s>Insurance Company Declares Living Man Dead George Johannesen is very much alive. Which is why it was so surpr
Real continuation
ising when the Canadian man received a letter addressedTo the Estate of George Johannesen.” Even more surprising is that it came from his insurance company, who should really be on top of such things. Now this wouldnt have been so terrible if Manitoba Public Insurance was giving Johannesens estate a fat check for his passing away. But thats not what happened. Instead the letter was to inform the estate that, since George was dead, his driver license and auto insurance had been cancelled in October
Model generation
ising to be insurance company. Living Man is the only insurance company that makes property insurance. It is a company that makes vacation insurance, but it still has the same degree of property insurance. So how does a living man die? So say I think there are ways to teach people how to be insured. The first step is to measure the health of the living man and the stress of his situation. To measure the health of the living man, it is important to measure his or her weight. What is the hazard to the living man? Living Man is the only insurance company that specializes in repairs

What Changed from PR #1019

Expanding MLP was the right call — stable rank confirmed it was parameter-starved. Mixed quantization bought us the bit budget to fit the larger model under 16 MB. The calibration hit is real but recoverable with temperature scaling.

  1. MLP 3× → 3.5× worked. SVD showed 94.4% rank utilization at 3× — nearly packed. At 3.5× it's 95.2% and still climbing. The extra 2.88M parameters are active.
  2. Mixed int5/int6 fits the budget. The per-matrix analysis shows MLP_down captures 70% of total upgrade benefit. Promote MLP_down first, then MLP_up, then attention if bits remain.
  3. Calibration is now a concern. ECE: 0.24% → 1.26%. Temperature scaling was unnecessary before. It's worth investigating now.
  4. Layer 9 overtook Layer 10 as the most quantization-sensitive layer (+403 vs +229 × 10⁻⁶). If narrowing a layer, L10 is still the candidate — low logit lens contribution (−0.52 bits/token) with high sensitivity.

11L GPT, 512d, 8H/4KV GQA, XSA-all, BigramHash 3072×112, MLP 3.5×, LeakyReLU(0.5)²

Mixed int5/int6 GPTQ, Brotli-11, Parallel Muon NS5, 8×H100 SXM, seed 314