Friday, April 23, 2010

Stem Cells

While the weather has taken a downturn I have been doing a lot of work on the 6+1 stems.

As previously intimated I was far from happy at the 'usefulness' of the stems in the Collins OSL.

So after completing my beautiful spreadsheet of the Collins' top 250 I then went back to the old book and added in those that had been dropped - a stunning 138 stems. I really couldn't believe that switching dictionaries had made that dramatic a change.

I colour coded the stems depending on whether it was in both books, just Collins (dark pink) or just Chambers (pale pink).

I then created two ranking columns. The first based on the probability of the stem and how many of the remaining tiles (assuming a full bag) the stem went with. The second based on the probability of the stem + the number of different letters it went with - the number of remaining tiles it did not go with. As a final tie breaker just the number of different letters it combined with.

Then I hit the sort button...

Highest placed Collins only stem was TINIES (Collins 94) coming in at number 57.
Highest placed Chambers only stem was STREET (Chambers 195), coming in at a surprising number 10 - relatively high probability stem combing with all but 10 of the remaining tiles.
The bottom 47 were all from Collins only - a veritable sea of dark pink.
The lowest Chambers stem was DEPART at number 341.
ROADIE (Collins 121) was last - despite being a high probability stem it only combines with D,L,N,S,V and X...
TENIAS is top, RETAIN dropping to third place behind TORIES.

I think it shows that the Collins' stems have too much weighting on the probability of the stem to the detriment of how useful it is to have it in the first place.

Now on to the 8s...
Hmm - my probability calculation was a bit out. Hadn't treated duplicate letters correctly. Actually makes the Collins' stems even worse - the bottom 58 are now dark pink. NAILER is now the highest placed Collins only stem at number 65.
STREET dropped to 20th, with SINGLE now the highest Chambers entry at number 10.
The worst stem is TAENIA. It is bad in two ways. Not only does it only make a 7 with ELMPS, but also all of the words it makes already appear in a higher ranked list!

