Jump to content

Unofficial Theory Index: A List


Tempus

Recommended Posts

Update time! Added the first four pages of Cosmere Theories. As expected, it's been a ride! Cosmere board has 15-20 theories a page so far, about double the density of the other boards. It's amazing and great to see 1/2 to 3/4 of a page covered in theories. I've seen a lot of things buried in there - theories I've come to independently (though with the advantage of Words or Radiance and all it's signings) that I never knew existed and couldn't find, theorists of exceptional quality that I've never met or seen before, and WoBs that were previously buried.

 

So, we're now up to 368 theories - 62 theories over three and a half pages (the last page was only half full). I've seen the rise of the Theorymaster Chaos (something I've been looking forward to) as he breaks out the Gold and pushes his way to the top. I've met Mad_Scientist - an amazing theorist who didn't post many posts (only 56 in Cosmere boards) but yet has managed to claw his way to the peak of the Quality Leaderboard on the Hall of Fame with the seven theories he presented in his short four months on the boards, every last one lengthy, well-formatted, and logical.

 

I just reached Windrunner's appearance, so I'm looking forward to seeing him and his many posts on the Cosmere boards. I'm also looking forward to more from Outis, WeiryWriter and his many posts, Observer, and maybe even a bit of Moogle and Argent (who have been missing as they are mostly Stormlight peeps). So far, Cosmere has lived up to expectations and been a most interesting and very high quality board.

 

 

Nooo!!! I'll get you for this, Catquisitor!

Seriously, Kurk is the best.

 

Did you see my parody profile pic of Kurk? I couldn't help myself but I think it's fantastic. Kurkistan would have the best regional magic ever.

 

Kurkistan.png

Link to comment
Share on other sites

Another quick update. Now up to 427 theories. One page had 24 theories on it! Cosmere boards are certainly dense. This latest update has brought the arrival of Odium's Shard, putting his big intelligent stamp on the boards. He'll have more theories to come in Stormlight as well. Windrunner has also clawed his way up to being one of the most prolific theorists. Plenty more to come in the next 20 pages of Cosmere.

 

I've also been more consistent about marked proven and disproven theories (though I'm sure I've missed some). There are 63 proven/disproven in total at the moment, about 15%. Speaking of percentages, at 306 theories...

 

Distribution            Actual    Expected

 

23 were Bandaids  7.5%     10%
105 were Normal   34.3%    35%
88 were Bronze      28.8%   25%
66 were Silver        21.6%   20%
24 were Gold         7.8%     10%
 

And now....

 

35 are Bandaids    5.8%      10%
144 are Normal     33.7%     35%
109 are Bronze     25.5%    25%
106 are Silver        24.8%    20%
33 are Gold           7.7%     10%
 

So you can really see how the cosmere boards have raised the bottom line for quality a few points, which is fantastic for it only being 1/4 of the index so far.

 

Last note, I started organizing the topic key. It now lists all the topics used, and formats them based off popularity. Going into the Edit menu of the topic subsection will show a comment outlining the current state of evaluation for topics, and the changes I think might need to be made to already existing topics to make the list neater and easier to use.

Link to comment
Share on other sites

Nothing like a triple post! Ah well, I enjoy sharing my little adventures.
 
We're now up to 473 theories, and a bit less than halfway through the Cosmere boards! The theories have been tending to thin out a little lately, and also lower a bit in quality in general. Bit of a lull patch in late 2012, I suspect - Alloy of Law had been out for almost two years, Way of Kings threeish. And Emperor's Soul hadn't come out quite yet. TES should perk things up a bit, and then I expect it'll pitter off for almost a year until late 2013 and WoR details start filtering out.
 
As for the Index itself, I've gone ahead and written a script which analyzes the metrics for me, and does a bunch of neat things like generate the Hall of Fame automatically, output some statistics, and put all data into a nice SQL database so I can query it as desired.
 
To celebrate that, I quickly whipped up a page with some Google Charts! I'll be making a much nicer page for stats as I go along - I like that sort of thing.
 
 
A question to anyone who has perused the Index: Should I further refine the grading scheme?
 
Right now, the grading scheme is Underdeveloped Theory, Normal, Bronze, Silver, and Gold. Five categories. However, with the expected 900 or so theories, the Index can certainly support more than five categories. In fact, I'm already grading for six categories - some of the better Silver threads get labelled as 'Gold Candidates'. After reading Kurk's Motivation, Execution, Consequence: A Realmatic Theory, I've noted that there are also a few Golds I'm feeling should be raised higher than Gold. Should I do so? A new scheme could be something like...

  1. Ruby
  2. Crystal
  3. Gold
  4. Silver
  5. Bronze
  6. -
  7. Bandaid

That may not be 'intuitive' enough, though. I could grade on a letter scale

  1. A+
  2. A
  3. B
  4. C
  5. D
  6. -
  7. F

That might be a little discouraging, however. I could also attempt to make it cosmere related. Something like...

 

  1. Fifth Heightening
  2. Fourth Heightening
  3. Third Heightening
  4. Second Heightening
  5. First Heightening
  6. -
  7. Drab

 

Your thoughts?

Link to comment
Share on other sites

Sorry to cause you yet more suffering. I do hope for your sake that you didn't read the _entire_ MEC, as its all rather long.

A thought: one way that it might be good to mark a few theories is to have some "Seminal Theory" tag for those theories that have had a very large influence on later discussion. Like "Everything has three aspects" or my own thread saying healing is a function of Cognitive aspects.

Link to comment
Share on other sites

Hey Kurk, I did read the entire theory, even two of the great Nepene wars. Love it. Best theory of the night, possibly best I've read period so far. As for the Seminal Theory, I do sort of have a moniker. The topic 'overarching theory' is used when I feel the theory attempts to explain the entirety of the category for the planet. Thus, a theory that attempts to explain all (or a wide swath) of Realmatics in the Cosmere gets 'Overarching Theory'. Likewise, a theory that attempted to describe and predict the entirety of the plot for the remaining eight Stormlight books would also get an 'Overarching Theory' topic.

 

It's not quite the same as a Seminal Theory as you describe it - those are more Theories that are important or have set the current knowledge on a subject. Unfortunately, I do find it difficult to judge theories like that. They are more subjective judgements, and I'm attempting to keep the rating tags as objective as I can. The other problem is that I haven't been on 17th Shard that long. I'm not certain what theories were completely unknown, which became the accepted theories, or when. If yourself or perhaps Chaos or Windrunner should nominate several, I'll see what I can do about marking them. Either that or I'll work them into the wiki summary I intend to begin to complete the index.

 

Any thoughts on further subdivision of the tags? Is that something that would be useful, or no?

Link to comment
Share on other sites

Well thank you, though once again I do you pity the time you spent. I also apologize that my tone becomes somewhat less than convivial at times during the War.

 

So far as marking "seminality" goes, I suppose it can wait for now. I do feel that it would be valuable to somehow mark out the theories that most people just walk around assuming these days, to some extent.

 

I really don't want to comment on the tags all that much because I'm trying to restrain the whole "Kurksitan is a narcissistic monster" streak of mine. Perhaps one or two more divisions if you feel the need, though I'd argue against using letter grades or the (let's admit, highly nerdy of us :P ) Heightenings. Besides, the Heightenings just end up being numerical anyway, and may result in over fitting because there are so many of them.

 

P.S. You read my big Forms thread too? My my, someone has a masochistic streak. :o

Link to comment
Share on other sites

I'm honestly not sure if you're being self-effacing or self-aggrandizing here Kurk! In a nutshell, though, your theories are good, well thought out, and super interesting. They tend to ramble, which probably puts some people off, but the underlying logic is usually sound and flows from your premise and assumptions. They're great theories! I'm not espousing anything just yet since I'm still picking away at my own things, but a lot of your material (on forms especially) I agree with. I'm sure we'll be having some more discussions on em' in the future (*gets a strange and disturbing glint in his eye*)!

 

 

As for divisions, I'm not certain if I feel the need or if I don't. The tags have a threefold purpose. The first is to allow browsers of the Theory Index a brief glimpse at the quality of a theory before reading it, hopefully allowing them to sort through theories a little better. This is the lesser purpose however - many poorly presented theories have interesting ideas at their heart. For this purpose, staying with fewer divisions would be better, as it avoids confusion.

 

The second and more central purpose is to encourage theorizers to properly develop and present their theories so that they can get more imaginary internet points. I designed the system, and it still works for me - my theories since I started the Index have all been Gold Star theories because I won't let myself make something worse, haha. My hope is that theorists in general are motivated to increase both their output and their quality in order to see themselves improve on the Index. This purpose would benefit from a more closely refined division of score which provides superior feedback.

 

The last purpose is to allow new theorists and theory readers to identify people they can emulate, people who might support them, and people they can look to provide reasoned and sourced responses in a thread. Your own reputation is quite fearsome, and when you arrive in a thread theorists tremble before you! There are many other well qualified theorists I'd like to recognize in such a fashion, for the above reasons. For this purpose, Either method of tagging should be fine.

 

Since I'm having trouble deciding, I thought I'd ask for opinions - even if that opinion is only 'My theories should be ranked even better', that shows me purpose two is getting traction. ^_^

 

 

 

Edit:

 

 

I forgot my update! Thou shalt not waste posts. We're up to 500 theories exact, with 15 pages to go in Cosmere boards. TES did indeed spawn theories, but even more was the Sanderson reddit thread, which blew minds. Kurk has been climbing the ranks again, and we're starting to see the high quality / low quantity posters get crowded towards the bottom by more casual but prolific theorists. The WoR theorists are starting to pop up on the radar as well, even though we're only just in early 2013. I suspect many of the bigger names began to join starting at the time when the book was announced.

 

I also updated my script to do a rudimentary topic list automatically, so that's nice.

 

Hall of Fame and Graph Stats pages have been updated.

Edited by Tempus
Link to comment
Share on other sites

I am both, Tempus. Always and forever both. :P

 

Thank you, though I'll say that calling some of them "rambling" is understating it. I agree that my Formic theories are the most exciting/groundbreaking at the moment. The MEC, while it is an edifice of awesome, is not exactly spot on in all of its calls.

 

As for tagging, I suppose leaving it as is for now couldn't hurt.

Link to comment
Share on other sites

Thank you, though I'll say that calling some of them "rambling" is understating it. I agree that my Formic theories are the most exciting/groundbreaking at the moment. The MEC, while it is an edifice of awesome, is not exactly spot on in all of its calls.

 

Formic theories? (Ender's Game fan sniffs the air hungrily.)

Link to comment
Share on other sites

Sorry to excite you, Kobold, but that's just my bad pun that refers to the family of "Form" (read: "Ideal") theories I have. Like the Ideals that spren are based on.

Link to comment
Share on other sites

Sorry to excite you, Kobold, but that's just my bad pun that refers to the family of "Form" (read: "Ideal") theories I have. Like the Ideals that spren are based on.

Darn. Now I'm feeling a bit like an enemy gate. (Read: down).

Link to comment
Share on other sites

Short update so that no one thinks I'm dead.

 

I've been dreading getting cracking on the Stormlight Archive and Words of Radiance boards, as some might be able to tell. I was planning on finishing the Cosmere boards before my trip next Monday, and then starting the WoR boards after. Slogging through those 2500 posts to find all the tiny nuggets of joy, endless, countless hours preparing, watching, finding, reading, typing... so I said:

 

NO MORE

 

and wrote myself a script! I'm very proud of my script - it looks at all the threads in the boards, takes down all the relevant information, and heuristically analyzes the language in the post, the structure, the grammar, the spelling, the punctuation, the quotes, the references, and over 500 key Cosmere terms (Thanks R'Shara!) to produce a massive list.

 

The script sorts the items by class -> Class 1 is 90% confident it's a theory. Class 2 is 80%, Class 3 is 65%, Class 4 is everything else. It then grabs all the pertinent information and records it so I don't have to - author, date, title, link, etc. It then uses the first post analysis to generate the most likely Location, Category, and Topic of the post. It also produces a list of other probable topics, and rates them. It then assigns the theory a score based on the content, and chooses a tag according to this score. The metric system is the same I've been using in my manual evaluations, as best the computer can manage. In my tests against the existing theory list, theories were tagged with 87% consistency with my own ratings, locations were 97% consistent, categories were 76% consistent (the 'Plot' and 'Joke' categories are 0% consistent, sadly), and the topic was 45% consistent for the core topic, but 96% consistent that the topic I assigned was on the probable topic list. It also attempts to notice things where it feels it has got them badly wrong, and mark them for me to consider with attention.

 

So overall, it's pretty darned good at what it does. It is of course quite possible that the inconsistent one in determining the original entries has been me on many occasions, and that the script judged it better. Noneless, before adding any of the entries to the Index I'll be reviewing them all. This will still take time, mind you, but significantly less time as not only will there be a lot less typing and copy-pasta, but there should be about 4 in 5 entries that I don't need to modify (out of the theories - non theories will have to be winnowed out and deleted).

 

 

 

Anyway, I started running the script, and you can find the results of Cosmere board pages 1-14 and the entire Stormlight Board here. Words of Radiance board will pop up later tonight, so it might be on there when you read it.

 

 

 

Some things you may note if you take a glance:

  • There are lots of bandaids. This is because it evaluates all threads. The ones that are not theories are highly likely to come up with a bandaid when they get evaluated, because they are of course not theories.
  • Class 4 is very big. This is also due to most of those threads being non-theories (but not all by any means)
  • The scoring system has some huge outliers at the upper end. The scores are based partly on exponential functions evaluating the criteria. As a result, the scoring system tends to break down around 35 points, and starts ramping up drastically after that, leading to some threads (especially unusual threads with inordinate amounts of quotes or formatting) being exceptionally high. This is intended, the score is merely an indicator of the presence of elements of well crafted theories, and identifying outliers is part of its job.
  • Some people have changed their display names since I scraped earlier, mostly my fault due to the Joke Proile Pic thread >.> This has messed with my table, haha. Reap what you sow!
Link to comment
Share on other sites

Nicely done, Tempus!  That's an impressive script to parse out that sort of information from basically a wall of text.  The programmer in me wants to know what language you wrote the script in..

Link to comment
Share on other sites

It's written in python. Beautiful Soup provides the scraping tools, and the rest is built-in libraries. In a nutshell, the script searches the title, body, and tags (if there are any) for a variety of characteristics. The easy ones are things like word count, sentence count, paragraph count and a variety of formatting tags, quotes, spelling, and basic grammar. More complex functions look through a 16000 character cross-indexed term dictionary, analyze by frequency, proximity, and category to produce a variety of indicative metrics.

 

When it's got all those metrics, it attempts to assemble them into indicators, usually modified with a fractional exponent to prevent bloating scores from overusing any single element. I then tested against my own evaluated threads, weighted the categories to match, added in a few simple heuristic rules to ensure certain things, and then presto.

 

Example: Class 1 is basically 'Not crem and has the word 'theory' in the title or tag. Class 2 does four or five checks against some weighted metrics, but it can mostly be boiled down to 'More assertions than anything else, and used a lot of keywords together.' Keywords being pretty much unique words from the books. It's not as complicated as it sounds when I summarize it as 'heuristic processing'. ^_^

Edited by Tempus
Link to comment
Share on other sites

Well, here's one more post for you then!

 

It's been years since I've done anything in Python.  I tried writing my own scraping library years ago.  It worked, but was kind of flimsy in my opinion.  I've been in the JavaScript/Angular/C# world for a while now, so I probably couldn't write that same program again!

 

Your post explaining what you did was excellent.  I love that kind of nerdy stuff!   B)

Link to comment
Share on other sites

Thanks Kurk, I've been gathering quite a list. I've noted slightly over 200 theories that are mentioned or touched on, but needed development. Some of those have been scratched off as I found theories about them, some have become obsolete with WoR information and all the questions we've answered, but quite a few are still quite good (I was sorting them earlier). I have about 54 right now that I think might be interesting (55 including yours), and I'm sure I'll get a few more before I'm done. Sometime in the first week of June I plan to start work on presenting those nicely and getting a think tank or a theories wanted page or something of the sort for people to bounce around in.

 

Also, been sorting through my scripts, and they really are pretty good. I've made a few corrections, but overall I'm super impressed. If only they could accurately detect 'Plot' theories, and more accurately determine theories from other posts, I wouldn't have to do anything! =D

 

Lastly, I'll be away from my desktop for the next week, I'm visiting Chicago. Won't make any progress on the Index while I'm gone, but I'll still be around the forums via my mobile.

Link to comment
Share on other sites

Back from my vacation, got a couple hours to work on it. Finished off the whole Cosmere boards, another 256~ theories (Excepting theories posted in the last ten days or so). Now up to 747 theories. PLANE POWER.

 

So I've now updated the Index, the Hall of Fame, and the Stats Page. I've seen the rise of Moogle, Isomere, and watched Chaos climb higher. I've seen the flux of bad theories about Hoid spawned after WoR release. Generally all sorts of goodies.

 

The Hall of Fame is getting quite crowded. It's now 33 entries long. I'm considering chopping off the bottom bits, upping the score threshold to 20, and the quantity to 8. This would chop off 9 and 12 people respectively, though. Notably, Nepene and Isomere would lose out on the quantity list.

 

Also of interest may be the changes in the statistics. Topic proliferation increased by about 25%, showing a shift to wider discussions. In terms of quality, Gold increased 2 points, Silver dropped four points, Bronze increased one point, and Normal increased one point. So a marginal shift towards more polarization - good theorists are making better theories, and casual theorists are doing about the same as ever. The theorist middle class is shrinking a bit. 

 

Location wise, Cosmere saw a big growth, as it should. Roshar was also well-represented on the Cosmere boards. A few notables to Sel, Nalthis, Ashyn, Yolen, and Taldain. Scadrial had the lowest proportional representation on the Cosmere boards, possibly because the Mistborn boards are still quite active. Category wise, Cosmere boards showed a shift away from Character and Plot theories, and a shift towards World and Realmatic. Can't imagine why that is! ^_^

 

Generally, a good batch! My script also offers significantly more topic/quality data, so I'm probably going to run it on all previously scraped threads and cross-examine it to improve the results (and possibly allow for a better categorization of topics).

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...