Active Users:177 Time:19/03/2024 10:40:22 AM
Opening the archive: Community Messageboards 2 & 3 Nebhead Send a noteboard - 02/01/2014 03:59:21 PM

As part of the fundraising drive last Dec (not last Dec, but last-last Dec... you know what I mean), I promised to import some of the old wotmania messageboards that Tor and I scraped from the site in its last few months.

It's taken me far, faaar longer than anticipated (approx 45 hours over the last two weeks alone), mostly due to the appalling state of the HTML I was trying to extract the data from. I also ended up needing to re-write most of the messageboard functionality to allow the archive posts to show properly.

You can view the archived boards in the new "Archive" section of the menu over on the left.

It's not possible to show the archive boards in the same way as the current ones, mostly due to the fact that there are no "recent" posts which can be displayed. Instead, the core homepage allows you to do some basic filtering of top-level threads, whereas the search page provide much more in-depth functionality to find specific posts.

There are a couple of things to watch out for:

  • The base filter form on the top of the board page won't allow a date range over more than 31 days. This is to stop the server from dying a fiery death as it tries to process a LOT of data. Even a full month's worth of posts take a good few seconds to load. Keeping date ranges small is a good way to keep the website running quickly.
  • If you have a premium account, you'll be able to use the "favourite post" functionality on the archived boards in the same way as you can on the normal boards.
  • I've added the ability for comments/new posts to be added to these old threads. The comment form is at the bottom of the page when viewing a post. It works in the same way as comments on quickpolls/journals/bugs/etc; just a single unthreaded list of comments for the whole thread. You can use the "order" option on the filter form to order threads by the number of "new" replies they have.
  • The way in which we saved the posts from wotmania means that on posts with over 100 responses, the sub-threading has been lost. The data I have allows me to identify the top-level responses within a particular thread, but not the sub-threading thereafter. These posts are instead shown with anexclamationicon next to them, to show that it should be in a sub-thread.
  • The quality of Mike's HTML was terrible, and, even worse, was different depending on the contents of the post. This means that I needed to extract data in a number of different ways, depending on just how badly messed up the code was. The end result of this is that the formatting of many posts just isn't quite right. In particular, edit links show even though they don't work, post links are shown twice, and there tends to be a lot of extra space at the end of the post. Signatures are also poorly formatted, and tend to be a lot bigger than they should be.

Still, it's good enough for now. If anyone's interested in a more indepth explanation of how we scraped the data in the first place, and how I went about extracting it and importing it into RAFO, let me know. It's all terribly exciting.



As a final note, you may notice that there is a big gap between the final post on CMB2 (14th Nov 2003) and the first post on CMB3 (9th Apr 2004). This is because, as per Mike's first post, the original CMB3 got corrupted and was wiped out.

It's all my fault...
Vegas Aug 17-18 - A Night to Remember

signature_images/100.gif
Spoony made this aaaages ago for me. Never got to use it though... until now!
Reply to message
Opening the archive: Community Messageboards 2 & 3 - 02/01/2014 03:59:21 PM 11326 Views
Awesome! Finally! - 02/01/2014 04:56:19 PM 1362 Views
Great stuff. - 02/01/2014 06:13:10 PM 1183 Views
Wow (not a world of warcraft reference) - 03/01/2014 12:46:16 AM 1213 Views
Wow redux - 03/01/2014 04:41:46 AM 1355 Views
If you can read this, the server is down. *pensive* - 03/01/2014 06:06:35 PM 1403 Views
Seeing a lot of those names is quite nostalgic. - 03/01/2014 09:09:56 PM 1239 Views
Thanks a lot for this! - 06/01/2014 08:03:38 AM 1224 Views
Woohoo - 11/01/2014 02:41:15 AM 1212 Views
Really really glad your message boards don't work at my work computer - 20/01/2014 03:04:37 PM 1124 Views
Man! I wrote some dumb crap when I was in my 20s - 11/01/2014 10:46:44 AM 1115 Views
Re: Opening the archive: Community Messageboards 2 & 3 - 20/01/2014 06:45:34 AM 1141 Views
i was *hilariously* immature in my 20s. love it! *NM* - 21/01/2014 02:34:01 AM 850 Views
Nice. - 21/01/2014 04:18:26 AM 1381 Views
*steals all your wse points* *NM* - 22/01/2014 08:16:32 PM 637 Views
HI TRIGGER *NM* - 24/01/2014 03:55:51 AM 686 Views
Ben, you are awesome. - 24/01/2014 03:57:24 AM 1116 Views
Now we get to relive Gems like this.. - 10/03/2014 11:45:14 AM 1131 Views
WOooOOooOOOooOWwwWwwww.... - 12/05/2014 01:20:00 PM 1124 Views
Before my time. - 21/10/2014 02:22:58 AM 964 Views
Re: Opening the archive: Community Messageboards 2 & 3 - 02/03/2014 05:42:50 PM 1562 Views
Oh Ben! What a wonder you are! ^_^ *NM* - 12/05/2014 01:16:23 PM 621 Views
Way Back Machine lets you see wotmania as it once was... - 02/09/2014 07:07:13 AM 1159 Views
You are amazing, thank you for all your hard work *NM* - 08/11/2014 08:12:04 PM 564 Views

Reply to Message