I started playing around with the Natural Language Toolkit for language analysis in Python, and I thought it might be fun to do some sort of analysis trying to see if there are systematic differences in use of language between Jordan and Sanderson. It would be quite cool if one could use a tool like that to identify who wrote which parts.
Now I'm not a linguist, and based on a little googling it looks like author fingerprinting is far from being a solved problem. Still I guess it's easier to tell the difference between two authors than taking an unknown text from any author at all, and trying to find out who wrote it.
Anyone have any ideas what one might look for? Or for that matter, does anyone know of existing computer analysis of the Wheel of Time?
Now I'm not a linguist, and based on a little googling it looks like author fingerprinting is far from being a solved problem. Still I guess it's easier to tell the difference between two authors than taking an unknown text from any author at all, and trying to find out who wrote it.
Anyone have any ideas what one might look for? Or for that matter, does anyone know of existing computer analysis of the Wheel of Time?
Fram kamerater!
Natural language processing
18/11/2012 03:47:47 PM
- 702 Views