Investigating Words: Forensic Stylometry and its Role for CFEs
/GUEST BLOGGER
Christopher Ekimoff, CFE, CPA
Manager, Investigative Accounting & Financial Litigation, Duff & Phelps
Washington, D.C.
Authors throughout history have used pen names for a variety of reasons. In the 19th century, women employed masculine names to remove any bias from publishers (see Mary Ann Evans as George Eliot, and the Bronte Sisters Charlotte, Emily and Ann as Currer, Ellis and Acton Bell).
More recently, certain authors have used noms de plume for notoriety reasons:
- Stephen King didn’t feel the public would buy more than one book from the same author in a year, so he wrote four novels as Richard Bachman until similarities in the works caught the attention of critics.
- J.K. Rowling of Harry Potter fame secretly penned a crime novel, The Cuckoo’s Calling, as Robert Galbraith. In an interview with The Telegraph, Rowling admitted “it has been wonderful to publish without hype or expectation and pure pleasure to get feedback under a different name.”
As a forensic accountant, it is hard to communicate to others what exactly I do. With projects and cases varying widely in industry, scope and deliverables, it’s always a stretch to accurately describe the profession in a 30-second elevator speech.
That being said, I jump at the chance when the media covers a story that relates to the work of a forensic accountant. Rowling’s recent admission to the public wasn’t brought about by conscience or advice from her agent, but by a team of linguists and computer programs in forensic stylometry.
One of the members of that investigative team, Patrick Juola of Duquesne University, gives a brief description of forensic stylometry on Language Log:
The basic theory is pretty simple: language is a set of choices, and speakers and writers tend to fall into habitual, or at least common, choices. Some choices come from dialect (the reason an Englishman drives a lorry but an American a truck), some from social pressure (if I need to impress someone with my vocabulary, I can utilize a polysyllabic lexicon instead of just using big words), and some just seem to come. An example of the latter category is in the use of many function words. If you ask yourself where the salad fork is relative to the plate, you quickly realize that it's usually to the left of the plate. Or is it? It's just as likely to be "on" the left of the plate, "at" the left of the plate, or perhaps "to" the left SIDE of the plate. Same fork, same position, and at least four different choices for how to describe it, none of which correspond to any sociolinguistic or cognitive variable with which I'm familiar.
Juola goes on to say that by quantifying and comparing those choices in writing, he and Peter Millican of Oxford University were able to draw conclusions about The Cuckoo’s Calling that led them to believe Rowling was most likely the author. (Read Juola’s complete post here.)
In forensic accounting investigations, we receive information in many forms: transaction data, financial statements, policy documents and emails – as well as prepared media statements, deposition testimony and legal briefs. It is in these latter three that forensic stylometry can help support (or refute) a forensic accountant’s conclusions about the facts of a case.
- Was this statement written by the speaker, or by the company’s public relations team?
- Do the answers to deposition questions appear to be genuine or coached and rehearsed by the legal counsel advising the witness?
- Does the expert’s report match up to his or her previous reports, or do sections appear to be written by others who are undisclosed?
Like most forensic undertakings, forensic stylometry is not a “smoking gun” solution to questions of veracity and authorship; it can still, however, provide insight into the development of the written information received.
Most forensic accountants already have at least one tool in forensic stylometry. Next time you finish writing a document in Microsoft Word, go to the Proofing section of the options menu, and click “Show Readability Statistics.” After a spelling and grammar review, Word will populate a table with statistics on your work. The picture below shows the statistics for this very blog post:
As you can see, I average 2.1 sentences per paragraph, over 23 words per sentence and wrote this piece at approximately a 12th-grade level.[1]
Compare some of your writings over time to see if you can identify consistencies. And the next time your intuition about a document’s author or authenticity is off, think about reviewing it through the lens of forensic stylometry. If Juola and Millican could outwit a fantasy writer turned crime novelist, what might you discover?
[1] More information on these statistics and their implications can be found in Financial Forensics: Body of Knowledge by Darrell Dorrell & Greg Gadawski, Wiley, 2012.