In April, two researchers at the University of Texas published an article in the academic journal Psychological Science, saying they could attribute a new play to William Shakespeare. The play in question, Double Falsehood, was long thought to have been written a century after Shakespeare’s death by Lewis Theobald, who supposedly based it on a lost play co-written by Shakespeare and another playwright, John Fletcher.
Although scholars have searched for new plays by Shakespeare for centuries, these researchers, Ryan Boyd and James Pennebaker, say they can prove Shakespeare and Fletcher, not Theobald, were the real authors of Double Falsehood.
And they have a novel method to do it. They begin from the premise that you can read an author’s psychological signature buried in the words he or she uses in a play. “The way that people use literally dozens of different categories of language,” Boyd says, “reflect all of these separate psychological processes that are happening in their head.”
Boyd and Pennebaker wrote a series of computer programs, took a group of plays by different writers and Shakespeare, broke the plays down into words and strings of words, sorted the words, and then gave them numerical codes. After that, according to Boyd, they “hand them over to different algorithms that put them together in such a way that we can figure out which psychological processes best distinguish between the three different authors.”
One category of language is called “social words.” In the case of Double Falsehood, Boyd offers as an example a passage using family words, like “mother”:
I conjure you,
By all the tender Interests of Nature,
By the chaste Love ‘twixt you, and my dear Mother?
(O holy Heav’n, that she were living now!)
Forgive and pity me.— Oh, Sir, remember,
I’ve heard my Mother say a thousand Times,
Her Father would have forced her Virgin Choice;
But when the Conflict was ’twixt Love and Duty,
Which should be first obey’d, my Mother quickly
Paid up her Vows to Love, and married You.
Boyd says these words demonstrate that the author values personal attachments — something you find all over the writing of John Fletcher. This process, Boyd says, has left him “95 to 97 percent sure” that Shakespeare and Fletcher co-wrote Double Falsehood. “The play does have all the hallmarks of a legitimate Shakespeare and Fletcher collaboration,” he says.
But there are lots of ways the 16th century can trip up even the best of computer programs. For one, unlike books today, two copies of the same 16th century book might not be identical to each other. For example, in the First Folio of Shakespeare (the first printed version of the Bard’s complete works), there are multiple versions of Hamlet’s first soliloquy. In three different copies of the First Folio housed at the Folger Shakespeare Library in Washington, DC, Hamlet wishes that “this too, too sullied flesh,” “this too, too soiled flesh” and “this too, too solid flesh” would melt, thaw and resolve itself into a dew.
Another problem comes from the classifications that Boyd and Pennebaker gave to the words they came across in the plays. Their algorithm is supposed to be able to pick up on the writer’s fetishes or cravings, but they’ve got a big blind spot: 16th-century puns, which pop up in nearly every line of the plays, but can be hard for present-day English speakers to pick up on. According to David Crystal, author of the Cambridge Encyclopedia of Language, in the 16th century, the word “hour” was pronounced “oar” and was a synonym for “prostitute.” “Anytime you get the word ‘hour’ turning up in a Shakespeare play,” Crystal says, “there’s always a possibility that there might be a pun there.”
So an algorithm that assumed the word “hour” was only about time would fundamentally misunderstand the word’s meaning. And “hour” is hardly the only word like this. “These puns cover every conceivable word in the canon,” Crystal says. “You are always on the lookout for them.”
This isn’t the first time scholars with computers have found Shakespearean treasure. In the mid-1990s, a scholar named Don Foster, using a similar program to analyze the frequency of words used in Shakespeare’s works, announced that he had discovered a previously unattributed poem by Shakespeare. Seven years later, after several editions of Shakespeare’s poetry had incorporated the new poem, scholars attributed it to one of Shakespeare’s later imitators, John Ford. Foster, who hadn’t included Ford’s work in his database, admitted he was wrong.
That is not to say Boyd and Pennebaker’s discovery will meet the same fate, but perhaps jumping in as they do may show “a mind impatient, an understanding simple and unschool’d.”
This story first aired as an interview on PRI's Studio 360 with Kurt Andersen.
The story you just read is not locked behind a paywall because listeners and readers like you generously support our nonprofit newsroom. If you’ve been thinking about making a donation, this is the best time to do it. Your support will get our fundraiser off to a solid start and help keep our newsroom on strong footing. If you believe in our work, will you give today? We need your help now more than ever!