April 4, 2013

Robots not replacing humankind, just writing haikus for Tumblr


The next Walt Whitman.

Jacob Harris, senior software architect at The New York Times, has developed an algorithm to find accidental haikus in the paper and post them to a new Tumblr. The timing of the release couldn’t be much better: it’s National Poetry Month, and Twitter now allows line breaks.

You may remember Harris as the guy who reverse-engineered @Horse_ebooks‘s script to create @nytimes_ebooks. This too runs a bevy of words through a script in hopes of creating art. If you ever did a dramatic reading of Spam Poetry, or followed Haikuleaks, I think it’ll be up your alley.

Harris explains:

A proper haiku should also contain a word that indicates the season, or “kigo,” as well as a juxtaposition of verbal imagery, known as “kireji.” That’s a lot harder to teach an algorithm, though, so we just count syllables like most amateur haiku aficionados do.

The program scans the NYT front page for articles, checks the number of syllables against an open source pronunciation dictionary, selects sentences that match the seventeen syllable count, and divides them into line breaks (five-seven-five). Designer Heena Ko and software engineer Anjali Bhojani decided that all quotes should be posted as images rather than text on the Tumblr.

Articles covering sensitive subjects are not scanned for potential haikus, and the bot knows to skip anything with awkward sentence construction. (More information about the process is detailed in Harris’s blog.)

“Over time we’ve added syllable counts for words like ‘Rihanna‘ or ‘terroir,'” the haiku Tumblr reports. Yesterday, Harris posted on Twitter that he was adding syllable counts for “gewürztraminer, vindaloo, sabermetrics, esoterica, mortarboard, defenestrate, koan, nametag, ceramicist…”

The Times created this Tumblr with a sense of fun, but is clearly also hoping to driving traffic to the main site. The underlying code and the use of natural language processing could be valuable to future projects, assistant editor for interactive news Marc Lavallee added in an interview with Nieman Lab.

Some favorites:

As a leftover,

the soup was equally good

without the croutons.


The buzzing of a

thousand bees in the tiny

curled pearl of an ear.


The family mutt

nabs it and reduces it

to a gooey lump.


It’s hard to find your

bearings in the middle of

a cataclysm.


His deadly prose is

so authentic that it has

a life of its own.


“As an engineer,

I’m sort of a student of

how things fall apart.”


Kirsten Reach is an editor at Melville House.