Building a digital home for Arapaho, one sentence at a time
Top image: "Two Riders Leading Horses" by Arapaho artist Frank Henderson, ca. 1882 (Photo: Metropolitan Museum of Art)
CU Boulder linguistics scholar Andrew Cowell helps Arapaho stories find new life online
The Arapaho words beteen, meaning “sacred,” and beteneyooo, “one’s body,” have a special connection for those who speak the language. Their linguistic similarity isn’t a coincidence.
Andrew Cowell, a Ƶ18 professor of linguistics and faculty director of theCenter for Native American and Indigenous Studies (CNAIS), says the Arapaho see it as a lesson encoded in the language. “It indicates that the body is sacred and therefore we have to protect it,” he says.
Such examples of cultural knowledge don’t always survive translation. That’s exactly why Cowell’s belief in the importance of preserving Indigenous languages led him to redirect the entire trajectory of his career.

CU Boulder linguist Andrew Cowell, faculty director of theCenter for Native American and Indigenous Studies (CNAIS), has partnered with a host of collaborators including CU students, community partners and native speakers to build digital tools to protect and revitalize the Arapaho language.
It’s also why, for the past two decades, he and a host of collaborators including CU Boulder students, community partners and native speakers, have been to protect and revitalize the Arapaho language.
Cowell didn’t originally come to CU Boulder to work on Arapaho, but he has long been curious about Indigenous languages, in part thanks to his personal connection to Native Hawaiian culture through his wife.
“Arapaho was the native language of Boulder, so when I got hired at CU I decided, well, I’ll look into Arapaho,” he recalls. “I started looking into Arapaho more and more and doing more work on the side and eventually decided to switch departments into linguistics so I could focus all my energy on indigenous languages.”
Two databases, one goal
Today, Cowell’s work on Arapaho takes two forms: one, an online lexical database; the other, an unpublished, in-depth text database of natural language conversation and narratives.
The lexical database, , functions like a living dictionary. With more than 20,000 entries and a searchable interface, it’s often used by learners across the Arapaho-speaking world in place of print dictionaries, according to Cowell.
But a larger effort has quietly been taking shape behind the scenes.
The text database, which is not publicly released, contains more than 100,000 sentences of spoken Arapaho. Among them are natural conversations and stories recorded over decades.
“At this point, I’ve got over a hundred thousand sentences of natural speaking that I have not only recorded, but also transcribed into written Arapaho, translated into English, and then it has linguistic analysis attached as well,” Cowell explains.
The database is the backbone of several major projects, all with the goal of making learning Arapaho more accessible and preserving it for future generations. One effort is a student grammar dictionary that focuses on the most useful and common words.
“We’ve gotten a list of the frequency of all the nouns in the language and all the verbs," Cowell says. "We ranked those, and it allowed us to produce a really small student dictionary where we only included words that occurred around 40 times or more.
“It means (students) don’t have to flip through rare and uncommon words they’re unlikely to be really interested in as initial learners.”
A pathway for new learners
Beyond the student dictionary, Cowell and his team are working on developing a scaled curriculum for teaching Arapaho. It guides learners from basics to more complex concepts across sequential levels based on real-world language use patterns.

Young Arapaho dancers (Photo courtesy the Wind River Casino)
“We’ve developed 44 steps of knowledge, and even within that there's 23a and 23b and so forth,” he says. “It’s all based on looking at the text we've collected and looking at the frequency of certain kinds of grammatical features that occur.”
Unlike French or Spanish, Arapaho wasn’t historically taught in a classroom but passed down through families at home. Cowell’s team has had to build an instructional framework from the ground up.
“With Arapaho, no one’s really ever tried to teach it as a second language. Now we’re trying to learn it and teach it, and the databases have allowed us to really produce that scaled curriculum,” Cowell says.
Generations of trust
Ensuring that his work isn’t just academic has been a priority for Cowell since the start. The database project is built on decades of trust between himself and the Arapaho community.
“The one thing Native American communities have often had problems with in the past is someone comes in, does their research, then disappears. Then the community is left wondering what they are getting out of it. In some cases, nothing,” Cowell says. “I worked hard to establish that I really want to learn the language and ensure my work is something that will feed back into the community and help out.”
That commitment has led to rich partnerships, sometimes spanning generations.
“We’re close to having 100 different native speakers represented in our data. At this point we’ve got grandparents and now their kids are working on it,” Cowell says.
A worthy effort
From a linguist’s perspective, Cowell explains, Indigenous languages expand our understanding of what language, and indeed human cognition, can do.
“We’re close to having 100 different native speakers represented in our data. At this point we’ve got grandparents and now their kids are working on it.”
“There are many cases in the history of linguistics where people have made a claim like ‘no language could possibly do this,’ and then someone goes to the Amazon and discovers a language that does it,” he says.
More importantly, the motivating force that has kept Cowell working for over twenty years comes from the Arapaho speakers themselves.
He says, “In my experience, Native American communities are very invested in their language. They see it as really crucial, central to their identity.”
That’s why the full text database hasn’t been released publicly, especially with growing concerns about how the data might be used or exploited by artificial intelligence. Still, Cowell and his team are taking steps toward broader access.
A grant from the National Science Foundation will support the release of 5,000 carefully selected sentences from the text database for public use. The snippets, which have been approved by native Arapaho speakers, will be available online with additional computational linguistic labeling.
As for Cowell, he says that even after 20 years, he never tires of seeing the work evolve. He hopes it shows CU students what’s possible when you follow your curiosity.
“You never know where you’re going to end up and what results are going to come out of something. You just have to trust that research is going to turn out to be interesting. You can’t necessarily predict when or where.”
Did you enjoy this article?Passionate about linguistics?Show your support