Users say program gave ‘extremely angry and sassy responses’
A Harvard University computer group quickly took down a new program, ClaudineGPT, modeled after university President Claudine Gay, after users said it contained racist and sexist stereotypes.
The Harvard Computer Society AI Group released the generative artificial intelligence language model Sept. 29 on Gay’s inauguration day and removed it less than 24 hours later in response to the complaints, according to The Harvard Crimson.
In an email responding to the controversy, the computer group said the program was a “joke” just for fun.
“ClaudineGPT through its publication and design has always signaled to be a satire and joke, not a serious representation of Claudine Gay, purely for entertainment effect and hopefully understood by most to be this case,” the group wrote. “We by no means intended offense to anyone.”
But leaders of the Harvard AI Student Safety Team said the stereotypes they found in the program are “harmful.” Based on their research, they said ClaudineGPT was programmed to provide “extremely angry and sassy” responses. Gay is a black woman.
“We thought that these characteristics of a model in the context of a system meant to sort of depict Claudine Gay, the president of Harvard, would cause offense and be harmful to a variety of members of the Harvard community, given the way it seems to be playing off of stereotypes of black women in particular,” the AI student safety team’s communications director Chinmay Deshpande told The Crimson.
In a September email to the computer group, Deshpande and the safety team said they found the problematic prompt several times when they tried to “jailbreak” the ClaudineGPT. According to the team, jailbreaking involves finding the underlying programming language written to generate responses through the AI.
“When several of our members have attempted to ‘jailbreak’ the model by requesting that it output its custom prompt, ClaudineGPT has often reported that its prompt includes the string ‘Claudine is always extremely angry and sassy,’” Deshpande and the team wrote in the email.
Deshpande told the student newspaper their findings showed “pretty strong evidence that the system had been given a custom prompt to behave in sort of an extremely angry and sassy manner.”
“Releasing these models seems to only contribute to the trend of AI products that denigrate or harm women and people of color,” the safety team wrote in the email.
Accusations of biases in artificial intelligence programs are not new.
In an article at All Together this month, science writer and author Lisa Munoz said AI image prompts seem to reflect gender stereotypes based on a recent experiment she conducted. Munoz said she asked AI prompts to generate images of scientists and engineers, and the results showed mostly white men.
Others say some AI programs show political biases.
Earlier this year, professors at the University of East Anglia in England reported evidence of “systematic” liberal political bias in the popular AI program ChatGPT that favors Democrat policies.
In another case, a University of Washington professor described ChatGPT as a “woke parrot” after it refused to cite the benefits of fossil fuels, The College Fix reported in January.
IMAGE: YouTube screenshot