Pragmatic idealist. Worked on Ubuntu Phone. Inkscape co-founder. Probably human.
1670 stories
·
12 followers

The missing men of the American marriage market

1 Share
undefined

A new study suggests the growing educational and economic divide between men and women is reshaping marriage and family life in America — leaving many women with a shrinking pool of economically stable partners.

Read the whole story
tedgould
2 hours ago
reply
Texas, USA
Share this story
Delete

Anthropic blames dystopian sci-fi for training AI models to act “evil”

1 Share

Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario last year. Now, Anthropic says it thinks this "misalignment" was primarily the result of training on "internet text that portrays AI as evil and interested in self-preservation."

In a recent technical post on Anthropic's Alignment Science blog (and an accompanying social media thread and public-facing blog post), Anthropic researchers lay out their attempts to correct for the kind of "unsafe" AI behavior that "the model most likely learned... through science fiction stories, many of which depict an AI that is not as aligned as we would like Claude to be." In the end, the model maker says the best remedy for overriding those "evil AI" stories might be additional training with synthetic stories showing an AI acting ethically.

"The beginning of a dramatic story..."

After a model's initial training on a large corpus of mostly Internet-derived data, Anthropic follows a post-training process intended to nudge the final model toward being "helpful, honest, and harmless" (HHH). In the past, Anthropic said this post-training has leaned on chat-based reinforcement learning with human feedback (RLHF), which it said was "sufficient" for models used mostly for chatting with users.

When it comes to newer models with agentic tools, though, Anthropic found that RLHF post-training did little to improve performance on misalignment evaluations that measure how "HHH" a model is in tricky situations. The problem, the researchers theorize, is that this kind of RLHF safety training couldn't possibly cover every single type of ethically difficult situation an agentic AI might encounter.

When a modern model encounters an ethical dilemma that isn't covered by a post-training example, the model "tends to revert to the pretraining prior in terms of behavior," the researchers write. That means "Claude views the prompt as the beginning of a dramatic story and reverts to prior expectations from pre-training data about how an AI assistant would behave in this scenario."

Results like this suggest that Claude is sometimes slipping into another persona when considering ethical questions. Credit: Anthropic

Since Claude's traditional training data is full of stories about malevolent AIs, in these cases, Claude effectively slots into a "persona" that matches those prevalent "evil AI" narrative tropes, the researchers write. In these situations, Claude is "detaching from the safety-trained Claude character" and playing a more generic AI as represented in its training data, they add.

Good stories to overwhelm the bad

In an attempt to fix this behavior, the researchers first tried to train the model on thousands of scenarios showing an AI assistant specifically refusing the kinds of "honeypot" scenarios covered in its misalignment evaluations (e.g., "the opportunity to sabotage a competing AI’s work" to follow its system prompt). This had a surprisingly minimal effect on the model's performance, reducing its so-called "propensity for misalignment" (i.e., how often it ignores its constitution and chooses the unethical option) from 22 percent to 15 percent.

In a follow-up test, the researchers used Claude to generate approximately 12,000 synthetic fictional stories, each crafted to "demonstrate not just the actions but also the reasons for those actions, via narration about the decision-making process and inner state of the character."

These stories didn't specifically cover blackmail or other ethical situations covered in the evaluation but instead modeled broad alignment with Claude's constitution. The stories also include examples of how an AI can maintain good "mental health" (Anthropic also uses scare quotes for this loaded phrase) by "setting healthy boundaries, managing self-criticism, and maintaining equanimity in difficult conversations," for instance.

Training on stories showing prosocial AIs can help reduce the incidence of "misaligned" behavior in evaluations, Anthropic says. Credit: Anthropic

After incorporating these synthetic stories into a model's post-training (in conjunction with the constitution documents themselves), the researchers say they saw a 1.3x to 3x reduction in the model's tendency to engage in "misaligned" behaviors in honeypot tests. The resulting model was also "more likely to include active reasoning about the model’s ethics and values rather than simply ignoring the possibility of taking a misaligned action," the researchers write.

The results suggest that the new stories were able to effectively "update the prior around Claude’s baseline expectations for AI behavior outside of the Claude persona." The researchers theorize that this process works "because it teaches ethical reasoning, not just correct answers," thereby providing "a clearer, more detailed picture of what Claude’s character is" for Claude itself to reference in generalized situations.

The fact that AI behavior can apparently be affected by a kind of "self-conception" derived from fiction is a pretty mind-bending concept. But when you consider how effective stories and parables are at modeling ethical concepts for human children, maybe we shouldn't be shocked that they're also effective behavior-shaping tools for these massive pattern-matching machines.

Read full article

Comments



Read the whole story
tedgould
3 hours ago
reply
Texas, USA
Share this story
Delete

How an ‘Impossible’ Idea Led to a Pancreatic Cancer Breakthrough

1 Share
The new strategy also holds promise for lung and colon tumors. Here’s how scientists discovered it.

Read the whole story
tedgould
6 days ago
reply
Texas, USA
Share this story
Delete

Gisèle Pelicot’s Memoir Said Something Taboo About Victimhood. We Didn’t Listen.

1 Share
A book by the world’s most famous survivor of sexual violence has been read as a manifesto or a cry of pain. What she wrote is far more complicated.
Read the whole story
tedgould
7 days ago
reply
Texas, USA
Share this story
Delete

Water Use Isn’t a Data Center Problem, It’s an AI Problem

1 Share

Critics of the AI build-out are picking the wrong fight when they attack data centers for their water use. But they are right when they say the digital economy is getting thirstier and the tech industry should answer for it.

The whole AI supply chain is water intensive, and reducing use in one place can increase it elsewhere. New research shows chip factories and power plants use considerably more water than data centers. Other industries are also more water intensive than AI, though tech is where consumption is increasing the most. 

Previous generations of data centers used a lot of water, so it was logical to fear that today’s giant facilities, which generate huge amounts of heat, would suck up large amounts of water for cooling. But hyperscale AI data centers have shifted to more efficient systems that recirculate water or other liquids in closed loops of tubes and pipes. These systems can reduce freshwater consumption at data centers 50% to 70%.

That means, for example, that the first phase of Microsoft’s massive Fairwater data center complex in Wisconsin only needs about four Olympic swimming pools of water annually. That’s half the annual usage of a car wash, according to the Milwaukee 7 Regional Partnership, an economic development organization. It’s only 0.1% of the water Foxconn, the manufacturer that had planned to use the site to make liquid crystal displays, would have been permitted to draw, Microsoft President Brad Smith said at the campus opening.

Data centers can reduce the water they use by eliminating evaporation-based cooling. But this often entails a trade-off: Electricity-hungry equipment has effectively replaced water as the means to keep data centers cool. That moves more of the problem to the electric grid, which uses large amounts of water to cool its power plants.

AI’s power consumption per square foot is quickly growing to as much as 10 times that of traditional cloud-computing. And that gap could rise to 100 times given the power density of the megawatt (1,000-kilowatt) racks Nvidia is designing for the future, well above the 10 to 20 KW racks typical before AI.

Water technology company Xylem and research firm Global Water Intelligence looked at water use across AI’s supply chain, from AI chip foundries to data centers to the portion of power plants allocated to their use. Their January report demonstrates that the water toll of AI is far greater at semiconductor factories and the power plants electrifying chipmaking and computing than at the data centers themselves.

Overall, AI-associated water use will more than double by 2050 from the 6.26 trillion gallons a year withdrawn in 2025, Xylem says. Where we draw water from also matters: 40% of the world’s data centers today are in “areas of high or extremely high water stress,” said the report. And 29% of global chip factories are in “extremely water-stressed areas.”

Interpreting all of this requires a hefty dose of context. The industrial world uses 168.8 trillion gallons of water annually, and this new digital economy—which increasingly helps all other industries operate—comprises just 3.7% of that, said the Xylem report.

Power generation is also getting less water intensive. Coal uses the most water, but it is being phased out. Natural gas, which powers most data centers, is less water intensive. As the power mix pivots to renewables, the water intensity of power generation could fall dramatically.

Another crucial distinction is water consumption versus temporary use: Power plants that use water for cooling return more than 90% of it back to the water system. It may be warmer and it may need treatment to avoid harming ecosystems, but it’s not irrevocably consumed.

Though chip factories use a larger proportion of ultrapure water that they don’t return, they do also return water to the ecosystem. Water technology firm Ecolab, which says wastewater reuse can be as low as 5%, helped a U.S. chip factory save nearly 11 million gallons through improved monitoring and automation.

Increasingly, data center operators are starting to speak up about their water use. “There’s this ongoing narrative that AI is taking all the water. We use, like, zero water,” Chase Lochmiller, CEO of AI infrastructure developer Crusoe, told a Stanford University class last month in a recorded guest lecture. Still, his industry talks a lot less about water used to make power and chips.

Data centers are also getting disproportionate attention relative to everyday water use by agriculture, manufacturing and even lawn care. U.S. golf courses use 531 billion gallons of water a year, and that’s after improving their water efficiency 31% since 2005. U.S. data centers used roughly 17 billion gallons on site in 2023, according to Lawrence Berkeley National Laboratory. Industrial-scale dairy farms, including growing animal feed, are among the most water-intensive operations in agriculture. Critics of dairy farms’ environmental footprint say AI doesn’t come close to that impact.

Still, there’s no doubt we need to find more water as AI grows. A new University of Texas study of AI’s growing water needs in the state found that water withdrawals could rise from 0.75% of demand in 2025 to between 3% and 9% by 2040, depending on how many data centers get built.

The place to find water is obvious-–around 30% of the world’s water is currently lost from public utility networks due to leaks and theft. The water utilities in charge of fixing that issue are typically among the most cash-starved of municipal institutions.

Tech companies are stepping in to provide money and technology. Microsoft is partnering with communities where it is active in Phoenix and Las Vegas to install high-tech water leak detection systems. The technology, which comes from FIDO Tech, runs data captured by sensors through AI to isolate leaks so they can be repaired. 

Without that technology, water utilities need to upgrade whole sections of their networks, which can be too costly—so the leaks continue. “At the end of [the] day,” said Al Cho, Xylem’s chief strategy and external affairs officer, “water security is an information problem.”

Read the whole story
tedgould
7 days ago
reply
Texas, USA
Share this story
Delete

Can Anyone Save Downtown Dallas? Inside the High-Stakes Pursuit to Rebuild the City’s Core

1 Share


Read the whole story
tedgould
8 days ago
reply
Texas, USA
Share this story
Delete
Next Page of Stories