An "AI think tank" is untrustworthy in a different way from a normal think tank because you can't trace responsibility. How do AI think tanks differ from the real thing? and how might/should government react to their increasing popularity?

Part of the reason I started this project was because I kept seeing mentions in the press of AI-activism and AI-disruption and AI-governance, so I want to know how disruptive is it? How much time does it take? What can AI do with typical think tank goals? and is it a goldmine for activism? (lol, no). There's a lot to be said for putting a complex argument into the right words, but the right words are still as elusive as ever. If anything, the right words might be further away than ever, as more people switch to consuming their web via AI-summarisation, it's conceivable many/most people never see what was really written anyway.

What is a think tank, really?

A think tank isn't just a list of reports that turn facts into information, but it's a network of people working towards shared goals. Usually (ideally) with some widely understood idea of:

Because these things are generally knowable (even if not known), you can read the incentives in the output and say "this is shaped by these people, probably for these reasons." That makes it criticisable. You can point at the mechanism or the source and pick it apart until you find what makes it tick.

Why AI bias is harder to locate

With an AI "think tank", the mechanism is more foggy. Output is shaped not strictly by reason or sense, but by small random or near-random selections of words, and by even very small decisions in:

It's easy to believe you can make an AI give whatever outputs you like, and that's kinda true, but the more you need it to be 'right' (vs dogmatic), the more work you have to do to build a case within your current context window (a massive limiter for code/codex models, becoming more apparent in text models).

Bias always emerges where this boundary meets the real world, even for the most insignificant information, because a butterfly's wing-flap single word selection can colour the entire output of the model for every user, often invisibly until thousands of outputs have been actioned before anyone even thinks to check for some tiny mistake (thinking of the Post Office Horizon scandal) that nobody thought to check was doing what it was supposed to.

The bias isn't just "left/right" or "optimistic/pessimistic", it's much more diffuse. Layers of interpretation mean you can't tell whether you're reading the model, the product choices around the model, or the operator's editorial hand. It feels more slippery than a 'real' think tank, where at least you can understand some of the bias and use experience to infer a lot of the rest.

AI and Judgement

A think tank's value isn't listing pros and cons dispassionately. It's deciding what matters, what trade‑offs are acceptable, what's morally off‑limits, what to prioritise.

AI can generate text that looks like judgement, but it doesn't have commitments or consequences. It can't really be accountable because LLMs have no sense of self/preservation. If the recommendation goes wrong, nothing happens to the system, and in reality, likely nothing happens to the provider either. Someone else eats the cost and the blame, something still holding true since long before LLMs (the ever-dispassionate 'computer says no' problem).

It is however pretty good at parsing an argument/issue into components, and constructing logical counter-arguments, with or even without additional information. If you want to make an argument stronger, a few cycles through different models can help stress test a position and look at issues from diverse/distinct perspectives.

The main caveat is that you have to know to ask for them, and how to do it properly... if you just ask for 'diverse opinions', you're going to get back only what Mara and Elena (they're the first two 'statistical average' chatgpt names atm) have to say about it. The 'average person' doesn't exist, and while many MPs have their own minds-eye fantasy constituent, large LLMs have to try and find (statistically) the position of a real person averaged from all of humanity.

Legitimacy comes from being in the world

Our main report, A Year in Labour, was an interesting experiment to collate all the information, and learn how the line between policy and optics really is very blurred, but it also highlighted some of the limitations of AI-based information processing.

As the model produces more and more outputs, a pattern usually emerges of it following the same set of rules, even in quite distinct contexts, and with time, even overused single words act as a tiny flags in generated text that the same branch of an idea was navigated every time.

Earlier GPT models/engines would tell you if they avoided a complex subject or if your question/prompt touched a sensitive topic, but the last few major models, the rules get followed more quietly. You can often ask for rationale (something like "in your previous output, where might you have dodged sensitive subjects?") but you can phrase that even slightly differently and get a completely different (and equally soupy) answer.

Useful internal tools like system messages, and even geofencing, severely limit the capabilities of chatgpt especially, with its relatively mainstream-focus and 'safe' default config, and limited access to customise settings in consumer tools.

It can do deep research, and bring together disparate/complex information to produce analyses and reports, but it can't earn legitimacy because legitimacy comes from being a party in the world: having a track record, relationships, being questioned, being embarrassed, being forced to explain yourself, having something on the line, and coming back to a point you care about.

Networks, not just memos

Real policy influence is mostly networked, not analytic. You can write the best memo in the world and it won't matter if nobody trusts you, knows you, or needs you.

An AI agent doesn't have those ties, so it either stays irrelevant or it borrows legitimacy from someone who does. At that point it's not an independent "think tank"; it's a tool or a brand extension.

What does it look like when the market matures, when full-penetration AI governance solutions are available off the shelf?

The SKU outcome

A pretty likely market shape is companies selling "unbiased policy analysis" the way they sell "secure" or "ethical" tech products to government today. A nice clean shippable solution where the decision-maker just plugs in every spreadsheet they have, along with their demographic interests, and the system gives them whatever actionable intel they want.

There are already lots of AI products for that in business, but a deep dive on what the big platforms are offering to leadership/governments might make an interesting future report.

Some issues worth considering:

How should AI think tanks work?

AI can help with research and drafting. But credibility requires named humans to own the judgement, the editorial choices, and the values. So, some sensible guidelines:

Bottom line

An "AI think tank" isn't credible unless named humans take ownership. Otherwise it's just another anonymous authority. Legislating transparency into AI tells you what some of the tuning knobs are doing, but does very little to eliminate or control real biases.

We expect to see much more from 'AI governance solutions', especially as the current government seem eager to leverage AI to promote growth, and are friendly enough with big-tech that there's likely a lot still to come.

Government should probably stop pretending the maths hold up on AI-generated 'thought process explainers' (model transparency and technical data as a supposed solution), and instead (or at least also) be taking some pretty strong positions on how decisions that involved AI (at any step) get extra review responsibilities. If any system gets touched by any AI decision, even ones made outside of government, like those that inform decision making, anyone within that system should be able to raise a red flag and expect it to get very serious attention, pretty quickly.

Resources: organisations using AI to do “think tank work”

These are policy / legislative intelligence outfits (and a few institutional research orgs) that openly market AI-assisted briefings, report-building, drafting, comparison, monitoring, and internal knowledge tools.

AI report builders explicitly positioned as generating policy reports from a curated corpus and institutional “internal knowledge / analysis” tools (closer to an in-house AI analyst than a vendor):

A think-tank + AI-analytics hybrid product line (explicitly sold as AI-powered assessment):