<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"

	>

<channel>
	<title>Radar</title>
	<atom:link href="https://www.oreilly.com/radar/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.oreilly.com/radar</link>
	<description>Now, next, and beyond: Tracking need-to-know trends at the intersection of business and technology</description>
	<lastBuildDate>Wed, 14 Feb 2024 17:53:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.3.17</generator>
	<item>
		<title>The OpenAI Endgame</title>
		<link>https://www.oreilly.com/radar/the-openai-endgame/</link>
				<comments>https://www.oreilly.com/radar/the-openai-endgame/#respond</comments>
				<pubDate>Tue, 13 Feb 2024 11:07:40 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Commentary]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15401</guid>
				<description><![CDATA[Since the New York Times sued OpenAI for infringing its copyrights by using Times content for training, everyone involved with AI has been wondering about the consequences. How will this lawsuit play out? And, more importantly, how will the outcome affect the way we train and use large language models? There are two components to [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Since the<em> New York Times</em> sued OpenAI for infringing its copyrights by using <em>Times</em> content for training, everyone involved with AI has been wondering about the consequences. How will this lawsuit play out? And, more importantly, how will the outcome affect the way we train and use large language models?</p>



<p>There are two components to this suit. First, it was possible to get ChatGPT to reproduce some <em>Times</em> articles very close to verbatim. That’s fairly clearly copyright infringement, though there are still important questions that could influence the outcome of the case. Reproducing the<em> New York Times</em> clearly isn’t the intent of ChatGPT, and OpenAI appears to have modified ChatGPT’s guardrails to make generating infringing content more difficult, though probably not impossible. Is this enough to limit any damages? It’s not clear that anybody has used ChatGPT to avoid paying for a <em>NYT</em> subscription. Second, the examples in a case like this are always cherry-picked. While the <em>Times</em> can clearly show that OpenAI can reproduce some articles, can it reproduce any article from the <em>Times</em>’ archive? Could I get ChatGPT to produce an article from page 37 of the September 18, 1947 issue? Or, for that matter, an article from the<em> Chicago Tribune</em> or the<em> Boston Globe</em>? Is the entire corpus available (I doubt it), or just certain random articles? I don’t know, and given that OpenAI has modified GPT to reduce the possibility of infringement, it’s almost certainly too late to do that experiment. The courts will have to decide whether inadvertent, inconsequential, or unpredictable reproduction meets the legal definition of copyright infringement.</p>



<p>The more important claim is that training a model on copyrighted content is infringement, whether or not the model is capable of reproducing that training data in its output. An inept and clumsy version of this claim was made by Sarah Silverman and others in a suit that was <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.hollywoodreporter.com/business/business-news/sarah-silverman-lawsuit-ai-meta-1235669403/" target="_blank">dismissed</a>. The Authors’ Guild has its own version of this lawsuit, and it is working on a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.hollywoodreporter.com/business/business-news/authors-guild-exploring-blanket-license-artificial-intelligence-companies-1235785941/" target="_blank">licensing</a> model that would allow its members to opt in to a single licensing agreement. The outcome of this case could have many side-effects, since it essentially would allow publishers to charge not just for the texts they produce, but for how those texts are used.</p>



<p>It is difficult to predict what the outcome will be, though easy enough guess. Here’s mine. OpenAI will settle with the<em> New York Times</em> out of court, and we won’t get a ruling. This settlement will have important consequences: it will set a de-facto price on training data. And that price will no doubt be high. Perhaps not as high as the <em>Times</em> would like (there are rumors that OpenAI has offered something in the range of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.adweek.com/media/open-ai-response-new-york-times-lawsuit/#" target="_blank">$1 million to $5 million</a>), but sufficiently high enough to deter OpenAI’s competitors.</p>



<p>$1M is not, in and of itself, a terribly high price, and the <em>Times</em> reportedly thinks that it’s way too low; but realize that OpenAI will have to pay a similar amount to almost every major newspaper publisher worldwide in addition to organizations like the Authors Guild, technical journal publishers, magazine publishers, and many other content owners. The total bill is likely to be close to $1 billion, if not more, and as models need to be updated, at least some of it will be a recurring cost. I suspect that OpenAI would have difficulty going higher, even given Microsoft’s investments—and, whatever else you may think of this strategy—OpenAI has to think about the total cost. I doubt that they are close to profitable; they appear to be running on an Uber-like business plan, in which they spend heavily to buy the market without regard for running a sustainable business. But even with that business model, billion-dollar expenses have to raise the eyebrows of partners like Microsoft.</p>



<p>The <em>Times</em>, on the other hand, appears to be making a common mistake: overvaluing its data. Yes, it has a large archive—but what is the value of old news? Furthermore, in almost any application but especially in AI, the value of data isn’t the data itself; it’s the correlations between different datasets. The <em>Times</em> doesn’t own those correlations any more than I own the correlations between my browsing data and Tim O’Reilly’s. But those correlations are precisely what’s valuable to OpenAI and others building data-driven products.</p>



<p>Having set the price of copyrighted training data to $1B or thereabouts, other model developers will need to pay similar amounts to license their training data: Google, Microsoft (for whatever independently developed models they have), Facebook, Amazon, and Apple. Those companies can afford it. Smaller startups (including companies like Anthropic and Cohere) will be priced out, along with every open source effort. By settling, OpenAI will eliminate much of their competition. And the good news for OpenAI is that even if they don’t settle, they still might lose the case. They’d probably end up paying more, but the effect on their competition would be the same. Not only that, the <em>Times</em> and other publishers would be responsible for enforcing this “agreement.” They’d be responsible for negotiating with other groups that want to use their content and suing those they can’t agree with. OpenAI keeps its hands clean, and its legal budget unspent. They can win by losing—and if so, do they have any real incentive to win?</p>



<p>Unfortunately, OpenAI is right in claiming that a good model can’t be trained <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://the-decoder.com/openai-says-its-impossible-to-train-state-of-the-art-models-without-copyrighted-data/" target="_blank">without copyrighted data</a> (although Sam Altman, OpenAI’s CEO, has also said the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.cnbc.com/2024/01/18/openai-ceo-on-nyt-lawsuit-ai-models-dont-need-publishers-data-.html" target="_blank">opposite</a>). Yes, we have substantial libraries of public domain literature, plus Wikipedia, plus papers in ArXiv, but if a language model trained on that data would produce text that sounds like a cross between 19th century novels and scientific papers, that’s not a pleasant thought. The problem isn’t just text generation; will a language model whose training data has been limited to copyright-free sources require prompts to be written in an early-20th or 19th century style? Newspapers and other copyrighted material are an excellent source of well-edited grammatically correct modern language. It is unreasonable to believe that a good model for modern languages can be built from sources that have fallen out of copyright.</p>



<p>Requiring model-building organizations to purchase the rights to their training data would inevitably leave generative AI in the hands of a small number of unassailable monopolies. (We won’t address what can or can’t be done with copyrighted material, but we will say that copyright law says nothing at all about the source of the material: you can buy it legally, borrow it from a friend, steal it, find it in the trash—none of this has any bearing on copyright infringement.) One of the participants at the WEF roundtable <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.weforum.org/events/world-economic-forum-annual-meeting-2024/sessions/the-expanding-universe-of-generative-models/" target="_blank">The Expanding Universe of Generative Models</a> reported that Altman has said that he doesn’t see the need for more than one foundation model. That’s not unexpected, given my guess that his strategy is built around minimizing competition. But this is chilling: if all AI applications go through one of a small group of monopolists, can we trust those monopolists to deal honestly with issues of bias? AI developers have said a lot about “alignment,” but discussions of alignment always seem to sidestep more immediate issues like race and gender-based bias. Will it be possible to develop specialized applications (for example, O’Reilly Answers) that require training on a specific dataset? I’m sure the monopolists would say “of course, those can be built by fine tuning our foundation models”; but do we know whether that’s the best way to build those applications? Or whether smaller companies will be able to afford to build those applications, once the monopolists have succeeded in buying the market? Remember: Uber was once inexpensive.</p>



<p>If model development is limited to a few wealthy companies, its future will be bleak. The outcome of copyright lawsuits won’t just apply to the current generation of Transformer-based models; they will apply to any model that needs training data. Limiting model building to a small number of companies will eliminate most academic research. It would certainly be possible for most research universities to build a training corpus on content they acquired legitimately. Any good library will have the <em>Times</em> and other newspapers on microfilm, which can be converted to text with OCR. But if the law specifies how copyrighted material can be used, research applications based on material a university has legitimately purchased may not be possible. It won’t be possible to develop open source models like <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/mistralai/Mistral-7B-v0.1" target="_blank">Mistral</a> and <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/docs/transformers/model_doc/mixtral" target="_blank">Mixtral</a>—the funding to acquire training data won’t be there—which means that the smaller models that don’t require a massive server farm with power-hungry GPUs won’t exist. Many of these smaller models can run on a modern laptop, which makes them ideal platforms for developing AI-powered applications. Will that be possible in the future? Or will innovation only be possible through the entrenched monopolies?</p>



<p>Open source AI has been the victim of a lot of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://spectrum.ieee.org/open-source-ai-2666932122" target="_blank">fear-mongering</a> lately. However, the idea that open source AI will be used irresponsibly to develop hostile applications that are inimical to human well-being gets the problem precisely wrong. Yes, open source will be used irresponsibly—as has every tool that has ever been invented. However, we know that hostile applications will be developed, and are already being developed: in military laboratories, in government laboratories, and at any number of companies. Open source gives us a chance to see what is going on behind those locked doors: to understand AI’s capabilities and possibly even to anticipate abuse of AI and prepare defenses. Handicapping open source AI doesn’t “protect” us from anything; it prevents us from becoming aware of threats and developing countermeasures.</p>



<p>Transparency is important, and proprietary models will always lag open source models in transparency. Open source has always been about source code, rather than data; but that is changing. OpenAI’s GPT-4 scores surprisingly well on Stanford’s <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://crfm.stanford.edu/fmti/" target="_blank">Foundation Model Transparency Index</a>, but still lags behind the leading open source models (Meta’s LLaMA and BigScience’s BLOOM). However, it isn’t the total score that’s important; it’s the “upstream” score, which includes sources of training data, and on this the proprietary models aren’t close. Without data transparency, how will it be possible to understand biases that are built in to any model? Understanding those biases will be important to addressing the harms that models are doing now, not hypothetical harms that might arise from sci-fi superintelligence. Limiting AI development to a few wealthy players who make private agreements with publishers ensures that training data will never be open.</p>



<p>What will AI be in the future? Will there be a proliferation of models? Will AI users, both corporate and individuals, be able to build tools that serve them? Or will we be stuck with a small number of AI models running in the cloud and being billed by the transaction, where we never really understand what the model is doing or what its capabilities are? That’s what the endgame to the legal battle between OpenAI and the <em>Times</em> is all about.</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/the-openai-endgame/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Radar Trends to Watch: February 2024</title>
		<link>https://www.oreilly.com/radar/radar-trends-to-watch-february-2024/</link>
				<comments>https://www.oreilly.com/radar/radar-trends-to-watch-february-2024/#respond</comments>
				<pubDate>Tue, 06 Feb 2024 11:01:51 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[Radar Trends]]></category>
		<category><![CDATA[Signals]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15392</guid>
				<description><![CDATA[2024 started with yet more AI: a small language model from Microsoft, a new (but unnamed) model from Meta that competes with GPT-4, and a text-to-video model from Google that claims to be more realistic than anything yet. Research into security issues has also progressed—unfortunately, discovering more problems than solutions. A common thread in several [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>2024 started with yet more AI: a small language model from Microsoft, a new (but unnamed) model from Meta that competes with GPT-4, and a text-to-video model from Google that claims to be more realistic than anything yet. Research into security issues has also progressed—unfortunately, discovering more problems than solutions. A common thread in several recent attacks has been to use embeddings: an attacker discovers innocuous text or images that happen to have an embedding similar to words describing actions that aren’t allowed. These innocuous inputs easily get by filters designed to prevent hostile prompts.</p>



<h2>AI</h2>



<ul><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://slgero.medium.com/merge-large-language-models-29897aeb1d1a" target="_blank">Merging large language models</a> gets developers the best of many worlds: use different models to solve different kinds of problems. It’s essentially <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Mixture_of_experts" target="_blank">mixture of experts</a> but applied at the application level of the stack rather than the model level.</li><li>Researchers have developed a method for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2401.12070" target="_blank">detecting AI-generated text</a> that is 90% accurate and has a false positive rate of only 0.01%.</li><li>Google has announced <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://lumiere-video.github.io/" target="_blank">Lumiere</a>, a text-to-video model that generates “realistic, diverse, and coherent” motion. Lumiere generates the entire video in one pass rather than generating distinct keyframes that are then merged.</li><li>Is JavaScript a useful language for developing artificial intelligence applications? <em>The New Stack</em> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/top-5-javascript-tools-for-ai-engineering/" target="_blank">lists</a> five tools for building AI applications in JavaScript, starting with TensorFlow.js.</li><li>Meta has released a new language model that <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://digialps.com/meta-research-introduces-revolutionary-self-rewarding-language-models-capable-of-gpt-4-level-performance/" target="_blank">claims performance similar to GPT-4</a>. It is a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/pdf/2401.10020.pdf" target="_blank">self-rewarding language model</a>; it continually evaluates its responses to prompts and adjusts its parameters in response. An independent open source <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/lucidrains/self-rewarding-lm-pytorch" target="_blank">implementation</a> is already on GitHub.</li><li>Hospitals are <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/health/2024/01/what-do-threads-mastodon-and-hospital-records-have-in-common/" target="_blank">using federated learning</a> techniques to collect and share patient data without compromising privacy. With federated learning, the hospitals aren’t sharing actual patient data but machine learning models built on local data.</li><li>Researchers have discovered “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2307.14539" target="_blank">compositional attacks</a>” against multimodal language models. In these attacks, prompts that combine text and images are used to “jailbreak” the model. A hostile but benign-looking image establishes a context in which the model ignores its guardrails.</li><li>Researchers have used tests for psychologically profiling humans to profile AI models and <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2024-01-psychological-profiling-language-based-ai.html" target="_blank">research their built-in biases and prejudices</a>.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2305.18290" target="_blank">Direct Preference Optimization</a> (DPO) is an algorithm for training language models to operate in agreement with human preferences. It is simpler and more efficient than <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback" target="_blank">RLHF</a>.</li><li>Mistral has published a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2401.04088" target="_blank">paper</a> describing its Mixtral 8x7B model, a mixture of experts model with very impressive performance.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/cars/2024/01/volkswagen-is-adding-chatgpt-to-its-infotainment-system/" target="_blank">Volkswagen has added ChatGPT</a> to the infotainment system on its cars. ChatGPT will not have access to any of the car’s data.</li><li>Language models rely on converting input tokens to embeddings (long sequences of numbers). Can the original text be recovered from the embeddings used with language models? The answer may be <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2310.06816" target="_blank">yes</a>.</li><li>AWS’s AI product, Q, now has tools to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/aws-gifts-java-rust-developers-with-useful-tools/" target="_blank">automate updating Java programs</a> to new versions. That includes finding and replacing deprecated dependencies.</li><li>Microsoft’s Phi-2 model is now open source; it has been <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/microsoft/phi-2/commit/7e10f3ea09c0ebd373aebc73bc6e6ca58204628d" target="_blank">relicensed</a> with the MIT license. Phi-2 is a small model (2.7B parameters) with performance comparable to much larger models.</li><li>Simon Willison’s summary of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Dec/31/ai-in-2023/" target="_blank">AI in 2023</a> is the best we’ve seen. In the coming year, Simon would love to see us get beyond “vibes-based development.” Unlike traditional programming, AI doesn’t do what you tell it to do, and we’re frequently forced to evaluate AI output on the basis of whether it “feels right.”</li><li>The US FTC has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/ftc-offers-25-000-prize-for-detecting-ai-enabled-voice-cloning/" target="_blank">issued</a> a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.ftc.gov/news-events/contests/ftc-voice-cloning-challenge" target="_blank">challenge</a> to developers: develop software that can detect AI-generated clones of human voices. The winner will receive a $25,000 prize.</li><li>DeepMind has built a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/ai/2024/01/deepmind-ai-rivals-the-worlds-smartest-high-schoolers-at-geometry/" target="_blank">model that can solve geometry problems</a>. The new model combines a language model with symbolic AI, giving it the ability to reason logically about problems in addition to matching patterns.</li></ul>



<h2>Programming</h2>



<ul><li>Any app can become extensible. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://extism.org/" target="_blank">Extism</a> is a WebAssembly library that can be added to almost any app that allows app users to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/extism-v1-run-webassembly-in-your-app/" target="_blank">write plug-ins</a> in most major programming languages.</li><li>Zed, a collaborative code editor, is now <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://zed.dev/blog/zed-is-now-open-source" target="_blank">open source</a> and available on <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/zed-industries/zed" target="_blank">GitHub</a>.</li><li>A <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.blog/2024-01-23-good-devex-increases-productivity/" target="_blank">study</a> by GitHub shows that creating a good developer experience (DevEx or DX) improves productivity by reducing cognitive load, shortening feedback loops, and helping developers to remain in “flow state.”</li><li>Julia Evans (@<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="mailto:b0rk@jvns.ca" target="_blank">b0rk@jvns.ca</a>) has compiled a list of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://gist.github.com/jvns/f7d2db163298423751a9d1a823d7c7c1" target="_blank">common Git mistakes</a>.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://ruffle.rs/" target="_blank">Ruffle</a> is a Flash emulator built with Rust and Wasm. While you may not remember Macromedia Flash, and you probably don’t want to use it for new content, the New York Times is using Ruffle to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://flowingdata.com/2024/01/10/nyt-flash-based-visualizations-work-again/" target="_blank">resurrect</a> archival content that used Flash for visualizations.</li><li>JavaScript as a shell language? <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://bun.sh/blog/the-bun-shell" target="_blank">Bun</a> is an open source JavaScript shell that can run on Linux, macOS, and Windows. It’s the only shell that is truly platform-independent.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://shadeup.dev/" target="_blank">Shadeup</a> is a new programming language that extends TypeScript. It is designed to simplify working with WebGPU.</li><li>“<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/rethinking-observability/" target="_blank">Rethinking Observability</a>” argues for thinking about how users experience a service, rather than details of the service’s implementation. What are the critical user journeys (CUJs), and what are service level objectives (SLOs) for those paths through the system?</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://marimo.io/" target="_blank">Marimo</a> is a new Python notebook with some important features. When you edit any cell, it automatically updates all affected cells; the notebooks themselves are pure Python and can be managed with Git and other tools; GitHub Copilot is integrated into the Marimo editor.</li><li>LinkedIn has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/linkedin-shares-its-developer-productivity-framework/" target="_blank">released</a> its <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://linkedin.github.io/dph-framework/" target="_blank">Developer Productivity and Happiness Framework</a>, a set of metrics for processes that affect developer experience. The metrics include things like code review response time, but LinkedIn points out that the framework is most useful in helping teams build their own metrics.</li><li>The Node package registry, NPM, recently <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/everything-blocks-devs-from-removing-their-own-npm-packages/" target="_blank">accepted</a> a package named “everything” that links to everything in the registry. Whether this was a joke or a hostile attack remains to be seen, but an important side effect is that it became impossible to remove a package from NPM.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/ktock/container2wasm" target="_blank">container2wasm</a> takes a container image and converts it to WebAssembly, The Wasm executable can be run with <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://wasi.dev/" target="_blank">WASI</a> or even in a browser. This project is still in its early stages, but it is very impressive.</li><li>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://ahastack.dev/aha/1-stack-overview/" target="_blank">AHA Stack</a> provides a way to build web applications that minimizes browser-side JavaScript. It is based on the Astro framework, htmx, and Alpine.js.</li><li>Last year ended with Brainfuck implemented in PostScript. To start 2024, someone has found a working <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/kspalaiologos/malbolge-lisp" target="_blank">Lisp interpreter written in Malbolge</a>, a language that competes with Brainfuck for being the most difficult, frustrating, and obtuse programming language in existence.</li><li>The year starts with a new Python web framework, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.miguelgrinberg.com/post/microdot-yet-another-python-web-framework" target="_blank">Microdot</a>. How long has it been since we’ve had a new Python framework? It’s very similar to Flask, but it’s small; it was designed to run on MicroPython, which runs on microcontrollers like ESP8266.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://odin-lang.org/" target="_blank">Odin</a> is yet another new programming language. It supports data-oriented programming and promises high performance with explicit (though safe) control of memory management and layout. It claims simplicity, clarity, and readability.</li></ul>



<h2>Security</h2>



<ul><li>The UK’s National Cyber Security Center has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/uk-says-ai-will-empower-ransomware-over-the-next-two-years/" target="_blank">warned</a> that generative AI will be used in ransomware and other attacks. Generative AI will make social engineering and phishing more convincing; it will enable inexperienced actors to create much more dangerous attacks.</li><li>A <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.schneier.com/blog/archives/2024/01/side-channels-are-common.html" target="_blank">presentation at USENIX’s security symposium</a> argues that side channels leak information in almost all commodity PCs: microphones, cameras, and other sensors pick up electromagnetic signals from the processor. These signals can be captured and decoded.</li><li>Like everyone else, malware groups are <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theregister.com/2023/12/11/lazarus_group_edang/" target="_blank">moving to memory-safe languages</a> like Rust and <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://dlang.org/" target="_blank">DLang</a> to develop their payloads.</li><li>Researchers have <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2401.05566" target="_blank">discovered</a> that poisoned training data can be used to insert backdoors into large language models. These backdoors can be triggered by special prompts and cannot be discovered or removed by current safety techniques.</li><li>Programmers who use AI assistants are likely to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2211.03622" target="_blank">write code that is less secure</a> while believing that their code is more secure. However, users of AI assistants who don’t “trust” the AI engage more with the code produced and are likely to produce code that is more secure.</li><li>A <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2024/01/a-previously-unknown-worm-has-been-stealthily-targeting-linux-devices-for-a-year/" target="_blank">variant of the Mirai malware is attacking Linux systems</a>. This variant finds weak SSH passwords and installs cryptocurrency mining software to create a mining botnet.</li><li>Many groups offer “bug bounties” that pay rewards to those who discover bugs (particularly security vulnerabilities) in their code. One open source maintainer <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/" target="_blank">argues</a> that this process is being distorted by incorrect bug reports that are generated by AI, wasting maintainers’ time.</li><li>The US National Institute of Standards and Technology <a href="https://venturebeat.com/security/new-nist-report-sounds-the-alarm-on-growing-threat-of-ai-attacks/">has</a> <a href="https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.pdf">published</a> a taxonomy and standard terminology for attacks against machine learning and AI systems.</li></ul>



<h2>Web</h2>



<ul><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://nimbo.earth/products/earth-online/" target="_blank">Nimbo Earth Online</a> aims to be a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenextweb.com/news/digital-twin-rival-google-earth-nimbo" target="_blank">“digital twin” of the Earth</a>. It’s superficially similar to Google Earth but has fascinating features like the ability to see historical progressions: for example, how a landscape changed after a fire or how a river’s course wandered over the years.</li><li>A <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://downloads.webis.de/publications/papers/bevendorff_2024a.pdf" target="_blank">study</a> shows that search results are getting worse as a result of SEO spam. The problem affects all major search engines. If you read the paper and ignore click-bait summaries, Google is doing a somewhat better job of maintaining search integrity than its competitors.</li><li><em>The Verge</em> has an excellent <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theverge.com/c/23998379/google-search-seo-algorithm-webpage-optimization" target="_blank">article</a> about how optimizing sites for Google search have affected web design, making sites much more homogeneous.</li><li>Facebook’s app includes a new <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://gizmodo.com/meet-link-history-facebook-s-new-way-to-track-the-we-1851134018" target="_blank">Link History</a> setting (on by default) that encourages use of the app’s built-in browser. Link History saves all links, and the browser is known to include a keylogger; the data from both is used for targeted advertising.</li></ul>



<h2>Quantum Computing</h2>



<ul><li>While we don’t yet have usable quantum computers, an <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.schneier.com/blog/archives/2024/01/improving-shors-algorithm.html" target="_blank">improvement</a> to Shor’s algorithm for factoring numbers has been <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2308.06572" target="_blank">published</a>. While it reduces the computational time from O(N^2) to O(N^1.5), it increases the number of qubits required, which may be an important limitation.</li></ul>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/radar-trends-to-watch-february-2024/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Technology Trends for 2024</title>
		<link>https://www.oreilly.com/radar/technology-trends-for-2024/</link>
				<comments>https://www.oreilly.com/radar/technology-trends-for-2024/#respond</comments>
				<pubDate>Thu, 25 Jan 2024 11:04:43 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[Radar Column]]></category>
		<category><![CDATA[Research]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15363</guid>
				<description><![CDATA[This has been a strange year. While we like to talk about how fast technology moves, internet time, and all that, in reality the last major new idea in software architecture was microservices, which dates to roughly 2015. Before that, cloud computing itself took off in roughly 2010 (AWS was founded in 2006); and Agile [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>This has been a strange year. While we like to talk about how fast technology moves, internet time, and all that, in reality the last major new idea in software architecture was microservices, which dates to roughly 2015. Before that, cloud computing itself took off in roughly 2010 (AWS was founded in 2006); and Agile goes back to 2000 (the&nbsp;<em>Agile Manifesto</em>&nbsp;dates back to 2001, Extreme Programming to 1999). The web is over 30 years old; the Netscape browser appeared in 1994, and it wasn’t the first. We think the industry has been in constant upheaval, but there have been relatively few disruptions: one every five years, if that.</p>



<p>2023 was one of those rare disruptive years. ChatGPT changed the industry, if not the world. We’re skeptical about things like job displacement, at least in technology. But AI is going to bring changes to almost every aspect of the software industry. What will those changes be? We don’t know yet; we’re still at the beginning of the story. In this report about how people are using O’Reilly’s learning platform, we’ll see how patterns are beginning to shift.</p>



<p>Just a few notes on methodology: This report is based on O’Reilly’s internal “Units Viewed” metric. Units Viewed measures the actual usage of content on our platform. The data used in this report covers January through November in 2022 and 2023. Each graph is scaled so that the topic with the greatest usage is 1. Therefore, the graphs can’t be compared directly to each other.</p>



<p>Remember that these “units” are “viewed” by our users, who are largely professional software developers and programmers. They aren’t necessarily following the latest trends. They’re solving real-world problems for their employers. And they’re picking up the skills they need to advance in their current positions or to get new ones. We don’t want to discount those who use our platform to get up to speed on the latest hot technology: that’s how the industry moves forward. But to understand usage patterns, it’s important to realize that every company has its own technology stacks, and that those stacks change slowly. Companies aren’t going to throw out 20 years’ investment in PHP so they can adopt the latest popular React framework, which will probably be displaced by another popular framework next year.</p>



<h2>Software Development</h2>



<p>Most of the topics that fall under software development declined in 2023. What does this mean? Programmers are still writing software; our lives are increasingly mediated by software, and that isn’t going to change.</p>



<p>Software developers are responsible for designing and building bigger and more complex projects than ever. That’s one trend that won’t change: complexity is always “up and to the right.” Generative AI is the wild card: Will it help developers to manage complexity? Or will it add complexity all its own? It’s tempting to look at AI as a quick fix. Who wants to learn about coding practices when you’re letting GitHub Copilot write your code for you? Who wants to learn about design patterns or software architecture when some AI application may eventually do your high-level design? AI is writing low-level code now; as many as&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.blog/2023-06-13-survey-reveals-ais-impact-on-the-developer-experience/#:~:text=on%20developing%20solutions.-,The%20bottom%20line,but%20enable%20upskilling%20opportunities%2C%20too." target="_blank">92% of software developers are using it</a>. Whether it will be able to do high-level design is an open question—but as always, that question has two sides: “Will AI do our design work?” is less interesting than “How will AI change the things we want to design?” And the real question that will change our industry is “How do we design systems in which generative AI and humans collaborate effectively?”</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig1-678x1048.png" alt="" class="wp-image-15364" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig1-678x1048.png 678w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig1-194x300.png 194w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig1-768x1187.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig1-994x1536.png 994w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig1.png 1209w" sizes="(max-width: 678px) 100vw, 678px" /><figcaption>Figure 1. Software architecture</figcaption></figure>



<p>Regardless of the answers to these questions, humans will need to understand and specify what needs to be designed. Our data shows that most topics in software architecture and design are down year-over-year. But there are exceptions. While software architecture is down 3.9% (a relatively small decline), enterprise architecture is up 8.9%. Domain-driven design is particularly useful for understanding the behavior of complex enterprise systems; it’s down, but only 2.0%. Use of content about event-driven architecture is relatively small, but it’s up 40%. That change is important because event-driven architecture is a tool for designing large systems that have to ingest data from many different streams in real time. Functional programming, which many developers see as a design paradigm that will help solve the problems of distributed systems, is up 9.8%. So the software development world is changing. It’s shifting toward distributed systems that manage large flows of data in real time. Use of content on topics relevant to that shift is holding its own or growing.</p>



<p>Microservices saw a 20% drop. Many developers expressed frustration with microservices during the year and argued for a return to monoliths. That accounts for the sharp decline—and it’s fair to say that many organizations are paying the price for moving to microservices because it was “the thing to do,” not because they needed the scale or flexibility that microservices can offer. From the start, microservice proponents have argued that the best way to develop microservices is to start with a monolith, then break the monolith into services as it becomes necessary. If implemented poorly, microservices deliver neither scale nor flexibility. Microservices aren’t ideal for new greenfield projects, unless you’re absolutely sure that you need them from the start—and even then, you should think twice. It’s definitely not a technology to implement just to follow the latest fad.</p>



<p>Software developers run hot and cold on design patterns, which declined 16%. Why? It probably depends on the wind or the phase of the moon. Content usage about design patterns increased 13% from 2021 to 2022, so this year’s decline just undoes last year’s gain. It’s possible that understanding patterns seems less important when AI is writing a lot of the code for you. It’s also possible that design patterns seem less relevant when code is already largely written; most programmers maintain existing applications rather than develop new greenfield apps, and few texts about design patterns discuss the patterns that are embedded in legacy applications. But both ways of thinking miss the point. Design patterns are common solutions to common problems that have been observed in practice. Understanding design patterns keeps you from reinventing wheels. Frameworks like React and Spring are important because they implement design patterns. Legacy applications won’t be improved by refactoring existing code just to use some pattern, but design patterns are useful for extending existing software and making it more flexible. And, of course, design patterns are used in legacy code—even code that was written before the term was coined! Patterns are discovered, not “invented”; again, they’re common solutions to problems programmers have been solving since the beginning of programming.</p>



<p>At the same time, whenever there’s a surge of interest in design patterns, there’s a corresponding surge in pattern abuse: managers asking developers how many patterns they used (as if pattern count were a metric for good code), developers implementing&nbsp;FactoryFactoryFactory&nbsp;Factories, and the like. What goes around comes around, and the abuse of design patterns is part of a feedback loop that regulates the use of design patterns.</p>



<h2>Programming and Programming Languages</h2>



<p>Most of the programming languages we track showed declines in content usage. Before discussing specifics, though, we need to look at general trends. If 92% of programmers are using generative AI to write code and answer questions, then we’d certainly expect a drop in content use. That may or may not be advisable for career development, but it’s a reality that businesses built on training and learning have to acknowledge. But that isn’t the whole story either—and the bigger story leaves us with more questions than answers.</p>



<p>Rachel Stephens provides two fascinating pieces of the puzzle in a&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://redmonk.com/rstephens/2023/12/14/language-rankings-update/" target="_blank">recent article on the RedMonk blog</a>, but those pieces don’t fit together exactly. First, she notes the decline in questions asked on Stack Overflow and states (reasonably) that asking a nonjudgmental AI assistant might be a preferable way for beginners to get their questions answered. We agree; we at O’Reilly have built&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/online-learning/feature-answers.html" target="_blank">O’Reilly Answers</a>&nbsp;to provide that kind of assistance (and are in the process of a major upgrade that will make it even more useful). But Stack Overflow shows a broad peak in questions from 2014 to 2017, with a sharp decline afterward; the number of questions in 2023 is barely 50% of the peak, and the 20% decline from the January 2023 report to the July report is only somewhat sharper than the previous drops. And there was no generative AI, no ChatGPT, back in 2017 when the decline began. Did generative AI play a role? It would be foolish to say that it didn’t, but it can’t be the whole story.</p>



<p>Stephens points to another anomaly: GitHub pull requests declined roughly 25% from the second half of 2022 to the first half of 2023. Why? Stephens guesses that there was increased GitHub activity during the pandemic and that activity has returned to normal now that we’ve (incorrectly) decided the pandemic is over. Our own theory is that it’s a reaction to GPT models leaking proprietary code and abusing open source licenses; that could cause programmers to be wary of public code repositories. But those are only guesses. This change is apparently not an error in the data. It might be a one-time anomaly, but no one really knows the cause.&nbsp;<em>Something</em>&nbsp;drove down programmer activity on GitHub, and that’s inevitably a part of the background to this year’s data.</p>



<p>So, what does O’Reilly’s data say? As it has been for many years, Python is the most widely used programming language on our platform. This year, we didn’t see an increase; we saw a very small (0.14%) decline. That’s noise; we won’t insult your intelligence by claiming that “flat in a down market” is really a gain. It’s certainly fair to ask whether a language as popular as Python has gathered all the market share that it will get. When you’re at the top of the adoption curve, it’s difficult to go any higher and much easier to drop back. There are always new languages ready to take some of Python’s market share. The most significant change in the Python ecosystem is Microsoft’s integration of Python into Excel spreadsheets, but it’s too early to expect that to have had an effect.</p>



<p>Use of content about Java declined 14%, a significant drop but not out of line with the drop in GitHub activity. Like Python, Java is a mature language and may have nowhere to go but down. It has never been “well loved”; when Java was first announced, people walked out of the doors of the conference room claiming that Java was dead before you could even download the beta. (I was there.) Is it time to dance on Java’s grave? That dance has been going on since 1995, and it hasn’t been right yet.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig2-975x1048.png" alt="" class="wp-image-15365" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig2-975x1048.png 975w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig2-279x300.png 279w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig2-768x825.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig2.png 1209w" sizes="(max-width: 975px) 100vw, 975px" /><figcaption>Figure 2. Programming languages</figcaption></figure>



<p>JavaScript also declined by 3.9%. It’s a small decline and probably not meaningful. TypeScript, a version of JavaScript that adds static typing and type annotations, gained 5.6%. It’s tempting to say that these cancel each other out, but that’s not correct. Usage of TypeScript content is roughly one-tenth the usage of JavaScript content. But it is correct to say that interest in type systems is growing among web developers. It’s also true that an increasing number of junior developers use JavaScript only through a framework like React or Vue. Boot camps and other crash programs often train students in “React,” with little attention on the bigger picture. Developers trained in programs like these may be aware of JavaScript but may not think of themselves as JavaScript developers, and may not be looking to learn more about the language outside of a narrow, framework-defined context.</p>



<p>We see growth in C++ (10%), which is surprising for an old, well-established language. (C++ first appeared in 1985.) At this point in C++’s history, we’d expect it to be a headache for people maintaining legacy code, not a language for starting new projects. Why is it growing? While C++ has long been an important language for game development, there are signs that it’s breaking out into other areas. C++ is an ideal language for embedded systems, which often require software that runs directly on the processor (for example, the software that runs in a smart lightbulb or in the braking system of any modern car). You aren’t going to use Python, Java, or JavaScript for those applications. C++ is also an excellent language for number crunching (Python’s numeric libraries are written in C++), which is increasingly important as artificial intelligence goes mainstream. It has also become the new “must have” language on résumés: knowing C++ proves that you’re tough, that you’re a “serious” programmer. Job anxiety exists—whether or not it’s merited is a different question—and in an environment where programmers are nervous about keeping their current jobs or looking forward to finding a new one, knowing a difficult but widely used language can only be an asset.</p>



<p>Use of content about Rust also increased from 2022 to 2023 (7.8%). Rust is a relatively young language that stresses memory safety and performance. While Rust is considered difficult to learn, the idea that memory safety is baked in makes it an important alternative to languages like C++. Bugs in memory management are a significant source of vulnerabilities, as noted in NIST’s page on “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.nist.gov/itl/ssd/software-quality-group/safer-languages" target="_blank">Safer Languages</a>,” and Rust does a good job of enforcing safe memory usage. It’s now used in operating systems (Linux kernel components), tool development, and even enterprise software.</p>



<p>We also saw 9.8% growth in content about functional programming. We didn’t see gains for any of the historical functional programming languages (Haskell, Erlang, Lisp, and Elixir) though; most saw steep declines. In the past decade, most programming languages have added functional features. Newer languages like Rust and Go have had them from the start. And Java has gradually added features like closures in a series of updates. Now programmers can be as functional as they want to be without switching to a new language.</p>



<p>Finally, there are some programming languages that we don’t yet track but that we’re watching with interest. Zig is a simple imperative language that’s designed to be memory safe, like Rust, but relatively easy to learn. Mojo is a superset of Python that’s compiled, not interpreted. It’s designed for high performance, especially for numerical operations. Mojo’s goal is to facilitate AI programming in a single language rather than a combination of Python and some other language (typically C++) that’s used for performance-critical numerical code. Where are these languages going? It will be some years before they reach the level of Rust or Go, but they’re off to a good start.</p>



<p>So what does all this tell us about training and skill development? It’s easy to think that, with Copilot and other tools to answer all your questions, you don’t need to put as much effort into learning new technologies. We all ask questions on Google or Stack Overflow, and now we have other places to get answers. Necessary as that is, the idea that asking questions can replace training is naive. Unlike many who are observing the influence of generative AI on programming, we believe that it will increase the gap between entry-level skills and senior developer skills. Being a senior developer—being a senior anything—requires a kind of fluency that you can’t get just from asking questions. I may never be a fluent user of Python’s pandas library (which I used extensively to write this report); I asked lots of questions, and that has undoubtedly saved me time. But what happens when I need to solve the next problem? The kind of fluency that you need to look at a problem and understand how to solve it doesn’t come from asking simple “How do I do this?” questions. Nor does it preclude asking lots of “I forgot how this function works” questions. That’s why we’ve built&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/answers/search/" target="_blank">O’Reilly Answers</a>, an AI-driven service that finds solutions to questions using content from our platform. But expertise does require developing the intellectual muscle that comes from grappling with problems and solving them yourself rather than letting something else solve them for you. (And that includes forcing yourself to remember all the messy syntax details.) People who think generative AI is a shortcut to expertise (and the job title and salary that expertise merits) are shortchanging themselves.</p>



<h2>Artificial Intelligence</h2>



<p>In AI, there’s one story and only one story, and that’s the GPT family of models. Usage of content on these models exploded 3,600% in the past year. That explosion is tied to the appearance of ChatGPT in November 2022. But don’t make the mistake of thinking that ChatGPT came out of nowhere. GPT-3 created a big splash when it was released in 2020 (complete with a clumsy web-based interface). GPT-2 appeared in 2019, and the original unnumbered GPT was even earlier. The real innovation in ChatGPT wasn’t the technology itself (though the models behind it represent a significant breakthrough in AI performance); it was packaging the model as a chatbot. That doesn’t mean that the GPT explosion wasn’t real. While our analysis of search trends shows that interest in ChatGPT has&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/radar/the-chatgpt-surge/" target="_blank">peaked</a>&nbsp;among our platform’s users, interest in natural language processing (NLP) showed a 195% increase—and from a much higher starting point.<sup>1</sup>&nbsp;That makes sense, given the more technical nature of our audience. Software developers will be building on top of the APIs for GPT and other language models and are likely less interested in ChatGPT, the web-based chat service. Related topics generative models (900%) and Transformers (325%) also showed huge gains. Prompt engineering, which didn’t exist in 2022, became a significant topic, with roughly the same usage as Transformers. As far as total use, NLP is almost twice GPT. However you want to read the data, this is AI’s big year, largely due to the GPT models and the idea of generative AI.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig3-680x1048.png" alt="" class="wp-image-15366" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig3-680x1048.png 680w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig3-195x300.png 195w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig3-768x1183.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig3-997x1536.png 997w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig3.png 1209w" sizes="(max-width: 680px) 100vw, 680px" /><figcaption>Figure 3. Artificial intelligence</figcaption></figure>



<p>But don’t assume that the explosion of interest in generative AI meant that other aspects of AI were standing still. Deep learning, the creation and application of neural networks with many layers, is fundamental to every aspect of modern AI. Usage in deep learning content grew 19% in the past year. Reinforcement learning, in which models are trained by giving “rewards” for solving problems, grew 15%. Those gains only look small in comparison to the triple- and quadruple-digit gains we’re seeing in natural language processing. PyTorch, the Python library that has come to dominate programming in machine learning and AI, grew 25%. In recent years, interest in PyTorch has been growing at the expense of TensorFlow, but TensorFlow showed a small gain (1.4%), reversing (or at least pausing) its decline. Interest in two older libraries, scikit-learn and Keras, declined: 25% for scikit-learn and 4.8% for Keras. Keras has largely been subsumed by TensorFlow, while scikit-learn hasn’t yet incorporated the capabilities that would make it a good platform for building generative AI. (An attempt to implement Transformers in scikit-learn appears to be underway at&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/scikit-learn/sklearn-transformers" target="_blank">Hugging Face</a>.)</p>



<p>We’ve long said that operations is the elephant in the room for machine learning and artificial intelligence. Building models and developing applications is challenging and fun, but no technology can mature if IT teams can’t deploy, monitor, and manage it. Interest in operations for machine learning (MLOps) grew 14% over the past year. This is solid, substantial growth that only looks small in comparison with topics like generative AI. Again, we’re still in the early stages—generative AI and large language models are only starting to reach production. If anything, this increase probably reflects older applications of AI. There’s a growing ecosystem of startups building tools for deploying and monitoring language models, which are fundamentally different from traditional applications. As companies deploy the applications they’ve been building, MLOps will continue to see solid growth. (More on MLOps when we discuss operations below.)</p>



<p><a href="https://www.langchain.com/">LangChain</a>&nbsp;is a framework for building generative AI applications around groups of models and databases. It’s often used to implement the&nbsp;<a href="https://thenewstack.io/retrieval-augmented-generation-for-llms/">retrieval-augmented generation (RAG) pattern</a>, where a user’s prompt is used to look up relevant items in a vector database; those items are then combined with the prompt, generating a new prompt that is sent to the language model. There isn’t much content about LangChain available yet, and it didn’t exist in 2022, but it’s clearly going to become a foundational technology. Likewise, vector databases aren’t yet in our data. We expect that to change next year. They are rather specialized, so we expect usage to be relatively small, unlike products like MySQL—but they will be very important.</p>



<p>AI wasn’t dominated entirely by the work of OpenAI; Meta’s LLaMA and Llama 2 also attracted a lot of attention. The source code for LLaMA was open source, and its weights (parameters) were easily available to researchers. Those weights quickly leaked from “researchers” to the general public, where they jump-started the creation of smaller open source models. These models are much smaller than behemoths like GPT-4. Many of them can run on laptops, and they’re proving ideal for smaller companies that don’t want to rely on Microsoft, OpenAI, or Google to provide AI services. (If you want to run an open source language model on your laptop, try&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/Mozilla-Ocho/llamafile" target="_blank">llamafile</a>.) While huge “foundation models” like the GPT family won’t disappear, in the long run open source models like Alpaca and Mistral may prove to be more important to software developers.</p>



<p>It’s easy to think that generative AI is just about software development. It isn’t; its influence extends to just about every field. Our&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/videos/chatgpt-possibilities-and/0636920908753/" target="_blank">ChatGPT: Possibilities and Pitfalls</a>&nbsp;Superstream was the most widely attended event we’ve ever run. There were over 28,000 registrations, with attendees and sponsors from industries as diverse as pharmaceuticals, logistics, and manufacturing. Attendees included small business owners, sales and marketing personnel, and C-suite executives, along with many programmers and engineers from different disciplines. We’ve also been running courses focused on specific industries:&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/videos/generative-ai-for/0636920962335/" target="_blank">Generative AI for Finance</a>&nbsp;had over 2,000 registrations, and&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/videos/generative-ai-for/0636920964384/" target="_blank">Generative AI for Government</a>&nbsp;over 1,000. And more than 1,000 people signed up for our&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/live-events/generative-ai-for-healthcare/0636920098725/" target="_blank">Generative AI for Healthcare</a>&nbsp;event.</p>



<h2>Data</h2>



<p>In previous years, we would have told the story of AI as part of the story of data. That’s still correct; with its heavy emphasis on mathematics and statistics, AI is a natural outgrowth of data science. But this year, AI has become the superstar that gets top billing, while data is a supporting actor.</p>



<p>That doesn’t mean that data is unimportant. Far from it. Every company uses data: for planning, for making projections, for analyzing what’s happening within the business and the markets they serve. So it’s not surprising that the second biggest topic in data is Microsoft Power BI, with a 36% increase since 2022. SQL Server also showed a 5.3% increase, and statistics toolbox R increased by 4.8%.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig4-692x1048.png" alt="" class="wp-image-15367" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig4-692x1048.png 692w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig4-198x300.png 198w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig4-768x1162.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig4-1015x1536.png 1015w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig4.png 1209w" sizes="(max-width: 692px) 100vw, 692px" /><figcaption>Figure 4. Data analysis and databases</figcaption></figure>



<p>Data engineering was by far the most heavily used topic in this category; it showed a 3.6% decline, stabilizing after a huge gain from 2021 to 2022. Data engineering deals with the problem of storing data at scale and delivering that data to applications. It includes moving data to the cloud, building pipelines for acquiring data and getting data to application software (often in near real time), resolving the issues that are caused by data siloed in different organizations, and more. Two of the most important platforms for data engineering, Kafka and Spark, showed significant declines in 2023 (21% and 20%, respectively). Kafka and Spark have been workhorses for many years, but they are starting to show their age as they become “legacy technology.” (Hadoop, down 26%, is clearly legacy software in 2023.) Interest in Kafka is likely to rise as AI teams start implementing real-time models that have up-to-the-minute knowledge of external data. But we also have to point out that there are newer streaming platforms (like Pulsar) and newer data platforms (like Ray).</p>



<p>Designing enterprise-scale data storage systems is a core part of data engineering. Interest in data warehouses saw an 18% drop from 2022 to 2023. That’s not surprising; data warehouses also qualify as legacy technology. Two other patterns for enterprise-scale storage show significant increases: Usage of content about data lakes is up 37% and, in absolute terms, significantly higher than that of data warehouses. Usage for data mesh content is up 5.6%. Both lakes and meshes solve a basic problem: How do you store data so that it’s easy to access across an organization without building silos that are only relevant to specific groups?&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Data_lake" target="_blank">Data lakes</a>&nbsp;can include data in many different formats, and it’s up to users to supply structure when data is utilized. A&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.montecarlodata.com/blog-what-is-a-data-mesh-and-how-not-to-mesh-it-up/" target="_blank">data mesh</a>&nbsp;is a truly distributed solution: each group is responsible for its own data but makes that data available throughout the enterprise through an interoperability layer. Those newer technologies are where we see growth.</p>



<p>The two open source data analysis platforms were virtually unchanged in 2023. Usage of content about R increased by 3.6%; we’ve already seen that Python was unchanged, and pandas grew by 1.4%. Neither of these is going anywhere, but alternatives, particularly to pandas, are appearing.</p>



<h2>Operations</h2>



<p>Whether you call it operations, DevOps, or something else, this field has seen some important changes in the past year. We’ve witnessed the rise of developer platforms, along with the related topic, platform engineering. Both of those are too new to be reflected in our data: you can’t report content use before content exists. But they are influencing other topics.</p>



<p>We’ve said in the past that Linux is table stakes for a job in IT. That’s still true. But the more the deployment process is automated—and platform engineering is just the next step in “Automate All the Things”—the less developers and IT staff need to know about Linux. Software is packaged in containers, and the containers themselves run as virtual Linux instances, but developers don’t need to know how to find and kill out-of-control processes, do a backup, install device drivers, or perform any of the other tasks that are the core of system administration. Usage of content about Linux is down 6.9%: not a major change but possibly a reflection of the fact that the latest steps forward in deploying and managing software shield people from direct contact with the operating system.</p>



<p>Similar trends reduce what developers and IT staff need to know about Kubernetes, the near-ubiquitous container orchestrator (down 6.9%). Anyone who uses Kubernetes knows that it’s complex. We’ve long expected “something simpler” to come along and replace it. It hasn’t—but again, developer platforms put users a step further away from engaging with Kubernetes itself. Knowledge of the details is encapsulated either in a developer platform or, perhaps more often, in a Kubernetes service administered by a cloud provider. Kubernetes can’t be ignored, but it’s more important to understand high-level principles than low-level commands.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig5-702x1048.png" alt="" class="wp-image-15368" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig5-702x1048.png 702w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig5-201x300.png 201w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig5-768x1147.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig5-1028x1536.png 1028w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig5.png 1209w" sizes="(max-width: 702px) 100vw, 702px" /><figcaption>Figure 5. Infrastructure and operations</figcaption></figure>



<p>DevOps (9.0%) and SRE (13%) are also down, though we don’t think that’s significant. Terms come and go, and these are going. While operations is constantly evolving, we don’t believe we’ll ever get to the mythical state of “NoOps,” nor should we. Instead, we’ll see constant evolution as the ratio of systems managed to operations staff grows ever higher. But we&nbsp;<em>do</em>&nbsp;believe that sooner rather than later, someone will put a new name on the disciplines of DevOps and its close relative, SRE. That new name might be “platform engineering,” though that term says more about designing deployment pipelines than about carrying the pager and keeping the systems running; platform engineering is about treating developers as customers and designing internal developer platforms that make it easy to test and deploy software systems with minimal ceremony. We don’t believe that platform engineering subsumes or replaces DevOps. Both are partners in improving experience for developers and operations staff (and ratcheting up the ratio of systems managed to staff even higher).</p>



<p>That’s a lot of red ink. What’s in the black? Supply chain management is up 5.9%. That’s not a huge increase, but in the past few years we’ve been forced to think about how we manage the software supply chain. Any significant application easily has dozens of dependencies, and each of those dependencies has its own dependencies. The total number of dependencies, including both direct and inherited dependencies, can easily be hundreds or even thousands. Malicious operators have discovered that they can corrupt software archives, getting programmers to inadvertently incorporate malware into their software. Unfortunately, security problems never really go away; we expect software supply chain security to remain an important issue for the foreseeable (and unforeseeable) future.</p>



<p>We’ve already mentioned that MLOps, the discipline of deploying and managing models for machine learning and artificial intelligence, is up 14%. Machine learning and AI represent a new kind of software that doesn’t follow traditional rules, so traditional approaches to operations don’t work. The list of differences is long:</p>



<ul><li>While most approaches to deployment are based on the idea that an application can be reproduced from a source archive, that isn’t true for AI. An AI system depends as much on the training data as it does on the source code, and we don’t yet have good tools for archiving training data.</li><li>While we’ve said that open source models such as Alpaca are much smaller than models like GPT-4 or Google’s Gemini, even the smallest of those models is very large by any reasonable standard.</li><li>While we’ve gotten used to automated testing as part of a deployment pipeline, AI models aren’t deterministic. A test doesn’t necessarily give the same result every time it runs. Testing is no less important for AI than it is for traditional software (arguably it’s more important), and we’re starting to see startups built around AI testing, but we’re still at the beginning.</li></ul>



<p>That’s just a start. MLOps is a badly needed specialty. It’s good to see growing interest.</p>



<h2>Security</h2>



<p>Almost all branches of security showed growth from 2022 to 2023. That’s a welcome change: in the recent past, many companies talked about security but never made the investment needed to secure their systems. That’s changing, for reasons that are obvious to anyone who reads the news. Nobody wants to be a victim of data theft or ransomware, particularly now that ransomware has evolved into blackmail.</p>



<p>The challenges are really very simple. Network security, keeping intruders off of your network, was the most widely used topic and grew 5%. Firewalls, which are an important component of network security, grew 16%. Hardening, a much smaller topic that addresses making systems less vulnerable to attack, grew 110%. Penetration testing remained one of the most widely used topics. Usage dropped 5%, although a 10% increase for Kali Linux (an important tool for penetration testers) largely offsets that decline.</p>



<p>The 22% growth in security governance is another indicator of changed attitudes: security is no longer an ad hoc exercise that waits for something to happen and then fights fires. Security requires planning, training, testing, and auditing to ensure that policies are effective.</p>



<p>One key to security is knowing who your users are and which parts of the system each user can access. Identity and access management (IAM) has often been identified as a weakness, particularly for cloud security. As systems grow more complex, and as our concept of “identity” evolves from individuals to roles assigned to software services, IAM becomes much more than usernames and passwords. It requires a thorough understanding of who the actors are on your systems and what they’re allowed to do. This extends the old idea of “least privilege”: each actor needs the ability to do exactly what they need, no more and no less. The use of content about IAM grew 8.0% in the past year. It’s a smaller gain than we would have liked to see but not insignificant.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig6-710x1048.png" alt="" class="wp-image-15369" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig6-710x1048.png 710w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig6-203x300.png 203w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig6-768x1134.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig6-1040x1536.png 1040w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig6.png 1220w" sizes="(max-width: 710px) 100vw, 710px" /><figcaption>Figure 6. Security</figcaption></figure>



<p>Application security grew 42%, showing that software developers and operations staff are getting the message. The DevSecOps “shift left” movement, which focuses on software security early in the development process, appears to be winning; use of content about DevSecOps was up 30%. Similarly, those who deploy and maintain applications have become even more aware of their responsibilities. Developers may design identity and access management into the code, but operations is responsible for configuring these correctly and ensuring that access to applications is only granted appropriately. Security can’t be added after the fact; it has to be part of the software process from beginning to the end.</p>



<p>Advanced persistent threats (APTs) were all over the news a few years ago. We don’t see the term APT anywhere near as much as we used to, so we’re not surprised that usage has dropped by 35%. Nevertheless, nation-states with sophisticated offensive capabilities are very real, and cyber warfare is an important component of several international conflicts, including the war in Ukraine.</p>



<p>It’s disappointing to see that usage of content about zero trust has declined by 20%. That decrease is more than offset by the increase in IAM, which is an essential tool for zero trust. But don’t forget that IAM is just a tool and that the goal is to build systems that don’t rely on trust, that always verify that every actor is appropriately identified and authorized. How can you defend your IT infrastructure if you assume that attackers already have access? That’s the question zero trust answers. Trust nothing; verify everything.</p>



<p>Finally, compliance is down 27%. That’s more than offset by the substantial increase of interest in governance. Auditing for compliance is certainly a part of governance. Focusing on compliance itself, without taking into account the larger picture, is a problem rather than a solution. We’ve seen many companies that focus on compliance with existing standards and regulations while avoiding the hard work of analyzing risk and developing effective policies for security. “It isn’t our fault that something bad happened; we followed all the rules” is, at best, a poor way to explain systemic failure. If that compliance-oriented mindset is fading, good riddance. Compliance, understood properly, is an important component of IT governance. Understood badly, compliance is an unacceptable excuse.</p>



<p>Finally, a word about a topic that doesn’t yet appear in our data. There has, of course, been a lot of chatter about the use of AI in security applications. AI will be a great asset for log file analysis, intrusion detection, incident response, digital forensics, and other aspects of cybersecurity. But, as we’ve already said, there are always two sides to AI. How does AI change security itself? Any organization with AI applications will have to protect them from exploitation. What vulnerabilities does AI introduce that didn’t exist a few years ago? There are many articles about prompt injection, sneaky prompts designed to “jailbreak” AI systems, data leakage, and other vulnerabilities—and we believe that’s only the beginning. Securing AI systems will be a critical topic in the coming years.</p>



<h2>Cloud Computing</h2>



<p>Looking at platform usage for cloud-related topics, one thing stands out: cloud native. Not only is it the most widely used topic in 2023, but it grew 175% from 2022 to 2023. This marks a real transition. In the past, companies built software to run on-premises and then moved it to the cloud as necessary. Despite reports (<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/library/view/the-cloud-in/9781492096733/ch01.html#executive_summary" target="_blank">including ours</a>) that showed 90% or more “cloud adoption,” we always felt that was very optimistic. Sure, 90% of all companies may have one or two experiments&nbsp;<em>in</em>&nbsp;the cloud—but are they really building&nbsp;<em>for</em>&nbsp;the cloud? This huge surge in cloud native development shows that we’ve now crossed that chasm and that companies have stopped kicking the tires. They’re building for the cloud as their primary deployment platform.</p>



<p>You could, of course, draw the opposite conclusion by looking at cloud deployment, which is down 27%. If companies are developing for the cloud, how are those applications being deployed? That’s a fair question. However, as cloud usage grows, so does organizational knowledge of cloud-related topics, particularly deployment. Once an IT group has deployed its first application, the second isn’t necessarily “easy” or “the same,” but it is familiar. At this point in the history of cloud computing, we’re seeing few complete newcomers. Instead we’re seeing existing cloud users deploying more and more applications. We’re also seeing a rise in tools that streamline cloud deployment. Indeed, any provider worth thinking about has a tremendous interest in making deployment as simple as possible.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig7-1048x996.png" alt="" class="wp-image-15370" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig7-1048x996.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig7-300x285.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig7-768x730.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig7.png 1209w" sizes="(max-width: 1048px) 100vw, 1048px" /><figcaption>Figure 7. Cloud architecture</figcaption></figure>



<p>Use of content about cloud security grew 25%, and identity and access management (IAM) grew 8%. An epidemic of data theft and ransomware that continues to this day put security on the corporate map as a priority, not just an expense with annual budget requests that sounded like an extortion scam: “Nothing bad happened this year; give us more money and maybe nothing bad will happen next year.” And while the foundation of any security policy is good local security hygiene, it’s also true that the cloud presents its own issues. Identity and access management: locally, that means passwords, key cards, and (probably) two-factor authentication. In the cloud, that means IAM, along with zero trust. Same idea, but it would be irresponsible to think that these aren’t more difficult in the cloud.</p>



<p>Hybrid cloud is a smaller topic area that has grown significantly in the past year (145%). This growth points partly to the cloud becoming the de facto deployment platform for enterprise applications. It also acknowledges the reality of how cloud computing is adopted. Years ago, when “the cloud” was getting started, it was easy for a few developers in R&amp;D to expense a few hours of time on AWS rather than requisitioning new hardware. The same was true for data-aware marketers who wanted to analyze what was happening with their potential customers—and they might choose Azure. When senior management finally awoke to the need for a “cloud strategy,” they were already in a hybrid situation, with multiple wildcat projects in multiple clouds. Mergers and buyouts complicated the situation more. If company A is primarily using AWS and company B has invested heavily in Google Cloud, what happens when they merge? Unifying behind a single cloud provider isn’t going to be worth it, even though cloud providers are providing tools to simplify migration (at the same time as they make their own clouds difficult to leave). The cloud is naturally hybrid. “Private cloud” and “public cloud,” when positioned as alternatives to each other and to a hybrid cloud, smell like “last year’s news.” It’s not surprising that usage has dropped 46% and 10%, respectively.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig8-1048x759.png" alt="" class="wp-image-15371" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig8-1048x759.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig8-300x217.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig8-768x556.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig8.png 1209w" sizes="(max-width: 1048px) 100vw, 1048px" /><figcaption>Figure 8. Cloud providers</figcaption></figure>



<p>What about the perennial horse race between Amazon Web Services, Microsoft Azure, and Google Cloud? Is anyone still interested, except perhaps investors and analysts? AWS showed a very, very small gain (0.65%), but Azure and Google Cloud showed significant losses (16% and 22%, respectively). We expected to see Azure catch up to AWS because of its lead in AI as a service, but it didn’t. As far as our platform is concerned, that’s still in the future.</p>



<h2>Web Development</h2>



<p>React and Angular continue to dominate web development. JavaScript is still the lingua franca of web development, and that isn’t likely to change any time soon.</p>



<p>But the usage pattern has changed slightly. Last year, React was up, and Angular was sharply down. This year, usage of React content hasn’t changed substantially (down 0.33%). Angular is down 12%, a smaller decline than last year but still significant. When a platform is as dominant as React, it may have nowhere to go but down. Is momentum shifting?</p>



<p>We see some interesting changes among the less popular frameworks, both old and new. First, Vue isn’t a large part of the overall picture, and it isn’t new—it’s been around since 2014—but if its 28% annual growth continues, it will soon become a dominant framework. That increase represents a solid turnaround after losing 17% from 2021 to 2022. Django is even older (created in 2005), but it’s still widely used—and with an 8% increase this year, it’s not going away. FastAPI is the newest of this group (2018). Even though it accounts for a very small percentage of platform use, it’s easy for a small change in usage to have a big effect. An 80% increase is hard to ignore.</p>



<p>It’s worth looking at these frameworks in a little more detail. Django and FastAPI are both Python-based, and FastAPI takes full advantage of Python’s type hinting feature. Python has long been an also-ran in web development, which has been dominated by JavaScript, React, and Angular. Could that be changing? It’s hard to say, and it’s worth noting that Flask, another Python framework, showed a 12% decrease. As a whole, Python frameworks probably declined from 2022 to 2023, but that may not be the end of the story. Given the number of boot camps training new web programmers in React, the JavaScript hegemony will be hard to overcome.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig9-702x1048.png" alt="" class="wp-image-15372" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig9-702x1048.png 702w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig9-201x300.png 201w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig9-768x1147.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig9-1028x1536.png 1028w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig9.png 1209w" sizes="(max-width: 702px) 100vw, 702px" /><figcaption>Figure 9. Web development</figcaption></figure>



<p>What about PHP, another long-standing framework that dates back to 1995, when the web was indeed young? PHP grew 5.9% in the past year. The use of content about PHP is small compared to frameworks like React and Angular or even Django. PHP certainly doesn’t inspire the excitement that it did in the 1990s. But remember that over 80% of the web is built on PHP. It’s certainly not trendy, it’s not capable of building the feature-rich sites that many users expect—but it’s everywhere. WordPress (down 4.8%), a content management system used for millions of websites, is based on PHP. But regardless of the number of sites that are built on PHP or WordPress, Indeed shows roughly&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="http://indeed.com/" target="_blank">three times as many job openings for React developers</a>&nbsp;as for PHP and WordPress combined. PHP certainly isn’t going away, and it may even be growing slightly. But we suspect that PHP programmers spend most of their time maintaining older sites. They already know what they need to do that, and neither of those factors drives content usage.</p>



<p>What about some other highly buzzworthy technologies? After showing 74% growth from 2021 to 2022, WebAssembly (Wasm) declined by 41% in 2023. Blazor, a web framework for C# that generates code for Wasm, declined by 11%. Does that mean that Wasm is dying? We still believe Wasm is a very important technology, and we frequently read about amazing projects that are built with it. It isn’t yet a mature technology—and there are plenty of developers willing to argue that there’s no need for it. We may disagree, but that misses the point. Usage of Wasm content will probably decline gradually&#8230;until someone creates a killer application with it. Will that happen? Probably, but we can’t guess when.</p>



<p>What does this mean for someone who’s trying to develop their skills as a web developer? First, you still can’t go wrong with React, or even with Angular. The other JavaScript frameworks, such as Next.js, are also good options. Many of these are metaframeworks built on React, so knowing them makes you more versatile while leveraging knowledge you already have. If you’re looking to broaden your skills, Django would be a worthwhile addition. It’s a very capable framework, and knowing Python will open up other possibilities in software development that may be helpful in the future, even if not now.</p>



<h2>Certification</h2>



<p>This year, we took a different approach to certification. Rather than discussing certification for different subject areas separately (that is, cloud certification, security certification, etc.), we used data from the platform to build a list of the top 20 certifications and grouped them together. That process gives a slightly different picture of which certifications are important and why. We also took a brief look at O’Reilly’s new badges program, which gives another perspective on what our customers want to learn.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig10-658x1048.png" alt="" class="wp-image-15373" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig10-658x1048.png 658w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig10-188x300.png 188w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig10-768x1223.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig10-965x1536.png 965w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig10.png 1213w" sizes="(max-width: 658px) 100vw, 658px" /><figcaption>Figure 10. Certification</figcaption></figure>



<p>Based on the usage of content in our platform (including practice tests), the most popular certifications are security certifications: CISSP (which declined 4.8%) and CompTIA Security+ (which grew 6.0%). CISSP is an in-depth exam for security professionals, requiring at least five years’ experience before taking the exam. Security+ is more of an entry-level exam, and its growth shows that security staff are still in demand. ISACA’s Certified Information Security Manager (CISM) exam, which focuses on risk assessment, governance, and incident response, isn’t as popular but showed a 54% increase. CompTIA’s Certified Advanced Security Practitioner (CASP+) showed a 10% increase—not as large but part of the same trend. The Certified Ethical Hacker (CEH) exam, which focuses on techniques useful for penetration testing or red-teaming, is up 4.1%, after a decline last year. Those increases reflect where management is investing. Hoping that there won’t be an incident has been replaced by understanding exposure, putting in place governance mechanisms to minimize risk, and being able to respond to incidents when they occur.</p>



<p>What really stands out, however, isn’t security: it’s the increased use of content about CompTIA A+, which is up 58%. A+ isn’t a security exam; it’s advertised as an entry-level exam for IT support, stressing topics like operating systems, managing SaaS for remote work, troubleshooting software, hardware, and networking problems, and the like. It’s testimony to the large number of people who want to get into IT. Usage of content about the CompTIA Linux+ exam was much lower but also grew sharply (23%)—and, as we’ve said in the past, Linux is “table stakes” for almost any job in computing. It’s more likely that you’ll encounter Linux indirectly via containers or cloud providers rather than managing racks of computers running Linux; but you will be expected to know it. The Certified Kubernetes Administrator (CKAD) exam also showed significant growth (32%). Since it was first released in 2014, Kubernetes has become an inescapable part of IT operations. The biggest trend in IT, going back 70 years or so, has been the increase in the ratio of machines to operators: from multiple operators per machine in the ’60s to one operator per machine in the era of minicomputers to dozens and now, in the cloud, to hundreds and thousands. Complex as Kubernetes is—and we admit, we keep looking for a simpler alternative—it’s what lets IT groups manage large applications that are implemented as dozens of microservices and that run in thousands of containers on an uncountable number of virtual machines. Kubernetes has become an essential skill for IT. And certification is becoming increasingly attractive to people working in the field; there’s no other area in which we see so much growth.</p>



<p>Cloud certifications also show prominently. Although “the cloud” has been around for almost 20 years, and almost every company will say that they are “in the cloud,” in reality many companies are still making that transition. Furthermore, cloud providers are constantly adding new services; it’s a field where keeping up with change is difficult. Content about Amazon Web Services was most widely used. AWS Cloud Practitioner increased by 35%, followed by AWS Solutions Architect (Associate), which increased 15%. Microsoft Azure certification content followed, though the two most prominent exams showed a decline: Azure Fundamentals (AZ-900) was down 37%, and Azure Administration (AZ-104) was down 28%. Google Cloud certifications trailed the rest: Google’s Cloud Engineer showed solid growth (14%), while its Data Engineer showed a significant decline (40%).</p>



<p>Content about Microsoft’s AI-900 exam (Azure AI Fundamentals) was the least-used among the certifications that we tracked. However, it gained 121%—it more than doubled—from 2022 to 2023. While we can’t predict next year, this is the sort of change that trends are made of. Why did this exam suddenly get so hot? It’s easy, really: Microsoft’s investment in OpenAI, its integration of the GPT models into Bing and other products, and its AI-as-a-service offerings through Azure have suddenly made the company a leader in cloud-based AI. While we normally hedge our bets on smaller topics with big annual growth—it’s easy for a single new course or book to cause a large swing—AI isn’t going away, nor is Microsoft’s leadership in cloud services for AI developers.</p>



<p>Late in 2023, O’Reilly began to offer&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/online-learning/badges.html" target="_blank">badges tied to course completion</a>&nbsp;on the O’Reilly learning platform. Badges aren’t certifications, but looking at the top badges gives another take on what our customers are interested in learning. The results aren’t surprising: Python, GPT (not just ChatGPT), Kubernetes, software architecture, and Java are the most popular badges.</p>



<p>However, it’s interesting to look at the difference between our B2C customers (customers who have bought platform subscriptions as individuals) and B2B customers (who use the platform via a corporate subscription). For most topics, including those listed above, the ratio of B2B to B2C customers is in the range of 2:1 or 3:1 (two or three times as many corporate customers as individuals). The outliers are for topics like communications skills, Agile, Scrum, personal productivity, Excel, and presentation skills: users from B2B accounts obtained these badges four (or more) times as often as users with personal accounts. This makes sense: these topics are about teamwork and other skills that are valuable in a corporate environment.</p>



<p>There are few (if any) badge topics for which individual (B2C) users outnumbered corporate customers; that’s just a reflection of our customer base. However, there were some topics where the ratio of B2B to B2C customers was closer to one. The most interesting of these concerned artificial intelligence: large language models (LLMs), TensorFlow, natural language processing, LangChain, and MLOps. Why is there more interest among individuals than among corporate customers? Perhaps by next year we’ll know.</p>



<h2>Design</h2>



<p>The important story in design is about tools. Topics like user experience and web design are stable or slightly down (down 0.62% and 3.5%, respectively). But usage about design tools is up 105%, and the VC unicorn Figma is up 145%. Triple-digit growth probably won’t continue, but it’s certainly worth noticing. It highlights two important trends that go beyond typical design topics, like UX.</p>



<p>First, low-code and no-code tools aren’t new, but many new ones have appeared in the past year. Their success has been aided by artificial intelligence. We already have AI tools that can generate text, whether for a production site or for a mockup. Soon we’ll have no-code tools that don’t just spit out a wireframe but will be able to implement the design itself. They will be smart about what the user wants them to do. But to understand the importance of low-code to design, you have to look beyond the use designers will make of these tools. Designers will also be designing these tools, along with other AI-powered applications. Tools for designers have to be well-designed, of course: that’s trivial. But what many discussions about AI ignore is that designing applications that use AI well is far from trivial. We’ve all been blindsided by the success of ChatGPT, which made the GPT models instantly accessible to everyone. But once you start thinking about the possibilities, you realize that a chat is hardly an ideal interface for an AI system.<sup>2</sup> What will the users of these systems really need? We’ve only just started down that path. It will be an exciting journey—particularly for designers.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig11-838x1048.png" alt="" class="wp-image-15374" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig11-838x1048.png 838w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig11-240x300.png 240w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig11-768x960.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig11.png 1209w" sizes="(max-width: 838px) 100vw, 838px" /><figcaption>Figure 11. Design</figcaption></figure>



<p>Second, Figma is important because it’s a breakthrough in tools for collaboration. Tools that allow remote employees to collaborate productively are crucial when coworkers can be anywhere: in an office, at home, or on another continent. The last year and a half has been full of talk about virtual reality, metaverses, and the like. But what few have realized is that the metaverse isn’t about wearing goggles—it’s about seamless collaboration with friends and coworkers. Use of content about AR and VR dropped 25% because people have missed the real story: we don’t need 3D goggles; we need tools for collaboration. And, as with low-code, collaboration tools are both something to design with and something that needs to be designed. We’re on the edge of a new way to look at the world.</p>



<p>Use of content about information architecture was up 16%, recovering from its decline from 2021 to 2022. The need to present information well, to design the environments in which we consume information online, has never been more important. Every day, there’s more information to absorb and to navigate—and while artificial intelligence will no doubt help with that navigation, AI is as much a design problem as a design solution. (Though it’s a “good problem” to have.) Designing and building for accessibility is clearly related to information architecture, and it’s good to see more engagement with that content (up 47%). It’s been a long time coming, and while there’s still a long way to go, accessibility is being taken more seriously now than in the past. Websites that are designed to be usable by people with impairments aren’t yet the rule, but they’re no longer exceptions.</p>



<h2>Professional Development</h2>



<p>Almost everyone involved with software starts as a programmer. But that’s rarely where they end. At some point in their career, they are asked to write a specification, lead a team, manage a group, or maybe even found a company or serve as an executive in an existing company.</p>



<p>O’Reilly is the last company to believe that software developers are neck-bearded geeks who want nothing more than to live in a cave and type on their terminals. We’ve spent most of our history fighting against that stereotype. Nevertheless, going beyond software development is a frequent source of anxiety. That’s no doubt true for anyone stepping outside their comfort zone in just about any field, whether it’s accounting, law, medicine, or something else. But at some point in your career, you have to do something that you aren’t prepared to do. And, honestly, the best leaders are usually the ones who have some anxiety, not the ones whose reaction is “I was born to be a leader.”</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig12-1048x762.png" alt="" class="wp-image-15375" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig12-1048x762.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig12-300x218.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig12-768x558.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/01/Fig12.png 1209w" sizes="(max-width: 1048px) 100vw, 1048px" /><figcaption>Figure 12. Professional development</figcaption></figure>



<p>For the past few years, our audience has been interested in professional growth that goes beyond just writing software or building models for AI and ML. Project management is up 13%; the ability to manage large projects is clearly seen as an asset for employees who are looking for their next promotion (or, in some cases, their next job). Whatever their goals might be, anyone looking for a promotion or a new job—or even just solidifying their hold on their current job—would be well served by improving their communications skills (up 23%). Professional development (up 22%) is a catch-all topic that appears to be responding to the same needs. What’s driving this? 2023 began and ended with a lot of news about layoffs. But despite well-publicized layoffs from huge companies that overhired during the pandemic, there’s little evidence that the industry as a whole has suffered. People who are laid off seem to be snapped up quickly by new employers. Nevertheless, anxiety is real, and the emphasis we’re seeing on professional development (and specifically, communications and project management skills) is partially a result of that anxiety. Another part of the story is no doubt&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/radar/fearing-the-wrong-thing/" target="_blank">the way AI is changing the workplace</a>. If generative AI makes people more efficient, it frees up time for them to do other things, including strategic thinking about product development and leadership. It may finally be time to value “individuals and interactions over processes and tools,” and “customer collaboration over contract negotiation,” as the&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://agilemanifesto.org/" target="_blank"><em>Agile Manifesto</em></a>&nbsp;claims. Doing so will require a certain amount of reeducation, focusing on areas like communications, interpersonal skills, and strategic thinking.</p>



<p>Product management, the discipline of managing a product’s lifecycle from the initial idea through development and release to the market, is also a desirable skill. So why is it only up 2.8% and not 20% like project management? Product management is a newer position in most companies; it has strong ties to marketing and sales, and as far as fear of layoffs is concerned (whether real or media driven), product management positions may be perceived as more vulnerable.</p>



<p>A look at the bottom of the chart shows that usage of content that teaches critical thinking grew 39%. That could be in part a consequence of ChatGPT and the explosion in artificial intelligence. Everyone knows that AI systems make mistakes, and almost every article that discusses these mistakes talks about the need for critical thinking to analyze AI’s output and find errors. Is that the cause? Or is the desire for better critical thinking skills just another aspect of professional growth?</p>



<h2>A Strange Year?</h2>



<p>Back at the start, I said this was a strange year. As much as we like to talk about the speed at which technology moves, reality usually doesn’t move that fast. When did we first start talking about data? Tim O’Reilly said “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/pub/a/web2/archive/what-is-web-20.html?page=3" target="_blank">Data is the next Intel Inside</a>” in 2005, almost 20 years ago. Kubernetes has been around for a decade, and that’s not counting its prehistory as Google’s Borg. Java was introduced in 1995, almost 30 years ago, and that’s not counting its set-top box prehistory as Oak and Green. C++ first appeared in 1985. Artificial intelligence has a prehistory as long as computing itself. When did AI emerge from its wintry cave to dominate the data science landscape? 2016 or 2017, when we were amazed by programs that could sort images into dogs and cats? Sure, Java has changed a lot; so has what we do with data. Still, there’s more continuity than disruption.</p>



<p>This year was one of the few years that could genuinely be called disruptive. Generative AI will change this industry in important ways. Programmers won’t become obsolete, but programming as we know it might. Programming will have more to do with understanding problems and designing good solutions than specifying, step-by-step, what a computer needs to do. We’re not there yet, but we can certainly imagine a day when a human language description leads reliably to working code, when “Do what I meant, not what I said” ceases to be the programmer’s curse. That change has already begun, with tools like GitHub Copilot. But to thrive in that new industry, programmers will need to know more about architecture, more about design, more about human relations—and we’re only starting to see that in our data, primarily for topics like product management and communications skills. And perhaps that’s the definition of “disruptive”: when our systems and our expectations change faster than our ability to keep up. I’m not worried about programmers “losing their jobs to an AI,” and I really don’t see that concern among the many programmers I talk to. But whatever profession you’re in, you will lose out if you don’t keep up. That isn’t kind or humane; that’s capitalism. And perhaps I should have used ChatGPT to write this report.<sup>3</sup></p>



<p>Jerry Lee Lewis might have said “There’s a whole lotta disruption goin’ on.” But despite all this disruption, much of the industry remains unchanged. People seem to have tired of the terms DevOps and SRE, but so it goes: the half-life of a buzzword is inevitably short, and these have been extraordinarily long-lived. The problems these buzzwords represent haven’t gone away. Although we aren’t yet collecting the data (and don’t yet have enough content for which to collect data), developer platforms, self-service deployment, and platform engineering look like the next step in the evolution of IT operations. Will AI play a role in platform engineering? We’d be surprised if it didn’t.</p>



<p>Movement to the cloud continues. While we’ve heard talk of cloud “repatriation,” we see no evidence that it’s happening. We do see evidence that organizations realize that the cloud is naturally hybrid and that focusing on a single cloud provider is short-sighted. There’s also evidence that organizations are now paying more than lip service to security, particularly cloud security. That’s a very good sign, especially after many years in which companies approached security by hoping nothing bad would happen. As many chess grandmasters have said, “Hope is never a good strategy.”</p>



<p>In the coming year, AI’s disruption will continue to play out. What consequences will it have for programming? How will jobs (and job prospects) change? How will IT adapt to the challenge of managing AI applications? Will they rely on AI-as-a-service providers like OpenAI, Azure, and Google, or will they build on open source models, which will probably run in the cloud? What new vulnerabilities will AI applications introduce into the security landscape? Will we see new architectural patterns and styles? Will AI tools for software architecture and design help developers grapple with the difficulties of microservices, or will it just create confusion?</p>



<p>In 2024, we’ll face all of these questions. Perhaps we’ll start to see answers. One thing is clear: it’s going to be an exciting year.</p>



<hr class="wp-block-separator" />



<h3>Footnotes</h3>



<ol><li>Google Trends&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://trends.google.com/trends/explore?geo=US&amp;q=chatgpt&amp;hl=en" target="_blank">suggests</a>&nbsp;that we may be seeing a resurgence in ChatGPT searches. Meanwhile, searches for ChatGPT on our platform appear to have bottomed out in October, with a very slight increase in November. This discrepancy aligns well with the difference between our platform and Google’s. If you want to use ChatGPT to write a term paper, are you going to search Google or O’Reilly?</li><li>Phillip Carter’s article, “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm" target="_blank">All the Hard Stuff Nobody Talks About when Building Products with LLMs</a>,” is worth reading. While it isn’t specifically about design, almost everything he discusses is something designers should think about.</li><li>I didn’t. Not even for data analysis. </li></ol>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/technology-trends-for-2024/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>I Actually Chatted with ChatGPT</title>
		<link>https://www.oreilly.com/radar/i-actually-chatted-with-chatgpt/</link>
				<comments>https://www.oreilly.com/radar/i-actually-chatted-with-chatgpt/#respond</comments>
				<pubDate>Tue, 16 Jan 2024 10:52:10 +0000</pubDate>
		<dc:creator><![CDATA[Philip Guo]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Deep Dive]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15329</guid>
				<description><![CDATA[ChatGPT was released just over a year ago (at the end of November 2022), and countless people have already written about their experiences using it in all sorts of settings. (I even contributed my own hot take last year with my O’Reilly Radar article Real-Real-World Programming with ChatGPT.) What more is left to say by [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>ChatGPT was released just over a year ago (at the end of November 2022), and countless people have already written about their experiences using it in all sorts of settings. (I even contributed my own hot take last year with my O’Reilly Radar article <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/radar/real-real-world-programming-with-chatgpt/" target="_blank"><em>Real-Real-World Programming with ChatGPT</em></a>.) What more is left to say by now? Well, I bet very few of those people have actually <em>chatted</em> with ChatGPT. And by “chat” I mean the original sense of the word—to hold a back-and-forth verbal conversation with it just like how you would chat with a fellow human being. I recently chatted with ChatGPT, and I want to use that experience to reflect on the usability of voice interfaces for AI tools based on Large Language Models. I’m personally interested in this topic since I am a professor who <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://pg.ucsd.edu/" target="_blank">researches human-computer interaction, user experience design, and cognitive science</a>, so AI voice interfaces are fascinating to me.</p>



<p>Here’s what I did: In December 2023 I installed the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://openai.com/blog/introducing-the-chatgpt-app-for-ios" target="_blank">official ChatGPT iOS app from OpenAI</a> on my iPhone and used its voice input mode to hold several hour-long conversations with it while driving long-distance on California highways. I wore standard Apple earbuds with a built-in mic and talked with ChatGPT just like how I would be talking to someone on the phone while driving. These long solo drives were the perfect opportunity to test out ChatGPT’s voice feature because I couldn’t interact with the app using my hands for safety reasons.</p>



<p>I had a very clear use case in mind: <strong>I wanted a conversation partner to keep me awake and alert while driving long-distance by myself.</strong> I’ve found that listening to music or podcasts doesn’t keep me alert when I’m tired because it’s such a passive experience—but what does keep me awake is having someone to talk to, either in the car or remotely on the phone. Could ChatGPT replace a human conversation partner in this role?</p>



<h3><strong>The Good: ChatGPT Made Personalized Podcasts to Keep Me Engaged While Driving</strong></h3>



<p>To not bury the lede, it turns out that it did a remarkable job! As I was driving I was able to engage in several hour-long conversations with ChatGPT that ended only because I had to take a rest stop or hit the usage limit for GPT-4. (I pay for a ChatGPT Plus subscription so I can use the most advanced GPT-4 model, but that comes with a usage limit that I usually hit after about an hour.)</p>



<p>The best way to describe my experience is (borrowing a wonderful term my friend coined) that it felt like listening to a <em>personalized podcast</em>. Since ChatGPT did most of the talking, it was a mostly passive listening experience on my part except for times when I wanted to ask follow-up questions or direct it to change topics. Critically, this meant I could still focus most of my attention on driving safely with a level of distraction on par with listening to a podcast. But it kept me more alert than a regular podcast since I could actively direct the flow of the conversation.</p>



<p>For a concrete example of what such a personalized podcast felt like, I started one conversation by straight-up asking ChatGPT to keep me awake while I was driving in Southern California from Los Angeles to San Diego. So it started by making small talk about road trips in general and asking me about various California landmarks that I’ve visited, culminating in asking me more about San Diego (where I live). When it asked me what places I liked visiting the most here, I mentioned the San Diego Zoo and it started telling me a bit about what makes this particular zoo notable. It mentioned the concept of “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.avma.org/javma-news/2002-12-01/designing-zoo-habitats-promote-animal-well-being" target="_blank">naturalistic enclosures</a>”—a term I had not heard before—so I asked it to elaborate on what this meant. ChatGPT’s explanation of this concept got me interested in the history of zoos, especially the progression from keeping animals in cages to today’s cageless naturalistic enclosures, which aim to be better for animal welfare. During that segment it mentioned the term “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Menagerie" target="_blank">menagerie</a>” in passing, which I had not heard of in that context before, so I asked it to elaborate more. It then went back farther in history to describe how a menagerie refers to the phenomenon of ancient rulers keeping exotic animals for display without as much regard for the animals’ well-being. Listening to that made me realize that I had actually heard the term menagerie in reference to a <em>Star Trek</em> episode of some sort, but I forgot which one, so I asked ChatGPT to jog my memory. It turns out that “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/The_Menagerie_(Star_Trek:_The_Original_Series)" target="_blank">The Menagerie</a>” was a very famous episode of the original <em>Star Trek</em> television series, so after chatting about that episode and other famous <em>Star Trek</em> episodes for a bit, we got onto the topic of why that show was canceled after only three seasons but later found a much larger audience in syndication (i.e., reruns). That in turn got me curious about the concept of syndication in the television business, so ChatGPT dived more into this topic. A few more conversational twists and turns later, then I suddenly realized that the hour had flown by and it was time to pull over for a bathroom break. Success!</p>



<p>Now, I don’t expect you to care at all about the details of the conversation I just described since it wasn’t your conversation—it was mine! But I certainly cared about it at the time since I was genuinely curious to learn more about the topics that ChatGPT mentioned, often offhand in the midst of telling me about something else. It felt a bit like diving down a Wikipedia rabbit hole of following related links, where each follow-up question I asked led it down another meandering path. It was perfect for keeping me from getting bored and sleepy during my long drive.</p>



<p>ChatGPT isn’t just good at this sort of superficial “personalized podcast about Wikipedia-level trivia” … it could also engage me in a more substantive conversation about a task I actually needed help with at the moment. In another hour-long car chat, I prompted ChatGPT to help me design a method to organize my huge collection of almost 30 years’ worth of personal and work-related files for backup. I’ve been diligent about data backup throughout my life, but my files are fragmented amongst different media over the years—burning CDs and DVDs back in the day, several generations of external hard drives (that are in various states of decay), university servers, Dropbox, and other cloud services. For years I had an aspirational goal of unifying all of my backups into one central directory tree, akin to the concept of a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Monorepo" target="_blank">monorepo</a> in software development. I’ve recently been brainstorming ideas for how to design such a system and how to deal with the practical challenges of scaling and maintenance. So I figured that ChatGPT could help me brainstorm during one of my long drives. Again it did a good job at engaging me in this bespoke conversation, and the hour flew by before I had to take a rest stop. I won’t bore you with details of what we discussed, but it felt like talking with an expert in data management who was giving me advice about how to deal with my particular challenge.</p>



<h3><strong>Intermission: Why It Feels Kind of Magical</strong></h3>



<p>Skeptical readers may be thinking at this point, “What’s the big deal, it’s just ChatGPT under the hood. I can already do all this from my computer by typing into the ChatGPT text box!” Although that’s technically true, there’s something magical about being able to do this all hands-free via voice. If you don’t believe me, just try it for an hour. My folk theory is that speaking and listening are hardwired into our brain’s innate language circuitry, but writing and reading are learned skills (i.e., “software” rather than “hardware” in our brains). And that’s why it feels more magical to hold a verbal conversation with an AI versus having the exact same conversation in a text box on a screen. If the AI is good enough, then it almost feels like you’re talking to a real person … at certain times when I was getting deep into a back-and-forth conversation I nearly forgot I was talking to a machine. However, that illusion broke in several ways …</p>



<h3><strong>The Not-So-Good: Usability Limitations of the ChatGPT Voice Interface</strong></h3>



<p>Despite my positive experiences with ChatGPT’s voice mode, it still didn’t live up to the gold standard of feeling like I was talking with a fellow human being. That’s okay, though, since this is an incredibly high bar! Here are some of the ways it fell short.</p>



<ul><li><strong>Must speak entire request all at once</strong>: Most notably, it felt unnatural to have to speak my entire request all at once without pausing. Whenever I paused for too long, ChatGPT would interpret what I said so far as my request and start processing it. As an analogy, when typing a request in a text chat, you can hit the Enter or Send buttons … imagine how weird it would be if ChatGPT started answering you the very moment you stopped typing for one second! Note that in human conversations, especially face-to-face, we use visual cues to tell whether our conversation partner is done talking or whether they are pausing a bit to think about the next thing to say. Even over the phone, we can tell by vocal inflections whether they are temporarily paused and want to keep talking, or whether they are done with their turn and ready for us to respond. Since ChatGPT can’t do any of that (yet!) I often had to think hard about what I wanted to say and then say it all at once without pausing. This was fine for simple requests like “Tell me more about naturalistic enclosures in zoos,” but for more complex requests like describing some facet of my data backup setup, it was painful to have to blurt out as much as I could without pausing. Even more annoyingly, I would sometimes make mistakes when talking so much all at once without pausing. Ideally the app would do a better job at detecting pauses in human speech, taking both context and vocal intonations into account. An easier hack would be to have a voice command like “DONE” or “OVER” (like when people use walkie-talkies) to signal that I am done talking; however, this would also feel unnatural for casual users.</li><li><strong>Unpredictable wait times</strong>: Wait times (latency) for ChatGPT’s responses are unpredictable, and there aren’t audio cues to help me establish an expectation for how long I need to wait before it responds. There’s a click sound when it starts processing my request, but then I may need to wait a few seconds in silence before hearing a response … maybe it’s only one second or maybe it’s five seconds. That said, if I ask it to browse the web, then it plays a continuous waiting sound; web browsing takes longer, maybe 10 to 20 seconds, but at least I get to hear a “waiting” sound. (I don’t mind ChatGPT taking longer here since a human would also take more time to browse the web. However, web browsing is annoying when I don’t explicitly ask it to browse. Oftentimes I want a fast answer but something I say triggers a browse without me intending to.) In contrast, when speaking with a human face-to-face, I can use visual cues to tell whether the other person is deep in thought or when they will likely respond; and even over the phone the other person may say “ummm” or “hold on one sec, lemme think” or “ok let me look this up on the web, hang tight for a while …” if they need more time to think through their response. However, since I don’t get any of these verbal cues from ChatGPT, unpredictable wait times break the illusion of talking to a person.</li><li><strong>Cannot interrupt while it is speaking</strong>: I always had to wait for ChatGPT to completely finish talking before it would listen to my next request. And since I never know ahead of time how long it planned to talk for during a particular turn (i.e., how many words its LLM-generated response is), when I wanted to say something midway it was aggravating to have to wait. I later saw that I could actually interrupt it by tapping on the app on my phone screen, but since I was driving and hands-free, I couldn’t safely do that. Also, that seems like a cumbersome interaction; I should be able to just talk when I want to, even when it is talking. This limitation made the conversation feel like we were using a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Walkie-talkie" target="_blank">walkie-talkie</a> where only one party can talk at once. And it’s not just me—this concept of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://scholar.google.com/scholar?hl=en&amp;as_sdt=0%2C5&amp;q=overlapping+speech&amp;btnG=" target="_blank">overlapping speech</a> is widely studied in linguistics and communication research. Humans naturally talk over one another for various reasons, so not being able to do this with ChatGPT made our conversation feel less fluid. Even implementing a feature like a voice command for interruption would be great, like maybe if I say “pause” or “wait” then it could stop and await my request.</li><li><strong>Speech recognition errors</strong>: ChatGPT’s speech recognition system (presumably based on <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/openai/whisper" target="_blank">OpenAI’s open source Whisper model</a>) is very good, but it does at times misinterpret what I’m saying. What’s stranger is that sometimes it thinks I said something when I didn’t, maybe because it picked up on background rumbles in my car. Several times I wouldn’t be saying anything and suddenly it responds out of the blue; and when I check the written transcript later, it thinks that I said something like “Thank you for watching!” (which I never said). At other times it tries to prematurely end the conversation even though I’m not done, maybe because it mistakenly detected that I said something along the lines of “Thanks …” without any follow-up. Misrecognizing words is forgivable, but I feel that it shouldn’t ever interpret background sounds as words. Of course, if there were other people in the car with me and either they talked or I was talking to them, then I could also understand how ChatGPT would mistakenly interpret that as being a request for it; always-listening home assistants like Alexa have had this issue for years. A more advanced AI would learn to filter out both other people’s voices and also infer when I was speaking with someone else and not it. For instance, when it detects that my sentence is way off topic, maybe that means I’m speaking with someone else in the car; it could at least ask me “Were you talking to me just now?” when it is uncertain. More generally, the idea of explicitly asking me for clarification when it is uncertain would go a long way toward making these interactions feel more human; that’s what I (a representative human!) would do if I were on a noisy phone connection with someone and didn’t hear them clearly.</li><li><strong>Overly agreeable artificial tone</strong>: Lastly, it’s still ChatGPT under the hood, so all the regular limitations of ChatGPT apply here. Most notably, ChatGPT is tuned to be overly friendly and overly agreeable (sounding like a customer service agent) so it will simply go along with whatever you assert. Thus, by default it will not be good at pushing back on you or challenging your thinking in any meaningful ways, just like how you wouldn’t expect a customer service agent to challenge what you say. Moreover, the overly friendly tone of its responses could come off as insincere and almost sarcastic at times, even though that wasn’t the designers’ intent. Relatedly, it had a tendency to ask me superficial questions after it responds, which sound mildly condescending and break the flow of our chat, like, “Sooo, what do YOU think about the San Diego Zoo? What’s YOUR favorite part of the zoo?!?” … when a normal human wouldn’t break the conversational flow so awkwardly like that. Lastly, ChatGPT is trained on data on the public internet (and can also browse the web to get more updated web contents), so it won’t do as well if you’re asking about things that haven&#8217;t been discussed much online.</li></ul>



<p>To summarize the above limitations, <em>chatting with ChatGPT on my phone felt like using a walkie-talkie over a noisy channel to talk to an overly agreeable but socially unaware customer service agent who has extensive knowledge about the contents of the public internet.</em></p>



<h3><strong>Parting Thoughts: Cautiously Optimistic About the Future</strong></h3>



<p>Despite these limitations, I’m excited to see what’s in store for future voice interfaces to LLM-based AI tools like ChatGPT. My early experiences of talking with ChatGPT while driving gave me a glimpse into what many of us have seen growing up in sci-fi shows such as <em>Star Trek</em>, where people can talk to an omnipresent computer to ask questions, hold conversations, or issue commands. Hands-free operation isn’t useful only while driving—it can <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://dl.acm.org/doi/10.1145/329124.329126" target="_blank">make computing truly ubiquitous</a> by letting us seamlessly interact with computation while we are in the midst of doing housework, cooking, or childcare; and it can make computing more accessible to broader groups of people, such as those with mobility impairments.</p>



<p>We still have a long way to go, though. Right now the ChatGPT iPhone app isn’t hooked up to external tools beside a basic web browser, but with the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://openai.com/blog/introducing-gpts" target="_blank">recently announced GPT store</a> (and likely upcoming LLM app stores from other companies) it will soon be possible to hook up LLMs to a variety of tools that can manage our emails, shopping lists, personal finances, home automation, and more. Recent research has started exploring these ideas by connecting ChatGPT to home assistants such as Amazon Alexa (<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2309.13879" target="_blank">2023 arXiv paper PDF</a>). Another promising line of work is better context awareness: for instance, Meta and Ray-Ban recently announced <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.meta.com/smart-glasses/" target="_blank">new Smart Glasses</a> which allow users to chat with an AI assistant that can see what they are seeing (<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.youtube.com/watch?v=pgiWqkvIclk&amp;ab_channel=TheVerge" target="_blank">review from <em>The Verge</em></a>). In my driving scenario, you could imagine wearing these glasses and having the AI act more like a passenger sitting alongside you in the car seeing what you see rather than someone on the other end of a phone call. Critically, a passenger can pause the conversation and tell you to watch the road more carefully if they see a possible danger ahead; a future AI powered by such smart glasses may be able to do the same thing. Alternatively, cars are now starting to directly embed AI into entertainment systems (e.g., <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.volkswagen-newsroom.com/en/press-releases/world-premiere-at-ces-volkswagen-integrates-chatgpt-into-its-vehicles-18048" target="_blank">Volkswagen announcement at CES 2024</a>), so future iterations could integrate cameras and 3D tracking to complement LLMs. One could also imagine smartglasses-based multimodal interactions where you point to objects in any physical environment and start conversations with the AI assistant about your surroundings (check out this <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.youtube.com/shorts/XKGJTMJVRBs" target="_blank">MKBHD YouTube Short showing AI chat with smart glasses</a>).</p>



<p>Of course, these increasingly intense levels of AI interaction and automation come with risks, such as user overreliance, unintended command execution, mental or physical health hazards, and security/privacy violations. Thus, it will be important to design ways to both manage those risks and educate users about how to safely operate these increasingly powerful systems. Thank you very much for reading. Sooo, what do YOU think about ChatGPT’s voice mode?!? What are YOUR favorite and least favorite parts?</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/i-actually-chatted-with-chatgpt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Can Language Models Replace Compilers?</title>
		<link>https://www.oreilly.com/radar/can-language-models-replace-compilers/</link>
				<comments>https://www.oreilly.com/radar/can-language-models-replace-compilers/#respond</comments>
				<pubDate>Tue, 09 Jan 2024 13:14:10 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Commentary]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15317</guid>
				<description><![CDATA[Kevlin Henney and I recently discussed whether automated code generation, using some future version of GitHub Copilot or the like, could ever replace higher-level languages. Specifically, could ChatGPT N (for large N) quit the game of generating code in a high-level language like Python and produce executable machine code directly, like compilers do today? It’s [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Kevlin Henney and I recently discussed whether automated code generation, using some future version of GitHub Copilot or the like, could ever replace higher-level languages. Specifically, could ChatGPT N (for large N) quit the game of generating code in a high-level language like Python and produce executable machine code directly, like compilers do today?</p>



<p>It’s not really an academic question. As coding assistants become more accurate, it seems likely to assume that they will eventually stop being “assistants” and take over the job of writing code. That will be a big change for professional programmers—though writing code is a small part of what programmers actually do. To some extent, it’s happening now: ChatGPT 4’s “Advanced Data Analysis” can generate code in Python, run it in a sandbox, collect error messages, and try to debug it. Google’s Bard has similar capabilities. Python is an interpreted language, so there’s no machine code, but there’s no reason this loop couldn’t incorporate a C or C++ compiler.</p>



<p>This kind of change has happened before: in the early days of computing, programmers “wrote” programs by plugging in wires, then by toggling in binary numbers, then by writing assembly language code, and finally (in the late 1950s) using early programming languages like COBOL (1959) and FORTRAN (1957). To people who programmed using circuit diagrams and switches, these early languages looked as radical as programming with generative AI looks today. COBOL was—literally—an early attempt to make programming as simple as writing English.</p>



<p>Kevlin made the point that higher-level languages are a “repository of determinism” that we can’t do without—at least, not yet. While a “repository of determinism” sounds a bit evil (feel free to come up with your own name), it’s important to understand why it is needed. At almost every stage of programming history, there has been a repository of determinism. When programmers wrote in assembly language, they had to look at the binary 1s and 0s to see exactly what the computer was doing. When programmers wrote in FORTRAN (or, for that matter, C), the repository of determinism moved higher: the source code expressed what programmers wanted and it was up to the compiler to deliver the correct machine instructions. However, the status of this repository was still shaky. Early compilers were not as reliable as we’ve come to expect. They had bugs, particularly if they were optimizing your code (were optimizing compilers a forerunner of AI?). Portability was problematic at best: every vendor had its own compiler, with its own quirks and its own extensions. Assembly was still the “court of last resort” for determining why your program didn’t work. The repository of determinism was only effective for a single vendor, computer, and operating system.<sup>1</sup> The need to make higher-level languages deterministic across computing platforms drove the development of language standards and specifications.</p>



<p>These days, very few people need to know assembler. You need to know assembler for a few tricky situations when writing device drivers or to work with some dark corners of the operating system kernel, and that’s about it. But while the way we program has changed, the structure of programming hasn’t. Especially with tools like ChatGPT and Bard, we still need a repository of determinism, but that repository is no longer assembly language. With C or Python, you can read a program and understand exactly what it does. If the program behaves in unexpected ways, it’s much more likely that you’ve misunderstood some corner of the language’s specification than that the C compiler or Python interpreter got it wrong. And that’s important: that’s what allows us to debug successfully. The source code tells us exactly what the computer is doing, at a reasonable layer of abstraction. If it’s not doing what we want, we can analyze the code and correct it. That may require rereading Kernighan and Ritchie, but it’s a tractable, well-understood problem. We no longer have to look at the machine language—and that’s a very good thing, because with instruction reordering, speculative execution, and long pipelines, understanding a program at the machine level is a lot more difficult than it was in the 1960s and 1970s. We need that layer of abstraction. But that abstraction layer must also be deterministic. It must be completely predictable. It must behave the same way every time you compile and run the program.</p>



<p>Why do we need the abstraction layer to be deterministic? Because we need a reliable statement of exactly what the software does. All of computing, including AI, rests on the ability of computers to do something reliably and repeatedly, millions, billions, or even trillions of times.&nbsp;If you don’t know exactly what the software does—or if it might do something different the next time you compile it—you can’t build a business around it. You certainly can’t maintain it, extend it, or add new features if it changes whenever you touch it, nor can you debug it.</p>



<p>Automated code generation doesn’t yet have the kind of reliability we expect from traditional programming; Simon Willison calls this “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Dec/31/ai-in-2023/" target="_blank">vibes-based development</a>.” We still rely on humans to test and fix the errors. More to the point: you’re likely to generate code many times en route to a solution; you’re not likely to take the results of your first prompt and jump directly into debugging any more than you’re likely to write a complex program in Python and get it right the first time. Writing prompts for any significant software system isn’t trivial; the prompts can be very lengthy, and it takes several tries to get them right. With the current models, every time you generate code, you’re likely to get something different. (Bard even gives you several alternatives to choose from.) The process isn’t repeatable. How do you understand what the program is doing if it’s a different program each time you generate and test it? How do you know whether you’re progressing towards a solution if the next version of the program may be completely different from the previous?</p>



<p>It’s tempting to think that this variation is controllable by setting a variable like GPT-4’s “temperature” to 0; “temperature” controls the amount of variation (or originality, or unpredictability) between responses. But that doesn’t solve the problem. Temperature only works within limits, and one of those limits is that the prompt must remain constant. Change the prompt to help the AI generate correct or well-designed code, and you’re outside of those limits. Another limit is that the model itself can’t change—but models change all the time, and those changes aren’t under the programmer’s control. All models are eventually updated, and there’s no guarantee that the code produced will stay the same across updates to the model. An updated model is likely to produce completely different source code. That source code will need to be understood (and debugged) on its own terms.</p>



<p>So the natural language prompt can’t be the repository of determinism. This doesn’t mean that AI-generated code isn’t useful; it can provide a good starting point to work from. But at some point, programmers need to be able to reproduce and reason about bugs: that’s the point at which you need repeatability and can’t tolerate surprises. Also at that point, programmers will have to refrain from regenerating the high-level code from the natural language prompt. The AI is effectively creating a first draft, and that may (or may not) save you effort compared to starting from a blank screen. Adding features to go from version 1.0 to 2.0 raises a similar problem. Even the largest context windows can’t hold an entire software system, so it’s necessary to work one source file at a time—exactly the way we work now, but again, with the source code as the repository of determinism. Furthermore, it’s difficult to tell a language model what it’s allowed to change and what should remain untouched: “modify this loop only, but not the rest of the file” may or may not work.</p>



<p>This argument doesn’t apply to coding assistants like GitHub Copilot. Copilot is aptly named: it’s an assistant to the pilot, not the pilot. You can tell it precisely what you want done, and where. When you use ChatGPT or Bard to write code, you’re not the pilot or the copilot; you’re the passenger. You can tell a pilot to fly you to New York, but from then on, the pilot is in control.</p>



<p>Will generative AI ever be good enough to skip the high-level languages and generate machine code? Can a prompt replace code in a high-level language? After all, we’re already seeing a tools ecosystem that has prompt repositories, no doubt with version control. It’s possible that generative AI will eventually be able to replace programming languages for day-to-day scripting (“Generate a graph from two columns of this spreadsheet”). But for larger programming projects, keep in mind that part of human language’s value is its ambiguity, and a programming language is valuable precisely because it isn’t ambiguous. As generative AI penetrates further into programming, we will undoubtedly see stylized dialects of human languages that have less ambiguous semantics; those dialects may even become standardized and documented. But “stylized dialects with less ambiguous semantics” is really just a fancy name for prompt engineering, and if you want precise control over the results, prompt engineering isn’t as simple as it seems. We still need a repository of determinism, a layer in the programming stack where there are no surprises, a layer that provides the definitive word on what the computer will do when the code executes. Generative AI isn’t up to that task. At least, not yet. </p>



<hr class="wp-block-separator" />



<h3>Footnote</h3>



<ol><li>If you were in the computing industry in the 1980s, you may remember the need to “reproduce the behavior of VAX/VMS FORTRAN bug for bug.”</li></ol>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/can-language-models-replace-compilers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Radar Trends to Watch: January 2024</title>
		<link>https://www.oreilly.com/radar/radar-trends-to-watch-january-2024/</link>
				<comments>https://www.oreilly.com/radar/radar-trends-to-watch-january-2024/#respond</comments>
				<pubDate>Thu, 04 Jan 2024 11:08:07 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[Radar Trends]]></category>
		<category><![CDATA[Signals]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15310</guid>
				<description><![CDATA[More large language models. Always more large language models. Will the new year be any different? But there is a difference in this month’s AI news: there’s an emphasis on tools that make it easy for users to use models. Whether it’s just tweaking a URL so you can ask questions of a paper on [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>More large language models. Always more large language models. Will the new year be any different? But there is a difference in this month’s AI news: there’s an emphasis on tools that make it easy for users to use models. Whether it’s just tweaking a URL so you can ask questions of a paper on arXiv or using LLamafile to run a model on your laptop (make sure you have a lot of memory!) or using the Notebook Language Model to query your own documents, AI is becoming widely accessible—and not just a toy with a web interface.</p>



<h2>Artificial Intelligence</h2>



<ul><li>Adding talk2 to the start of any arXiv URL (e.g., talk2arxiv.org) loads the paper into an AI chat application so you can talk to it. This is a very clever <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/evanhu1/talk2arxiv" target="_blank">application of the RAG pattern</a>.</li><li>Google’s Autonomous Vehicle startup, Waymo, has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/cars/2023/12/human-drivers-crash-a-lot-more-than-waymos-software-data-shows/" target="_blank">reported</a> a total of three minor injuries to humans in over 7 million miles of driving. This is clearly not Tesla, not Uber, not Cruise.</li><li>Google’s DeepMind has used a large language model to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/12/14/1085318/google-deepmind-large-language-model-solve-unsolvable-math-problem-cap-set/" target="_blank">solve</a> a previously unsolved <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.nature.com/articles/s41586-023-06924-6" target="_blank">problem</a> in mathematics. This is arguably the first time a language model has created information that didn’t previously exist.</li><li>The creator of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/Mozilla-Ocho/llamafile" target="_blank">llamafile</a> has offered a set of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://justine.lol/oneliners/" target="_blank">one-line bash scripts</a> for laptop-powered AI. </li><li>Microsoft has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/" target="_blank">released</a> a small language model named Phi-2. Phi-2 is a 2.7B parameter model that has been trained extensively on “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.microsoft.com/en-us/research/publication/textbooks-are-all-you-need/" target="_blank">textbook-quality data</a>.” Without naming names, they claim performance superior to Llama 2.</li><li>Claude, Anthropic’s large language model, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://docs.anthropic.com/claude/docs/using-claude-for-sheets" target="_blank">can be used in Google Sheets</a> via a browser extension.</li><li>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://notebooklm.google.com/?pli=1" target="_blank">Notebook Language Model</a> is a RAG implementation designed for individuals. It is a Google notebook (similar to Colab or Jupyter) that allows you to upload documents and then ask questions about those documents.</li><li>The European Union is about to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/12/11/1084942/five-things-you-need-to-know-about-the-eus-new-ai-act/" target="_blank">pass its AI Act</a>, which will be the world’s most significant attempt to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.europarl.europa.eu/news/en/press-room/20231206IPR15699/artificial-intelligence-act-deal-on-comprehensive-rules-for-trustworthy-ai" target="_blank">regulate</a> artificial intelligence.</li><li>Mistral has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://mistral.ai/news/mixtral-of-experts/" target="_blank">released</a> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/docs/transformers/model_doc/mixtral" target="_blank">Mixtral</a> 8x7B, a mixture-of-experts model in which the model first determines which of eight sets of 7 billion parameters will generate the best response to a prompt. The results compare well to Llama 2. Mistral 7B and Mixtral can be run with <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/Mozilla-Ocho/llamafile" target="_blank">Llamafile</a>.</li><li>Meta has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://ai.meta.com/blog/purple-llama-open-trust-safety-generative-ai/" target="_blank">announced</a> Purple Llama, a project around trust and safety for large language models. They have released a set of benchmarks for evaluating model safety, along with a classifier for filtering unsafe input (prompts) and model output.</li><li>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://postgresml.org/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes" target="_blank">Switch Kit</a> is an open source software development kit that allows you to replace OpenAI with an open source language model easily.</li><li>Google has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://deepmind.google/technologies/gemini/#build-with-gemini" target="_blank">announced</a> that its multimodal Gemini AI model is available to software developers via their AI Studio and Vertex AI.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenextweb.com/news/new-ai-tool-democratised-image-generation" target="_blank">Progressive upscaling</a> is a technique for starting with a low-resolution image and using AI to increase the resolution. It reduces the computational power needed to generate high-resolution images. It has been implemented as a plug-in to Stable Diffusion called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2311.16973" target="_blank">DemoFusion</a>.</li><li>The internet enabled mass surveillance, but that still leaves you with exabytes of data to analyze. According to Bruce Schneier, AI’s ability to analyze and draw conclusions from that data enables “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.schneier.com/blog/archives/2023/12/the-internet-enabled-mass-surveillance-ai-will-enable-mass-spying.html" target="_blank">mass spying</a>.”</li><li>A group of over 50 organizations, including Meta, IBM, and Hugging Face, has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://9to5mac.com/2023/12/05/ai-alliance/" target="_blank">formed the AI Alliance</a> to focus on the development of open source models.</li><li>DeepMind has built an AI system that demonstrates <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-12-deepmind-ai-social-capabilities.html" target="_blank">social learning</a>: the ability to learn how to solve a problem by observing an expert.</li><li>Are neural networks the only way to build artificial intelligence?&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://hivekit.io/blog/building-ai-without-a-neural-network/" target="_blank">Hivekit</a> is building tools for a distributed spatial rules engine that can provide the communications layer for hives, swarms, and colonies.</li><li>The proliferation of AI testing tools continues with <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-12-ai-gaia-benchmark-tool-general.html" target="_blank">Gaia</a>, a benchmark suite intended to determine whether AI systems are, indeed, intelligent. The benchmark consists of a set of questions that are easy for humans to answer but difficult for computers.</li><li>Meta has just published a suite of multilingual spoken language models called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://ai.meta.com/research/seamless-communication/" target="_blank">Seamless</a>. The models are capable of near real-time translation and claim to be more faithful to natural human expression.</li><li>In an experiment simulating a stock market, a stock-trading <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.schneier.com/blog/archives/2023/12/ai-decides-to-engage-in-insider-trading.html" target="_blank">AI system engaged in “insider trading”</a> after being put under pressure to show greater returns and receiving “tips” from company “employees.”</li><li>What’s the best way to run a large language model on your laptop?&nbsp; Simon Willison <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Nov/29/llamafile/" target="_blank">recommends</a> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/jartine/llava-v1.5-7B-GGUF/blob/main/llamafile-server-0.1-llava-v1.5-7b-q4" target="_blank">llamafile</a>, which packages a model together with the weights as a single (large) executable that works on multiple operating systems.</li><li>Further work on <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html" target="_blank">extracting training data from ChatGPT</a>, this time against the production model, shows that these systems may be opaque, but they aren’t quite “black boxes.”</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://press.aboutamazon.com/2023/11/aws-announces-amazon-q-to-reimagine-the-future-of-work" target="_blank">Amazon Q</a> is a new large language model that includes a chatbot and other tools to aid office workers. It can be customized by individual businesses that subscribe to the service so that it has access to their proprietary data.</li></ul>



<h2>Programming</h2>



<ul><li>A new language superset: <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://pluto-lang.org/docs/Introduction" target="_blank">Pluto</a> is a superset of Lua. Supersetting may be the “new thing” in language design: TypeScript, Mojo, and a few others (including the first versions of C++) come to mind.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/the-new-age-of-virtualization/" target="_blank">Virtualization within containers orchestrated by Kubernetes</a>: Can you imagine a Kubernetes cluster running within a Docker container? Is that a good thing or evidence of how a stack’s complexity can grow without bounds?</li><li>Google engineers <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://dl.acm.org/doi/10.1145/3593856.3595909?utm_source=thenewstack&amp;utm_medium=website&amp;utm_content=inline-mention&amp;utm_campaign=platform" target="_blank">propose</a> an <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/year-in-review-was-2023-a-turning-point-for-microservices/" target="_blank">alternative to microservices</a>: limited monoliths that are deployed by an automated runtime that determines where and when to instantiate them. As Kelsey Hightower said, deployment architecture becomes an implementation detail.</li><li>The OpenBao project is intended to be an open source fork of HashiCorp’s Vault, analogous to the OpenTofu fork of Terraform. There is <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/meet-openbao-an-open-source-fork-of-hashicorp-vagrant/" target="_blank">speculation</a> that IBM will back both projects.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.biscuitsec.org/" target="_blank">Biscuit authorization</a> is a distributed authorization protocol that is relatively small, flexible, and is designed for use in distributed systems. Any node can validate a Biscuit token using only public information.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://gokrazy.org/" target="_blank">gokrazy</a> is a minimal Go runtime environment for the Raspberry Pi and (some) PCs. It minimizes maintenance by eliminating everything that isn’t needed to compile and run Go programs.</li><li>You very clearly don’t need this: A <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/nst/bfps" target="_blank">Brainfuck interpreter written in PostScript</a>. (If you really must know, Brainfuck is arguably the world’s most uncomfortable programming language, and PostScript is the language your computer sends to a printer.)</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://baserow.io/" target="_blank">Baserow</a> is a no-code, open source tool that combines a spreadsheet with a database. It’s similar to Airtable.</li><li>New programming language of the month: <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://wasmer.io/posts/onyxlang-powered-by-wasmer" target="_blank">Onyx</a> is a new programming language designed to generate WebAssembly (Wasm), using <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://wasmer.io/" target="_blank">Wasmer</a> as the underlying runtime.</li></ul>



<h2>Web</h2>



<ul><li>Anil Dash predicts that <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.rollingstone.com/culture/culture-commentary/internet-future-about-to-get-weird-1234938403/" target="_blank">the internet is about to get weird again</a>—the way it should be. Power is shifting from the entrenched, heavily funded “walled gardens” and back to people who just want to be creative.</li><li>Meta’s Threads has begun to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theverge.com/2023/12/13/24000120/threads-meta-activitypub-test-mastodon" target="_blank">test integration with ActivityPub</a>, which will make it accessible to Mastodon servers. </li><li>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/12/21/1084525/internet-whimsy-html-energy/" target="_blank">HTML Energy</a> movement attempts to reclaim the creativity of the early web by building sites from scratch with HTML and abandoning high-powered web frameworks.</li><li>The best WebAssembly runtime might be <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://00f.net/2023/12/11/webassembly-compilation-to-c/" target="_blank">no runtime</a> at all: just transpile it to C. </li></ul>



<h2>Security</h2>



<ul><li>Researchers have <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2023/12/hackers-can-break-ssh-channel-integrity-using-novel-data-corruption-attack/" target="_blank">discovered</a> a man-in-the-middle attack against SSH, one of the foundations of cybersecurity.</li><li>A new version of SSH (<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/francoismichel/ssh3" target="_blank">SSH3</a>) promises to be faster and more feature-rich. It is based on HTTP/3 and written in Go.</li><li>Security researchers have <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-12-customized-gpt-vulnerability.html" target="_blank">demonstrated</a> two important vulnerabilities in OpenAI’s custom GPTs. Malicious actors can extract system prompts, and they can force it to leak uploaded files and other data.</li><li>Meta has made end-to-end encryption (E2EE) the default for all users of Messenger and Facebook messaging. Their E2EE implementation is based on Signal’s. They have built a new <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/meta-rolls-out-default-end-to-end-encryption-on-messenger-facebook/" target="_blank">storage and retrieval service</a> for encrypted messages.</li><li>A chatbot driven by a jailbroken language model can be used to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.scientificamerican.com/article/jailbroken-ai-chatbots-can-jailbreak-other-chatbots/" target="_blank">jailbreak other chatbots</a>. Language models are very good at coming up with prompts that get other models to go outside their boundaries, with success rates of 40% to 60%. AI security will be a key topic this year.</li></ul>



<h2>Quantum Computing</h2>



<ul><li>IBM has developed a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/science/2023/12/ibm-adds-error-correction-to-updated-quantum-computing-roadmap/" target="_blank">1121 qubit quantum processor</a>, along with a system built from three 133 qubit processor chips that greatly improves the accuracy of quantum gates. Working quantum computers will probably require over a million qubits, but this is a big step forward.</li><li>A research group has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/science/2023/12/quantum-computer-performs-error-resistant-operations-with-logical-qubits/" target="_blank">announced</a> that it can perform computations on 48 logical (i.e., error-corrected) qubits. While there are a number of limitations to their work, it’s an important step toward practical quantum computing.</li><li>Two posts about post-<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.cryptographyengineering.com/2023/10/06/to-schnorr-and-beyond-part-1/" target="_blank">quantum</a> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.cryptographyengineering.com/2023/11/30/to-schnorr-and-beyond-part-2/" target="_blank">cryptography</a> explain what it’s about.</li></ul>



<h2>Brains</h2>



<ul><li>Researchers have developed a noninvasive system that can <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-12-portable-non-invasive-mind-reading-ai-thoughts.html" target="_blank">turn human thought into text</a>. Users wear a cap with sensors that generates EEG data. Accuracy isn’t very high yet, but it is already superior to other thought-to-speech technologies.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/12/11/1084926/human-brain-cells-chip-organoid-speech-recognition/" target="_blank">Artificial neural networks with brains</a>: Researchers connected cultured human brain cells (organoids) to an interface that allowed them to give the organoids audio data. They found that it was able to recognize vowel sounds.</li></ul>



<h2>Virtual and Augmented Reality</h2>



<ul><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/PixarAnimationStudios/OpenUSD" target="_blank">OpenUSD</a> is an open source <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/openusd-could-enable-a-real-metaverse/" target="_blank">standard for scene representation</a> that could enable a real metaverse, not the proprietary walled garden imagined by last year’s metaverse advocates.</li></ul>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/radar-trends-to-watch-january-2024/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Copyright, AI, and Provenance</title>
		<link>https://www.oreilly.com/radar/copyright-ai-and-provenance/</link>
				<comments>https://www.oreilly.com/radar/copyright-ai-and-provenance/#respond</comments>
				<pubDate>Tue, 12 Dec 2023 10:54:00 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides and Tim O’Reilly]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Deep Dive]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15292</guid>
				<description><![CDATA[Generative AI stretches our current copyright law in unforeseen and uncomfortable ways. In the US, the Copyright Office has issued guidance stating that the output of image-generating AI isn’t copyrightable unless human creativity has gone into the prompts that generated the output. This ruling in itself raises many questions: How much creativity is needed, and [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Generative AI stretches our current copyright law in unforeseen and uncomfortable ways. In the US, the Copyright Office has issued <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence" target="_blank">guidance</a> stating that the output of image-generating AI isn’t copyrightable unless human creativity has gone into the prompts that generated the output. This ruling in itself raises many questions: How much creativity is needed, and is that the same kind of creativity that an artist exercises with a paintbrush? If a human writes software to generate prompts that in turn generate an image, is that copyrightable? If the output of a model can’t be owned by a human, who (or what) is responsible if that output infringes existing copyright? Is an artist’s style copyrightable, and if so, what does that mean?</p>



<p>Another group <a rel="noreferrer noopener" aria-label="of (opens in a new tab)" href="https://www.reuters.com/technology/more-writers-sue-openai-copyright-infringement-over-ai-training-2023-09-11/" target="_blank">of</a> <a rel="noreferrer noopener" aria-label="cases (opens in a new tab)" href="https://authorsguild.org/news/ag-and-authors-file-class-action-suit-against-openai/" target="_blank">cases</a> involving text (typically novels and novelists) argue that using copyrighted texts as part of the training data for a large language model (LLM) is itself copyright infringement,<sup><a href="#foot1">1</a></sup> even if the model never reproduces those texts as part of its output. But reading texts has been part of the human learning process as long as reading has existed, and while we pay to buy books, we don’t pay to learn from them. These cases often point out that the texts used in training were acquired from pirated sources—which makes for good press, although that claim has no legal value. Copyright law says nothing about whether texts are acquired legally or illegally.</p>



<p>How do we make sense of this? What should copyright law mean in the age of artificial intelligence?</p>



<p>In an article in <em>The New Yorker</em>, Jaron Lanier introduces the idea of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.newyorker.com/science/annals-of-artificial-intelligence/there-is-no-ai" target="_blank">data dignity,</a> which implicitly distinguishes between training a model and generating output using a model. Training an LLM means teaching it how to understand and reproduce human language. (The word “teaching” arguably invests too much humanity into what is still software and silicon.) Generating output means what it says: providing the model instructions that cause it to produce something. Lanier argues that training a model should be a protected activity but that the output generated by a model can infringe on someone’s copyright.</p>



<p>This distinction is attractive for several reasons. First, current copyright law protects “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.copyright.gov/fair-use/#:~:text=Transformative%20uses%20are%20those%20that,purpose%20of%20encouraging%20creative%20expression." target="_blank">transformative use</a>.” You don’t have to understand much about AI to realize that a model is transformative. Reading about the lawsuits reaching the courts, we sometimes have the feeling that authors believe that their works are somehow hidden inside the model, that George R. R. Martin thinks that if he searched through the trillion or so parameters of GPT-4, he’d find the text to his novels. He’s welcome to try, and he won’t succeed. (OpenAI won’t give him the GPT models, but he can download the model for Meta’s Llama 2 and have at it.) This fallacy was probably encouraged by another <em>New Yorker</em> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web" target="_blank">article</a> arguing that an LLM is like a compressed version of the web. That’s a nice image, but it is fundamentally wrong. What is contained in the model is an enormous set of parameters based on all the content that has been ingested during training, that represents the probability that one word is likely to follow another. A model isn’t a copy or a reproduction, in whole or in part, lossy or lossless, of the data it’s trained on; it is the potential for creating new and different content. AI models are probability engines; an LLM computes the next word that’s most likely to follow the prompt, then the next word most likely to follow that, and so on. The ability to emit a sonnet that Shakespeare never wrote: that’s transformative, even if the new sonnet isn’t very good.</p>



<p>Lanier’s argument is that building a better model is a public good, that the world will be a better place if we have computers that can work directly with human language, and that better models serve us all—even the authors whose works are used to train the model. I can ask a vague, poorly formed question like “In which 21st century novel do two women travel to Parchman prison to pick up one of their husbands who is being released,” and get the answer “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.amazon.com/Sing-Unburied-Novel-Jesmyn-Ward/dp/1501126067" target="_blank"><em>Sing, Unburied, Sing</em></a> by Jesmyn Ward.” (Highly recommended, BTW, and I hope this mention generates a few sales for her.) I can also ask for a reading list about plagues in 16th century England, algorithms for testing prime numbers, or anything else. Any of these prompts might generate book sales—but whether or not sales result, they will have expanded my knowledge. Models that are trained on a wide variety of sources are a good; that good is transformative and should be protected.</p>



<p>The problem with Lanier’s concept of data dignity is that, given the current state of the art in AI models, it is impossible to distinguish meaningfully between “training” and “generating output.” Lanier recognizes that problem in his criticism of the current generation of “black box” AI, in which it’s impossible to connect the output to the training inputs on which the output was based. He asks, “Why don’t bits come attached to the stories of their origins?,” pointing out that this problem has been with us since the beginning of the web. Models are trained by giving them smaller bits of input and asking them to predict the next word billions of times; tweaking the model’s parameters slightly to improve the predictions; and repeating that process thousands, if not millions, of times. The same process is used to generate output, and it’s important to understand why that process makes copyright problematic. If you give a model a prompt about Shakespeare, it might determine that the output should start with the word “To.” Given that it has already chosen “To,” there’s a slightly higher probability that the next word in the output will be “be.” Given that, there’s an even slightly higher probability that the next word will be “or.” And so on. From this standpoint, it’s hard to say that the model is copying the text. It’s just following probabilities—a “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://dl.acm.org/doi/10.1145/3442188.3445922" target="_blank">stochastic parrot</a>.” It’s more like monkeys typing randomly at keyboards than a human plagiarizing a literary text—but these are highly trained, probabilistic monkeys that actually have a chance at reproducing the works of Shakespeare.</p>



<p>An important consequence of this process is that it’s not possible to connect the output back to the training data. Where did the word “or” come from? Yes, it happens to be the next word in Hamlet’s famous soliloquy; but the model wasn’t copying Hamlet, it just picked “or” out of the hundreds of thousands of words it could have chosen, on the basis of statistics. It isn’t being creative in any way we as humans would recognize. It’s maximizing the probability that we (humans) will perceive the output it generates as a valid response to the prompt.</p>



<p>We believe that authors should be compensated for the use of their work—not in the creation of the model, but when the model produces their work as output. Is it possible? For a company like O’Reilly Media, a related question comes into play. Is it possible to distinguish between creative output (“Write in the style of Jesmyn Ward”) and actionable output (“Write a program that converts between current prices of currencies and altcoins”)? The response to the first question might be the start of a new novel—which might be substantially different from anything Ward wrote, and which doesn’t devalue her work any more than her second, third, or fourth novels devalue her first novel. Humans copy each other’s style all the time! That’s why English style post-Hemingway is so distinctive from the style of 19th century authors, and an AI-generated homage to an author might actually increase the value of the original work, much as human “fan-fic” encourages rather than detracts from the popularity of the original.</p>



<p>The response to the second question is a piece of software that could take the place of something a previous author has written and published on GitHub. It could substitute for that software, possibly cutting into the programmer’s revenue. But even these two cases aren’t as different as they first appear. Authors of “literary” fiction are safe, but what about actors or screenwriters whose work could be ingested by a model and transformed into new roles or scripts? There are 175 <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Nancy_Drew" target="_blank">Nancy Drew</a> books, all “authored” by the nonexistent Carolyn Keene but written by a long chain of ghostwriters. In the future, AIs may be included among those ghostwriters. How do we account for the work of authors—of novels, screenplays, or software—so they can be compensated for their contributions? What about the authors who teach their readers how to master a complicated technology topic? The output of a model that reproduces their work provides a direct substitute rather than a transformative use that may be complementary to the original.</p>



<p>It may not be possible if you use a generative model configured as a chat server by itself. But that isn’t the end of the story. In the year or so since ChatGPT’s release, developers have been building applications on top of the state-of-the-art foundation models. There are many different ways to build applications, but one pattern has become prominent: retrieval-augmented generation, or RAG. RAG is used to build applications that “know about” content that isn’t in the model’s training data. For example, you might want to write a stockholders’ report or generate text for a product catalog. Your company has all the data you need—but your company’s financials obviously weren’t in ChatGPT’s training data. RAG takes your prompt, loads documents in your company’s archive that are relevant, packages everything together, and sends the prompt to the model. It can include instructions like “Only use the data included with this prompt in the response.” (This may be too much information, but this process generally works by generating “embeddings” for the company’s documentation, storing those embeddings in a vector database, and retrieving the documents that have embeddings similar to the user’s original question. Embeddings have the important property that they reflect relationships between words and texts. They make it possible to search for relevant or similar documents.)</p>



<p>While RAG was originally conceived as a way to give a model proprietary information without going through the labor- and compute-intensive process of training, in doing so it creates a connection between the model’s response and the documents from which the response was created. The response is no longer constructed from random words and phrases that are detached from their sources. We have provenance. While it still may be difficult to evaluate the contribution of the different sources (23% from A, 42% from B, 35% from C), and while we can expect a lot of natural language “glue” to have come from the model itself, we’ve taken a big step forward toward Lanier’s data dignity. We’ve created traceability where we previously had only a black box. If we published someone’s currency conversion software in a book or training course and our language model reproduces it in response to a question, we can attribute that to the original source and allocate royalties appropriately. The same would apply to new novels in the style of Jesmyn Ward or, perhaps more appropriately, to the never-named creators of pulp fiction and screenplays.</p>



<p>Google’s “AI-powered overview” feature<sup><a href="#foot2">2</a></sup> is a good example of what we can expect with RAG. We can’t say for certain that it was implemented with RAG, but it clearly follows the pattern. Google, which invented Transformers, knows better than anyone that Transformer-based models destroy metadata unless you do a lot of special engineering. But Google has the best search engine in the world. Given a search string, it’s simple for Google to perform the search, take the top few results, and then send them to a language model for summarization. It relies on the model for language and grammar but derives the content from the documents included in the prompt. That process could give exactly the results shown below: a summary of the search results, with down arrows that you can open to see the sources from which the summary was generated. Whether this feature improves the search experience is a good question: while an interested user can trace the summary back to its source, it places the source two steps away from the summary. You have to click the down arrow, then click on the source to get to the original document. However, that design issue isn’t germane to this discussion. What’s important is that RAG (or something like RAG) has enabled something that wasn’t possible before: we can now trace the sources of an AI system’s output.</p>



<p>Now that we know that it’s possible to produce output that respects copyright and, if appropriate, compensates the author, it’s up to regulators to hold companies accountable for failing to do so, just as they are held accountable for hate speech and other forms of inappropriate content. We should not buy into the assertion of the large LLM providers that this is an impossible task. It is one more of the many business models and ethical challenges that they must overcome.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig1.png" alt="" class="wp-image-15293" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig1.png 292w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig1-135x300.png 135w" sizes="(max-width: 292px) 100vw, 292px" /></figure>



<p>The RAG pattern has other advantages. We’re all familiar with the ability of language models to “hallucinate,” to make up facts that often sound very convincing. We constantly have to remind ourselves that AI is only playing a statistical game, and that its prediction of the most likely response to any prompt is often wrong. It doesn’t know that it is answering a question, nor does it understand the difference between facts and fiction. However, when your application supplies the model with the data needed to construct a response, the probability of hallucination goes down. It doesn’t go to zero, but it is significantly lower than when a model creates a response based purely on its training data. Limiting an AI to sources that are known to be accurate makes the AI’s output more accurate.</p>



<p>We’ve only seen the beginnings of what’s possible. The simple RAG pattern, with one prompt orchestrator, one content database, and one language model, will no doubt become more complex. We will soon see (if we haven’t already) systems that take input from a user, generate a series of prompts (possibly for different models), combine the results into a new prompt, which is then sent to a different model. You can already see this happening in the latest iteration of GPT-4: when you send a prompt asking GPT-4 to generate a picture, it processes that prompt, then sends the results (probably including other instructions) to DALL-E for image generation. Simon Willison has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Nov/15/gpts/" target="_blank">noted</a> that if the prompt includes an image, GPT-4 never sends that image to DALL-E; it converts the image into a prompt, which is then sent to DALL-E with a modified version of your original prompt. Tracing provenance with these more complex systems will be difficult—but with RAG, we now have the tools to do it.</p>



<p></p>



<hr class="wp-block-separator" />



<h2>AI at O&#8217;Reilly Media</h2>



<p>We’re experimenting with a variety of RAG-inspired ideas on <a rel="noreferrer noopener" aria-label="the O’Reilly learning platform (opens in a new tab)" href="https://learning.oreilly.com/" target="_blank">the O’Reilly learning platform</a>. The first extends <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/pub/pr/3308" target="_blank">Answers</a>, our AI-based search tool that uses natural language queries to find specific answers in our vast corpus of courses, books, and videos. In this next version, we’re placing Answers directly within the reading context and using an LLM to generate content-specific questions about the material to enhance your understanding of the topic.</p>



<p>For example, if you’re reading about gradient descent, the new version of Answers will generate a set of related questions, such as how to compute a derivative or use a vector library to increase performance. In this instance, RAG is used to identify key concepts and provide links to other resources in the corpus that will deepen the learning experience.</p>



<figure class="wp-block-image size-large is-resized"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig2.png" alt="" class="wp-image-15294" width="544" height="389" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig2.png 468w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig2-300x215.png 300w" sizes="(max-width: 544px) 100vw, 544px" /><figcaption><br><em>Answers 2.0, expected to go into beta in the first half of 2024</em></figcaption></figure>



<p>Our second project is geared toward making our long-form video courses simpler to browse. Working with our friends at <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://designsystemsinternational.com/" target="_blank">Design Systems International</a>, we’re developing a feature called “Ask this course,” which will allow you to “distill” a course into just the question you’ve asked. While conceptually similar to Answers, the idea of “Ask this course” is to create a new experience within the content itself rather than just linking out to related sources. We use a LLM to provide section titles and a summary to stitch together disparate snippets of content into a more cohesive narrative.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig3.png" alt="" class="wp-image-15295" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig3.png 468w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/12/Fig3-300x185.png 300w" sizes="(max-width: 468px) 100vw, 468px" /><figcaption><br><em>Ask this course, expected to go into beta in the first half of 2024</em></figcaption></figure>



<p></p>



<hr class="wp-block-separator" />



<h3>Footnotes</h3>



<div>
<a name="foot1">1.</a> The first case to reach the courts involving novels and other prose works has been dismissed; the judge said that the claim that the model itself infringed upon the authors’ copyrights was “nonsensical,” and the plaintiffs did not present any evidence that the model actually produced infringing works.
</div>
<p></p>
<div>
<a name="foot2">2.</a> As of November 16, 2023, it’s unclear who has access to this feature; it appears to be in some kind of gradual rollout, A/B test, or beta test, and may be limited to specific browsers, devices, operating systems, or account types.
</div>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/copyright-ai-and-provenance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Radar Trends to Watch: December 2023</title>
		<link>https://www.oreilly.com/radar/radar-trends-to-watch-december-2023/</link>
				<comments>https://www.oreilly.com/radar/radar-trends-to-watch-december-2023/#respond</comments>
				<pubDate>Tue, 05 Dec 2023 11:02:29 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[Radar Trends]]></category>
		<category><![CDATA[Signals]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15282</guid>
				<description><![CDATA[We’re continuing to push AI content into other areas, as appropriate. AI is influencing everything, including biology. Perhaps the biggest new trend, though, is the interest that security researchers are taking in AI. Language models present a whole new class of vulnerabilities, and we don’t yet know how to defend against most of them. We’ve [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>We’re continuing to push AI content into other areas, as appropriate. AI is influencing everything, including biology. Perhaps the biggest new trend, though, is the interest that security researchers are taking in AI. Language models present a whole new class of vulnerabilities, and we don’t yet know how to defend against most of them. We’ve known about prompt injection for a time, but SneakyPrompt is a way of tricking language models by composing nonsense words from fragments that are still meaningful to the model. And cross-site prompt injection means putting a hostile prompt into a document and then sharing that document with a victim who is using an AI-augmented editor; the hostile prompt is executed by the victim when they open the document. Those two have already been fixed, but if I know anything about security, that is only the beginning.</p>



<h2>Artificial Intelligence</h2>



<ul><li>We have seen several automated testing tools for evaluating and testing AI system, including <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.giskard.ai/" target="_blank">Giskard</a> and <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://talc.ai/" target="_blank">Talc</a>.</li><li>Amazon has announced <a rel="noreferrer noopener" aria-label="Q (opens in a new tab)" href="https://aws.amazon.com/q/" target="_blank">Q</a>, an AI chatbot that is designed for business. They claim that it can use information in your company’s private data, suggesting that it is using the RAG pattern to supplement the model itself.</li><li>Let the context wars begin. Anthropic <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.anthropic.com/index/claude-2-1" target="_blank">announces</a> a 200K context window for Claude 2.1, along with a 50% decline in the percentage of false statements (hallucinations). Unlike most AI systems, Claude 2.1 is able to say “I don’t know” when it doesn’t have the answer to a question.</li><li>There’s a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/Acly/krita-ai-diffusion" target="_blank">tool</a> for integrating generative art AI with the Krita open source drawing tool. It preserves a human-centered artist’s workflow while integrating AI. It uses Stable Diffusion and can run locally, with sufficient processing power; it might be capable of using other models.</li><li>Simon Willison has published an excellent <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Nov/15/gpts/" target="_blank">exploration</a> of OpenAI’s GPTs. They’re more than they seem: not just a simple way of storing useful prompts.</li><li>Google has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://deepmind.google/discover/blog/transforming-the-future-of-music-creation/" target="_blank">announced</a> some new models for AI-generated music. One model can provide an orchestration for a simple melody line, and represents an interesting connection between human creativity and AI. Audio output is <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theverge.com/2023/11/16/23963607/google-deepmind-synthid-audio-watermarks" target="_blank">watermarked</a> with SynthID.</li><li>Warner Bros. is using AI to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-11-beatles-ai-edith-piaf-voice.html" target="_blank">simulate</a> the voice and image of <a rel="noreferrer noopener" aria-label="É (opens in a new tab)" href="https://en.wikipedia.org/wiki/%C3%89dith_Piaf" target="_blank">Édith Piaf</a> for an upcoming biopic. Unlike the Beatles’ “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.youtube.com/watch?v=Opxhh9Oh3rg" target="_blank">Now and Then</a>,” which used AI to restore John Lennon’s voice from earlier tapes, AI will synthesize Piaf’s voice and image to use in narration and video.</li><li>An AI system from Google’s Deep Mind has been shown to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/science/2023/11/ai-outperforms-conventional-weather-forecasting-for-the-first-time-google-study/" target="_blank">outperform</a> traditional weather forecasting. This is the first time AI has outperformed human weather prediction.</li><li>A researcher has <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2305.13873" target="_blank">proposed</a> a method for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-11-filter-tackle-unsafe-ai-generated-images.html" target="_blank">detecting and filtering</a> unsafe and hateful images that are generated by AI.</li><li>AI-generated facial images of White people can now appear “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-11-ai-images-white-hyper-real.html" target="_blank">more real</a>” than actual photographs. The same is not true of images of racial or ethnic minorities. What are the consequences of White faces being perceived as “more realistic”?</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://jxnl.github.io/instructor/blog/2023/11/05/chain-of-density/#original-prompt" target="_blank">Chain of Density</a> is a relatively new prompting technique. You ask a language model to summarize something. The initial response will probably be verbose. Then you ask it to improve the summary by adding new facts without increasing the summary’s length.</li><li>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/HuggingFaceH4/zephyr-7b-beta" target="_blank">Zephyr-7B</a> model, a fine-tuned descendant of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/mistralai/Mistral-7B-v0.1" target="_blank">Mistral-7B</a>, outperforms other 7B models on benchmarks. It was trained using a technique called knowledge distillation. It has not been trained to reject hate speech and other inappropriate output.</li><li>Can a large language model be the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://medium.com/@ronaldmannak/goodbye-windows-hello-llms-the-future-of-operating-systems-7ba61ea03e8d" target="_blank">operating system of the future</a>? And if so, what would that look like?</li><li>Quantization is a technique for reducing the size of large language models by storing parameters in as few as 4 bits. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2210.17323" target="_blank">GPTQ</a> is an open source tool for quantizing models. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/PanQiWei/AutoGPTQ" target="_blank">AutoGPTQ</a> is another implementation that’s compatible with the Hugging Face Transformers library.</li><li>Researchers use machine learning to enable users to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-11-machine-users-superhuman-ability-tools.html" target="_blank">create objects in virtual reality</a> without touching a keyboard or a mouse. Gestural interfaces haven’t worked well in the past. Is this their time?</li><li>Google’s PaLl-3 is a vision model with 5 billion parameters that <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://the-decoder.com/googles-new-pali-3-vision-language-model-achieves-performance-of-10x-larger-models/" target="_blank">consistently outperforms</a> much larger models.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/vectara/hallucination_evaluation_model" target="_blank">Hem</a> is an open source model for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://vectara.com/measuring-hallucinations-in-rag-systems/" target="_blank">measuring generative AI hallucinations</a>. It’s an interesting idea, though given a first glance at the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/vectara/hallucination-leaderboard" target="_blank">leaderboard</a>, it seems overly generous.</li><li>OpenAI has announced the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theverge.com/2023/11/6/23948957/openai-chatgpt-gpt-custom-developer-platform" target="_blank">GPT store</a>, an app store that is essentially a mechanism for sharing prompts. They also announced a no-code development platform for GPT “agents,” lower pricing for GPT-4, and indemnification against copyright lawsuits for users of GPT products.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.langchain.com/langsmith" target="_blank">LangSmith</a> looks like a good <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://cobusgreyling.medium.com/using-%EF%B8%8Flangsmith-to-inspect-langchain-agents-7e621e20c9c8" target="_blank">platform for developing and debugging</a> LangChain-based AI agents.</li><li>Tim Bray <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.tbray.org/ongoing/When/202x/2023/10/28/C2PA-Workflows" target="_blank">explains</a> Leica’s use of C2PA to watermark photographs. C2PA is a standard that uses public key cryptography to trace image provenance. Photoshop implements C2PA, allowing both the image creator and its (Photoshop) editors to be traced.</li></ul>



<h2>Security</h2>



<ul><li>An important new group of attacks against Bluetooth, called <a href="https://www.bleepingcomputer.com/news/security/new-bluffs-attack-lets-attackers-hijack-bluetooth-connections/">BLUFFS</a>, allows attackers to impersonate others’ devices and to execute man-in-the-middle attacks. All Bluetooth devices since roughly 2014 are vulnerable.</li><li>If you aren’t already careful about what you plug in to your USB ports, you should be. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2023/11/normally-targeting-ukraine-russian-state-hackers-spread-usb-worm-worldwide/#p3" target="_blank">LitterDrifter</a> is a worm that propagates via USB drives. It is oriented towards data collection (i.e., espionage), and was developed by a group with close ties to the Russian state.</li><li>The AlphV ransomware group wins the irony award. They <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2023/11/ransomware-group-reports-victim-it-breached-to-sec-regulators/" target="_blank">reported</a> one of their victims to the SEC for not disclosing the attack. Other groups are following the same strategy. The law requiring disclosure is not yet in effect, so aside from PR damage, consequences will be minor.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/11/17/1083593/text-to-image-ai-models-can-be-tricked-into-generating-disturbing-images/" target="_blank">SneakyPrompt</a> is a new technique for creating hostile prompts that can “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2305.12082" target="_blank">jailbreak</a>” image generators, causing them to generate images that violate policies. It works by substituting tokens from words that aren’t allowed with tokens from other words that are semantically similar, creating a “word” that is nonsensical to humans but still meaningful to the model.</li><li>Security researchers <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://embracethered.com/blog/posts/2023/google-bard-data-exfiltration/" target="_blank">showed</a> that Google’s Bard was vulnerable to prompt injection via Gmail, Google Docs, and other documents that were shared with unsuspecting victims. The hostile prompt was executed when the user opened the document. The vulnerability was promptly fixed, but it shows what will happen as language models become part of our lives.</li><li>Researchers have <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2023/11/hackers-can-steal-ssh-cryptographic-keys-in-new-cutting-edge-attack/" target="_blank">demonstrated</a> that an error during signature generation can <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://eprint.iacr.org/2023/1711.pdf" target="_blank">expose</a> private SSH keys to attack. Open source SSH implementations have countermeasures that protect them from this attack, but some proprietary implementations don’t.</li><li>If you’re concerned about privacy, worry about the data broker industry, not Google and Facebook. A <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techpolicy.sanford.duke.edu/data-brokers-and-the-sale-of-data-on-us-military-personnel/" target="_blank">report</a> shows that it’s easy to obtain information (including net worth and home ownership) about US military service members with minimal vetting.</li><li>Proposed EU legislation called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theregister.com/2023/11/08/europe_eidas_browser/" target="_blank">eIDAS</a> 2.0 (electronic ID, Authentication and Services) gives European governments the ability to conduct man-in-the-middle attacks against secured web communications (TLS and https). It would be illegal for browser makers to reject certificates compromised by governments.</li><li>Developer backlash against <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/the-limits-of-shift-left-whats-next-for-developer-security/" target="_blank">the Shift-Left approach to security</a> isn’t unexpected, but it may be reaching its limits in other ways: attackers are focusing less on vulnerabilities in code and more on flaws in business logic—in addition to targeting users themselves.</li><li>History is important. Gene Spafford has posted an excellent 35th anniversary <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.cerias.purdue.edu/site/blog/post/reflecting_on_the_internet_worm_at_35/" target="_blank">essay</a> about the Morris Worm, and lessons drawn from it that are still applicable today.</li><li>In a simulated financial system, a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bbc.com/news/technology-67302788" target="_blank">trading bot based on GPT-4</a> not only used information that was declared as “insider information”; it stated that it had not used any insider information. The benefit of using the information outweighed the risk of being discovered. (Or perhaps it was behaving the same way as human traders.)</li></ul>



<h2>Programming</h2>



<ul><li>If you write shell scripts, you will find this useful: <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.shellcheck.net/" target="_blank">ShellCheck</a>, a program to find bugs in shell scripts.</li><li>India has been experimenting successfully with digital public goods—publishing open source software with open standards and data—for creating a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://marginalrevolution.com/marginalrevolution/2023/11/the-indian-challenge-to-blockchains-digital-public-goods.html?utm_source=feedly&amp;utm_medium=rss&amp;utm_campaign=the-indian-challenge-to-blockchains-digital-public-goods" target="_blank">digital commons</a>. Such a commons might be a practical alternative to blockchains.</li><li>The Python Software Foundation has hired a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/pythons-new-security-developer-has-plans-to-secure-the-language/" target="_blank">security</a> developer, with the intention of improving Python’s security features.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://zknill.io/posts/collaboration-no-crdts/" target="_blank">Collaboration without CRDTs</a>: CRDTs are important—but for many kinds of applications, it’s possible to build collaborative software without them.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://shadowtraffic.io/" target="_blank">ShadowTraffic</a> is a service for simulating traffic to backend systems. It is packaged as a Docker container, so it can easily run locally or in a cloud. It can currently simulate traffic for Kafka and Postgres, and webhooks, but its developer plans to expand to other backends quickly.</li><li>The Rust + Wasm stack is a good choice for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.secondstate.io/articles/fast-llm-inference/" target="_blank">running Llama 2 models efficiently on an M2 MacBook</a>. Memory requirements, disk requirements, and performance are much better than with Python.</li><li>GitHub’s <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://githubnext.com/projects/copilot-for-docs/" target="_blank">Copilot for Docs</a> lets users ask questions that are answered by a chatbot trained on documentation in GitHub’s repositories. They plan to integrate other documentation, along with other GitHub content.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/KillianLucas/open-interpreter/" target="_blank">OpenInterpreter</a> sends prompts to a language model, and then runs the code generated by those prompts locally. You can inspect the code before it runs. It defaults to GPT-4, but can use other models, including models running locally. Automatically executing generated code is a bad idea, but it’s a step towards automating everything.</li><li>Microsoft’s Radius is a<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/exploring-microsoft-radius-application-platform/" target="_blank"> cloud native application platform</a> that provides a unified model for developing and deploying applications on all the major cloud providers.</li><li>Doug Crockford, author of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/library/view/javascript-the-good/9780596517748/" target="_blank"><em>JavaScript: The Good Parts</em></a>, has created a new programming language called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.crockford.com/misty/" target="_blank">Misty</a>. It is designed to be used both by students and professional programmers. Reactions are mixed, but anything Doug does is worth following.</li><li>Knowing how to use the terminal is a superpower. But terminals make one thing difficult: recording terminal sessions. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://asciinema.org/" target="_blank">Asciinema</a> is an open source project that solves the problem.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://spin.atomicobject.com/2023/11/04/bug-triage/?utm_source=feedblitz&amp;utm_medium=FeedBlitzRss&amp;utm_campaign=atomicspin" target="_blank">Bug triage</a>: You can’t fix all the bugs. But you can prioritize what to fix, and when.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/ohmjs/ohm" target="_blank">Ohm</a> is a toolkit for creating parsers, using the Ohm language to define grammars. It has a JavaScript API and an <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://ohmjs.org/editor/" target="_blank">interactive editor</a>. The editor includes a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://dubroy.com/blog/visualizing-packrat-parsing/" target="_blank">visualiser</a> for exploring how a parser works.</li><li>Bjarne Stroustrup <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/bjarne-stroustrups-plan-for-bringing-safety-to-c/" target="_blank">proposes</a> memory safety for C++.</li></ul>



<h2>Web</h2>



<ul><li>We don’t know why you’d want to run Windows 98 in the browser, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://copy.sh/v86/?profile=windows98" target="_blank">but you can</a>. There’s no hint about how this is implemented; I assume it is some sort of Wasm wizardry.</li><li>Opt for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.jim-nielsen.com/2023/html-web-components/" target="_blank">enhancement over replacement</a>: that’s the argument for using HTML Web Components rather than React components.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/tldraw/draw-a-ui" target="_blank">tldraw</a> is a simple application that lets you draw a wireframe for a website on a screen, specify the components you want to implement it, and send it to GPT-4, which generates code for a mockup. The mockup can then be edited, and the code regenerated.</li><li>Google is <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/tech-policy/2023/11/google-sues-people-who-weaponized-dmca-to-remove-rivals-search-results/" target="_blank">suing</a> two people who have “weaponized” the DMCA by issuing false takedown notices against the websites of products (apparently T-shirts) that compete with them.</li><li>WebRTC was designed to support videoconferencing. It has been used for many other real time applications, but there should be alternatives available. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://quic.video/blog/replacing-webrtc/" target="_blank">Replacing it</a> will take years, but that’s the goal of the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://datatracker.ietf.org/wg/moq/about/" target="_blank">Media over Quic</a> project.</li></ul>



<h2>Biology</h2>



<ul><li>The UK has approved a CRISPR-based genetic <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/health/2023/11/uk-becomes-first-country-to-approve-crispr-gene-editing-therapy/" target="_blank">therapy for sickle cell anemia</a> and beta thalassemia.</li><li>A European startup named Cradle has created a generative AI model to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenextweb.com/news/dutch-biotech-startup-bags-e22m-for-proprietary-generative-ai-model" target="_blank">design new proteins</a>.</li><li>In a small test involving patients with a genetic predisposition to high cholesterol, a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/health/2023/11/crispr-gene-editing-shown-to-permanently-lower-high-cholesterol/" target="_blank">CRISPR</a> treatment that modified a gene in the liver appeared to reduce cholesterol levels permanently. Larger and more comprehensive testing will follow.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/11/10/1083222/covid-moonshot-drug-discovery-open-source/" target="_blank">Open source drug discovery</a> might be an approach for developing antivirals for many common diseases for which there are no treatments, including diseases as common as measles and West Nile.</li></ul>



<h2>Hardware</h2>



<ul><li>AI is coming to the Internet of Things. ARM’s latest CPU design, the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.arm.com/products/silicon-ip-cpu/cortex-m/cortex-m52" target="_blank">Cortex-M52</a>, is a processor designed for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/arm-pushes-ai-into-the-smallest-iot-devices-with-cortex-m52-chip/" target="_blank">AI in low-power, low-cost devices</a>.</li><li>Microsoft has developed its own AI chip, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theverge.com/2023/11/15/23960345/microsoft-cpu-gpu-ai-chips-azure-maia-cobalt-specifications-cloud-infrastructure" target="_blank">Maia</a>, which will be available on Azure in 2024.</li><li>H100 GPUs are yesterday’s technology. NVIDIA has announced the <a href="https://www.nvidia.com/en-gb/data-center/h200/" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">H200</a>, with more and faster memory. NVIDIA claims almost double the performance of the H100 in LLM inference, and up to 100X performance for “data science” applications.</li></ul>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/radar-trends-to-watch-december-2023/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Strawberry Fields Forever</title>
		<link>https://www.oreilly.com/radar/strawberry-fields-forever/</link>
				<comments>https://www.oreilly.com/radar/strawberry-fields-forever/#respond</comments>
				<pubDate>Thu, 30 Nov 2023 11:35:20 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Commentary]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15271</guid>
				<description><![CDATA[Tim O’Reilly forwarded an excellent article about the OpenAI soap opera to me: Matt Levine’s “Money Stuff: Who Controls Open AI.” I’ll skip most of it, but something caught my eye. Toward the end, Levine writes about Elon Musk’s version of Nick Bostrom’s AI that decides to turn the world to paperclips: [Elon] Musk gave [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Tim O’Reilly forwarded an excellent article about the OpenAI soap opera to me: Matt Levine’s “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://news.bloomberglaw.com/mergers-and-acquisitions/matt-levines-money-stuff-who-controls-openai" target="_blank">Money Stuff: Who Controls Open AI</a>.” I’ll skip most of it, but something caught my eye. Toward the end, Levine writes about Elon Musk’s version of Nick Bostrom’s AI that decides to turn the world to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Instrumental_convergence" target="_blank">paperclips</a>:</p>



<blockquote class="wp-block-quote"><p>[Elon] Musk gave an example of an artificial intelligence that’s given the task of picking strawberries. It seems harmless enough, but as the AI redesigns itself to be more effective, it might decide that the best way to maximize its output would be to destroy civilization and convert the entire surface of the Earth into strawberry fields.</p></blockquote>



<p>That gets me, but not in the way you think. It’s personally poignant, for reasons entirely different from the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://buzzmachine.com/2023/11/19/artificial-general-bullshit/" target="_blank">AI-doomerism cults</a> that Musk, Bostrom, and others are propagating.</p>



<p>When I was a graduate student at Stanford, I was driving around with a friend through the endless maze of parking lots and strip malls in that nondescript part of Silicon Valley where Sunnyvale, Santa Clara, and Cupertino come together. My friend pointed out the window and said, “That&#8217;s where my father&#8217;s farm was.” I asked what his father grew; it was very difficult to imagine a farm at that location. He grew strawberries. And what happened to the farm? His father lost it when he was put into a World War II internment camp for Japanese. A real estate investor ended up with it. My friend’s father eventually committed suicide. The farm became a parking lot.</p>



<p>This gets me back to an argument that I&#8217;ve made in <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/radar/the-ethics-of-artificial-intelligence/" target="_blank">older Radar articles</a>: Our fears of AI are really fears of ourselves, fears that AI will act as badly as humans have repeatedly acted. We don&#8217;t need AI to turn the world into strawberries any more than we need it to turn the world into parking lots. We&#8217;re already turning the world into parking lots, and doing so without regard to the human cost. We’re already spewing CO<sub>2</sub> at a rate that will soon make the world uninhabitable for all but the few who can insulate themselves from the consequences. If we&#8217;re going to solve these problems, it won’t be through technology. It&#8217;s through finding better humans than Elon and, I fear, Sam Altman. We don&#8217;t have a chance to solve the AI problem if we can&#8217;t solve the human problem. And if we don’t solve the human problem, the AI problem is irrelevant.</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/strawberry-fields-forever/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Generative AI in the Enterprise</title>
		<link>https://www.oreilly.com/radar/generative-ai-in-the-enterprise/</link>
				<comments>https://www.oreilly.com/radar/generative-ai-in-the-enterprise/#respond</comments>
				<pubDate>Tue, 28 Nov 2023 18:04:08 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Research]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15261</guid>
				<description><![CDATA[Generative AI has been the biggest technology story of 2023. Almost everybody’s played with ChatGPT, Stable Diffusion, GitHub Copilot, or Midjourney. A few have even tried out Bard or Claude, or run LLaMA1 on their laptop. And everyone has opinions about how these language models and art generation programs are going to change the nature [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Generative AI has been the biggest technology story of 2023. Almost everybody’s played with ChatGPT, Stable Diffusion, GitHub Copilot, or Midjourney. A few have even tried out Bard or Claude, or run LLaMA<sup data-type="footnote">1</sup> on their laptop. And everyone has opinions about how these language models and art generation programs are going to change the nature of work, usher in the singularity, or perhaps even doom the human race. In enterprises, we’ve seen everything from wholesale adoption to policies that severely restrict or even forbid the use of generative AI.</p>



<p>What’s the reality? We wanted to find out what people are actually doing, so in September we surveyed O’Reilly’s users. Our survey focused on how companies use generative AI, what bottlenecks they see in adoption, and what skills gaps need to be addressed.</p>



<h2>Executive Summary</h2>



<p>We’ve never seen a technology adopted as fast as generative AI—it’s hard to believe that ChatGPT is barely a year old. As of November 2023:</p>



<ul><li>Two-thirds (67%) of our survey respondents report that their companies are using generative AI.</li><li>AI users say that AI programming (66%) and data analysis (59%) are the most needed skills.</li><li>Many AI adopters are still in the early stages. 26% have been working with AI for under a year. But 18% already have applications in production.</li><li>Difficulty finding appropriate use cases is the biggest bar to adoption for both users and nonusers.</li><li>16% of respondents working with AI are using open source models.</li><li>Unexpected outcomes, security, safety, fairness and bias, and privacy are the biggest risks for which adopters are testing.</li><li>54% of AI users expect AI’s biggest benefit will be greater productivity. Only 4% pointed to lower head counts.</li></ul>



<p>Is generative AI at the top of the hype curve? We see plenty of room for growth, particularly as adopters discover new use cases and reimagine how they do business.</p>



<h2>Users and Nonusers</h2>



<p>AI adoption is in the process of becoming widespread, but it’s still not universal. Two-thirds of our survey’s respondents (67%) report that their companies are using generative AI. 41% say their companies have been using AI for a year or more; 26% say their companies have been using AI for less than a year. And only 33% report that their companies aren&#8217;t using AI at all.</p>



<p>Generative AI users represent a two-to-one majority over nonusers, but what does that mean? If we asked whether their companies were using databases or web servers, no doubt 100% of the respondents would have said “yes.” Until AI reaches 100%, it’s still in the process of adoption. ChatGPT was opened to the public on November 30, 2022, roughly a year ago; the art generators, such as Stable Diffusion and DALL-E, are somewhat older. A year after the first web servers became available, how many companies had websites or were experimenting with building them? Certainly not two-thirds of them. Looking only at AI users, over a third (38%) report that their companies have been working with AI for less than a year and are almost certainly still in the early stages: they’re experimenting and working on proof-of-concept projects. (We’ll say more about this later.) Even with cloud-based <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://hai.stanford.edu/news/what-foundation-model-explainer-non-experts" target="_blank">foundation models</a> like GPT-4, which eliminate the need to develop your own model or provide your own infrastructure, fine-tuning a model for any particular use case is still a major undertaking. We’ve never seen adoption proceed so quickly.</p>



<p>When 26% of a survey’s respondents have been working with a technology for under a year, that’s an important sign of momentum. Yes, it’s conceivable that AI—and specifically generative AI—could be at the peak of the hype cycle, as <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.gartner.com/en/newsroom/press-releases/2023-08-16-gartner-places-generative-ai-on-the-peak-of-inflated-expectations-on-the-2023-hype-cycle-for-emerging-technologies" target="_blank">Gartner has argued</a>. We don’t believe that, even though the failure rate for many of these new projects is undoubtedly high. But while the rush to adopt AI has plenty of momentum, AI will still have to prove its value to those new adopters, and soon. Its adopters expect returns, and if not, well, AI has experienced many <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.techtarget.com/searchenterpriseai/definition/AI-winter" target="_blank">“winters”</a> in the past. Are we at the top of the adoption curve, with nowhere to go but down? Or is there still room for growth?</p>



<p>We believe there’s a lot of headroom. Training models and developing complex applications on top of those models is becoming easier. Many of the new open source models are much smaller and not as resource intensive but still deliver good results (especially when trained for a specific application). Some can easily be run on a laptop or even in a web browser. A healthy tools ecosystem has grown up around generative AI—and, as was said about the California Gold Rush, if you want to see who’s making money, don’t look at the miners; look at the people selling shovels. Automating the process of building complex prompts has become common, with patterns like retrieval-augmented generation (RAG) and tools like LangChain. And there are tools for archiving and indexing prompts for reuse, vector databases for retrieving documents that an AI can use to answer a question, and much more. We’re already moving into the second (if not the third) generation of tooling. A roller-coaster ride into Gartner’s “trough of disillusionment” is unlikely.</p>



<h2>What’s Holding AI Back?</h2>



<p>It was important for us to learn why companies aren’t using AI, so we asked respondents whose companies aren’t using AI a single obvious question: “Why isn’t your company using AI?” We asked a similar question to users who said their companies are using AI: “What’s the main bottleneck holding back further AI adoption?” Both groups were asked to select from the same group of answers. The most common reason, by a significant margin, was difficulty finding appropriate business use cases (31% for nonusers, 22% for users). We could argue that this reflects a lack of imagination—but that’s not only ungracious, it also presumes that applying AI everywhere without careful thought is a good idea. The consequences of “Move fast and break things” are still playing out across the world, and it isn’t pretty. Badly thought-out and poorly implemented AI solutions can be damaging, so most companies should think carefully about how to use AI appropriately. We’re not encouraging skepticism or fear, but companies should start AI products with a clear understanding of the risks, especially those risks that are specific to AI. What use cases are appropriate, and what aren’t? The ability to distinguish between the two is important, and it’s an issue for both companies that use AI and companies that don’t. We also have to recognize that many of these use cases will challenge traditional ways of thinking about businesses. Recognizing use cases for AI and understanding how AI allows you to reimagine the business itself will go hand in hand.</p>



<p>The second most common reason was concern about legal issues, risk, and compliance (18% for nonusers, 20% for users). This worry certainly belongs to the same story: risk has to be considered when thinking about appropriate use cases. The legal consequences of using generative AI are still unknown. Who owns the copyright for AI-generated output? Can the creation of a model violate copyright, or is it a “transformative” use that’s protected under US copyright law? We don’t know right now; the answers will be worked out in the courts in the years to come. There are other risks too, including reputational damage when a model generates inappropriate output, new security vulnerabilities, and many more.</p>



<p>Another piece of the same puzzle is the lack of a policy for AI use. Such policies would be designed to mitigate legal problems and require regulatory compliance. This isn’t as significant an issue; it was cited by 6.3% of users and 3.9% of nonusers. Corporate policies on AI use will be appearing and evolving over the next year. (At O’Reilly, we have just put our policy for workplace use into place.) Late in 2023, we suspect that relatively few companies have a policy. And of course, companies that don’t use AI don’t need an AI use policy. But it’s important to think about which is the cart and which is the horse. Does the lack of a policy prevent the adoption of AI? Or are individuals adopting AI on their own, exposing the company to unknown risks and liabilities? Among AI users, the absence of company-wide policies isn’t holding back AI use; that’s self-evident. But this probably isn’t a good thing. Again, AI brings with it risks and liabilities that should be addressed rather than ignored. Willful ignorance can only lead to unfortunate consequences.</p>



<p>Another factor holding back the use of AI is a company culture that doesn’t recognize the need (9.8% for nonusers, 6.7% for users). In some respects, not recognizing the need is similar to not finding appropriate business use cases. But there’s also an important difference: the word “appropriate.” AI entails risks, and finding use cases that are appropriate is a legitimate concern. A culture that doesn’t recognize the need is dismissive and could indicate a lack of imagination or forethought: “AI is just a fad, so we’ll just continue doing what has always worked for us.” Is that the issue? It’s hard to imagine a business where AI couldn’t be put to use, and it can’t be healthy to a company’s long-term success to ignore that promise.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig01-1048x710.png" alt="" class="wp-image-15262" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig01-1048x710.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig01-300x203.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig01-768x520.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig01-1536x1041.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig01-2048x1387.png 2048w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<p>We’re sympathetic to companies that worry about the lack of skilled people, an issue that was reported by 9.4% of nonusers and 13% of users. People with AI skills have always been hard to find and are often expensive. We don’t expect that situation to change much in the near future. While experienced AI developers are starting to leave powerhouses like Google, OpenAI, Meta, and Microsoft, not enough are leaving to meet demand—and most of them will probably gravitate to startups rather than adding to the AI talent within established companies. However, we’re also surprised that this issue doesn’t figure more prominently. Companies that are adopting AI are clearly finding staff somewhere, whether through hiring or training their existing staff.</p>



<p>A small percentage (3.7% of nonusers, 5.4% of users) report that “infrastructure issues” are an issue. Yes, building AI infrastructure is difficult and expensive, and it isn’t surprising that the AI users feel this problem more keenly. We’ve all read about the shortage of the high-end GPUs that power models like ChatGPT. This is an area where cloud providers already bear much of the burden, and will continue to bear it in the future. Right now, very few AI adopters maintain their own infrastructure and are shielded from infrastructure issues by their providers. In the long term, these issues may slow AI adoption. We suspect that many API services are being offered as loss leaders—that the major providers have intentionally set prices low to buy market share. That pricing won’t be sustainable, particularly as hardware shortages drive up the cost of building infrastructure. How will AI adopters react when the cost of renting infrastructure from AWS, Microsoft, or Google rises? Given the cost of equipping a data center with high-end GPUs, they probably won’t attempt to build their own infrastructure. But they may back off on AI development.</p>



<p>Few nonusers (2%) report that lack of data or data quality is an issue, and only 1.3% report that the difficulty of training a model is a problem. In hindsight, this was predictable: these are problems that only appear after you’ve started down the road to generative AI. AI users are definitely facing these problems: 7% report that data quality has hindered further adoption, and 4% cite the difficulty of training a model on their data. But while data quality and the difficulty of training a model are clearly important issues, they don’t appear to be the biggest barriers to building with AI. Developers are learning how to find quality data and build models that work.</p>



<h2>How Companies Are Using AI</h2>



<p>We asked several specific questions about how respondents are working with AI, and whether they’re “using” it or just “experimenting.”</p>



<p>We aren’t surprised that the most common application of generative AI is in programming, using tools like GitHub Copilot or ChatGPT. However, we <em>are</em> surprised at the level of adoption: 77% of respondents report using AI as an aid in programming; 34% are experimenting with it, and 44% are already using it in their work. Data analysis showed a similar pattern: 70% total; 32% using AI, 38% experimenting with it. The higher percentage of users that are experimenting may reflect OpenAI’s addition of Advanced Data Analysis (formerly Code Interpreter) to ChatGPT’s repertoire of beta features. Advanced Data Analysis does a decent job of exploring and analyzing datasets—though we expect data analysts to be careful about checking AI’s output and to distrust software that’s labeled as “beta.”</p>



<p>Using generative AI tools for tasks related to programming (including data analysis) is nearly universal. It will certainly become universal for organizations that don’t explicitly prohibit its use. And we expect that programmers will use AI even in organizations that prohibit its use. Programmers have always developed tools that would help them do their jobs, from test frameworks to source control to integrated development environments. And they’ve always adopted these tools whether or not they had management’s permission. From a programmer’s perspective, code generation is just another labor-saving tool that keeps them productive in a job that is constantly becoming more complex. In the early 2000s, some studies of open source adoption found that a large majority of staff said that they were using open source, even though a large majority of CIOs said their companies weren’t. Clearly those CIOs either didn’t know what their employees were doing or were willing to look the other way. We’ll see that pattern repeat itself: programmers will do what’s necessary to get the job done, and managers will be blissfully unaware as long as their teams are more productive and goals are being met.</p>



<p>After programming and data analysis, the next most common use for generative AI was applications that interact with customers, including customer support: 65% of all respondents report that their companies are experimenting with (43%) or using AI (22%) for this purpose. While companies have long been talking about AI’s potential to improve customer support, we didn’t expect to see customer service rank so high. Customer-facing interactions are very risky: incorrect answers, bigoted or sexist behavior, and many other well-documented problems with generative AI quickly lead to damage that is hard to undo. Perhaps that’s why such a large percentage of respondents are experimenting with this technology rather than using it (more than for any other kind of application). Any attempt at automating customer service needs to be very carefully tested and debugged. We interpret our survey results as “cautious but excited adoption.” It’s clear that automating customer service could go a long way to cut costs and even, if done well, make customers happier. No one wants to be left behind, but at the same time, no one wants a highly visible PR disaster or a lawsuit on their hands.</p>



<p>A moderate number of respondents report that their companies are using generative AI to generate copy (written text). 47% are using it specifically to generate marketing copy, and 56% are using it for other kinds of copy (internal memos and reports, for example). While rumors abound, we’ve seen few reports of people who have actually lost their jobs to AI—but those reports have been almost entirely from copywriters. AI isn’t yet at the point where it can write as well as an experienced human, but if your company needs catalog descriptions for hundreds of items, speed may be more important than brilliant prose. And there are many other applications for machine-generated text: AI is good at summarizing documents. When coupled with a speech-to-text service, it can do a passable job of creating meeting notes or even podcast transcripts. It’s also well suited to writing a quick email.</p>



<p>The applications of generative AI with the fewest users were web design (42% total; 28% experimenting, 14% using) and art (36% total; 25% experimenting, 11% using). This no doubt reflects O’Reilly’s developer-centric audience. However, several other factors are in play. First, there are already a lot of low-code and no-code web design tools, many of which feature AI but aren’t yet using generative AI. Generative AI will face significant entrenched competition in this crowded market. Second, while OpenAI’s GPT-4 announcement last March demoed generating website code from a hand-drawn sketch, that capability wasn’t available until after the survey closed. Third, while roughing out the HTML and JavaScript for a simple website makes a great demo, that isn’t really the problem web designers need to solve. They want a drag-and-drop interface that can be edited on-screen, something that generative AI models don’t yet have. Those applications will be built soon; <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/tldraw/draw-a-ui" target="_blank">tldraw</a> is a very early example of what they might be. Design tools suitable for professional use don&#8217;t exist yet, but they will appear very soon.</p>



<p>An even smaller percentage of respondents say that their companies are using generative AI to create art. While we’ve read about startup founders using Stable Diffusion and Midjourney to create company or product logos on the cheap, that’s still a specialized application and something you don’t do frequently. But that isn’t all the art that a company needs: “hero images” for blog posts, designs for reports and whitepapers, edits to publicity photos, and more are all necessary. Is generative AI the answer? Perhaps not yet. Take Midjourneyfor example: while its capabilities are impressive, the tool can also make silly mistakes, like getting the number of fingers (or arms) on subjects incorrect. While the latest version of Midjourney is much better, it hasn’t been out for long, and many artists and designers would prefer not to deal with the errors. They’d also prefer to avoid legal liability. Among generative art vendors, Shutterstock, Adobe, and Getty Images indemnify users of their tools against copyright claims. Microsoft, Google, IBM, and OpenAI have offered more general indemnification.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig02-1048x782.png" alt="" class="wp-image-15263" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig02-1048x782.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig02-300x224.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig02-768x573.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig02-1536x1147.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig02.png 1871w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<p>We also asked whether the respondents’ companies are using AI to create some other kind of application, and if so, what. While many of these write-in applications duplicated features already available from big AI providers like Microsoft, OpenAI, and Google, others covered a very impressive range. Many of the applications involved summarization: news, legal documents and contracts, veterinary medicine, and financial information stand out. Several respondents also mentioned working with video: analyzing video data streams, video analytics, and generating or editing videos.</p>



<p>Other applications that respondents listed included fraud detection, teaching, customer relations management, human resources, and compliance, along with more predictable applications like chat, code generation, and writing. We can’t tally and tabulate all the responses, but it’s clear that there’s no shortage of creativity and innovation. It’s also clear that there are few industries that won’t be touched—AI will become an integral part of almost every profession.</p>



<p>Generative AI will take its place as the ultimate office productivity tool. When this happens, it may no longer be recognized as AI; it will just be a feature of Microsoft Office or Google Docs or Adobe Photoshop, all of which are integrating generative AI models. GitHub Copilot and Google’s Codey have both been integrated into Microsoft and Google’s respective programming environments. They will simply be part of the environment in which software developers work. The same thing happened to networking 20 or 25 years ago: wiring an office or a house for ethernet used to be a big deal. Now we expect wireless everywhere, and even that’s not correct. We don’t “expect” it—we assume it, and if it’s not there, it’s a problem. We expect mobile to be everywhere, including map services, and it’s a problem if you get lost in a location where the cell signals don’t reach. We expect search to be everywhere. AI will be the same. It won’t be expected; it will be assumed, and an important part of the transition to AI everywhere will be understanding how to work when it isn’t available.</p>



<h2>The Builders and Their Tools</h2>



<p>To get a different take on what our customers are doing with AI, we asked what models they’re using to build custom applications. 36% indicated that they aren’t building a custom application. Instead, they’re working with a prepackaged application like ChatGPT, GitHub Copilot, the AI features integrated into Microsoft Office and Google Docs, or something similar. The remaining 64% have shifted from using AI to developing AI applications. This transition represents a big leap forward: it requires investment in people, in infrastructure, and in education.</p>



<h3>Which Model?</h3>



<p>While the GPT models dominate most of the online chatter, the number of models available for building applications is increasing rapidly. We read about a new model almost every day—certainly every week—and a quick look at <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/models" target="_blank">Hugging Face</a> will show you more models than you can count. (As of November, the number of models in its repository is approaching 400,000.) Developers clearly have choices. But what choices are they making? Which models are they using?</p>



<p>It’s no surprise that 23% of respondents report that their companies are using one of the GPT models (2, 3.5, 4, and 4V), more than any other model. It’s a bigger surprise that 21% of respondents are developing their own model; that task requires substantial resources in staff and infrastructure. It will be worth watching how this evolves: will companies continue to develop their own models, or will they use AI services that allow a foundation model (like GPT-4) to be customized?</p>



<p>16% of the respondents report that their companies are building on top of open source models. Open source models are a large and diverse group. One important subsection consists of models derived from Meta’s LLaMA: llama.cpp, Alpaca, Vicuna, and many others. These models are typically smaller (7 to 14 billion parameters) and easier to fine-tune, and they can run on very limited hardware; many can run on laptops, cell phones, or nanocomputers such as the Raspberry Pi. Training requires much more hardware, but the ability to run in a limited environment means that a finished model can be embedded within a hardware or software product. Another subsection of models has no relationship to LLaMA: RedPajama, Falcon, MPT, Bloom, and many others, most of which are available on Hugging Face. The number of developers using any specific model is relatively small, but the total is impressive and demonstrates a vital and active world beyond GPT. These “other” models have attracted a significant following. Be careful, though: while this group of models is frequently called “open source,” many of them restrict what developers can build from them. Before working with any so-called open source model, look carefully at the license. Some limit the model to research work and prohibit commercial applications; some prohibit competing with the model’s developers; and more. We’re stuck with the term “open source” for now, but where AI is concerned, open source often isn’t what it seems to be.</p>



<p>Only 2.4% of the respondents are building with LLaMA and Llama 2. While the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/models?search=llama2" target="_blank">source code and weights</a> for the LLaMA models are available online, the LLaMA models don’t yet have a public API backed by Meta—although there appear to be several APIs developed by third parties, and both <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://console.cloud.google.com/marketplace/product/meta/llama-2?project=strong-keyword-184513" target="_blank">Google Cloud</a> and <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233" target="_blank">Microsoft Azure</a> offer Llama 2  as a service. The LLaMA-family models also fall into the “so-called open source” category that restricts what you can build.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig03-1048x785.png" alt="" class="wp-image-15264" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig03-1048x785.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig03-300x225.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig03-768x575.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig03-1536x1151.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig03.png 1853w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<p>Only 1% are building with Google’s Bard, which perhaps has less exposure than the others. A number of writers have claimed that Bard gives worse results than the LLaMA and GPT models; that may be true for chat, but I’ve found that Bard is often correct when GPT-4 fails. For app developers, the biggest problem with Bard probably isn’t accuracy or correctness; it’s availability. In March 2023, Google announced a public beta program for the Bard API. However, as of November, questions about API availability are still answered by links to the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.googlecloudcommunity.com/gc/AI-ML/Google-Bard-API/m-p/538517#:~:text=Yes%2C%20there%20is%20an%20API,on%20the%20Google%20AI%20website." target="_blank">beta announcement</a>. Use of the Bard API is undoubtedly hampered by the relatively small number of developers who have access to it. Even fewer are using <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://claude.ai/login?returnTo=%2F" target="_blank">Claude</a>, a very capable model developed by Anthropic. Claude doesn’t get as much news coverage as the models from Meta, OpenAI, and Google, which is unfortunate: Anthropic’s <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback" target="_blank">Constitutional AI</a> approach to AI safety is a unique and promising attempt to solve the biggest problems troubling the AI industry.</p>



<h3>What Stage?</h3>



<p>When asked what stage companies are at in their work, most respondents shared that they’re still in the early stages. Given that generative AI is relatively new, that isn’t news. If anything, we should be surprised that generative AI has penetrated so deeply and so quickly. 34% of respondents are working on an initial proof of concept. 14% are in product development, presumably after developing a PoC; 10% are building a model, also an early stage activity; and 8% are testing, which presumes that they’ve already built a proof of concept and are moving toward deployment—they have a model that at least appears to work.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig04-1048x750.png" alt="" class="wp-image-15265" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig04-1048x750.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig04-300x215.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig04-768x549.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig04-1536x1099.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig04.png 1935w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<p>What stands out is that 18% of the respondents work for companies that have AI applications in production. Given that the technology is new and that many AI projects fail,<sup data-type="footnote">2</sup> it’s surprising that 18% report that their companies already have generative AI applications in production. We’re not being skeptics; this is evidence that while most respondents report companies that are working on proofs of concept or in other early stages, generative AI is being adopted and is doing real work. We’ve already seen some significant&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.honeycomb.io/blog/honeycomb-natural-language-querying-query-assistant" target="_blank">integrations</a>&nbsp;of AI into existing products, including&nbsp;<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/online-learning/article-answers.html" target="_blank">our own</a>. We expect others to follow.</p>



<h3>Risks and Tests</h3>



<p>We asked the respondents whose companies are working with AI what risks they’re testing for. The top five responses clustered between 45 and 50%: unexpected outcomes (49%), security vulnerabilities (48%), safety and reliability (46%), fairness, bias, and ethics (46%), and privacy (46%).</p>



<p>It’s important that almost half of respondents selected “unexpected outcomes,” more than any other answer: anyone working with generative AI needs to know that incorrect results (often called hallucinations) are common. If there’s a surprise here, it’s that this answer wasn’t selected by 100% of the participants. Unexpected, incorrect, or inappropriate results are almost certainly the biggest single risk associated with generative AI.</p>



<p>We’d like to see more companies test for fairness. There are many applications (for example,<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.himss.org/resources/uncovering-and-removing-data-bias-healthcare" target="_blank"> medical</a> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-10-ai-chatbots-health-perpetuating-racism.html" target="_blank">applications</a>) where bias is among the most important problems to test for and where getting rid of historical biases in the training data is very difficult and of utmost importance. It’s important to realize that unfair or biased output can be very subtle, particularly if application developers don’t belong to groups that experience bias—and what’s “subtle” to a developer is often very unsubtle to a user. A chat application that doesn’t understand a user’s accent is an obvious problem (search for “Amazon Alexa doesn’t understand Scottish accent”). It’s also important to look for applications where bias isn’t an issue. ChatGPT has driven a focus on personal use cases, but there are many applications where problems of bias and fairness aren’t major issues: for example, examining images to tell whether crops are diseased or optimizing a building’s heating and air conditioning for maximum efficiency while maintaining comfort.</p>



<p>It’s good to see issues like safety and security near the top of the list. Companies are gradually waking up to the idea that security is a serious issue, not just a cost center. In many applications (for example, customer service), generative AI is in a position to do significant reputational damage, in addition to creating legal liability. Furthermore, generative AI has its own vulnerabilities, such as <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Apr/14/worst-that-can-happen/" target="_blank">prompt injection</a>, for which there is still no known solution. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-10-ai-expose-critical-vulnerabilities-major.html" target="_blank">Model leeching</a>, in which an attacker uses specially designed prompts to reconstruct the data on which the model was trained, is another attack that’s unique to AI. While 48% isn’t bad, we would like to see even greater awareness of the need to test AI applications for security.</p>



<p>Model interpretability (35%) and model degradation (31%) aren’t as big concerns. Unfortunately, interpretability remains a research problem for generative AI. At least with the current language models, it’s very difficult to explain why a generative model gave a specific answer to any question. Interpretability might not be a requirement for most current applications. If ChatGPT writes a Python script for you, you may not care why it wrote that particular script rather than something else. (It’s also worth remembering that if you ask ChatGPT why it produced any response, its answer will not be the reason for the previous response, but, as always, the most likely response to your question.) But interpretability is critical for diagnosing problems of bias and will be extremely important when cases involving generative AI end up in court.</p>



<p>Model degradation is a different concern. The performance of any AI model degrades over time, and as far as we know, large language models are no exception. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/pdf/2307.09009.pdf" target="_blank">One hotly debated study</a> argues that the quality of GPT-4’s responses has dropped over time. Language changes in subtle ways; the questions users ask shift and may not be answerable with older training data. Even the existence of an AI answering questions might cause a change in what questions are asked. Another fascinating issue is what happens when generative models are trained on data generated by other generative models. Is <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2305.17493v2" target="_blank">“model collapse”</a> real, and what impact will it have as models are retrained?</p>



<p>If you’re simply building an application on top of an existing model, you may not be able to do anything about model degradation. Model degradation is a much bigger issue for developers who are building their own model or doing additional training to fine-tune an existing model. Training a model is expensive, and it’s likely to be an ongoing process.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig05-1048x732.png" alt="" class="wp-image-15266" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig05-1048x732.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig05-300x210.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig05-768x537.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig05-1536x1073.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig05.png 1969w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<h2>Missing Skills</h2>



<p>One of the biggest challenges facing companies developing with AI is expertise. Do they have staff with the necessary skills to build, deploy, and manage these applications? To find out where the skills deficits are, we asked our respondents what skills their organizations need to acquire for AI projects. We weren’t surprised that AI programming (66%) and data analysis (59%) are the two most needed. AI is the next generation of what we called “data science” a few years back, and data science represented a merger between statistical modeling and software development. The field may have evolved from traditional statistical analysis to artificial intelligence, but its overall shape hasn’t changed much.</p>



<p>The next most needed skill is operations for AI and ML (54%). We’re glad to see people recognize this; we’ve long thought that operations was the “elephant in the room” for AI and ML. Deploying and managing AI products isn’t simple. These products differ in many ways from more traditional applications, and while practices like continuous integration and deployment have been very effective for traditional software applications, AI requires a rethinking of these code-centric methodologies. The model, not the source code, is the most important part of any AI application, and models are large binary files that aren’t amenable to source control tools like Git. And unlike source code, models grow stale over time and require constant monitoring and testing. The statistical behavior of most models means that simple, deterministic testing won’t work; you can’t guarantee that, given the same input, a model will generate the same output. The result is that AI operations is a specialty of its own, one that requires a deep understanding of AI and its requirements in addition to more traditional operations. What kinds of deployment pipelines, repositories, and test frameworks do we need to put AI applications into production? We don’t know; we’re still developing the tools and practices needed to deploy and manage AI successfully.</p>



<p>Infrastructure engineering, a choice selected by 45% of respondents, doesn’t rank as high. This is a bit of a puzzle: running AI applications in production can require huge resources, as companies as large as <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theregister.com/2023/10/11/github_ai_copilot_microsoft/" target="_blank">Microsoft</a> are finding out. However, most organizations aren’t yet running AI on their own infrastructure. They’re either using APIs from an AI provider like OpenAI, Microsoft, Amazon, or Google or they’re using a cloud provider to run a homegrown application. But in both cases, some other provider builds and manages the infrastructure. OpenAI in particular offers enterprise services, which includes APIs for training custom models along with stronger guarantees about keeping corporate data private. However, with <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenextweb.com/news/ai-progress-hitting-brakes-more-likely-than-world-domination" target="_blank">cloud providers operating near full capacity</a>, it makes sense for companies investing in AI to start thinking about their own infrastructure and acquiring the capacity to build it.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig06-1048x726.png" alt="" class="wp-image-15267" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig06-1048x726.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig06-300x208.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig06-768x532.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig06-1536x1064.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig06.png 1968w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<p>Over half of the respondents (52%) included general AI literacy as a needed skill. While the number could be higher, we’re glad that our users recognize that familiarity with AI and the way AI systems behave (or misbehave) is essential. Generative AI has a great wow factor: with a simple prompt, you can get ChatGPT to tell you about Maxwell’s equations or the Peloponnesian War. But simple prompts don’t get you very far in business. AI users soon learn that good prompts are often very complex, describing in detail the result they want and how to get it. Prompts can be very long, and they can include all the resources needed to answer the user’s question. Researchers debate whether this level of prompt engineering will be necessary in the future, but it will clearly be with us for the next few years. AI users also need to expect incorrect answers and to be equipped to check virtually all the output that an AI produces. This is often called critical thinking, but it’s much more like the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Discovery_(law)" target="_blank">process of discovery in law</a>: an exhaustive search of all possible evidence. Users also need to know how to create a prompt for an AI system that will generate a useful answer.</p>



<h2>Finally, the Business</h2>



<p>So what’s the bottom line? How do businesses benefit from AI? Over half (54%) of the respondents expect their businesses to benefit from increased productivity. 21% expect increased revenue, which might indeed be the result of increased productivity. Together, that’s three-quarters of the respondents. Another 9% say that their companies would benefit from better planning and forecasting.</p>



<p>Only 4% believe that the primary benefit will be lower personnel counts. We’ve long thought that the fear of losing your job to AI was exaggerated. While there will be some short-term dislocation as a few jobs become obsolete, AI will also create new jobs—as has almost every significant new technology, including computing itself. Most jobs rely on a multitude of individual skills, and generative AI can only substitute for a few of them. Most employees are also willing to use tools that will make their jobs easier, boosting productivity in the process. We don’t believe that AI will replace people, and neither do our respondents. On the other hand, employees will need training to use AI-driven tools effectively, and it’s the responsibility of the employer to provide that training.</p>



<figure class="wp-block-image size-large"><img src="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig07-1048x740.png" alt="" class="wp-image-15268" srcset="https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig07-1048x740.png 1048w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig07-300x212.png 300w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig07-768x542.png 768w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig07-1536x1084.png 1536w, https://www.oreilly.com/radar/wp-content/uploads/sites/3/2023/11/fig07.png 1971w" sizes="(max-width: 1048px) 100vw, 1048px" /></figure>



<p>We’re optimistic about generative AI’s future. It’s hard to realize that ChatGPT has only been around for a year; the technology world has changed so much in that short period. We’ve never seen a new technology command so much attention so quickly: not personal computers, not the internet, not the web. It’s certainly possible that we’ll slide into another AI winter if the investments being made in generative AI don’t pan out. There are definitely problems that need to be solved—correctness, fairness, bias, and security are among the biggest—and some early adopters will ignore these hazards and suffer the consequences. On the other hand, we believe that worrying about a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Artificial_general_intelligence" target="_blank">general AI</a> deciding that humans are unnecessary is either an affliction of those who read too much science fiction or a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://stratechery.com/2023/attenuating-innovation-ai/" target="_blank">strategy to encourage regulation</a> that gives the current incumbents an advantage over startups.</p>



<p>It’s time to start learning about generative AI, thinking about how it can improve your company’s business, and planning a strategy. We can’t tell you what to do; developers are pushing AI into almost every aspect of business. But companies will need to invest in training, both for software developers and for AI users; they’ll need to invest in the resources required to develop and run applications, whether in the cloud or in their own data centers; and they’ll need to think creatively about how they can put AI to work, realizing that the answers may not be what they expect.</p>



<p>AI won’t replace humans, but companies that take advantage of AI will replace companies that don’t.</p>



<hr class="wp-block-separator" />



<h2>Footnotes</h2>



<ol><li>Meta has dropped the odd capitalization for Llama 2. In this report, we use LLaMA to refer to the LLaMA models generically: LLaMA, Llama 2, and Llama n, when future versions exist. Although capitalization changes, we use Claude to refer both to the original Claude and to Claude 2, and Bard to Google’s Bard model and its successors.</li><li>Many articles quote Gartner as saying that the failure rate for AI projects is 85%. We haven’t found the source, though in 2018, <a href="https://www.gartner.com/en/newsroom/press-releases/2018-02-13-gartner-says-nearly-half-of-cios-are-planning-to-deploy-artificial-intelligence" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">Gartner wrote</a> that 85% of AI projects “deliver erroneous outcomes.” That’s not the same as failure, and 2018 significantly predates generative AI. Generative AI is certainly prone to “erroneous outcomes,” and we suspect the failure rate is high. 85% might be a reasonable estimate.</li></ol>



<hr class="wp-block-separator" />



<h2>Appendix</h2>



<h3>Methodology and Demographics</h3>



<p>This survey ran from September 14, 2023, to September 27, 2023. It was publicized through <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learning.oreilly.com/home-new/" target="_blank">O’Reilly’s learning platform</a> to all our users, both corporate and individuals. We received 4,782 responses, of which 2,857 answered all the questions. As we usually do, we eliminated incomplete responses (users who dropped out part way through the questions). Respondents who indicated they weren’t using generative AI were asked a final question about why they weren’t using it, and considered complete.</p>



<p>Any survey only gives a partial picture, and it’s very important to think about biases. The biggest bias by far is the nature of O’Reilly’s audience, which is predominantly North American and European. 42% of the respondents were from North America, 32% were from Europe, and 21% percent were from the Asia-Pacific region. Relatively few respondents were from South America or Africa, although we are aware of very interesting applications of AI on these continents.</p>



<p>The responses are also skewed by the industries that use our platform most heavily. 34% of all respondents who completed the survey were from the software industry, and another 11% worked on computer hardware, together making up almost half of the respondents. 14% were in financial services, which is another area where our platform has many users. 5% of the respondents were from telecommunications, 5% from the public sector and the government, 4.4% from the healthcare industry, and 3.7% from education. These are still healthy numbers: there were over 100 respondents in each group. The remaining 22% represented other industries, ranging from mining (0.1%) and construction (0.2%) to manufacturing (2.6%).</p>



<p>These percentages change very little if you look only at respondents whose employers use AI rather than all respondents who completed the survey. This suggests that AI usage doesn’t depend a lot on the specific industry; the differences between industries reflects the population of O’Reilly’s user base.</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/generative-ai-in-the-enterprise/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Creativity Isn’t Just Remixing</title>
		<link>https://www.oreilly.com/radar/creativity-isnt-just-remixing/</link>
				<comments>https://www.oreilly.com/radar/creativity-isnt-just-remixing/#respond</comments>
				<pubDate>Tue, 14 Nov 2023 13:09:52 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Commentary]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15251</guid>
				<description><![CDATA[This is not the first time that I’ve written about AI creativity, and I doubt that it will be the last. It’s a question that comes up repeatedly, and that is very much in the current mind, with events like the strikes by the Writers Guild of America and the Screen Actors Guild, in which [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>This is not the first time that I’ve written about AI creativity, and I doubt that it will be the last. It’s a question that comes up repeatedly, and that is very much in the current mind, with events like the strikes by the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theguardian.com/culture/2023/sep/26/hollywood-writers-strike-ends-studio-deal" target="_blank">Writers Guild of America</a> and the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.cnbc.com/2023/07/13/sag-actors-union-goes-on-strike-joining-hollywood-writers.html" target="_blank">Screen Actors Guild</a>, in which the use of AI to create scripts and to generate images of actors was an issue.&nbsp;Can an AI system be creative and, if so, what would that creativity look like?</p>



<p>I’m skeptical about AI creativity, though recently I <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/radar/ai-hallucinations-a-provocation/" target="_blank">hypothesized</a> that an AI system optimized for “hallucinations” might be the start of “artificial creativity.” That’s a path that’s well worth investigating. But let’s take a step back and think more carefully about what creativity means.</p>



<p>It’s all too easy to say that creativity is, at its heart, combinatory. Ethan Mollick (with whom I rarely disagree) writes, “In the real world, most new ideas do not come from the ether; they are based on combinations of existing concepts, which is why innovation scholars have long pointed to the importance of recombination in generating ideas.” He’s partially right, but that statement misses the point—in part because Mollick studies business innovation, which, despite the name, is all too often nothing more than recombination. Remember all the VC dollars thrown at new “social media” companies that were ultimately just reinventions of Twitter, Facebook, or one of their predecessors? Remember all the “Uber for X” startups? The thousands of altcoins that (used to) attract lots of capital? The current wave of AI startups is no different. There’s a lot of posturing here, but very little creativity.</p>



<p>No, to find creativity, we’ll have to look more closely. It’s naive to say that creativity isn’t partly based on the work of predecessors. You wouldn’t get Beethoven without the works of Haydn and Mozart. At the same time, you don’t get Beethoven out of the works of Haydn and Mozart. An AI trained on the works on Haydn and Mozart wouldn’t give you Beethoven; it would give you some (probably rather dull) amalgam, lacking the creativity of either Haydn or Mozart. Nor can you derive the Beatles by mixing together Chuck Berry and Little Richard, though (again) there are obvious relationships. </p>



<p>At this point, we have to make some distinctions about what we mean by “creativity.” AI can write poems—not terribly well, but they certainly rhyme, and they can be prompted to convey certain sentiments. I wouldn’t mistake anything I’ve seen for the work of a great (or even good) poet, but companies like Hallmark provide a market for millions of lines of verse, and that market is probably more lucrative than the market for poets who publish in “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Little_magazine" target="_blank">little magazines</a>.” And it’s been a long time since I’ve expected anything worthwhile from the music industry, which is much more about industry than music.&nbsp;There’s an almost unending appetite for “industrial” music.</p>



<p>So, what is creativity? Creativity certainly depends on the past: “shoulders of giants” and all of that. There are few great artists or technical innovators who don’t understand their relationship to the past. That relationship is often uncomfortable, but it’s essential. At the same time, great artists add something new, create new possibilities. Arne Eigenfeldt, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://theconversation.com/why-the-growth-of-ai-in-making-art-wont-eliminate-artists-210187" target="_blank">writing</a> about music, says that “it takes true creativity to produce something outside the existing paradigm,” and that the “music industry has been driven by style-replicating processes for decades.” AI that merely mixes and matches style is uninteresting. But Eigenfeldt would be the last person to say that AI has nothing to do with creativity. It’s another tool; prompting AI, and curating its output is itself a creative act. Artists working with AI can do more experiments, and potentially create more art that breaks paradigms, art that indeed makes something new.</p>



<p>Of all the arts, music has historically been the most amenable to borrowing, stealing, or whatever you want to call it. The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.youtube.com/watch?v=MAFUdIZnI5o" target="_blank">history</a> of Thelonious Monk’s “Rhythm-a-Ning” stretches back to George Gershwin’s “I Got Rhythm” and Duke Ellington’s “Ducky Wucky,” and forward (or is it sideways) to songs as unlikely as the theme song for <em>The Flintstones</em>. There is no question about creativity, but it’s creativity that’s based on a vocabulary that has a long history. And there’s no question that all of these expressions of creativity include elements that go beyond a simple “remixing” of that vocabulary.</p>



<p>What about other arts? While borrowing in literature is usually more covert than overt, T. S. Eliot famously <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.gutenberg.org/cache/epub/57795/pg57795-images.html#massinger" target="_blank">said</a>, “Immature poets imitate; mature poets steal; bad poets deface what they take, and good poets make it into something better, or at least something different. The good poet welds his theft into a whole of feeling which is unique, utterly different from that from which it was torn.” This is often quoted incorrectly as “Good writers borrow, great writers steal,” a quote that’s also attributed to Oscar Wilde (“Talent borrows, genius steals”) and many others. While the history of copying this quote about copying is interesting in its own right, Eliot’s version shows how “theft” becomes something new, something that wasn’t couldn’t have been predicted or anticipated. It’s worth thinking of William Blake’s reinterpretation of Milton’s <em>Paradise Lost</em>, in which Satan is the hero; “The reason Milton wrote in fetters when he wrote of Angels and God, and at liberty when of Devils and Hell, is that he was a true Poet and of the Devil’s party without knowing it” (<em><a href="https://ia803405.us.archive.org/0/items/marriageofheaven00blak/marriageofheaven00blak.pdf">The Marriage of Heaven and Hell</a></em>, page 6).&nbsp; But Blake’s works are far from a remixing; they’re radically different. Blake certainly understood his connection to Milton, but more than any other poet created works that are completely unlike anything that came before. (Follow the link to see images of Blake’s work.) While Blake may represent creation at its most radical, literature that is worth reading is never just a remixing; it always adds something new, if it is not to be entirely in “fetters.” </p>



<p>I’ve <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/radar/ai-and-creativity/" target="_blank">argued</a> that what matters to us in a literary work is the fact that a human wrote it. We value a poem like Wordsworth’s “Lines Composed a Few Miles Above Tintern Abbey, on Revisiting the Banks of the Wye During a Tour” because of the texture of Wordsworth’s thought, and his thought reflecting on itself. I’ve used the long and prosaic title rather than the shorter “Tintern Abbey” to emphasize that. Whether it’s Wordsworth or Ginsburg’s “Howl,” what matters is that someone has thought these thoughts. But that’s certainly a post-Romantic take on creativity—one that Wordsworth would have agreed with, but that would have been very strange to Shakespeare or Chaucer. Chaucer would have thought that literature was about retelling good stories, and not necessarily original ones; <em>The Canterbury Tales</em> steals from many models, ranging from classical literature to Dante. So do Shakespeare’s plays. But in both cases, thinking that these works could come from recombining the original works misses the point.&nbsp;What makes them worth reading isn’t that they’re retellings of old material, it’s what isn’t in the original. Macbeth may be based on <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Holinshed's_Chronicles" target="_blank"><em>Holinshed’s Chronicles</em></a>, but <em>Holinshed</em> (should you ever read it) is dull. <em>Hamlet </em>was almost certainly based on an earlier play (called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Ur-Hamlet" target="_blank"><em>Ur-Hamlet</em></a>), probably written by one of Shakespeare’s contemporaries, about which very little is known. There’s something great imaginatively happening in all of these works: characters that we can think about and care about, something we might even call the “invention of the human.”<sup>1</sup></p>



<p>As in literature, copying in painting is usually covert rather than overt. Pablo Picasso also may have said “good artists copy, great artists steal,” joining Eliot, Wilde, and others. Copying paintings by great artists is still an exercise for aspiring artists—although most of us recognize that more paintings in the style of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.news24.com/life/arts-and-entertainment/arts/ai-recreation-of-johannes-vermeers-masterpiece-sparks-dutch-art-controversy-20230311" target="_blank">Vermeer</a> aren’t interesting as works of art. They’re perhaps valuable as stand-ins when the original is on tour, and the technology used to create them is certainly of interest; I’m particularly interested in an AI-created <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.npr.org/sections/alltechconsidered/2016/04/06/473265273/a-new-rembrandt-from-the-frontiers-of-ai-and-not-the-artists-atelier" target="_blank">Rembrandt</a> that used a 3D printer to mimic his brushstrokes. This technology may be useful for repairing damaged works of art.&nbsp;But as far as new paintings—in a very real sense, much as we may wish we had more, we have enough. Hanging a picture of your company’s founder in the style of Vermeer on your wall would be a joke—either on the institution of Art, or on you, depending on whether you understand what you’re doing.</p>



<p>The question of remixing becomes more important if we turn to recent and more commercial art. While I wouldn’t want a painting of Tim O’Reilly in the style of Vermeer on my wall, many people are using tools like Midjourney and Stable Diffusion to create their own images in the style of living, working artists; images in the style of Greg Rutkowski have been requested <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://finance.yahoo.com/news/greg-rutkowski-removed-stable-diffusion-223226680.html" target="_blank">over 400,000 times</a>. After his images were removed from Stable Diffusion’s training data, fans developed an alternate model that was tuned to produce images in Rutkowski’s style. While that’s certainly a strong sign of ongoing popularity, it is important to think about the consequences. Does ease of creating faux-Rutkowski compromise his ability to make a living? Fans are clearly putting faux-Rutkowski as wallpaper on their laptops, if not ordering high-resolution prints and putting them on their walls. If this is a joke, who is the butt? Would a publisher generate a faux image as a book cover? Is Rutkowski’s style (as opposed to a specific work) protected by copyright laws?&nbsp;We don’t know; a number of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.thefashionlaw.com/from-chatgpt-to-deepfake-creating-apps-a-running-list-of-key-ai-lawsuits/" target="_blank">cases</a> are in the legal system now. Most of these cases involve the terra incognita of training data, though most of these cases involve the use of copyrighted material as training data, not the recreation of a specific style, let alone a specific work.</p>



<p>What about creativity? Creativity sets a high bar, and I don’t think AI meets it yet. At least one artist thinks that tools like Midjourney are being trained to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://uxdesign.cc/the-case-for-ai-hallucination-a79688338a14" target="_blank">favor photorealism</a>, rather than originality. In “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2305.17493" target="_blank">The Curse of Recursion</a>,” a research group shows that generative AI that is trained on the output of generative AI will produce less surprising, original output. Its output will become pedestrian, expected, and mediocre, and that might be fine for many applications. With human artists such as Rutkowski or <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://waxy.org/2022/11/invasive-diffusion-how-one-unwilling-illustrator-found-herself-turned-into-an-ai-model/" target="_blank">Hollie Mengert</a> (whose story is eerily similar to Rutkowski’s), creativity lies in what they put into their art, not the possibility of imitating their style. We see that clearly when we’re not blinded by AI’s presence: if a human imitated their styles, would we call that creative? Or just derivative? It’s amazing that an AI system can produce derivative works, but we have to remember that they are derivative works. And we have to recognize that AI, as a tool for artists, makes perfect sense. Just as we don’t confuse the artist’s creativity with the paintbrush, we shouldn’t confuse their creativity with the AI.</p>



<hr class="wp-block-separator" />



<h3>Footnotes</h3>



<ol><li>The title of Harold Bloom’s <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.goodreads.com/en/book/show/20942" target="_blank">book</a> on Shakespeare. Bloom is also one of a minority of scholars who believes that Shakespeare wrote the <em>Ur-Hamlet</em>, which was an early version of Hamlet. Given that we know next to nothing about the original play, this is at best an interesting conjecture.</li></ol>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/creativity-isnt-just-remixing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Radar Trends to Watch: November 2023</title>
		<link>https://www.oreilly.com/radar/radar-trends-to-watch-november-2023/</link>
				<comments>https://www.oreilly.com/radar/radar-trends-to-watch-november-2023/#respond</comments>
				<pubDate>Tue, 07 Nov 2023 10:58:06 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[Radar Trends]]></category>
		<category><![CDATA[Signals]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15244</guid>
				<description><![CDATA[Our Security section has grown almost as large as AI (and longer than Programming)—and that’s not including some security issues specific to AI, like model leeching. Does that mean that AI is cooling down? Or that security is heating up? It’s really impossible for security issues to get too much attention. The biggest news in [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Our Security section has grown almost as large as AI (and longer than Programming)—and that’s not including some security issues specific to AI, like model leeching. Does that mean that AI is cooling down? Or that security is heating up? It’s really impossible for security issues to get too much attention. The biggest news in AI arrived on the last day of October, and it wasn’t technical at all: the Biden administration’s executive order on AI. It will take some time to digest this, and even longer to see whether vendors follow the order’s recommendations. In itself, it’s evidence of an important ongoing trend: in the next year, many of the most important developments in AI will be legal rather than technical.</p>



<h2>Artificial Intelligence</h2>



<ul><li>In an <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/" target="_blank">executive order</a>, the US has issued a set of rules covering the development of advanced AI systems. The regulations <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/10/30/1082678/three-things-to-know-about-the-white-houses-executive-order-on-ai/" target="_blank">encourage</a> the development of watermarks (specifically the C2PA initiative) to authenticate communication; they attempt to set standards for testing; and they call for agencies to develop rules to protect consumers and workers.</li><li>Nightshade is another tool that artists can use to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/information-technology/2023/10/university-of-chicago-researchers-seek-to-poison-ai-art-generators-with-nightshade/" target="_blank">prevent generative AI</a> systems from using their work. It makes unnoticeable modifications to the image that cause the AI model to misinterpret it and create incorrect output.</li><li>Stanford’s Institute for Human-Centered Artificial Intelligence has issued a report on <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://hai.stanford.edu/news/introducing-foundation-model-transparency-index" target="_blank">transparency for large language models</a>: whether the creators of LLMs are disclosing essential data about their models. No model scores well, and transparency appears to be declining as the field grows more competitive.</li><li>Chatbots <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-10-ai-chatbots-health-perpetuating-racism.html" target="_blank">perpetuate false and racially biased information</a> in medical care. Debunked ideas about pain tolerance, kidney function, and other factors are included in training data, causing models to repeat those ideas.</li><li>An <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://redmonk.com/jgovernor/2023/10/18/introducing-the-ai-bill-of-materials/" target="_blank">AI Bill of Materials</a> (AIBOM) would <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/jasebell/ai-bill-of-materials" target="_blank">document</a> all of the materials that go into the creation of an AI system. This documentation would be essential to building AI that is capable of complying with regulation.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.andrealyip.com/a-young-ladys-illustrated-primer" target="_blank">GPT-4 does Stephenson</a>: GPT simulates the <em>Young Lady’s Illustrated Primer</em> (from <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/The_Diamond_Age" target="_blank"><em>The Diamond Age</em></a>). With illustrations from DALL-E.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://cobusgreyling.medium.com/a-new-prompt-engineering-technique-has-been-introduced-called-step-back-prompting-b00e8954cacb" target="_blank">Step-Back Prompting</a> is another prompting technique in which you ask a question, but before getting an answer, you ask the LLM to provide background information that will help it answer the question.</li><li>Prompt injection just got scarier. GPT-4V, which allows users to include images in conversations, is <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://simonwillison.net/2023/Oct/14/multi-modal-prompt-injection/" target="_blank">vulnerable to prompt injection through the images themselves</a>; text in the images can be interpreted as prompts. Malicious prompts can even be <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://twitter.com/goodside/status/1713000581587976372" target="_blank">hidden</a> in images.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://cloud.google.com/blog/products/ai-machine-learning/protecting-customers-with-generative-ai-indemnification" target="_blank">Google joins Microsoft, Adobe</a>, and others in indemnifying users of their AI against copyright lawsuits.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-10-ai-expose-critical-vulnerabilities-major.html" target="_blank">Model leeching</a> is a new attack against large language models. In model leeching, a carefully constructed set of prompts allows attackers to generate a smaller model that behaves similarly. The smaller model can then be used to construct other attacks against the original model.</li><li>Open source language models are proliferating. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.replit.com/replit-code-v1_5" target="_blank">Replit Code v1.5 3B</a> is now <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://huggingface.co/replit/replit-code-v1_5-3b" target="_blank">available on Hugging Face</a>. This model is designed for code completion, and has been trained on permissively licensed code so there should be minimal legal issues.</li><li>Anthropic <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.anthropic.com/index/decomposing-language-models-into-understandable-components" target="_blank">appears</a> to have made <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://transformer-circuits.pub/2023/monosemantic-features/index.html" target="_blank">significant progress</a> in making large language models interpretable. The key is understanding the behavior of groups of neurons, which they call “features,” rather than individual neurons.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://mistral.ai/news/announcing-mistral-7b/" target="_blank">Mistral 7B</a> is an open source large language model with impressive performance. It was developed independently. (It is not related to LLaMA.) Its performance is claimed to be better than equivalently sized models.</li><li>AMD may be able to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.hpcwire.com/2023/10/05/how-amd-may-get-across-the-cuda-moat/" target="_blank">challenge</a> NVIDIA’s dominance of the GPU market. NVIDIA’s dominance relies on the widely used CUDA language for programming GPUs. AMD has developed a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://rocm.docs.amd.com/en/latest/how_to/pytorch_install/pytorch_install.html" target="_blank">version of PyTorch</a> that has been tuned for use on AMD GPUs, eliminating the need for low-level GPU programming.</li><li>Larger training datasets leads to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2306.13141" target="_blank">more biased and hateful</a> <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://info.deeplearning.ai/ais-new-power-couple-movie-industry-limits-ai-youtube-goes-generative-more-web-data-more-bias" target="_blank">output</a>, not less.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://langstream.ai/" target="_blank">LangStream</a> (unrelated to LangChain) is an open source platform for building streaming applications that use generative AI.</li><li>GPT-4 and Claude have proven useful in <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://resobscura.substack.com/p/translating-latin-demonology-manuals" target="_blank">translating 16th century demonology texts</a> written in Medieval Latin. Claude’s 100K context window appears to be a big help. (And Medieval Latin is much different from the Latin you probably didn’t learn in school.)</li><li>A vulnerability called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/shelltorch-flaws-expose-ai-servers-to-code-execution-attacks/" target="_blank">ShellTorch</a> allows attackers to gain access to AI servers using TorchServe, a tool for deploying and scaling AI models using PyTorch.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://phys.org/news/2023-09-catch-22s-reservoir-overlooked-weakness-powerful.html" target="_blank">Reservoir computing</a> is another kind of neural network that has promise for understanding chaotic systems.</li><li>Perhaps not surprisingly, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2309.10668" target="_blank">language models can do an excellent job of lossless compression</a> better than standards like FLAC. (This doesn’t mean that language models store a compressed copy of the web.)</li><li>An artist <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://uxdesign.cc/the-case-for-ai-hallucination-a79688338a14" target="_blank">makes the case</a> that training generative models not to “hallucinate” has made them less interesting and less useful for creative applications.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/information-technology/2023/09/can-you-melt-eggs-quoras-ai-says-yes-and-google-is-sharing-the-result/" target="_blank">Can you melt eggs?</a> Quora has included a feature that generates answers using an older GPT model. This model answered “yes,” and aggressive SEO managed to get that “yes” to the top of a Google search.</li></ul>



<h2>Programming</h2>



<ul><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/harpoon-no-code-deployment-for-kubernetes/" target="_blank">Harpoon</a> is a no-code, drag and drop tool for Kubernetes deployment. </li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://davidlattimore.github.io/making-supply-chain-attacks-harder.html#introducing-cackle-aka-cargo-acl" target="_blank">Cackle</a> is a new tool for the Rust tool chain. It checks access control lists and is used to make software supply chain attacks more difficult.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/make-sure-your-application-comes-correct-with-correctness-slos/" target="_blank">Correctness SLOs</a> (Service-Level Objectives) are a way to specify the statistical properties of a program’s output if it is running properly. They could become important as AI is integrated into more applications.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://cilium.io/" target="_blank">Cilium</a> is a tool for cloud native network observability. It provides a layer on top of eBPF that solves security and observability problems for Docker and Kubernetes workloads.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/the-6-pillars-of-platform-engineering-part-1-security/" target="_blank">The Six Pillars of Platform Engineering</a> is a great start for any organization that is serious about developer experience. The pillars are Security, Pipelines, Provisioning, Connectivity, Orchestration, and Observability. One article in this series is devoted to each.</li><li>Adam Jacob, creator of Chef Software, is out to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/adam-jacob-rebuilding-devops-with-system-initiative/" target="_blank">reimagine</a> DevOps. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.systeminit.com/" target="_blank">System Initiative</a> is an <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/systeminit/si" target="_blank">open source</a> tool for managing infrastructure that stresses collaboration between engineers and operations staff—something that was always the goal of DevOps but rarely achieved.</li><li>Unreal engine, a game development platform that had been free for users outside of the gaming industry, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.creativebloq.com/news/epic-games-unreal-engine-charge" target="_blank">will now have a subscription fee</a>. It will remain free for students and educators.</li><li>CRDTs (conflict-free replicated data types) are a data structure that is designed for resolving concurrent changes in collaborative applications (like Google Docs). Here’s a good interactive <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://jakelazaroff.com/words/an-interactive-intro-to-crdts/" target="_blank">tutorial</a> and a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://jakelazaroff.com/words/building-a-collaborative-pixel-art-editor-with-crdts/" target="_blank">project</a>: building a collaborative pixel editor.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://ambient.run/" target="_blank">Ambient</a> is a purely web-based platform for multiplayer games, built with Wasm, WebGPU, and Rust. Instant deployment, no servers.</li><li>Google has open sourced its <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/google/graph-mining#the-graph-mining-library" target="_blank">graph mining library</a>. Graphs are becoming increasingly important in data mining and machine learning.</li><li>Microsoft has released a binary build of <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://learn.microsoft.com/en-ca/java/openjdk/download?utm_source=thenewstack&amp;utm_medium=website&amp;utm_content=inline-mention&amp;utm_campaign=platform#openjdk-21" target="_blank">OpenJDK</a> 21, presumably optimized for Azure. Shades of Embrace and Extend? That doesn’t appear to be happening.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/polystores-the-data-management-game-changer/" target="_blank">Polystores</a> can store many different kinds of data—relational data, vector data, unstructured data, graph data—in a single data management system.</li></ul>



<h2>Security</h2>



<ul><li>The EFF has posted an excellent introduction to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.eff.org/what-is-a-passkey" target="_blank">passkeys</a>, which are the next step past passwords in user authentication.</li><li>Microsoft has started an early access program for <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/microsoft/microsoft-announces-security-copilot-early-access-program/" target="_blank">Security Copilot</a>, a chatbot based on GPT-4 that has been tuned to answer questions about computer security. It can also summarize data from security incidents, analyze data from new attacks, and suggest responses.</li><li>Google is planning to test <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/google/google-chromes-new-ip-protection-will-hide-users-ip-addresses/" target="_blank">IP protection</a> in Chrome. IP protection hides users’ IP addresses by routing traffic to or from specific domains through proxies.&nbsp;Address hiding prevents a number of common attacks, including cross-site scripting.</li><li>While the European Cyber Resilience Act (CRA) has many good ideas about making software more secure, it puts <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/open-source-development-threatened-in-europe/" target="_blank">liability for software flaws</a> on open source developers and companies funding open source development.</li><li>A new attack against memory, called <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2023/10/theres-a-new-way-to-flip-bits-in-dram-and-it-works-against-the-latest-defenses/#p3" target="_blank">RowPress</a>, can cause bitflips even in DDR4 memory, which already incorporates protections against the RowHammer attack.</li><li>August and September’s distributed denial of service attacks (DDOS) against Cloudflare and Google took advantage of a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/security/2023/10/how-ddosers-used-the-http-2-protocol-to-deliver-attacks-of-unprecedented-size/" target="_blank">newly discovered vulnerability</a> in HTTP/2. Attackers open many streams per request, creating extremely high utilization with relatively few connections.</li><li>Mandiant has provided a fascinating <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.mandiant.com/resources/blog/gru-disruptive-playbook" target="_blank">analysis</a> of the Russian military intelligence’s (GRU’s) playbook in Ukraine.</li><li>Mozilla and Fastly are developing OHTTP (<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.fastly.com/blog/firefox-fastly-take-another-step-toward-security-upgrade" target="_blank">Oblivious HTTP</a>), a successor to HTTP that has been designed for privacy. OHTTP separates information about the requestor from the request itself, so no single party ever has both pieces of information.</li><li>A newly discovered <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/new-wordpress-backdoor-creates-rogue-admin-to-hijack-websites/" target="_blank">backdoor to WordPress</a> allows attackers to take over websites. The malware is disguised as a WordPress plug-in that appears legitimate.</li><li>While standards are still developing, <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://thenewstack.io/i-have-a-verifiable-credential-now-what/" target="_blank">decentralized identity and verifiable credentials</a> are starting to appear outside of the cryptocurrency world. When adopted, these technologies will significantly enhance both privacy and security.</li><li>To improve its ability to detect unwanted and harmful email, Gmail will be <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/microsoft-365-admins-warned-of-new-google-anti-spam-rules/" target="_blank">requiring</a> bulk email senders (over 5,000 messages per day) to implement SPF, DKIM, and DMARC authentication records in DNS or risk having their messages marked as spam.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/genetics-firm-23andme-says-user-data-stolen-in-credential-stuffing-attack/" target="_blank">Genetic data has been stolen</a> from 23andMe. The attack was quite simple: the attackers just used usernames and passwords that were in circulation and had been reused.</li><li>The time required to execute a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.bleepingcomputer.com/news/security/fbi-dual-ransomware-attack-victims-now-get-hit-within-48-hours/" target="_blank">ransomware</a> attack has reduced from 10 days to 2 days, and it’s increasingly common for victims to be hit with a second attack against systems that have already been compromised.</li></ul>



<h2>Networks</h2>



<ul><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.com/Shopify/toxiproxy" target="_blank">Toxiproxy</a> is a tool for chaos network engineering. It is a proxy server that simulates many kinds of network misbehavior.</li><li><a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/tech-policy/2023/09/fcc-details-plan-to-restore-the-net-neutrality-rules-repealed-by-ajit-pai/" target="_blank">Network neutrality rises again</a>: The chair of the FCC has proposed returning to Obama-era network neutrality rules, in which carriers couldn’t prioritize traffic from some users in exchange for payment. Laws in some states, such as California, have largely prevented traffic prioritization, but a return of network neutrality would provide a uniform regulatory framework.</li><li>Most VPNs (even VPNs that don’t log traffic) track user activity. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://obscuravpn.io/" target="_blank">Obscura</a> is a new VPN that was designed for privacy, and that cannot track activity.</li></ul>



<h2>Biology</h2>



<ul><li>The US Fish &amp; Wildlife Service is creating a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/science/2023/10/biodiversity-library-will-help-preserve-genetic-diversity-in-endangered-species/" target="_blank">biodiversity library</a>. The library’s goal is to preserve tissue samples from all endangered species in the US. The animals’ DNA will be sequenced and uploaded to <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.ncbi.nlm.nih.gov/genbank/" target="_blank">GenBank</a>, a collection of all publicly available DNA sequences.</li></ul>



<h2>Quantum Computing</h2>



<ul><li>Atom Computing claims to have built a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arstechnica.com/science/2023/10/atom-computing-is-the-first-to-announce-a-1000-qubit-quantum-computer/" target="_blank">1,000 qubit quantum computer</a>. While this is still too small to do real work, it’s the largest quantum computer we know about; it looks like it can scale to (somewhat) larger sizes; and it doesn’t require extreme cold.</li><li>Two research teams have <a href="https://phys.org/news/2023-10-erase-quantum-errors.html" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">made</a> <a href="https://phys.org/news/2023-10-self-correcting-quantum.html" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">progress</a> in quantum error correction. Lately, we’ve seen several groups reporting progress in QEC, which is key to making quantum computing practical. Will this soon be a solved problem?</li></ul>



<h2>Robotics</h2>



<ul><li>This article’s title is all you need: <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.theverge.com/2023/10/26/23933213/boston-dynamics-robot-dog-spot-top-hat" target="_blank">Boston Dynamics turned its robotic dog into a walking tour guide using ChatGPT</a>. It can give a tour of Boston Dynamics’ facilities in which it answers questions, using data from its cameras to provide added context. And it has a British accent.</li><li>Another <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://techxplore.com/news/2023-10-ai-approach-yields-athletically-intelligent.html" target="_blank">autonomous robotic dog</a> can plan and execute actions in complex environments. While its agility is impressive, what sets it apart is the ability to plan actions to achieve a goal, taking into account the objects that it sees.</li><li>A <a href="https://techxplore.com/news/2023-09-multi-purpose-robot.html" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">tetrahedral robot</a> is able to change its shape and size, use several different styles of walking, and adapt itself to different tasks.</li></ul>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/radar-trends-to-watch-november-2023/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Questions for 2024</title>
		<link>https://www.oreilly.com/radar/questions-for-2024/</link>
				<comments>https://www.oreilly.com/radar/questions-for-2024/#respond</comments>
				<pubDate>Tue, 31 Oct 2023 10:11:05 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Radar Column]]></category>
		<category><![CDATA[Signals]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15230</guid>
				<description><![CDATA[This time of year, everyone publishes predictions. They’re fun, but I don’t find them a good source of insight into what’s happening in technology. Instead of predictions, I’d prefer to look at questions: What are the questions to which I’d like answers as 2023 draws to a close? What are the unknowns that will shape [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>This time of year, everyone publishes predictions. They’re fun, but I don’t find them a good source of insight into what’s happening in technology.</p>



<p>Instead of predictions, I’d prefer to look at questions: What are the questions to which I’d like answers as 2023 draws to a close? What are the unknowns that will shape 2024? That’s what I’d really like to know. Yes, I could flip a coin or two and turn these into predictions, but I&#8217;d rather leave them open-ended. Questions don’t give us the security of an answer. They force us to think, and to continue thinking. And they let us pose problems that we really can’t think about if we limit ourselves to predictions like “While individual users are getting bored with ChatGPT, enterprise use of Generative AI will continue to grow.” (Which, as predictions go, is pretty good.)</p>



<h2>The Lawyers Are Coming</h2>



<p><strong>The year of tech regulation</strong>: Outside of the EU, we may be underwhelmed by the amount of proposed regulation that becomes law. However, discussion of regulation will be a major pastime of the chattering classes, and major technology companies (and venture capital firms) will be maneuvering to ensure that regulation benefits them. Regulation is a double-edged sword: while it may limit what you can do, if compliance is difficult, it gives established companies an advantage over smaller competition.</p>



<p>Three specific areas need watching:</p>



<ul><li>What regulations will be proposed for AI? Many ideas are in the air; watch for changes in copyright law, privacy, and harmful use.</li><li>What regulations will be proposed for “online safety”? Many of the proposals we’ve seen are little more than <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.lawfaremedia.org/article/the-shapeshifting-crypto-wars" target="_blank">hidden attacks against cryptographically secure communications</a>.</li><li>Will we see more countries and states develop privacy regulations? The EU has led with GDPR. However, effective privacy regulation comes into direct conflict with online safety, as those ideas are often formulated. Which will win out?</li></ul>



<p><strong>Organized labor:</strong> Unions are back. How will this affect technology? I doubt that we’ll see strikes at major technology companies like Google and Amazon—but we’ve already seen a union at Bandcamp. Could this become a trend? X (Twitter) employees have plenty to be unhappy about, though many of them have immigration complications that would make unionization difficult.</p>



<p><strong>The backlash against the backlash against open source</strong>: Over the past decade, a number of corporate software projects have changed from an open source license, such as Apache, to one of a number of “business source” licenses. These licenses vary, but typically restrict users from competing with the project’s vendor. When HashiCorp relicensed their widely used Terraform product as business source, their community’s reaction was strong and immediate. They formed an OpenTF consortion and forked the last open source version of Terraform, renaming it OpenTofu; OpenTofu was quickly adopted under the Linux Foundation’s mantle and appears to have significant traction among developers. In response, HashiCorp’s CEO has predicted that the rejection of business source licenses will be the end of open source.</p>



<ul><li>As more corporate sponsors adopt business sources licenses, will we see more forks?</li><li>Will OpenTofu survive in competition with Terraform?</li></ul>



<p>A decade ago, we said that open source has won. More recently, developers questioned open source’s relevance in an era of web giants. In 2023, the struggle resumed. By the end of 2024, we’ll know a lot more about the answers to these questions.</p>



<h2>Simpler, Please</h2>



<p><strong>Kubernetes</strong>: Everyone (well, almost everyone) is using Kubernetes to orchestrate large applications that are running in the cloud. And everyone (well, almost everyone) thinks Kubernetes is too complex. That’s no doubt true; prior to its release as an open source project, Kubernetes was Google’s Borg, the almost legendary software that ran their core applications. Kubernetes was designed for Google-scale deployments, but very few organizations need that.</p>



<p>We’ve long thought that a simpler alternative to Kubernetes would arrive. We haven’t seen it. We have seen some simplifications built on top of Kubernetes: K3s is one; Harpoon is a no-code drag-and-drop tool for managing Kubernetes. And all the major cloud providers offer “managed Kubernetes” services that take care of Kubernetes for you.</p>



<p>So our questions about container orchestration are:</p>



<ul><li>Will we see a simpler alternative that succeeds in the marketplace? There are some alternatives out there now, but they haven’t gained traction.</li><li>Are simplification layers on top of Kubernetes enough? Simplification usually comes with limitations: users find most of what they want but frequently miss one feature they need.</li></ul>



<p><strong>From microservices to monolith</strong>: While microservices have dominated the discussion of software architecture, there have always been other voices arguing that microservices are too complex, and that monolithic applications are the way to go. Those voices are becoming more vocal. We’ve heard lots about organizations decomposing their monoliths to build collections of microservices—but in the past year we’ve heard more about organizations going the other way. So we need to ask:</p>



<ul><li>Is this the year of the monolith?</li><li>Will the “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.thoughtworks.com/en-us/insights/blog/microservices/modular-monolith-better-way-build-software" target="_blank">modular monolith</a>” gain traction?</li><li>When do companies need microservices?</li></ul>



<h2>Securing Your AI</h2>



<p><strong>AI systems are not secure</strong>: Large language models are vulnerable to new attacks like prompt injection, in which adversarial input directs the model to ignore its instructions and produce hostile output. Multimodal models share this vulnerability: it’s possible to submit an image with an invisible prompt to ChatGPT and corrupt its behavior. There is no known solution to this problem; there may never be one.</p>



<p>With that in mind, we have to ask: </p>



<ul><li>When will we see a major, successful hostile attack against generative AI? (I’d bet it will happen before the end of 2024. That’s a prediction. The clock is ticking.)</li><li>Will we see a solution to prompt injection, data poisoning, model leakage, and other attacks?</li></ul>



<h2>Not Dead Yet</h2>



<p><strong>The metaverse:</strong> It isn’t dead, but it’s not what Zuckerberg or Tim Cook thought. We’ll discover that the metaverse isn’t about wearing goggles, and it certainly isn’t about walled-off gardens. It’s about better tools for collaboration and presence. While this isn’t a big trend, we’ve seen an upswing in developers working with CRDTs and other tools for decentralized frictionless collaboration.</p>



<p><strong>NFTs</strong>: NFTs are a solution looking for a problem. Enabling people with money to prove they can spend their money on bad art wasn’t a problem many people wanted to solve. But there are problems out there that they could solve, such as maintaining public records in an open immutable database. Will NFTs actually be used to solve any of these problems?</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/questions-for-2024/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Preliminary Thoughts on the White House Executive Order on AI</title>
		<link>https://www.oreilly.com/radar/preliminary-thoughts-on-the-white-house-executive-order-on-ai/</link>
				<comments>https://www.oreilly.com/radar/preliminary-thoughts-on-the-white-house-executive-order-on-ai/#respond</comments>
				<pubDate>Mon, 30 Oct 2023 20:36:22 +0000</pubDate>
		<dc:creator><![CDATA[Tim O’Reilly]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Commentary]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15232</guid>
				<description><![CDATA[Disclaimer: Based on the announcement of the EO, without having seen the full text. Overall, the Executive Order is a great piece of work, displaying a great deal of both expertise and thoughtfulness. It balances optimism about the potential of AI with reasonable consideration&#160;of the risks. And it doesn&#8217;t rush headlong into new regulations or&#160;the [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p><strong>Disclaimer:</strong> Based on <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/" target="_blank">the announcement of the EO</a>, without having seen the full text.</p>



<p>Overall, the Executive Order is a great piece of work, displaying a great deal of both expertise and thoughtfulness. It balances optimism about the potential of AI with reasonable consideration&nbsp;of the risks. And it doesn&#8217;t rush headlong into new regulations or&nbsp;<a rel="noreferrer noopener" href="https://www.linkedin.com/pulse/shiny-ai-moment-needs-whats-much-less-jennifer-pahlka-bqysc/" target="_blank">the creation of new agencies</a>, but instead directs existing agencies and organizations to understand and apply AI to their mission and areas of oversight. The EO also does an impressive job of highlighting the need to bring more AI talent into government. That&#8217;s a huge win.</p>



<p>Given my own research focus on&nbsp;<a rel="noreferrer noopener" href="https://www.oreilly.com/content/you-cant-regulate-what-you-dont-understand-2/" target="_blank">enhanced disclosures as the starting point for better AI regulation</a>, I was heartened to hear that the Executive Order on AI uses the Defense Production Act to compel disclosure of various data from the development of large AI models. Unfortunately,&nbsp;these disclosures do not go far enough. The EO seems to be requiring only data on the procedures and results of &#8220;Red Teaming&#8221; (i.e. adversarial testing to determine a model&#8217;s flaws and weak points), and not a wider range of information that would help to address many of the other concerns outlined in the EO. These include:</p>



<ul><li><strong>What data sources the model is trained on.</strong> Availability of this information would assist in many of the other goals outlined in the EO, including addressing algorithmic discrimination and increasing competition in the AI market, as well as other important issues that the EO does not address, such as copyright. The recent discovery (documented by <a rel="noreferrer noopener" aria-label="an exposé in The Atlantic (opens in a new tab)" href="https://www.theatlantic.com/technology/archive/2023/09/books3-database-generative-ai-training-copyright-infringement/675363/" target="_blank">an exposé in The Atlantic</a>) that OpenAI, Meta, and others used databases of pirated books, for example, highlights the need for transparency in training data. Given the importance of intellectual property to the modern economy, copyright ought to be an important part of this executive order. Transparency on this issue will not only allow for debate and discussion of the intellectual property issues raised by AI, it will increase competition between developers of AI models to license high-quality data sources and to differentiate their models based on that quality. To take one example, would we be better off with the medical or legal advice from an AI that was trained only with the hodgepodge of knowledge to be found on the internet, or one trained on the full body of professional information on the topic?</li><li><strong>Operational Metrics. </strong>Like other internet-available services, AI models are not static artifacts, but dynamic systems that interact with their users. AI companies deploying these models manage and control them by measuring and responding to various factors, such as permitted, restricted, and forbidden uses; restricted and forbidden users; methods by which its policies are enforced; detection of machine-generated content, prompt-injection, and other cyber-security risks; usage by geography, and if measured, by demographics and psychographics; new risks and vulnerabilities identified during operation that go beyond those detected in the training phase; and much more. These should not be a random grab-bag of measures thought up by outside regulators or advocates, but <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.oreilly.com/content/you-cant-regulate-what-you-dont-understand-2/" target="_blank">disclosures of the actual measurements and methods that the companies use to manage their AI systems</a>.</li><li><strong>Policy on use of user data for further training.</strong> AI companies typically treat input from their users as additional data available for training. This has both privacy and intellectual property implications.</li><li><strong>Procedures by which the AI provider will respond to user feedback and complaints. </strong>This should include its proposed redress mechanisms.</li><li><strong>Methods by which the AI provider manages and mitigates risks identified via Red Teaming, including their effectiveness. </strong>This reporting should not just be &#8220;once and done,&#8221; but an ongoing process that allows the researchers, regulators, and the public to understand whether the models are improving or declining in their ability to manage the identified new risks.</li><li><strong>Energy usage and other environmental impacts.</strong> There has been a lot of fear-mongering about the energy costs of AI and its potential impact in a warming world. Disclosure of the actual amount of energy used for training and operating AI models would allow for a much more reasoned discussion of the issue.</li></ul>



<p>These are only a few off-the-cuff suggestions. Ideally, once a full range of required disclosures has been identified, they should be overseen by either an existing governmental standards body, or a non-profit akin to the Financial Accounting Standards Board (FASB) that oversees accounting standards. This is a rapidly-evolving field, and so disclosure is not going to be a &#8220;once-and-done&#8221; kind of activity.&nbsp;We are still in the early stages of the AI era, and innovation should be allowed to flourish. But this places an even greater emphasis on the need for transparency, and the establishment of baseline reporting frameworks that will allow regulators, investors, and the public to measure how successfully AI developers are managing the risks, and whether AI systems are getting better or worse over time.</p>



<p></p>



<h2><strong>Update</strong></h2>



<p>After reading the details found in&nbsp;<a rel="noreferrer noopener" href="https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/" target="_blank">the full Executive Order on AI</a>, rather than just the White House summary, I am far less positive about the impact of this order, and what appeared to be the first steps towards a robust disclosure regime, which is a necessary precursor to effective regulation. The EO will have no impact on the operations of current AI services like ChatGPT, Bard, and others under current development, since its requirements that model developers disclose the results of their &#8220;red teaming&#8221; of model behaviors and risks only apply to future models trained with orders of magnitude more compute power than any current model.&nbsp;<em>In short, the AI companies have convinced the Biden Administration that the only risks worth regulating are the science-fiction existential risks of far future AI rather than the clear and present risks in current models.</em></p>



<p>It is true that various agencies have been tasked with considering present risks such as discrimination in hiring, criminal justice applications, and housing, as well as impacts on the job market, healthcare, education, and competition in the AI market, but those efforts are in their infancy and years off. The most important effects of the EO, in the end, turn out to be the call to increase hiring of AI talent into those agencies, and to increase their capabilities to deal with the issues raised by AI. Those effects may be quite significant over the long run,&nbsp;but they will have little short-term impact.</p>



<p>In short, the big AI companies have hit a home run in heading off any effective regulation for some years to come.</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/preliminary-thoughts-on-the-white-house-executive-order-on-ai/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
		<item>
		<title>Model Collapse: An Experiment</title>
		<link>https://www.oreilly.com/radar/model-collapse-an-experiment/</link>
				<comments>https://www.oreilly.com/radar/model-collapse-an-experiment/#respond</comments>
				<pubDate>Tue, 24 Oct 2023 10:07:56 +0000</pubDate>
		<dc:creator><![CDATA[Mike Loukides]]></dc:creator>
				<category><![CDATA[AI & ML]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Research]]></category>

		<guid isPermaLink="false">https://www.oreilly.com/radar/?p=15219</guid>
				<description><![CDATA[Ever since the current craze for AI-generated everything took hold, I’ve wondered: what will happen when the world is so full of AI-generated stuff (text, software, pictures, music) that our training sets for AI are dominated by content created by AI. We already see hints of that on GitHub: in February 2023, GitHub said that [&#8230;]]]></description>
								<content:encoded><![CDATA[
<p>Ever since the current craze for AI-generated everything took hold, I’ve wondered: what will happen when the world is so full of AI-generated stuff (text, software, pictures, music) that our training sets for AI are dominated by content created by AI. We already see hints of that on GitHub: in February 2023, GitHub <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://github.blog/2023-02-14-github-copilot-now-has-a-better-ai-model-and-new-capabilities/" target="_blank">said</a> that 46% of all the code checked in was written by Copilot. That’s good for the business, but what does that mean for future generations of Copilot? At some point in the near future, new models will be trained on code that they have written. The same is true for every other generative AI application: DALL-E 4 will be trained on data that includes images generated by DALL-E 3, Stable Diffusion, Midjourney, and others; GPT-5 will be trained on a set of texts that includes text generated by GPT-4; and so on. This is unavoidable. What does this mean for the quality of the output they generate? Will that quality improve or will it suffer?</p>



<p>I’m not the only person wondering about this. At least one research group has experimented with training a generative model on content generated by generative AI, and has found that the output, over successive generations, was more tightly constrained, and less likely to be original or unique. Generative AI output became more like itself over time, with less variation. They reported their results in “<a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/abs/2305.17493" target="_blank">The Curse of Recursion</a>,” a paper that’s well worth reading. (Andrew Ng’s <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://info.deeplearning.ai/gpt-4-opens-its-eyes-metas-generative-facelift-newsrooms-respond-to-ai-beware-training-on-generated-data" target="_blank">newsletter</a> has an excellent summary of this result.)</p>



<p>I don’t have the resources to recursively train large models, but I thought of a simple experiment that might be analogous. What would happen if you took a list of numbers, computed their mean and standard deviation, used those to generate a new list, and did that repeatedly? This experiment only requires simple statistics—no AI.</p>



<p>Although it doesn’t use AI, this experiment might still demonstrate how a model could collapse when trained on data it produced. In many respects, a generative model is a correlation engine.&nbsp;Given a prompt, it generates the word most likely to come next, then the word mostly to come after that, and so on. If the words “To be” pop out, the next word is reasonably likely to be “or”; the next word after that is even more likely to be “not”; and so on. The model’s predictions are, more or less, correlations: what word is most strongly correlated with what came before? If we train a new AI on its output, and repeat the process, what is the result? Do we end up with more variation, or less?</p>



<p>To answer these questions, I wrote a Python program that generated a long list of random numbers (1,000 elements) according to the Gaussian distribution with mean 0 and standard deviation 1. I took the mean and standard deviation of that list, and use those to generate another list of random numbers. I iterated 1,000 times, then recorded the final mean and standard deviation. This result was suggestive—the standard deviation of the final vector was almost always much smaller than the initial value of 1. But it varied widely, so I decided to perform the experiment (1,000 iterations) 1,000 times, and average the final standard deviation from each experiment. (1,000 experiments is overkill; 100 or even 10 will show similar results.)</p>



<p>When I did this, the standard deviation of the list gravitated (I won’t say “converged”) to roughly 0.45; although it still varied, it was almost always between 0.4 and 0.5. (I also computed the standard deviation of the standard deviations, though this wasn’t as interesting or suggestive.) This result was remarkable; my intuition told me that the standard deviation wouldn’t collapse. I expected it to stay close to 1, and the experiment would serve no purpose other than exercising my laptop’s fan. But with this initial result in hand, I couldn’t help going further. I increased the number of iterations again and again. As the number of iterations increased, the standard deviation of the final list got smaller and smaller, dropping to .0004 at 10,000 iterations.</p>



<figure class="wp-block-image"><img src="https://lh4.googleusercontent.com/MqQJnDFmvb6R_lvS2XRxm_FBcqMCkimPKOkvO42CCMU1pDJv881hxGYzfOPF4Xs1TSzZv9VaWTufGSTjDeebzciwdmmYeim2BAFLTUfDEd6V10c_E-_SZRJx0FCICZZQVqWTR2WYGKSgsupiJvEOMcA" alt="" /></figure>



<p>I think I know why. (It’s very likely that a real statistician would look at this problem and say “It’s an obvious consequence of the <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Law_of_large_numbers" target="_blank">law of large numbers</a>.”) If you look at the standard deviations one iteration at a time, there’s a lot a variance. We generate the first list with a standard deviation of one, but when computing the standard deviation of that data, we’re likely to get a standard deviation of 1.1 or .9 or almost anything else. When you repeat the process many times, the standard deviations less than one, although they aren’t more likely, dominate. They shrink the “tail” of the distribution. When you generate a list of numbers with a standard deviation of 0.9, you’re much less likely to get a list with a standard deviation of 1.1—and more likely to get a standard deviation of 0.8. Once the tail of the distribution starts to disappear, it’s very unlikely to grow back.</p>



<p>What does this mean, if anything?</p>



<p>My experiment shows that if you feed the output of a random process back into its input, standard deviation collapses. This is exactly what the authors of “The Curse of Recursion” described when working directly with generative AI: “the tails of the distribution disappeared,” almost completely. My experiment provides a simplified way of thinking about collapse, and demonstrates that model collapse is something we should expect.</p>



<p>Model collapse presents AI development with a serious problem. On the surface, preventing it is easy: just exclude AI-generated data from training sets. But that’s not possible, at least now because tools for detecting AI-generated content have <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://arxiv.org/pdf/2306.15666.pdf" target="_blank">proven inaccurate</a>. Watermarking might help, although watermarking brings its <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://www.technologyreview.com/2023/08/09/1077516/watermarking-ai-trust-online/" target="_blank">own set of problems</a>, including whether developers of generative AI will implement it. Difficult as eliminating AI-generated content might be, collecting human-generated content could become an equally significant problem. If AI-generated content displaces human-generated content, quality human-generated content could be hard to find.</p>



<p>If that’s so, then the future of generative AI may be bleak. As the training data becomes ever more dominated by AI-generated output, its ability to surprise and delight will diminish. It will become predictable, dull, boring, and probably no less likely to “hallucinate” than it is now. To be unpredictable, interesting, and creative, we still need ourselves.</p>
]]></content:encoded>
							<wfw:commentRss>https://www.oreilly.com/radar/model-collapse-an-experiment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
							</item>
	</channel>
</rss>
