The race among companies to adapt AI has evolved: Instead of simply striving to be first, firms have turned their attention to learning how to deploy these powerful tools effectively. This development comes as companies discover that poorly crafted prompts—the set of instructions used to tell AI to perform a given task—and the use of unspecialized models are spawning inaccuracies and inefficiencies.
There are many examples of this evolution. Firms like Johnson & Johnson are creating libraries of prompts to share among staff to improve the quality of AI output. Other companies, including Starbucks, are taking things further by creating in-house models.
For context, it’s helpful to know that using a so-called large language model, like ChatGPT, requires entering a prompt for the AI that can be as simple as "summarize this story."
"There are two parts of prompting. One part is just to give a good description of what you want done," Christopher Manning, professor of computer science and linguistics at Stanford University and director of the Stanford Artificial Intelligence Laboratory, told Fortune. "After that, there's a lot of fiddling that goes on because people quickly find that some prompts work better than others. It turns out that giving grandmotherly instructions like 'make sure you think carefully about it' actually tend to do good."
Prompt libraries can be as simple as a collection of conversation starters like the library Johnson & Johnson's uses in order to reduce friction for employees who use its internal generative AI chatbot.
"We’re using [our chatbot] to upload internal documents and create summaries or ask ad hoc questions," a spokesperson for Johnson & Johnson said. "We created a prompt library with thought-starters to help employees explore potential use cases relevant to different areas of the business."
Meanwhile other prompts aim to minimize the risk of hallucination—the term for the frequent occurrence of AI producing facts that sound plausible but aren't true—or to format answers in the most efficient way possible.
"We have different prompting libraries depending on the use case or expected output," Christian Pierre, Chief Intelligence Officer & Partner at creative agency GUT, told Fortune. "[Our] strategists and data analysts share a library and a 'keyword cheat sheet' with specific keywords that can drastically change the output. For instance, we know that just by adding 'Provide it as a boolean query,' ChatGPT will write boolean queries that we can use in our social listening tools."
Often undesired outputs are the result of the absence of knowledge in the relevant data set. For example, a language model will likely supply an answer to a prompt asking why John got hit by a car—even if it has no information about John or the accident in the first place.
"The tendency of all of these models is that if there are facts, they will use them." Professor Manning said, "And if there aren't facts, [it will] write plausible ones with no basis in truth."
Thus crafting the perfect prompt requires providing an intense level of context, the tweaking of keywords, and a precise description of the desired form. Those who pride themselves on creating these call themselves prompt engineers.
Taking it even further
Unfortunately, even the most optimized prompt can fall short of what big companies are looking for.
"These large language models are very generically trained," Beatriz Sanz Sáiz, global consulting data and AI leader at Ernst & Young, told Fortune. "What we are trying to achieve is really bring in the best, let's say tax professionals, to really fine-tune, retain, and retrain."
Ernst & Young has created an in-house AI platform called EY.ai. Microsoft provided the firm early access to Azure OpenAI, in order to build a secure and specialized system. This has helped increase the speed of the system and protect sensitive data and, most importantly, provided EY with the ability to adjust the model to fit its desired outcomes.
"If there's one task that you want to do—like reading insurance claims, writing out what we're doing with them, and what the reason for it was—and if you've got a fair number of examples of that from your past business," Professor Manning explained, "you can then fine-tune the model to be especially good at that."
Fine-tuning is done by someone with machine learning experience, rather than a prompt engineer. At this stage, the company may decide to shrink the dataset by removing unnecessary stuff, like the ability to write haiku, to fixate the model on a specific function.
Ernst & Young has specialized its system further by creating a library of embeddings.
"Think of [embeddings] as additional data sets that you put into the model," Sáiz said. "We can connect all the dots by bringing together the tax knowledge, the country regulation, maybe also the sector knowledge."
By plugging-in these additional datasets, the model becomes hyper-specific to its purpose. Companies are finding the best AI recipe entails building on a controlled dataset, injecting it with a library of embeddings, and querying with customized prompts.
"Typically now what we'll be able to do is assess clients, not on the expertise of an individual tax team but on the collective knowledge that EY has created for years," Sáiz explained. "And not just in one jurisdiction, but globally across multiple jurisdictions."
Sáiz believes that the personalization of AI models through finely tuned in-house models and embedding libraries will be crucial to the future of companies using AI. She also predicts that the importance of prompts will decrease as AI gets more intelligent.
However, Professor Manning believes the future will be mixed. While specialized systems will exist for high volume tasks there is also room for generalized models that require engineered prompts for irregular tasks, such as writing a job advertisement.
"Those are great tasks you can give to ChatGPT these days," Professor Manning told Fortune. "I think a huge space of companies can very successfully have someone who learns a bit and gets perfectly decent at writing prompts and getting great results out of ChatGPT."