The flurry of emerging generative AI — including large language models — offerings continue to pick up velocity, but that does not necessarily mean every enterprise can or wants to bolt such resources on to their technology strategy. For regulated institutions such as banks, a variety of considerations must come into play.
That was part of the debate at this week’s New York Enterprise Technology Meetup, where a panel discussed “Large Language Models in the Enterprise.” Daniel Chesley, an associate with venture capital firm Work-Bench, who hosts the meetup, moderated.
The lineup included execs from financial institutions Morgan Stanley and JPMorgan Chase as well as leadership from startups MosiacML, a builder of machine learning training systems, and Arthur, a developer of a firewall for LLMs.
For startups, the upswell in activity in this space has meant changing tactics and strategy quickly to find the right angle of attack. “When the LLM thing first exploded, really November 30th when they launched ChatGPT,” said Adam Wenchel, CEO and co-founder of Arthur. “We immediately jumped into action and expanded our observability into LLMs.”
He said the rush by some companies to get LLMs into production brought some concerning discoveries. “They quickly ran into a whole new set of challenges that are new to generative models,” Wenchel said. “That includes a couple of big ones — hallucination, toxicity, and then leaking sensitive data.”
Those and other issues led to some organizations hitting a wall that stalled production, he said, making it hard for his company to help them observe production models. Wenchel said Arthur went to work on this problem to get models into production more quickly.
Though Jonathan Frankle, co-founder and chief scientist at MosaicML, said he has seen a variety of LLM use cases take hold in the enterprise, some areas have not been as deeply explored as might be expected. “Not a lot of people are using these as chatbots,” he said, “which probably isn’t going to surprise anybody who’s actually trying to use this to extract useful information from data. A lot of the use cases I see are really what we would call extraction or summarization.”
This can include gleaning useful information from a long document, such as a legal opinion or a 10-K filing, Frankle said. “A lot of these use cases — they don’t look like chat at the end of the day,” he said. “Chat is in some sense a nice, fancy user interface on top of a model that you really want to do some productive stuff with.”
At least for now, there are limitations and concerns that should be considered in what and how such resources are used. “Putting these models into production without human supervision is a little scary,” Frankle said. “That’s not something I would recommend anyone do today in any kind of real serious use case where you need to be right.” Extraction and summarization via LLM, he said, in comparison does not necessarily require certain extremes of accuracy. “You just need to be useful,” Frankle said. “You need to get some information that would have taken you a long time to get that first sketch up.”
H. David Wu, of Morgan Stanley, speaks as Jonathan Frankle, with MosiacML, looks on. / Photo by Joao-Pierre S. Ruth
Banking and other institutions seem to have been looking for ways to leverage LLMs and other forms of generative AI even before ChatGPT took off — but its popularity might accelerate their exploration of it.
There has been an insurgence of AI/ML over the course of past five years, says Sage Lee, executive director of global tech, AI and ML at JPMorgan Chase. “We do think about how we govern AI/ML versus traditional models,” she said. “Typically, we would consider models versus nonmodels.”
Along with the overall growth of AI/ML in recent months, Lee said she also keeps abreast of operational and reputational risks the technology may pose. “Our focus right now is really establishing the right infrastructure,” she said, “and then also making sure that the dataset that we bring into that infrastructure is safely guarded.”
Those considerations include the handling of data, data persistence, and ensuring that data is wiped when necessary. Much like others in the financial services industry, JPMorgan Chase cannot escape its regulatory compliance requirements when it comes to data, which might become more intensive in a world growing more concerned about cybersecurity and privacy. “I’m not going to say it’s going to get worse,” Lee said. “It’s going to get more and more and more interesting.”
Vetting potential vendors of this technology, and ways to apply, can require assessing everything, according to H. David Wu, managing director at Morgan Stanley. He said his organization was introduced to OpenAI in late 2021 – early 2022 and has been working with them for more than one year. “The first use case was really around, ‘How do we take the intellectual capital that we think is differentiating? How do we hook up the GPT-4 capabilities to serve that up to … our financial advisers?’”
The release of ChatGPT last November changed the landscape drastically with a growing flood of comparable technology being announced. “This is absolutely overwhelming,” Wu said. “It’s very hard to keep up with the ecosystem. You’ve got existing technology companies that are rebranding themselves as AI companies. If everything in the 90s was a dot com, it feels like things are GPT, or dot AI.”
Time spent with OpenAI, he said, helped Morgan Stanley get a grasp of what is really needed to make the technology work and winnow out the chaff. “From a buyer perspective, we look at every aspect. We’ve looked at all these foundational models,” Wu said. “When we started with OpenAI, it felt like the only game in town. Now you’ve got the Anthropics, the Inflexions, the Coheres, the Googles, the Microsofts, and it just goes on and on.” There are also apps trying to specialize in very specific use cases, he said.
There may be a desire to move fast with the spread of this technology, but heavily regulated institutions must still act with compliance — and that can mean asking many more questions before acting. “As a bank, we historically buy and integrate versus build,” Wu said. “It’s really our responsibility to see what’s out there, what’s changing, and figure out, ‘Are there people that harness these capabilities and solve the problem that we’ve not yet done, but also do in a way that we can get our regulators comfortable?”
What to Read Next: