Data is the lifeblood of our modern, technology-driven world. Data is powering AI and the vast majority of the technology-based services being delivered not only by the businesses and people we interact with, but also the government and the agencies that support the government’s function.
The ability to share and utilize data seamlessly is critical for efficient governance. However, a significant roadblock hindering data-driven decision-making lies in the difficulty of data sharing between federal agencies. Often referred to as “data silos” or “stovepipes”, the challenge of data sharing results from a combination of bureaucratic obstacles, security and privacy concerns, data format differences, and divergent data management practices across agencies. Consequently, realizing the mission of government agencies is hindered due to the limited exchange of necessary data.
On a recent GovFuture podcast, Taka Ariga who is the Chief Data Scientist, Director of Innovation Lab at Government Accountability Office (GAO) shares insights into the needs for data collection and greater data sharing in the US federal government, as well as insights into the use and adoption of AI and advanced technologies.
Taka Ariga, Government Accountability Office
Taka Ariga, Government Accountability Office (DuHonPhotography.com)
Expanding Innovation at the GAO and the Needs for Data Collection
“I have a dual role at the Government Accountability Office as GAO’s first Chief Data Scientist, but I also lead the work of our Innovation Lab. That’s really a recognition by GAO that for us to do our oversight effectively, we need to have a fairly prescriptive view on emerging technology, such as artificial intelligence, blockchain, cloud services, etc. GAO has a very important mandate to conduct federal government oversight of these programs. So we need to understand the inner mechanics of these technologies to address not only performance issues, but certainly any sort of a societal impact that, for example, AI may have. But the greedy strategy here is while we’re unpacking these inner mechanics of technology, why can we also think about how these capabilities might also enhance GAO’s oversight capacity?
Using data is nothing new to GAO. GAO has existed for a little over 100 years or so. And we really pride ourselves in that objective, nonpartisan quality work. That underpins the data-centric approach that we take. And so, but in the digital era, what that really changes, how can we think about alternative ways of collecting data? Typically, when GAO conducts an audit, we will issue data requests to the agency and that data gets extracted from the systems and we will do our analysis accordingly. But more and more, we now have opportunity to sort of generate data in a way that, for example, using drone, telemetry, and other sensory information, using extended reality where we might be able to embed machine learning models to, for example, auto count inventories and do some computer vision type of work.
And so those kinds of novel data sets are becoming increasingly part of GAO’s way of conducting oversight. So developing capacity in terms of how we collect them, how we process them, how do we apply data science to that kind of information, but also more importantly, how do we develop the kind of data-centric narrative so that we can support absorption of these complex topics in a way that enhances sort of congressional oversight on various topics.”
The GAO’s “Yellow Book Standard” For Data Privacy and Security
When data is shared or exchanged between organizations, often the topic of data privacy and security comes up, especially when dealing with taxpayer, citizen, and/or confidential data. Taka shares, “One principle that GAO lives by is what we call the yellow book standard. And that is a shorthand for the generally accepted government auditing standard. And within that standard, it requires all federal programs to be effective, efficient, economical, ethical and equitable. And as an auditor, we follow those same principles as well.
When we request information from an agency, we guarantee that we will protect that information at or above how that agency is protecting that information. We’re making sure that there’s a need to know basis. We make sure that we only request what we need in order to conduct our oversight work effectively. And so there’s a lot that we go through to make sure that while we are carrying out the important oversight function, we’re not applying undue burdens on the agency, but we also are not collecting information that we don’t need. And even if we do, we’re protecting them accordingly, so that it’s not a broad access even within GAO, for example, PII information or other sensitive data.”
In addition to the above, Taka adds that “part of the way that we ensure that we’re using data appropriately is through a multidisciplinary process. These are not just data scientists getting together and saying, ‘how might we analyze information?’ We will have conversations with our lawyers, with other subject matter experts to say: while there are benefits in sort of augmenting data in different ways, are there unintended consequences that we’re creating? And does that risk outweigh the benefit that we’re looking to generate? So it’s a very deliberate process that we take to make sure that we are safeguarding the information that we’re entrusted with.”
Challenges in IT Modernization
On the topic of modernizing IT systems and increasing a culture of data sharing, Taka shares that “Technology usually is not the challenge. You know, we have a lot of smart people here. We understand what the technology is, whether it’s cloud, whether it’s AI, whether it’s blockchain, and how they generally function. A lot of our focus really is around the cultural mindset.”
Taka goes on to provide a relevant example.” Even in infrastructure modernization, there may be a tendency to think about how you lift and shift from an on-premise environment into a cloud environment. Whereas from an innovation lab perspective, we typically focus on how we are cloud optimized, so not just lift and shift into a new environment. And part of that sort of process is also making sure that we’re reinventing our business processes to accommodate that modernization. Otherwise, just putting a faster, shinier processing capability on existing processes doesn’t always solve the challenges either. So we do put a lot of sort of focus around not only the cultural transformation, but also the change management process. Technology usually takes care of itself pretty straightforwardly.”
Data Sharing Struggles
When interacting with other agencies to accomplish their mission, the GAO faces challenges in data access and sharing. Taka explains, “Within the Innovation Lab, we aim to tackle really systemic challenges facing the federal government, whether they’re improper payments, whether they’re fraud related with identity verification. These are the kind of really challenging issues that I don’t think GAO has all of the answers to. We certainly have done work in all of these areas that we can draw from, but we want perspective, we want experiences of other agencies.
But statutorily, it’s very difficult for Article 1 agency to collaborate with Article 2 agency on top of auditors involving auditees. We have been able to sort of make progress in the area. For example, in the Joint Financial Management Improvement Program, JFMIP, it allows GAO to come together with OPM, OMB, and Treasury to collaborate on topics of mutual interest. In this case, specifically financial management, and that is a wide area for us to focus around, for example, use of AI, use of identity verification, how do we mitigate proper payments, use of blockchain technology, for example. So we’re trying to find the right forum for us to collaborate while individually maintaining our agency guardrails around independence, around sort of management duties. We always look for opportunities to collaborate because some of these systemic issues do require a sort of collective effort to help address.”
On the topic of data sharing, Taka relates, “I think that puts an additional emphasis around collaboration, knowledge, sharing, and more importantly, the ability to speak freely around what they’re seeing so that we can be more preventative as opposed to reactive.
I would love for the public sector to continue to be forward thinking and adopting capabilities [such as AI]. Not reactive to it. And part of that at GAO, so we were doing at the cadence that we can be helpful to Congress thinking through some of these regulatory conversations. Part of that is in making sure that we’re accounting for change management, and cultural transformation is part of that conversation. So we’re not just doing technology for the sake of technology.”
One of the primary reasons behind the challenges of data sharing is the historically entrenched bureaucratic culture that fosters compartmentalization rather than collaboration. As each agency operates with distinct missions, goals, and priorities, sharing sensitive data can be perceived as a risk, leading to a reluctance to cooperate. Moreover, the lack of standardized data formats and incompatible systems further exacerbate the problem, impeding seamless data interchange and analysis. As a result, essential insights that could inform and enhance policy-making, public services, and national security efforts remain locked within the confines of individual agencies.
As is often the case with technology, the most difficult parts of change and adoption are related to people and processes, not the technology itself. You can hear more on the GovFuture podcast on this topic and more details and insights from Taka Ariga of the GAO.
Disclosure: Ronald Schmelzer is an Executive Director at GovFuture.