Structured Data Management for Discovery and Insight

Structured data is the life blood for the decision process in the enterprise, and it’s essential to building models that help identify root causes, predict outcomes, and prescribe actions.

To do all that, structured data must be accurate, trusted, protected, and accessible to users in the enterprise. Data management is also the fastest growing category of infrastructure spending in 2022, according to a recent report from IDC.

Unstructured data, which is usually in forms like text, images, video, sound, and voice, was incredibly difficult to work with only a few years ago.

However, with the advancements in artificial intelligence and machine learning (AI/ML), it can now be turned into a structured form that provides insights, which help people and applications to act on the information that was hidden in the unstructured data.

“Data is managed differently at different parts of its lifecycle, and can have varying value as well,” explains Bret Greenstein, partner, data and analytics at PwC. “Some data is very time sensitive — milliseconds when validating a transaction, to days for supply and demand forecasting, to months for pricing — depending on the industry and use case.”

Additionally, as data moves through an enterprise, it often is transformed as it goes, so its meaning and impact can change.

“It is important to understand the value of data at every step to understand if it can be discarded, used for historical reference, or aggregated with other data to generate a new insight,” Greenstein says.

He explains that the current best practices for structured data management include leveraging cloud native data warehouses in order to integrate data from multiple sources across the enterprise (ERP systems, HR systems and CRM systems) into an enterprise data model.

Such a model includes tables and views designed to make it easier for consumers of the data to build insights, reports, dashboards, and models.

“Each major cloud provider has their own data warehouse technologies, and several companies have cloud data warehouses that run on any cloud as well,” he notes. “You will often hear data lakes mentioned as well.”

Greenstein says those are good places for companies to manage their unstructured data along with their structured data, to make it easier to work with advanced analytics and AI/ML.

Defining a Structured Data Strategy

Ed Macosky, chief innovation officer at platform-as-a-service provider Boomi, says before managing structured data, leaders first need to devise a comprehensive and nuanced data strategy. “In a time where organizations are dealing with massive amounts of data, it is imperative for leaders to understand that not all data is equal,” he says. “Not all data requires data management or governance.”

For example, personally identifiable information (PII) is one type of data that must be secured, managed, and governed to corporate standards. However, data that is not PII or critical to the business or driving decisions can live in its natural state for analytical needs.

“Once data is sorted and classified based on governance parameters, leaders can then begin to understand the value of the data, when and where to place protection, and how to make it accessible and understandable to those who need it,” Macosky adds.

Eliud Polanco, president of Fluree, an open-source semantic graph database company, says another best practice is to manage full-cycle data lineage. This is how data provenance can be tracked and traced across its lifecycle so that data consumers can verify the integrity of the information they are using.

“The ability to manage and maintain a record of a given dataset’s lifetime is critical to enhancing the credibility of that data,” he explains.

C-Suite Data Stakeholders Should Collaborate

Polanco says the chief data officer, chief compliance officer, and CISO should collaborate on finding an effective structured data management practice that provides a well-governed, fully-compliant data architecture that connects data sources for data consumers.

“Data must be findable, accessible, interoperable, and re-usable for [data] consumers, while also ensuring compliance with data quality standards and data security and privacy measures,” he adds.

Anyone in a managerial position who encounters data will likely have considered best practices for data management already. “While those managers may be responsible for implementing data management resources for their respective teams, the initial solution can come from technology companies that weld together the manual knowledge of what the data needs to look like and the efficiency of a more automated sorting process,” Polanco says.

Macosky adds that while the chief data officer position is fairly new across industries, he expects to see the role become more important and vital as organizations prioritize and value data management.

“It is up to the data governance council to be responsible for executing the strategy and building curated, secure, and well-managed data to support C-level strategy and decision-making,” he says.

Reaching the Full Potential of Structured Data

From Polanco’s perspective, to leverage the true potential of enterprise data, an organization must build semantic frameworks for universal interoperability, as well as put data lineage management in place.

“By future-proofing structured data for re-use along the data value chain, data can become a trustworthy, secure, and accessible strategic asset across enterprise functions,” he says.

Macosky says structured and unstructured data management both should be part of an organization’s data governance strategy, and the company must be intentional and selective when identifying this critical data.

“To help with these complexities, IT teams can use data tools to scan unstructured data to identify data patterns and attributes that fit into PII or mission-critical data categories,” he says. “This will let leaders know which data sets should be managed, secured, and governed accordingly.”

With data becoming a critical aspect of any given business strategy, structured data management allows operations to move forward with more agility and build competitive advantage in each marketplace.

“To outperform competitors, build faster time-to-analytics, and intelligently respond to emerging data privacy concerns, building a full-cycle structured data management practice is critical,” Polanco says.

https://www.informationweek.com/big-data/structured-data-management-for-discovery-and-insight

Leave a Reply