pwshub.com

Metadata knowledge graphs and data integration future - SiliconANGLE

Unpacking the next data platform is a crucial process in the constantly changing world of data and artificial intelligence. It involves understanding metadata knowledge graphs and how different layers of the modern data stack come together.

If one wants to do anything with data, they need a stack of tools to get it done. The stack has not changed even with all of the innovation that’s been happening in the data industry, according to Gaurav Pathak (pictured, right), vice president of product management AI and metadata at Informatica Inc.

George Gilbert, theCUBE, George Fraser, chief executive officer of Fivetran, Gaurav Pathak, vice president of product management AI and metadata at Informatica, discuss metadata knowledge graphs during Supercloud 7.

George Fraser of Fivetran and Gaurav Pathak of Informatica talk with theCUBE about metadata knowledge graphs.

“Players have changed. But that stack, moving the data from raw data to really processed insight, has remained quite similar with [metadata knowledge graphs],” Pathak said. “We are looking at triples, we are looking at relationships between individual metadata objects.”

Pathak and George Fraser (left), chief executive officer of Fivetran Inc., spoke with theCUBE Research’s Rob Strechay and George Gilbert at the Supercloud 7: Get Ready for the Next Data Platform event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the evolving data stack and the role of metadata knowledge graphs.

Metadata knowledge graphs enable more action

Informatica collects technical, business, operational and usage metadata about data assets, according to Pathak. That also involves collecting information such as schema and structures about what data looks like in Snowflake and Databricks.

“We look at how is that pipeline created, what are the transformations, how [are] all of these things related to each other?” he said. “Having that triple, having that metadata knowledge graph, then allows you to now start doing, both human-wise and AI-wise, intelligence queries to the data ecosystem itself.”

Companies can ask questions as a result of that, according to Pathak. Those questions could include how many Iceberg tables a company has.

“How many of them are used by people in [the] marketing department? And how many of them are compliant with GDPR?” Pathak said. “Their data is not moving from one jurisdiction to another. These kind of questions are really, really hard to get early on. But with metadata knowledge graphs, with catalogs like these, these are now possible.”

It’s understood now that metadata has been centralized, things such as usage, consumption, financial management and governance are much easier to manage. There are a lot of new workloads happening right now, according to Fraser.

“That’s one of the big phenomenon that we’re seeing, is customers are doing more new workloads with their data. From a Fivetran perspective, that means new data types,” he said. “It means there are things that previously didn’t belong in the central data estate, now belong there. Mostly freeform text stuff. Fivetran has had connectors to systems like Zendesk and Slack for many years that have freeform text, but there’s a whole new emphasis on those systems.”

AI demands diverse, fresh data sources

Beyond AI’s evolution tied to freeform text, there’s also an evolution tied to a demand for more diverse sources of data, according to Fraser. The other point of evolution has to do with latency.

“Some of these more operational type of workflows that people want to do with AI agents and things like that, they require fresher data,” Fraser said. “The first Fivetran pipeline 10 years ago ran once a day. And now, the milestone we’re trying to get to is where we can reliably do one-minute latency for all data sources.”

Fivetran considers all these to involve workloads that run on the data it delivers. That evolution means more sources and new entities within existing sources, according to Fraser.

“It maybe means more adoption of data lakes as the compute engine people want to use to power some of these new workloads is maybe one that doesn’t even exist yet,” he said. “Those are the main, I think, evolutionary pressures that we are feeling from the data pipeline perspective.”

These technologies are extremely powerful, according to Pathak. There are more changes on the horizon to come, too.

“What will change is that people have thought about code as something that needs to be maintained pristine. It has to be taken in for a long time. There was a whole ecosystem around it,” he said. “But if you have gen AI systems that can convert English into natural language statements and then take decisions on what’s the right formats, what are the right models to store that data in, I think that will be a very different world that we will live in.”

Stay tuned for the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of the Supercloud 7: Get Ready for the Next Data Platform event.

Photo: SiliconANGLE

Source: siliconangle.com

Related stories
2 weeks ago - Last year theCUBE Research asserted that we are on the brink of a transformative shift toward intelligent data applications, set to revolutionize business operations. We introduced the concept of “Uber for All” as a metaphor, predicting...
1 month ago - The advent of generative artificial intelligence has made data management a pressing concern, with data software companies looking to better support the average business user. Founded in 2021, Illumex Technologies Inc. is set to meet...
1 month ago - There’s an ongoing changing source of truth amid the data platform shift. It’s a rapidly evolving situation, as companies must consider open table formats and metadata management tools. The open table format landscape includes Delta Lake,...
1 month ago - FOSSA Inc., an open-source compliance and security platform, today announced it has acquired the developer tool community platform StackShare for an undisclosed amount, bringing on board 1.5 million registered users. As a software...
3 weeks ago - We believe enterprise applications are undergoing a profound change. By next year, highly capable agentic systems will emerge to create new application classes and alter the way organizations think about their backend systems, data...
Other stories
52 minutes ago - (Bloomberg) -- Skechers U.S.A. Inc. shares delivered their worst daily performance since February after the footwear company’s chief financial officer told an industry conference that China sales will be under pressure the rest of the...
1 hour ago - The Fed's cutting cycle in 1995 sparked an economic boom, with the stock market more than doubling in value by the end of the decade.
1 hour ago - There's nothing like a potentially massive government contract to win the hearts of both investors and analysts.
2 hours ago - Shares of Truth Social’s parent company fell Thursday, extending the latest round of declines for Trump Media & Technology Group.
3 hours ago - European Union officials are taking new steps to ensure that Apple Inc. complies with the bloc’s DMA tech industry regulation. The European Commission, the EU’s executive arm, announced the initiative today. The DMA is a piece of...