We ❤️ Open Source
A community education resource
Building data adoption: Why organizations embrace or resist data systems
How to drive organization-wide adoption of your carefully built data systems.
In our previous article, we explored the “plumbing” of data infrastructure using a backyard drainage project analogy. As I was recovering from our weekend project, the whole world and myself included heard about “Liberation Day” — the unprecedented tariff regime being proposed by the United States government. Politics of the matter aside, I’m intrigued by how businesses will react and respond to this daily influx of unpredictability. Naturally, I wonder how robust data infrastructure could help insulate organizations from drastic impacts, even when they have little control over the input decisions.
The tariffs are set to land unevenly, potentially affecting different parts of the economy differently. Should companies immediately add the tariff to end prices? Which rate should they use? How quickly could these changes be reflected throughout their ecosystem — from point of sale systems to purchasing, from middle office to back office, from accounts payable to customer billing? We’ll revisit these questions throughout this article as we examine the second key concept of our data infrastructure series: How to drive organization-wide adoption of your carefully constructed data capabilities.
Free download: A developer’s guide to modern data infrastructure
The adoption paradox
Throughout my career modernizing data systems, I’ve observed a consistent pattern that has become something of a personal axiom: It’s incredibly difficult to get organizations to adopt new systems, but once adoption begins, it’s nearly impossible to take those systems away.
This paradox explains why legacy systems persist for decades and why expensive modernization projects often struggle to deliver their promised ROI. The organizational resistance isn’t irrational—it’s deeply human.
Consider how this paradox might affect tariff implementation: Companies with modern, flexible data infrastructure can potentially implement the new tariff structure within days once finalized. Meanwhile, organizations reliant on legacy systems might need to resort to manual workarounds—Excel spreadsheets calculating surcharges that employees must manually apply—creating inconsistencies and errors that could persist long after the initial implementation.
The comfort of familiar complexity
People develop intimate relationships with their data systems, including all their quirks and inconsistencies. They become fluent in local data dialects: Is it “NY,” “N.Y.,” “New York,” or “NewYork”? These variations become second nature to experienced users.
These localized representations extend to every aspect of business data—customer identifiers, product codes, country codes, percentages and amount notations, transaction types, and status indicators. The result? A mosaic of data representations that makes enterprise-wide data utilization exponentially more complex.
Consider this real-world example: In one system, an active customer might be represented by status code “A,” while in another system, the same status is indicated by a blank expiration date in the customer master record. To a human familiar with both systems, these different representations signify the same thing. But for advanced analytics or AI applications like large language models, these inconsistencies create significant barriers to accurate insights.
Now imagine the upcoming tariff implementation challenge: If country codes and percentage notations differ across internal systems, calculating the correct tariff will become nearly impossible at scale. When Vietnam appears as “VN” in one system, “Vietnam” in another, and perhaps “VNM” in a third, how can you uniformly apply country-specific tariff rates? Some products might get charged the wrong rate, others might escape the tariff entirely, and manual reconciliation could become a nightmare.
Read more: Rethinking data infrastructure: A guide to AI-ready systems
Building a common data language
How do we overcome these challenges? The foundation begins with establishing a unified data taxonomy for your organization:
- Data taxonomy: Define what fundamental concepts mean across your organization. What exactly is a “customer”? How do you represent customer status consistently? What constitutes an “outstanding balance”? In the tariff example, do you use a standard ISO catalog for country, city, and port codes? Organizations with standardized country codes will find themselves with a considerable head start when the tariffs are officially implemented.
- Data ontology: Map the relationships between different concepts in your business domain. This helps translate between different representations while maintaining semantic consistency. For international trade, adopting a global geolocation ontology provides flexibility when sourcing decisions need to change rapidly in response to tariff fluctuations.
- Data modeling: Establish standard terminology and valid value lists for each data element. If each data attribute has exactly one owner (as we discussed in our previous article), that owner should define the logical list of values to be adopted consistently. For tariffs, are percentage conventions uniformly implemented (i.e., is it 0.1 or 10%)? Companies that standardize these representations will be able to simply update a single parameter rather than rewriting dozens of reports and calculations.
Yet even with these foundations, challenges remain. When different representations logically equate to the same outcome (like our active customer example), you need a semantic layer to translate between systems.
The power of semantic layers
Think of semantic layers as custom interfaces designed for specific user groups. Like a well-designed mobile app that presents only the most relevant features for its context, semantic layers filter and transform data to match the needs of particular business units.
For instance:
- Finance and accounting might need transaction coding at an extremely granular level but care little about customer credit score variations. In the tariff scenario, they will need precise country-of-origin data to calculate duties accurately, but might not care about the specific shipping route or carrier.
- Supply chain, conversely, might need detailed shipping logistics but require only basic accounting categorization. With the new tariffs approaching, they suddenly need to evaluate alternative sourcing options and understand the cost differential between countries with different tariff rates.
Building semantic layers tailored to departmental needs ensures that data consumers interact with relevant information without wasting time on translations or hunting for what matters to them. When “Liberation Day” arrives, organizations with well-developed semantic layers will be able to quickly create new views incorporating tariff information without disrupting existing processes.
The infrastructure of access
With ontology established, data models defined, and semantic layers built, we face another crucial question: Where does the data live, and how do people access it?
Most organizations—especially those with legacy systems—have developed multiple access paths to critical data. Users find increasingly creative ways to build dedicated pipelines optimized for their specific needs, often resulting in a tangled web of connections.
A reference architecture can bring order to this complexity by classifying data users into functional groups:
1. Data producers
These are your systems of record—enterprise platforms like SAP, Workday, Oracle ERP, or Salesforce. They generate the authoritative data that drives your business. In the tariff case, ideally all tariff impositions will be implemented in these systems and flow through to downstream applications. Companies with inflexible legacy systems will likely find themselves creating workarounds that could persist for years.
2. Data operators
This critical and diverse group includes operations teams, legal, risk, compliance, and various back-office functions. These users don’t just consume data; they actively augment it by qualifying transactions, approving credit, updating customer information, and performing countless other business processes.
For these users, Operational Data Stores (ODS) have long been the architectural solution of choice. Modern ODS implementations have evolved from monolithic databases to microservices, multi-modal platforms, and edge computing nodes—but their fundamental purpose remains: Unifying and normalizing data from multiple sources so users can work with the data rather than on the data.
In our tariff example, the impact will need to be factored into multiple systems (invoices to customers, updates to terms and conditions, etc.). Configurable metadata goes a long way in being able to incorporate these changes easily without building offline manual processes. Organizations with parameter-driven price calculation engines will be able to implement new tariff structures much more rapidly than those with hard-coded pricing logic.
3. Decision support users
These groups—including marketing, strategy teams, and senior leadership—need historical perspective for longer-term decisions. They require extensive historical data, cleanly curated to answer trend questions like: “How has our sourcing mix changed over the past decade?” In our tariff example, they might need to analyze the possibility of shifting production to locations with lower tariff impacts or adjusting product mix to emphasize domestically sourced items.
Data warehouses, data lakes, and newer data lakehouses serve these analytical needs, supporting exploratory, often non-repeatable questions that are difficult to predict in advance. Not all information may be available in-house when needed, so this group also needs to be able to quickly incorporate new external sources for side-by-side analysis with existing data. With tariffs poised to change the competitive landscape overnight, the ability to rapidly integrate external market data will become a critical advantage.
Read more: Revisiting data quality in the age of AI and ChatGPT
The data science challenge
Data scientists often lament that 80% of their time gets consumed by “data wrangling” as they navigate the quirks of source systems. The same disciplined approach to taxonomy and semantic layers can reduce this burden, though many data scientists still prefer access to raw, unprocessed data.
This tension reflects an element of cultural maturity. Traditionally, data specializations have been segmented into distinct roles:
- Requirements and analysis = Product managers
- Modeling and lineage = Data modelers
- Pipelines = Data engineers
- Analytics = Data scientists
In reality, individuals often wear multiple hats, and with generative AI, many of these functions can now be partially automated. What remains non-negotiable is aligning data representation with intended usage—this creates the flywheel effect that accelerates adoption.
The upcoming tariff implementation presents a textbook case for organizational nimbleness and agility. Companies that have invested in cross-functional data teams will find themselves able to model tariff impacts within days, while siloed organizations will struggle to coordinate an effective response. Practicing these organizational routines isn’t optional—it’s essential preparation for inevitable disruptions.
The economics of modern data infrastructure
Thanks to evolving technology, what might have been a two-year project costing tens of millions just a few years ago can now be accomplished in a fraction of the time and cost. Cloud-native tools, containerization, and AI-assisted development have dramatically changed the economics of data infrastructure.
This technological evolution means that even the absolute shock of “Liberation Day” can be absorbed without an all-hands-on-deck crisis approach—if the underlying groundwork is already in place. Organizations that invest in modern data platforms will find they can implement rapid changes through configuration rather than coding, avoiding both immediate disruption and long-term technical debt from hasty implementations.
The competitive advantage of data mastery
A carefully designed information architecture—one that classifies organizational data using a reference framework and incorporates the foundational “plumbing” we’ve discussed—enables organizations to fully exploit their information assets when it matters most.
The tariff scenario demonstrates this vividly: Companies with mature data capabilities will be able to rapidly:
- Assess the financial impact across their product portfolio
- Model various pricing scenarios to maintain margins
- Identify alternative sourcing options
- Communicate changes coherently to customers and suppliers
In today’s AI-driven landscape, competitive advantage doesn’t necessarily go to those with the latest models or most powerful GPUs. Rather, it flows to organizations with the best understanding of their own data and the infrastructure to leverage it quickly when market conditions change.
After all, the most sophisticated AI is only as good as the data it’s built upon. When your organization speaks a common data language and provides intuitive access paths to that data, you create the conditions for both human and artificial intelligence to flourish—especially when facing unexpected challenges like “Liberation Day.”
Additional resources
Overview topics
- What is Data Management? (CIO wiki)
- A brief history of data management (Dataversity)
- What is data management? (IBM)
Data storage
- Introduction to data lakes (Databricks)
- What is a data lake? A super-simple explanation for anyone (Forbes)
Ontology
- Open Knowledge Graph Lab (EDM Council)
More from We Love Open Source
- Rethinking data infrastructure: A guide to AI-ready systems
- Revisiting data quality in the age of AI and ChatGPT
- Demystifying external data as a service
- Optimized GraphQL data fetching strategies
- A developer’s guide to modern data infrastructure
The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.
