Data Management Strategy – Evaluation

Managing data within an organization begins with an honest evaluation of the products and services your company offers. For simplicity, I’ll use the term “Product” to encompass both products and services. Different offerings will inevitably have distinct data requirements, which necessitate tailored management strategies.

In today’s landscape, virtually all products utilize data to some degree. However, the specifics of how data is used — and therefore managed — depend on the nature of the product itself. Let’s break this down into two primary categories:

  1. Data as the Product: These are offerings where data itself is the central value proposition. Examples include data aggregators, advanced statistical models, analytics dashboards, or any tool that extracts actionable insights from data. We’ll refer to these as Data Products.
  2. Products Utilizing Data: These are products where data plays a supporting role. A typical SaaS (Software as a Service) business might fall into this category, leveraging data to enhance decision-making, provide insights, or improve the overall user experience. We’ll call these Digital Products.

Key Dimensions of Data Management

Once you’ve categorized your product, the next step is to assess your data needs along two critical dimensions:

1. Data Volume:

Data can be broadly classified as either large or not large. While every organization dreams of scaling to the point where “large” data becomes a problem, the reality is that most companies deal with manageable datasets. My “hot take” here: default to assuming a “not large” dataset unless your use case proves otherwise.

2. Data Freshness:

Data updates can range from real-time to batch processing. While there’s a tendency to overestimate the need for real-time data, most use cases can thrive on daily updates. In fact, many scenarios requiring updates within minutes can still be considered “batch” processing. My second “hot take”: daily updates suffice for most applications, with real-time capabilities reserved for edge cases.

Start with “Not Large & Batch”

For most organizations, the simplest and most scalable starting point is to manage data with a “Not Large & Batch” approach. This baseline strategy minimizes complexity while meeting the majority of typical data needs. Let’s explore how this applies to both Digital Products and Data Products:

Digital Products: Separate Systems for Analytics and Operations

For a Digital Product, the key is to establish a basic ETL (Extract, Transform, Load) pipeline that supports distinct use cases for analytics and operational data. Operational data—the data directly used by the product—should remain optimized for performance and reliability, while analytics data—used for reporting and decision-making—can be handled separately to avoid conflicts or performance bottlenecks. Starting with clear boundaries between these systems will help maintain clarity and scalability as data needs grow. Some more detailed thoughts on this can be found in my article Scaling ETL.

Data Products: Aligning on a Source of Truth

For a Data Product, the approach differs slightly. Here, it’s critical to establish a single “source of truth” that serves both operational and analytical use cases. This means that the same datasets used to drive product functionality (e.g., dashboards or reports) are also used to generate insights or models that can feed back into the product.

The reason to do this is that Digital products need insights from the data where Data Products are the data, and doing it this way makes it easier to incorporate changes in the main product.

Next Steps: Evolving Your Data Strategy

Once you’ve mastered the basics of “Not Large & Batch,” you can expand your strategy to accommodate increased complexity. This might include supporting larger datasets, introducing more frequent updates, or optimizing for real-time use cases as they arise. The goal is to build iteratively, ensuring your data management strategy evolves in tandem with your organization’s needs.

In conclusion, the first step in managing data is understanding your product’s relationship to data and aligning that understanding with your organizational goals. By categorizing your product, assessing data volume and freshness needs, and addressing core data management components, you can set a clear path for a sustainable and effective data strategy.


This article is part of my experimenting with AI to accelerate writing series. Approximately 70% of this article reflects my thoughts and writing, with the remaining 30% generated by ChatGPT.