Considering Data Products as solutions is becoming a competitive differentiator that some companies are beginning to experience. The reduction of costs per use of these products or the higher return on investment are allowing the popularization of this concept.

In this article we will review the most relevant factors to be considered when defining a Data Product, but first let’s remember this concept.

We will consider as Data Product that whose main objective is to use the available data to facilitate the achievement of a final objective. This type of solution must contain sufficient information to understand the data it handles and the use that can be made of it to solve a use case in a specific domain.

For this reason, we will not consider Netflix as a Data Product, since its objective is not purely analytical, but its movie or series recommender.

Before embarking on defining a new Data Product we will have to ask ourselves some questions such as whether someone will need or use our product, what decisions or considerations may be made from its use, what actions may be involved and how the results derived from such actions will be measured.

Statistics show that many Data Products are left on standby in their last phases or once deployed. Among them, there are also some that provide little or no analytical value or are failures.

These are some of the reasons that lead us to consider the relevance of the definition phase of a Data Product and to follow some good practices that allow us to build successful Data Products.

Where to start building a Data Product?

Like most things, proposing the simplest and trying to keep it that way as long as possible.  If at the time of definition we overcomplicate it, it will end up being a product with no possibilities of development, evolution or user-friendliness.

In these first moments we consider what we want the user of our Data Product to obtain when using it, what actions can be carried out with the information received and what is the experience that the user will have during and after using the Product.

If we manage to build a successful Product, we will be able to focus our efforts later on evolving it and adding more sophisticated functionality.

What will the process be like?

Initially the development team should formulate with Business the problem in terms of analytical value and how this value can be effectively measured in the domain in which it will be implemented. This factor will be key in determining whether the product will ultimately be a success.

With this starting point, it will be convenient to carry out a feasibility test in which the proposal can be validated in terms of business needs.

Next, the different types of users and how they can interact with the product should be taken into account for the implementation. For this purpose, it is advisable to define a conceptual model that considers the use of data in the different interfaces such as APIs, Dashboards or others.

A later refinement will allow defining a more detailed physical model that also contemplates sufficient metadata for its understanding.

After its deployment and implementation, the Data Product will be designed to facilitate its evaluation and evolution during its life cycle through interaction with users.

What features should a Data Product have?

Among the most important characteristics that we can consider in the definition of a Data Product are the following:

  • Utility: Do not lose sight of the use that a user can make of the Data Product regardless of the effort required to keep the data updated and preprocessed. A product that is not used does not add value.
  • Scalability: Successful products are usually overused and data sources tend to increase in size and number, so taking this into account is key to their future performance.
  • Potential: To measure its effectiveness it will be necessary to determine metrics to evaluate its performance, such as “accuracy” and “recall” in a possible recommendation engine. Assuming that the data product will fail and how to preserve the user experience in such cases is an aspect to consider.
  • Passive data collection: Collecting user data without interfering with the user experience is important to provide added value. This is the case of accelerometers or gps in smartphones, for example.
  • Constant performance validation: Accumulating historical metrics enables the implementation of A/B tests to evaluate the variation between different versions of a product and to make more informed decisions. A model that currently works may not continue to do so in the future, so monitor the evolution of these metrics and review their degradation or improvement.
  • Transparency builds trust: The user may be skeptical of product results, so additional information to the result in the form of probabilities, confidence scores or attributes that contribute most to an inference can mean the difference between a reliable or questionable product.
  • Agile in its update:  There are Data Products that are susceptible to continuous upgrade as it can make a competitive difference this quality. In such cases, deploying versions regularly with new features in a safe and diligent manner will have a positive impact on the user’s experience.


But there are also undesirable attributes in a Data Product that can generate mistrust in the user or a bad experience. This is the case, for example, of a product that provides erroneous or poor quality information. Sometimes, not implementing sufficient validations regarding data quality can lead to confusing or erroneous results that could potentially generate large losses. Therefore, data quality in the datasets managed by these Data Products becomes a priority for their managers.

Other situations that are not recommended are those that occur when a product generates a huge amount of data (data vomit). Depending on the user profile, it may not be the best way to present the information of interest and this can lead to a paralyzing situation.

In short, defining and developing a Data Product with guarantees of success requires careful planning, a correct implementation subject to the usefulness and user experience, and continuous testing and monitoring that allow the continuous evaluation of its performance.


  • Keepler

    Software company specialized in the design, construction and operation of digital data products based on cloud computing platforms.