A Framework for Interpreting Realized Economic Value From Data Science Projects

Written with Daniel Fleck and David Lubert

The first blog of our series will focus on utilizing Net Present Value (NPV), Internal Rate of Return (IRR), and the Profitability Index (PI) to evaluate a data science project appropriately. We will also discuss how utilizing a data science methodology can further refine the inputs of these same economic metrics to develop higher-quality evaluations.

NPV, IRR, and PI are critical evaluation metrics used by organizations in many industries and applications to determine the economic potential of investment endeavors. Though the metrics are most notably used during the capital budgeting process [a process by which a business evaluates potential projects and long-term investments for feasibility and value creation potential], we will demonstrate how data science projects can leverage these metrics in order to better understand the impact from invested resources. But first, let’s get into the definitions and uses of the different calculations:

Net Present Value (NPV) is the dollar difference between the present value of net cash flows over a specified period of time. An NPV greater than zero is considered to be economically viable. NPV can also be used to compare two investment projects that both have an NPV greater than zero in order to determine which project is more worthwhile. This metric is commonly used in finance due to its ease of interpretability and intuitive result.

Internal Rate of Return (IRR) is a metric that calculates the profitability of a project in percentage terms. IRR is useful when the result is compared to the discount rate for the business, ensuring the return of the project exceeds the cost of capital. IRR is also commonly used in finance, sometimes in tandem with NPV or potentially as a stand-alone evaluation metric. However, IRR does have limitations when a discount rate is unknown, or if the net cash flows are negative for more than one time period.

Profitability Index (PI) is the ratio of future cash flows compared to the initial cost of the project. If a project has a PI greater than 1, the project is considered economically viable. The PI result is directly correlated with an NPV result since the inputs for these metrics are very similar. The main difference between the two is whether you want a ratio (PI) or an absolute dollar amount (NPV) to measure a project’s profitability.

Now let’s apply these metrics on a hypothetical data science project. Assume you are in charge of strategy for a fast-growing retail clothing business. As the business grows, the potential for fraud grows as well. Fraud is estimated to be 5% of revenue annually, where revenue is currently $1M and growing by 20% annually. For simplicity, let’s assume the tax rate is zero. You are faced with two options to streamline fraud detection:

Scenario 1

The first option is a researched software that claims 95% accurate fraud detection. If you choose to go through with this option, the initial investment is the cost of the software since it is not a subscription-based service. The software company believes this software will be effective for 3 years, starting in year 1.

Investment amount: $95,000
Number of periods solution can be used for: 3
Discount rate: 8%

Cash flow from fraud savings: $207,480
Discounted free cash flow: $176,578

Evaluation Metrics:
NPV: $81,578
IRR: 37%
PI: 1.859

Scenario 2

The second option is to hire a team of data science consultants who will develop a customized fraud detection model from scratch, which they believe will also address 95% of fraud — exactly the same as the software. Unlike the software, the data scientists feel their solution should be effective for 5 years. In this case, the initial investment is the cost of the team of 3 people over 3 months is $135,000 ($15,000 per month per data scientist).

Investment amount: $135,000
Number of periods solution can be used for: 5
Discount rate: 8%

Cash flow from fraud savings: $424,171
Discounted free cash flow: $329,417

Evaluation Metrics:
NPV: $194,417
IRR: 36%
PI: 2.440

Given these results, what should you do? NPV and PI tell you the same thing: the data science consultants are the way to go. However, the IRR suggests that the software provides a slightly better return. In this example, the main decision points are the growth and length of time the solution will be viable. Assuming that the difference of $40K between the two options is negligible for the organization, it is a matter of trusting the business growth rate and length of time before the solution needs to be replaced. Those, too, could be further analyzed by data science — but we’ll leave that to you!

As data science expands and evolves within organizations, so do the potential business cases. This evolution will not only impact the assessment of projects like the one above but also how data science is infused into the assessment process for businesses. For example, we can also consider improving upon the assumption that inputs for calculations like NPV are “static.” Realistically, this assumption is often invalid. It would behoove the strategy head in this particular example to also focus on analyzing the data regarding fraud and corresponding assumptions. A better understanding of this data and its distribution may result in more reliable forecasts — thus impacting the NPV, IRR, and PI calculations and the subsequent decision made when interpreting each.

Now that we’ve covered methods used in the capital budgeting process, we will next delve into analyzing how long it takes to see a return on our capital investments. Join us in our next blog where we discuss methods of payback analysis using the Break-Even Point, Payback Period, and Discounted Payback Period!

Links to the Rest of the Series:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store