A Framework for Interpreting Realized Economic Value From Data Science Projects

Written with Daniel Fleck and David Lubert

The next entry in our blog series will investigate breakeven points and payback periods with respect to data science projects. Our focus will be specific to the understanding and application of the Breakeven Point, Payback Period, and Discounted Payback Period metrics. In order to drive efficiency within an organization or a specific team, these metrics should be evaluated when determining the efficacy of a project. Data science projects can be a large investment in time, software, infrastructure, and employee salaries. Taking all of these into account when determining the potential costs of a project is necessary when analyzing breakeven points and payback periods.

A common question when assessing the economics for a potential endeavor is: “When will we break even?” Although this can be a pertinent question, it isn’t always the best question to ask since the overall economic benefit should be the primary focus. Nonetheless, timing does matter; therefore, identifying breakeven points in terms of both volume and cash can and should influence business strategy. For example, a company with lower relative free cash flow compared to their operational needs may be sensitive to projects that take longer to bear fruit. Thus, utilizing breakeven metrics to analyze projects is certainly a logical step during the assessment phase.

Let’s formalize these metrics and provide some examples:

Break-Even Point (BEP) is when total revenue from a product or service equals the expenses associated with providing that product or service. This metric is typically expressed on a per unit basis but it can also be expressed in other ways.

Relating this back to the example in our last blog, we can use this approach to determine how much fraud would need to be discovered in order to break even on the project if we were to use software or data scientists. You may recall that our hypothetical example assumed 5% fraud on total revenue and that both options provided 95% savings on fraud, with the only difference being that the software provided 3 years of relief while the data scientists provided 5 years of relief. Given this information we can calculate the BEP for both of these options:

The breakeven for the software option over 3 years is a fraud rate of 2.29%. This result suggests that if you think fraud will be less than 2.29% over this time frame, you should not purchase the software. Conversely, the breakeven for the data scientist option is a fraud rate of 1.59%. These two results suggest that the data scientist option is preferred to the software -provided the expected fraud occurs over 5 years.

Payback Period (PP) calculates the amount of time it will take to reach the BEP. This is useful when determining how long an investment may take to recoup the initial expenditure.

After applying this metric to our example we observe that the PPs for the software and data scientists are 2.56 years and 3.12 years, respectively. This is an expected outcome since the estimated fraud improvement rate is identical for both options, yet the data scientists cost more to the organization for added benefit in years 4 and 5. Nonetheless, the differential of paying back the project 6 months later isn’t a considerable amount of time to wait considering the tradeoff.

Discounted Payback Period (DPP) is a variation of the payback period calculation that addresses one of the biggest drawbacks for the aforementioned PP metric: the time value of money. From a formulaic perspective, PP and DPP are essentially the same except for the discounting future cash flows in DPP. Since the time value of money discounts future cash flows (because money now is preferred over money later), the DPP will always have a greater payback period than PP provided that the project is conventional, meaning there is one initial outflow and all future cash flows are inflows.

Unsurprisingly, the relative DPP results are very similar to the PP results, though they are pushed back slightly due to the discounting of future cash flow benefits: 2.72 years for the software and 3.25 years for the data scientists, with a difference of approximately 6 months. These results, however, are likely more reflective of reality given that the cash flows should be discounted and should be the preferred metrics to use when quoting payback periods for the project.

These three metrics are important barometers of both potential project success and profitability and can be evaluated together during the decision making process. That said, data science projects should be evaluated based on more than just breakeven points or payback periods. As previously stated, it is always important for an organization to be aware of their cash flow benefits from projects in order to better understand the impact and future financial state. Based on expected future cash flows organizations can make decisions on other capital investment projects without waiting until those cash flow benefits are received. Understanding a project’s breakeven point and payback period is instrumental during an organization’s capital planning process.

We have covered both the capital budgeting process and profitability metrics. Next, we will discuss overarching performance metrics. Check out our next blog about Economic Value Add and Return on Investment.

Links to the Rest of the Series:
Part I: NPV, IRR, PI

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store