Warning: define(): Argument #3 ($case_insensitive) is ignored since declaration of case-insensitive constants is no longer supported in /home/u742613510/domains/strategy-at-risk.com/public_html/wp-content/plugins/wpmathpub/wpmathpub.php on line 65
Other topics – Page 8 – Strategy @ Risk

Category: Other topics

  • The Risk of Spreadsheet Errors

    The Risk of Spreadsheet Errors

    This entry is part 1 of 2 in the series Spreadsheet Errors

     

    Spreadsheets create an illusion of orderliness, accuracy, and integrity. The tidy rows and columns of data, instant calculations, eerily invisible updating, and other features of these ubiquitous instruments contribute to this soothing impression.  The quote are taken from Ivars Peterson’s MathTrek Column written in back in 2005, but it still applies to day. ((Peterson, Ivars. “The Risky Business of Spreadsheet Errors.” MAA Online December 19, 2005 26 Feb 2009 .))

    Over the years we have learned a good deal about spreadsheet errors we even have got a spread sheet risk interest group (EuSpRIG) ((EuSpRIG: http://www.eusprig.org/index.htm)).

    Audits done shows that nearly 90% of the spreadsheets contained serious errors. Code inspection experiments also shows that even experienced users have a hard time finding errors succeeding in only finding 54% on average.

    Panko (2009) summarized the results of seven field audits in which operational spreadsheets were examined, typically by an outsider to the organization. His results show that 94% of spreadsheets have errors and that the average cell error rates (the ratio of cells with errors to all cells with formulas) is 5.2%. ((Panko, Raymond R.. “What We Know About Spreadsheet Errors.” Spreadsheet Research (SSR. 2 16 2009. University of Hawai’i. 27 Feb 2009 . ))

    Some of the problems stems from the fact that a cell can contain any of the following: operational values, document properties, file names, sheet names, file paths, external links, formulas, hidden cells, nested Ifs, macros etc. and that the workbook can contain, hidden sheets and very hidden sheets.

    Add to this reuse and recirculation of workbooks and code; after cutting and pasting information, the spreadsheet might not work the way it did before — formulas can be damaged, links can be broken, or cells can be overwritten. How many uses version controls and change logs? In addition the spreadsheet is a perfect environment for perpetrating fraud due to the mixture of formulae and data.

    End-users and organizations that rely on spreadsheets generally do not fully recognize the risks of spreadsheet errors:  It is completely within the realms of possibility that a single, large, complex but erroneous spreadsheet could directly cause the accidental loss of a corporation or institution (Croll 2005)  ((Croll, Grenville J.. “The Importance and Criticality of Spreadsheets in the City of London.” Notes from Eusprig 2005 Conference . 2005. EuSpRIG. 2 Mar 2009 .))

    A very comprehensive literature review on empirical evidence of spreadsheet errors is given in the article Spreadsheet Accuracy Theory.  ((Kruck, S. E., Steven D. Sheetz. “Spreadsheet Accuracy Theory.” Journal of Information Systems Education 12(2007): 93-106.))

    EUSPRIG also publicises verified public reports with a quantified error or documented impact of spreadsheet errors. ((” Spreadsheet mistakes – news stories.” EuSpRIG. 2 Mar 2009 .))

    We will in the following use publicised data from a well documented study on spreadsheet errors. The data is the result of an audit of 50 completed and operational spreadsheets from a wide variety of sources. ((Powell, Stephen G., Kenneth R. Baker, Barry Lawson. “Errors in Operational Spreadsheets.” Tuck School of Business. November 15, 2007. Dartmouth College. 2 Mar 2009))

    Powell et alii settled for six error types:

    1. Hard-coding in a formula – one or more numbers appear in formulas
    2. Reference error – a formula contains one or more incorrect references to other cells
    3. Logic error – a formula is used incorrectly, leading to an incorrect result
    4. Copy/Paste error – a formula is wrong due to inaccurate use of copy/paste
    5. Omission error – a formula is wrong because one or more of its input cells is blank
    6. Data input error – an incorrect data input is used

    And these were again grouped as Wrong Result or Poor Practise depending on the errors effect on the calculation.

    Only three workbooks were without errors, giving a spreadsheet error rate of 94%. In the remaining 47 workbooks they found 483 instances ((An error instance is a single occurrence of one of the six errors in their taxonomy)) of errors; 281 giving wrong result and 202 involving poor practise.

    cell_errors_instances

    The distribution on the different types of error is given in the instances table. It is worth noting that in poor practice hard-coding errors was the most common while incorrect references and incorrectly used formulas was the most numerous errors in wrong result.

    cell_errors_cells

    The 483 instances involved 4,855 error cells, which with 270,722 cells audited gives a cell error rate of 1.79%. The corresponding distribution of errors is given in the cells table. The Cell Error Rate (CER) for wrong result is 0.87% while the CER for poor practise is 1.79%.

    In the following graph we have plotted the cell error rates against the proportion of spreadsheets having that error rate (zero CER is excluded). We can se that most spreadsheets have a low CER and only a few a high CER. This is more evident for wrong result than for poor practise.

    cell_error_rates_frequencie

    If we accumulate the above frequencies and include the spreadsheets with zero errors we get the “probability distributions” below. We find that 60% of the spread sheets have a CER giving a wrong result of 1% or more and that only 10% have a CER of 5% or more.

    cell_error_rates_accumulate

    The high percentage of spreadsheets having errors is due to the fact that bottom-line values are computed through long cascades of formula cells. Because in tasks that contain many sequential operations error rates multiply along cascades of subtasks, the fundamental equation for the bottom-line error rate is based on a memoryless geometric distribution over cell errors. ((Lorge, Irving, Herbert Solomon. “Two Models of Group Behavior in the Solution of Eureka-Type Problems.” Psykometrika 20(1955): 139-148. )):

    E=1-(1-e)^n

    Here, E is the bottom-line error rate, e is the cell error rate and n is the number of cells in the cascade. E indicates the probability of an incorrect result in the last cascade cell, given the probability of an error in each cascade cell is equal to the cell error rate. ((Bregar, Andrej. “Complexity Metrics for Spreadsheet Models.” Proceedings ofEuSpRIG 2004. http://www.eusprig.org/. 1 Mar 2009 .))

    In the figure below we have used the CER for wrong result (0.87%) and for poor practise (1.79%) to calculate the probability of a corresponding worksheet error, given the cascade length. For poor practice at a calculation cascade of 100 cells there is a probability of 84% an error and 65 cells it is 95%. For wrong result 100 cells give a probability of 58% for an error and at 343 cells it is 95%.

    cascading-probability

    Now if we consider a net present value calculation over a 10 year forecast period in a valuation problem it will easily have more than 343 cells that with high probability contains error.

    This is why S@R uses programming languages for simulation models. Of course will models like that also have errors, but it will not mix data and code, the quality control is easier, it will have columnar consistency, be protected by being compiled, having numerous intrinsic error checks, data entry controls and validation checks (see: Who we are).

    Efficient computing tools are essential for statistical research, consulting, and teaching. Generic packages such as Excel are not sufficient even for the teaching of statistics, let alone for research and consulting (American Statistical Association )

    References

  • Budgeting

    Budgeting

    This entry is part 1 of 2 in the series Budgeting

     

    Budgeting is one area that is well suited for Monte Carlo Simulation. Budgeting involves personal judgments about future values of large number of variables like; sales, prices, wages, down- time, error rates, exchange rates etc. – variables that describes the nature of the business.

    Everyone that has been involved in a budgeting process knows that it is an exercise in uncertainty; however it is seldom described in this way and even more seldom is uncertainty actually calculated as an integrated part of the budget.

    Admittedly a number of large public building projects are calculated this way, but more often than not is the aim only to calculate some percentile (usually 85%) as expected budget cost.

    Most managers and their staff have, based on experience, a good grasp of the range in which the values of their variables will fall.  A manager’s subjective probability describes his personal judgement ebitabout how likely a particular event is to occur. It is not based on any precise computation but is a reasonable assessment by a knowledgeable person. Selecting the budget value however is more difficult. Should it be the “mean” or the “most likely value” or should the manager just delegate fixing of the values to the responsible departments?

    Now we know that the budget values might be biased by a number of reasons – simplest by bonus schemes etc. – and that budgets based on average assumptions are wrong on average ((Savage, Sam L. “The Flaw of Averages”, Harvard Business Review, November (2002): 20-21.))

    When judging probability, people can locate the source of the uncertainty either in their environment or in their own imperfect knowledge ((Kahneman D, Tversky A . ” On the psychology of prediction.” Psychological Review 80(1973): 237-251)). When assessing uncertainty, people tend to underestimate it – often called overconfidence and hindsight bias.

    Overconfidence bias concerns the fact that people overestimate how much they actually know: when they are p percent sure that they have predicted correctly, they are in fact right on average less than p percent of the time ((Keren G.  “Calibration and probability judgments: Conceptual and methodological issues”. Acta Psychologica 77(1991): 217-273.)).

    Hindsight bias concerns the fact that people overestimate how much they would have known had they not possessed the correct answer: events which are given an average probability of p percent before they have occurred, are given, in hindsight, probabilities higher than p percent ((Fischhoff B.  “Hindsight=foresight: The effect of outcome knowledge on judgment under uncertainty”. Journal of Experimental Psychology: Human Perception and Performance 1(1975) 288-299.)).

    We will however not endeavor to ask for the managers subjective probabilities only ask for the range of possible values (5-95%) and their best guess of the most likely value. We will then use this to generate an appropriate log-normal distribution for sales, prices etc. For investments we will use triangular distributions to avoid long tails. Where, most likely values are hard to guesstimate we will use rectangular distributions.

    We will then proceed as if the distributions where known (Keynes):

    [Under uncertainty] there is no scientific basis on which to form any calculable probability whatever. We simply do not know. Nevertheless, the necessity for action and for decision compels us as practical men to do our best to overlook this awkward fact and to behave exactly as we should if we had behind us a good Benthamite calculation of a series of prospective advantages and disadvantages, each multiplied by its appropriate probability waiting to be summed.  ((John Maynard Keynes. ” General Theory of Employment, Quarterly Journal of Economics (1937))

    budget_actual_expected

    The data collection can easily be embedded in the ordinary budget process, by asking the managers to set the lower and upper 5% values for all variables demining the budget, and assuming that the budget figures are the most likely values.

    This gives us the opportunity to simulate (Monte Carlo) a number of possible outcomes – usually 1000 – of net revenue, operating expenses and finally EBIT (DA).

    In this case the budget was optimistic with ca 84% probability of having an outcome below and only with 26% probability of having an outcome above. The accounts also proved it to be high (actual) with final EBIT falling closer to the expected value. In our experience expected value is a better estimator for final result than the budget  EBIT.

    However, the most important part of this exercise is the shape of the cumulative distribution curve for EBIT. The shape gives a good picture of the uncertainty the company faces in the year to come, a flat curve indicates more uncertainty both in the budget forecast and the final result than a steeper curve.

    Wisely used the curve (distribution) can be used both to inform stakeholders about risk being faced and to make contingency plans foreseeing adverse events.percieved-uncertainty-in-ne

    Having the probability distributions for net revenue and operating expenses we can calculate and plot the manager’s perceived uncertainty by using coefficients of variation.

    In our material we find on average twice as much uncertainty in the forecasts for net revenue than for operating expenses.

    As many often have budget values above expected value they are exposing a downward risk. We can measure this risk by the Upside Potential Ratio, which is the expected return above budget value per unit of downside risk. It can be found using the upper and lower moments calculated at budget value.

    References

  • What we do; Predictive and Prescriptive Analytics

    What we do; Predictive and Prescriptive Analytics

    This entry is part 1 of 3 in the series What We Do

     

    Analytics is the discovery and communication of meaningful patterns in data. It is especially valuable in areas rich with recorded information – as in all economic activities. Analytics relies on the simultaneous application of statistical methods, simulation modeling and operations research to quantify performance.

    Prescriptive analytics goes beyond descriptive, diagnostic and predictive analytics; by being able to recommend specific courses of action and show the likely outcome of each decision.

    Predictive analytics will tell what probably will happen, but will leave it up to the client to figure out what to do with it.

    Prescriptive analytics will also tell what probably will happen, but in addition:  when it probably will happen and why it likely will happen, thus how to take advantage of this predictive future. Since there are always more than one course of action prescriptive analytics have to include: predicted consequences of actions, assessment of the value of the consequences and suggestions of the actions giving highest equity value for the company.

    By employing simulation modeling (Monte Carlo methods) we can give answers – by probability statements – to the critical question at the top of the value staircase.

     

    Prescriptive-analytics

     

    This feature is a basic element of the S@R balance simulation model, where the Monte Carlo simulation can be stopped at any point on the probability distribution for company value  (i.e. very high or very low value of company) giving full set of reports: P&L and balance sheet etc. – enabling a full postmortem analysis: what it was that happened and why it did happen.

    Different courses of actions to repeat or avoid the result with high probability can then be researched and assessed. The EBITDA client specific model will capture relationships among many factors to allow simultaneous assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions. Even the language we use to write the models are specially developed for making decision support systems.

    Our methods will as well include data and information visualization to clearly and effectively communicate both information and acquired knowledge – to reinforce comprehension and cognition.

    Firms may thus fruitfully apply analytics to business data, to describe, predict, and improve its business performance.

     

  • How we work

    How we work

    An initial meeting allows us to begin to understand each other and for us to gain an insight into a client’s business, ideas, ambitions and direction through an open but totally confidential exchange. Following one or possibly two further meetings and subject to client approval we prepare and submit a fully costed proposal of work, with time lines and key deliverables  made clear. (See:  S@R Services) .

    Formal acceptance of our proposal of work – its scope, scale, fees and costs, and timing initiates the program that will almost certainly require input and co- operation of key executives and managers at regular intervals during its life-cycle. All information, data and analysis will be handled according to relevant security standards.

    Depending upon the nature of the program, we usually take a phased approach enabling joint assessment at the end of each stage of work.

    On completion of an assignment we will deliver a comprehensive presentation and understandable report making clear our key recommendations and next steps to be pursued. We will not leave you at risk.

    If you are interested in S@R services, please do not hesitate contacting us to discuss how we can provide a solution satisfying your demands.