Two popular group decision making methodologies are investigated this week, the Delphi Method (DM) as formally developed in Dalkey & Helmer (1963), and its variants and hybrids noted in particular domains such as a research methodology in Skulmoski, Hartman, & Krahn (2007), and the Nominal Group Technique (NGT) and its variants, most notably a simplified version in Dobbie, Rhodes, Tysinger, Freeman (2004) for educational processes of the modified version given by MNGT as devised in Bartunek & Murnighan (1984). Both methodologies involve group decision-making by quasi-consensus through an iterative process of refinement and weighing. The final product, in both cases is a consensus on how to solve a problem with a target solution, (i.e., an optimization, prognostication, strategy, constructive suggestions, etc.). The differences lie in their starting inputs (interpreters and knowledge repository), group interaction dynamics, and iterative reassessment techniques.
The DM starts by using so-called experts in relevant fields of study for the problem to be solved. These are usually small in number compared to the larger number of participant-like interpreters in the NGT. In both cases, convergence to a consensus is assumed by their respective nature of interaction. In the delphi method, for example, expert participants are given several rounds of discussion and contribution to the problem at hand. They are essentially giving their estimates of solutions or answers to these project problems or questions. There is an implicit motive to hybridize and combine solution types or estimates during these rounds until convergence appears to happen in later rounds.
In the NGT, input comes in the shape of general estimates from the process participants. They are not subject to scrutiny and this theoretically invites more openness and less mob influence on individual choices. Intermediate "moments of silence" are also introduced as a means to give participants the opportunity to collect their respective individual thoughts and creativity. Also in the NGT, successively smaller groups are formed in order to drill down into sub-estimates or answers in order to compare to the original target problem. Ill-structured problems may be broken down by this iteration of forming smaller clusters of problem-solving subgroups until the original problem can be clarified and hence attacked more effectively. In the minor version of MNGT, smaller amounts of time are given at the conclusion in order to focus groups on the termination of the project and pinpoint consensus answers. Any differences in direction between the sub-group estimates and the original problem target are reconciled and attempted to be brought back to relevance. In the delphi method, in each round or iteration, the expert's individual estimates are weighted and group rank ordered.
In the NGT, rank ordering also occurs but at the smaller group scenario before being projected back to the larger original problem. Convergence rates differ according to (1) the dynamics of the groups involved in using both methodologies, (2) the magnitude of the original target problem and (3) of the quality of input information given to the interpreters, (i.e., the veracity of knowledge input). Knowledge is usually incomplete, vague, directionally unknown, convoluted, or biased when using group decision processes. Another very large assumption made when using these methodologies is that of "wisdom of the crowds" effects, Surowiecki (2005), network computational growth, and the robustness of statistical smearing estimates (i.e., variants of averaging). If anything, these assumptions introduce more ambiguity and pseudo-randomness into the process by the very nature of the structures that are required to be present when using such conditions.
Theoretically, group decision making involves the dynamics of coalitions, exogeneous effects, and infinite interactions, interchanges, and conditional probabilities of success of the participant groups. In Dalkey (2002) the DM is given a more succinct mathematical statistical face. Dalkey points out the enormous uncertainty from many angles in the usage of the DM dynamics when forming a mathematical statistical structure for group estimation methods. In Lee & Shi (2011), the dynamics of weighted group estimation improves on the straight averaging of opinions, but the causal effects are unknown - wisdom of the crowds also suffers from ignorance of the crowds when left unchecked or when weighing schemes are biased based on the very human frailties of pattern-opia, anchoring, and other well known and studied judgment limitations (Tversky & Kahneman, 1974). Parente & Anderson-Parente (2011) point to the need to choose a more diverse, representative, and competent expert panel for a DM when possible, thereby cutting down on possible technological bias. Additionally, they posit that mere majority rules are not adequate for accuracy. There should be well-structured statistical criteria for robust estimators of accuracy in the final outcome results of the panels involved. Time-based solutions should also be given, (i.e., not just the scenarios that are likely to happen, but the time frame for them to occur) in order to present more useful prognostication. This is what I have called in the past, the Nostradamus scam (effect) in which scenarios are predicted based on vague generalizations without time frames that can conveniently be fitted retroactively to events for self-approval.
I propose that group estimation and decision-making is much more dynamic than management science or operations research would purport it to be. In fact, group estimation and decision-making is a game-theoretic program involving continual iterative Bayesian conditioning, coalition sub-strategies with an optimization of profit which can be interpreted as measurement of accuracy or realism of the participant's prowess to give plausible estimates. Individual and group utility functions (in game theory) represent individual and group participatory realism in group decision-making. It is akin to fitness-based multi-criteriia decision-making with multiple players, (i.e., a multi-criteria evolutionary game structure). How does one measure the effectiveness of a-priori probabilities on the success of experts or particpants over a long period of time and account for that in the group decision process? Do you weigh each participant differently based on this measure of "realism"? If you did then this would introduce certain other biases because who would measure the measurer? As it stands now, both these methodologies do not explicitly account for this realism of the participants. Experts and non-experts alike have agendas and they are unconsciously not optimal for the group. Hence, there are sub-coalitions that form in order to reach pseudo-consensus results using these group methods. There are always elements of coercion when trying to reach consensus in non-agreeing groups. Futuring is no exception and in fact, could lead to more subversive and insidious coercion because of self-fulfilling prophecies and wisdom-of-the-crowds blind memetic manipulation for future prospecting. Wisdom of the crowds effects may led to more optimal decisions, but only if the size, diversity, and independence of deciding panels are sufficient (Lee & Shi, 2011). Even then, the group dynamic may be, at best, sub-optimal, stopping at local bests while not having the long-term wisdom to pursue out-of-the-box approaches and technologies. Elements of social choice theory are also involved when these sub-coalitions are formed in Delphi-type groups or star chambers. Social evolutionary dynamics are probably more relevant than naive statistical smearing of estimates and opinions in group processes. The MNGT methods suffer from similar statistical weaknesses and participant realism. Also, since these participants are usually part of a larger experiment, they become "the experiment" and hence are inextricably bias-tied to the measurement process.
With respect to various ambitious futuring prospects that I have evangelized in the course, in particular, the introduction of generalized non-classical computational models to build hybrid human-machines, either methodology would be insufficient on its own. There are various reasons for this shortfall. One is the relative shortage of knowledge domain experts in all the multi-disciplinary fields that would be involved. Another would be the insufficient literature to be used as the starting input for them. If each method could be modified such
that group estimation is given more statistical and dynamic robustness and realism (accuracy measurement) of the participants, the prospect for garnering a successful iteratively formed idea for human-machine hybrids as living support for our frailties, would be more plausible. Iterative refinement would probably be beneficial in one respect here. Brainstorming keep aside, group estimation methods are good for refining starting points in a development process. Plausible non-coerced convergence after that may not be so clear cut, academic, or even exist.
I want to add that variance among each participant's estimates are presumed to decrease as further iterations proceed based on convergence of consensus. Variability may be falsely measured based on coercion of influences, even in the NGT methodologies because eventually each individual is faced with subgroup consensus building and hence with harmonization of ideas. They may nit "feel" as if they are coerced, however, they are pressured to come to consensus toward the end of the process. This may be a false ending to the process of consensus. In the end consensus building may also not be optimal. This depends on the nature of the problem and an acceptable scenario for the solution. space.
In doing further research into group decision-making methodologies, I came upon a very recent article or more exactly, a letter to the editor of Science Magazine in Mercier & Sperber (2012) in which they separate two most probable scenarios and their experimental dynamics in wisdom of crowds effects. The first concerns group decision processes that depend mostly on the perception and memorization of the participant panel. If the motive for judging each other is based on perception from things like self-assessments or the perception of self-worth and how each propagates that to everyone else (boosting, reputation, etc), without regards to actual effectiveness, then consensus is usually reached by way of appointing the leadership to the perceived leader. No communication between the participants becomes relevant in this case. In other words, the one with the highest reputation or perception of highest excellence usually gets their respective views weighed much more heavily in the final consensus reached, regardless of group methodology. The second scenario is when exchange of arguments is used for the primary purpose of rooting out the best choices, (i.e., reasonable debate and ranking). In this situation, convergence of ideas is optimized when knowledgeable exchanges are made. Even minority consensus can win out based on effective debating and reasoning. In the end, by using effective communication and exchange of ideas using optimal sizes of panels, wisdom of crowds does better on the average than oligarchies or aristocracies of idea makers. It may do worse if communication does not exist, is minimized based on the perception of domination within the group or if majority mob mentality takes over the optimal core of reasoning of the group.
In a futurology panel using a DM or NGT type methodology, interesting things may occur based on reputation alone. Since futurologists have extremely poor records for prediction, perceptions are all they have. However, if everyone is approximately equally likely to be wrong then the perception is that no one should be a leader. Then consensus should be entirely based on the effective and reasonable exchange of ideas and reasoning based on scientific first principles of physical matter and broad sociological phenomena. One futurologist that was mentioned by Cynthia in this week's class (in the SL session) was Herman Kahn, he of the cold war thermonuclear and Dr. Strangelove fame. Kahn dropped out of graduate school, was hired by the Rand Corporation (mostly because he shared the extreme right-wing philosophy of its board), and gave us the ideas of gaming out wars that got us into so much trouble in the cold war and Vietnam. Robert McNamara (Kennedy and Johnson's secretary of defense) was a huge fan of his and because of him, he helped escalate the Vietnam war without end as well as Henry Kissinger. Kahn was wrong around 90% of the time according to Sherden (1998). Kahn had been a very dominant type of character and personality in meetings with the Pentagon and it is very clear that in any panel he would have ruled by perception and not by the effective exchange of ideas and a debate. Kahn would later change his mind about how to approach wars, especially nuclear ones. I think it was a bit too late and it cost a tremendous loss of lives in the world and the foundation for our military industrial complex that has put us in so much economic trouble now. McNamara would later admit his and Kahn's mistaken philosophies of war and of predicting the future.
References
Bartunek, J.M., & Murnighan, J.K. (1984). "The nominal group technique: Expanding the basic procedure and underlying assumptions", Group and Organization Studies, 9, 417-432.
Dalkey, N. C., & Helmer, O. (1963). An experimental application of the Delphi Method to the use of experts. Management Science, 9, 3, 458-468.
Dalkey, N. C. (2002). Toward a theory of group estimation. In Linstone, H. & Taroff, M. (Eds.). The delphi method: Techniques and applications. 236-261. London, England: Addison-Wesley.
Dobbie, A., Rhodes, M., Tysinger, J. W., Freeman, J. (2004). Using a modified nominal group technique as a curriculum evaluation.
Lee, M. D., & Shi, J. (2011). The accuracy of small-group estimation and the wisdom of crowds.
Mercier, H., & Sperber, D. (2012). Two heads are better stands to reason. Science, 336, p. 979.
Parente, R., & Anderson-Parente, J. (2011). A case study of long-term delphi accuracy. Technological Forecasting & Social Change.
Sherden, W. A. (1998). The fortune sellers. New York, NY: Wiley.
Skulmoski, G. J., Hartman, F. T., & Krahn, J. (2007). The delphi method for graduate research.
Surowiecki, J. (2005). The wisdom of crowds. New York, NY: Anchor Press.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1457, 1124-1171.