Abstract
Implementing machine learning in an enterprise involves tackling a wide range of complexities with respect to requirements elicitation, design, development, and deployment of such solutions. Despite the necessity and relevance of requirements engineering approaches to the process, not much research has been done in this area. This paper employs a case study method to evaluate the expressiveness and usefulness of GR4ML, a conceptual modeling framework for requirements elicitation, design, and development of machine learning solutions. Our results confirm that the framework includes an adequate set of concepts for expressing machine learning requirements and solution design. The case study also demonstrates that the framework can be useful in machine learning projects by revealing new requirements that would have been missed without using the framework, as well as, by facilitating communication among project team members of different roles and backgrounds. Feedback from study participants and areas of improvement to the framework are also discussed.
Similar content being viewed by others
Notes
See the section on threats to validity for further details.
Interestingly, recent research in the healthcare domain also supports the idea that enrolling the wrong doctors into government programs can be a contributing factor toward failure of such programs. These research publications were unknown to the modelers and project team during the Business View modeling.
CRoss-Industry Standard Process for DM.
The International Workshop on Requirements Engineering for Artificial Intelligence (RE4AI).
Software Engineering for Machine Learning Applications International Symposium.
References
Gartner Inc (2019) Advanced analytics. Gartner IT Glossary. https://www.gartner.com/it-glossary/advanced-analytics/. Accessed 16 Nov 2019
Bichler M, Heinzl A, van der Aalst WM (2017) Business analytics and data science: once again? Bus Inf Syst Eng 59(2):77–79
Moore A (2019) When AI becomes an everyday technology. Harvard business review. https://hbr.org/2019/06/when-ai-becomes-an-everyday-technology. Accessed 16 Nov 2019
Veeramachaneni K (2016) Why you’re not getting value from your data science. Harv Bus Rev 12:1–4
Luca M, Kleinberg J, Mullainathan S (2016) Algorithms need managers, Too. Harv Bus Rev 94:96–101
Kiron D, Schrage M (2019) Strategy for and with AI. MIT Sloan Manag Rev 60(4):30–35
Ng A (2016) What artificial intelligence can and can’t do right now. Harvard Business Review. https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now. Accessed 16 Nov 2019
Redman T (2019) Do your data scientists know the ‘Why’ behind their work?. Harvard Business Review. https://hbr.org/2019/05/do-your-data-scientists-know-the-why-behind-their-work. Accessed 16 Nov 2019
Akkiraju R, Sinha V, Xu A, Mahmud J, Gundecha P, Liu Z, Schumacher J (2018) Characterizing machine learning process: a maturity framework. arXiv preprint http://arxiv.org/1811.04871
Storey VC, Trujillo JC, Liddle SW (2015) Research on conceptual modeling: Themes, topics, and introduction to the special issue. Data Knowl Eng 98:1–7
Storey VC, Song IY (2017) Big data technologies and management: what conceptual modeling can do. Data Knowl Eng 108:50–67
Lukyanenko R, Castellanos A, Parsons J, Tremblay MC, Storey VC (2019) Using conceptual modeling to support machine learning. In: Cappiello C, Ruiz M (eds) International Conference on Advanced Information Systems Engineering, vol 350. Springer, Cham, pp 170–181
Nalchigar S, Yu E, Ramani R (2016) A conceptual modeling framework for business analytics. In: Comyn-Wattiau I, Tanaka K, Song IY, Yamamoto S, Saeki M (eds) International Conference on Conceptual Modeling, vol 9974. Springer, Cham, pp 35–49
Nalchigar S, Yu E (2018) Business-driven data analytics: a conceptual modeling framework. Data Knowl Eng 117:359–372
Nalchigar S, Yu E (2017) Conceptual modeling for business analytics: a framework and potential benefits. In 2017 IEEE 19th Conference on Business Informatics (CBI) (Vol. 1, pp. 369–378). IEEE
Nalchigar S, Yu E (2020) Designing business analytics solutions. Bus Inf Syst Eng 62(1):61–75
Nalchigar S, Yu E, Obeidi Y, Carbajales S, Green J, Chan A (2019) Solution patterns for machine learning. In: Giorgini P, Weber B (eds) International Conference on Advanced Information Systems Engineering, vol 11483. Springer, Cham, pp 627–642
Siau K, Rossi M (2011) Evaluation techniques for systems analysis and design modelling methods–a review and comparative analysis. Inf Syst J 3(21):249–268
Easterbrook E (2007) Empirical Research Methods in Requirements Engineering. Tutorial In 15th IEEE International Requirements Engineering Conference
Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. In: Shull F, Singer J, Sjøberg DIK (eds) Guide to Advanced Empirical Software Engineering. Springer, London
Kurgan LA, Musilek P (2006) A survey of Knowledge discovery and data mining process models. Knowl Eng Rev 21(1):1–24
Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI mag 17(3):37–37
Shearer C (2000) The CRISP-DM model: the new blueprint for data mining. J data warehous 5(4):13–22
RE4AI Workshop. https://sites.google.com/view/re4ai. Accessed: 2020–03–07
Software Engineering for Machine Learning Applications (SEMLA). https://semla.polymtl.ca/. Accessed: 2020–03–07
Horkoff J (2019) Non-Functional Requirements for Machine Learning: Challenges and New Directions. In 2019 IEEE 27th International Requirements Engineering Conference (RE’19), (pp. 386–391)
Vogelsang A, Borg M (2019) Requirements Engineering for Machine Learning: Perspectives from Data Scientists. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW) (pp. 245–251). IEEE
Liu L, Feng L, Cao Z, Li J (2016) Requirements engineering for health data analytics: Challenges and possible directions. In 2016 IEEE 24th International Requirements Engineering Conference (RE) (pp. 266–275). IEEE
Chen HM, Kazman R, Haziyev S (2016) Agile big data analytics for web-based systems: an architecture-centric approach. IEEE Transactions on Big Data 2(3):234–248
Barone D, Yu E, Won J, Jiang L, Mylopoulos J (2010) Enterprise modeling for business intelligence. In: van Bommel P, Hoppenbrouwers S, Overbeek S, Proper E, Barjis J (eds) IFIP Working Conference on the Practice of Enterprise Modeling, vol 68. Springer, Berlin, Heidelberg, pp 31–45
Jiang L, Barone D, Amyot D, Mylopoulos J (2011) Strategic models for business intelligence. In: Jeusfeld M, Delcambre L, Ling TW (eds) International Conference on Conceptual Modeling, vol 6998. Springer, Berlin, Heidelberg, pp 429–439
Barone D, Jiang L, Amyot D, Mylopoulos J (2011) Reasoning with Key performance indicators. In: Johannesson P, Krogstie J, Opdahl AL (eds) IFIP Working Conference on The Practice of Enterprise Modeling, vol 92. Springer, Berlin, Heidelberg, pp 82–96
Giorgini P, Rizzi S, Garzetti M (2008) GRAnD: A goal-oriented approach to requirement analysis in data warehouses. Decis Support Syst 45(1):4–21
Mazón JN, Pardillo J, Trujillo J (2007) A Model-driven goal-oriented requirement engineering approach for data warehouses. In: Hainaut JL et al (eds) International Conference on Conceptual Modeling, vol 4802. Springer, Berlin, Heidelberg, pp 255–264
Vassiliadis P, Simitsis A, Skiadopoulos S (2002) Conceptual modeling for ETL processes. In: Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP (pp. 14–21). ACM
Munoz L, Mazon JN, Trujillo J (2011) ETL process modeling conceptual for data warehouses: a systematic mapping study. IEEE Latin Am Transactions 9(3):358–363
Horkoff J, Yu E (2016) Interactive goal model analysis for early requirements engineering. Requir Eng 21(1):29–61
Yu ESK, Giorgini P, Maiden N, Mylopoulos J (2011) (Eds.). Social modeling for requirements engineering. MIT Press. Cambridge
Acknowledgements
We wish to thank the anonymous reviewer #1 for her/his valuable comments, especially for suggesting to highlight the centrality of the Insight modeling elements as a link between the three modeling views.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A : List of prompting questions for constructing models in the framework
Constructing Business View models |
•What are the key business strategies in your domain of interest? |
•Who is responsible for/aim to achieve those goals? |
•How are they achieving this? How else can we achieve this? |
•Why are they doing this? |
•What are the key performance indicators in this context? |
•How would you measure how well you are achieving those goals? |
•What are the business decision(s) that need analytics (or data-driven) support? Who are those decision makers? |
•Why would they need to make such decisions? Which business goal is each decision part of? Which business (routine) process is this decision part of? |
•What is the frequency of each decision (how often)? |
•What would the decision maker(s) need to know during the decision processes? |
•What are the questions that come to their mind (and they need to have an answer for) during their decision making activities? |
•For each question, if it is too broad, can you break it into sub-questions? |
•Specify the tense (past, present, or future), and frequency (how often) of the questions |
•From the given list, specify what kinds of answers are needed for each of the business questions? Predictive model, groupings of the data (segments), probability model, diagram (visualization), or logical rules |
•For each of the above, specify the Input, Output, Usage Frequency, Update Frequency, and Learning Period of the machine learning model |
Constructing Analytics View models |
•What kind of analytics (descriptive, predictive, or prescriptive) would be appropriate to generate required insights? |
•What algorithm(s) exist for fulfilling the analytics goal at hand? |
•What are the quality attributes or non-functional requirements (NFRs) are critical for users? |
•What numeric metrics would be used to compare/evaluate the algorithms? |
•Define the threshold (upper or lower) values for indicators (e.g., minimum required accuracy for predictive models) |
•How are the critical NFRs influenced by alternative algorithms? |
Constructing Data Preparation View models |
•What kind of data would be relevant for generating the insights and answering the business question at hand? |
•What data attributes (i.e., features), in what format, and aggregation level are needed for the question goals under consideration? |
•Where is the data stored, and what is data schema (i.e., entities and relationships)? |
•Explain, to best of your understanding, the attributes, format, and size of the dataset at hand |
•For each attributes, what is the data types, aggregation level, and selection of records (filtering)? |
•What (sequence of) integration, cleaning, aggregation, filtering, and other data preparations are needed for transforming the raw data tables into the prepared data tables? |
•Are there any data quality concerns? |
Appendix B: Questionnaire used for collecting feedback in post-modeling interviews
[Q1] At the end of modeling sessions, were the modelers able to arrive at a characterization of your existing analytics solution/product?
-
If your answer is NO, please explain what aspects/parts/components of your product/solution were not identified at the end of modeling sessions.
-
If your answer is YES, please provide 2–3 sentences on which area of the graphical models correspond to which part of your product.
[Q2] Through the course of this collaboration, were there any instances of understandings or findings that you and your team were not able to arrive at that prior to the modeling activities? Please provide 2–3 examples.
[Q3] What did you find useful about the framework? (Write 3–4 sentences or bullet points). This can include specific modeling language features or methodological steps, as well as the general approach.
[Q4] What do you think is most lacking in the framework? Are there additions to or variations on the framework that you would like to see?
[Q5] Provide 2–3 examples of features that are not part of current your product/solution, but after the modeling sessions, you think that they can be fruitful additions.
[Q6] What are the aspects or features of the framework that you consider least useful? (This can include modeling language features as well as methodological steps.)
[Q7] In arriving at your current analytics solution/product, you had evolved the product conception and design through one or more iterations in the past. Retrospectively, do you think using the modeling framework would have enabled you to arrive at a viable product more easily or sooner?, e.g., in uncovering pain points and analyzing failure stories and scenarios, and in providing guidance and focus in the search for solutions.
Rights and permissions
About this article
Cite this article
Nalchigar, S., Yu, E. & Keshavjee, K. Modeling machine learning requirements from three perspectives: a case report from the healthcare domain. Requirements Eng 26, 237–254 (2021). https://doi.org/10.1007/s00766-020-00343-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-020-00343-z