The Initial Vision of Intelligent Tutoring Systems

One of the initial visions for intelligent tutoring systems was a vision of systems that were as perceptive as a human teacher (see discussion in Self 1990) and as thoughtful as an expert tutor (see discussion in Shute 1990), using some of the same pedagogical and tutorial strategies as used by expert human tutors (Merrill et al. 1992; Lepper et al. 1993; McArthur et al. 1990). These systems would explicitly incorporate knowledge about the domain and about pedagogy (see discussion in Wenger 1987), as part of engaging students in complex and effective mixed-initiative learning dialogues (Carbonell 1970). Student models would infer what a student knew (Goldstein 1979), and the student’s motivation (Del Soldato and du Boulay 1995), and would use this knowledge in making decisions that improved student outcomes along multiple dimensions.

An intelligent tutoring system would not just be capable of supporting learning. An intelligent tutoring system would behave as if it genuinely cared about the student’s success (Self 1998).Footnote 1 This is not to say that such a system would actually simulate the process of caring and being satisfied by a student’s success, but the system would behave identically to a system that did, meeting the student’s needs in a comprehensive fashion.

And these systems would not just be effective at promoting students’ learning, the systems themselves would also learn how to teach (O’Shea 1982; Beck 1997). The systems would study what worked, and when it worked, and they would improve themselves over time (Beck 1997; Beck et al. 2000).

In 2015, after decades of hard work by many world-class scientists, we have wonderful demonstrations of the potentials of this type of technology. We have systems that can provide support on every step in a student’s thinking process (VanLehn et al. 2005; Anderson et al. 1995), systems that can talk with students in natural language (Nye et al. 2014; earlier on, see Stevens and Collins 1977); systems that model complex teacher and tutor pedagogical strategies (Heffernan and Koedinger 2002; Khachatryan et al. 2014); systems that recognize and respond to differences in student emotion (D’Mello et al. 2010; Arroyo et al. 2014); and simulated students that enable human students to learn by teaching (Leelawong and Biswas 2008; Matsuda et al. 2010).

And at the same time, we have intelligent tutoring systems being used by tens or hundreds of thousands of students a year, and achieving outcomes that would make the earliest proponents of intelligent tutoring systems proud, with statistically significant positive impacts on student learning, including SQL-Tutor (Mitrovic and Ohlsson 1999), ALEKS (Craig et al. 2013), Cognitive Tutor (Pane et al. 2014) and ASSISTments (Koedinger et al. 2010).

A Disconnect

But there is a disconnect between the vision of what intelligent tutoring systems could be, and what they are; a disconnect between the most impressive examples of what intelligent tutors can do, and what current systems used at scale do. In fact, the most widely used intelligent tutoring systems are in some ways the furthest from the initial vision of researchers like Carbonell and Self.

We can start with the ways that tutors used at scale resemble this initial Vision. domain modeling is present in many of the systems used at scale. For example, Cognitive Tutors use production-rule models to represent skill (Anderson et al. 1995), ALEKS uses prerequisite hierarchies to represent the connections between items (Falmagne et al. 2013), and SQL-Tutor uses extensive constraint-based models to represent appropriate performance (Mitrovic and Ohlsson 1999).

So, too, student knowledge modeling is seen in some of the systems used at scale, with Cognitive Tutors using Bayesian Knowledge Tracing (BKT; Corbett & Anderson, 1995), and ALEKS using formulas based on knowledge space theory (Falmagne et al. 2013). But here, despite the decades of work on knowledge modeling, and the intense competition between approaches seen in published papers (see, for instance, Pavlik et al. 2009; Pardos et al. 2011; Papousek et al. 2014), the approaches used in practice are largely fairly simple. For example, many systems in wide use depend on simple heuristics to assess student mastery, such as whether the student gets three right in a row (Heffernan and Heffernan 2014).

And work to adapt to affect, engagement, meta-cognition, and self-regulated learning has yielded projects that have improved student outcomes (Baker et al., 2006; Arroyo et al., 2007; D'Mello and Graesser 2012), but has largely been investigated in small-scale studies rather than being deployed at large scale. What is most intriguing is that some of these projects have been conducted in the context of systems being deployed at scale; but the research innovations, even apparently successful ones, are not then integrated into the systems deployed at scale.

So, too, despite the initial enthusiasm about systems that can use reinforcement learning to improve themselves (e.g. Beck 1997; Beck et al. 2000), few systems incorporate this capacity. There have been some simulation studies around reinforcement learning and related machine learning approaches (e.g. POMDPs) for intelligent tutors (Chi et al. 2010; Rafferty et al. 2011), but little work to deploy these solutions into approaches used at scale.

As such, we are left with a bit of a puzzle. ITS research has been successful at producing impressive technologies (and there are many beyond the small sample discussed here), and ITS systems are now being used by tens or hundreds of thousands of learners, but the systems being used at scale are generally not representative of the full richness that research systems demonstrate.

New Excitement with MOOCs

This mystery is particularly relevant at this historical moment. Massive Online Open Courses, or MOOCs (McAuley et al. 2010), have emerged into the consciousness of a large proportion of educated people worldwide. MOOCs provide a combination of video lectures, online assignments, and discussion forums (and connectivist MOOCs, or c-MOOCs, provide additional pedagogies as well – Rodriguez 2012). These systems can incorporate intelligent tutor-style assignments (Aleven et al., 2015) but typically provide an experience more focused on didactic lectures and discussion forums than on the types of activities typical to intelligent tutoring systems.

Many of the leading proponents of MOOCs have advertised them as crucibles for innovation, with huge potential to revolutionize education; making high-quality learning materials, once available only to very limited numbers of learners, available to the masses. Much of the rhetoric and ideas around MOOCs matches the earlier enthusiasm around intelligent tutoring systems – MOOCs will leverage the power of big data and reinforcement learning to improve themselves (Raghuveer et al. 2014); MOOCs will adapt to individual learners and provide a truly personalized learning experience (Norvig 2012).

Thus far, most MOOCs have fallen far short of the hype. Intelligent tutoring style or simulation-based assignments have only begun to be embedded into MOOCs (Ostashewski 2013; Diaz et al. 2013; Aleven et al., 2015); collaborative chat activities or social media leveraging activities have only been lightly deployed (Joksimović et al. 2015; also see dance.cs.cmu.edu); and at the time of this writing, most MOOCs are still limited to providing very basic multiple-choice assignments surrounding sometimes low-quality video lectures, and overwhelmingly large discussion forums which many instructors participate in only lightly or not at all. Perhaps MOOCs can be forgiven for not achieving in three yearsFootnote 2 what intelligent tutoring systems still struggle to provide after decades, but the fact remains that the same pattern of development seems to be replicating itself: large-scale deployment of solutions that fall far short of the visions and the rhetoric.

As such, this appears to be a valuable time to take stock of how online learning systems used at scale differ from the original vision of intelligent tutoring systems, and what this might mean.

A Different Vision

One potential response to the still relatively simple technology seen in online learning is that these developments take time, and that solutions can simply be slow to make their way from the research laboratory, to the research classroom, to the full diversity of classrooms (Corbett et al. 2001). This perspective very well may be right. There are a number of economic factors that come into play that may slow the progress of innovations into use. Certainly, it is the perspective that a great deal of my own work has adopted. It has been my dream, and continues to be my dream, that intelligent tutoring systems that incorporate detectors of – say – gaming the system, and adapt in real-time when students game the system, will one day be commonplace.

But I find myself wondering, is it possible that this is not what the world of educational technology will look like? Is it possible that there’s another path towards developing excellent online learning technologies? And is it possible that this world has been developing around us, that this alternate path is already happening, while we continue to work on building more and more sophisticated intelligent tutors?

So let me pose the possibility of a different way that the excellent online learning systems of tomorrow could be developed. Perhaps we do not in fact need intelligent tutoring systems. Perhaps instead what we need, what we are already developing, is stupid tutoring systems.Footnote 3 Tutors that do not, themselves, behave very intelligently. But tutors that are designed intelligently, and that leverage human intelligence .

In other words, perhaps what we need is stupid tutoring systems, and intelligent humans.

What would this look like?

Envision that we design a system, with relatively simple interactions with students. A student is posed a mathematics problem. They can answer it, or request a hint. If they ask for a hint, they get a pre-defined set; if they make a wrong answer, they get a message telling them why they are wrong, or perhaps a scaffolding problem that helps them with a key step towards the answer. They keep working on math problems for the current skill, until they can get three in a row right. And the next morning, their teacher can look up which problems they and their classmates got right and wrong.

I am referring, of course, to the ASSISTments system (Heffernan and Heffernan 2014), one of the most widely used (and simplest) intelligent tutoring systems in common usage today.

ASSISTments is not just an example of simple design. It’s an example of good design. It’s an example of data-driven design. Data-driven design is not a new idea in AIED; it dates back at least to Self’s (1990) model of the iterative design of intelligent tutoring systems.

But systems like ASSISTments bring iterative design based on experimentation to a new level. There have been innumerableFootnote 4 studies testing different aspects of the ASSISTments system (Ostrow and Heffernan 2014), answering questions such as: should we use hints or scaffolds (Razzaq and Heffernan 2006)? should hints be delivered using text or video (Ostrow and Heffernan 2014)? how much do messages on growth mindsets benefit students (Ostrow et al. 2014)? should we display gaming behavior indicators to teachers (Walonoski & Heffernan, 2006)? The design of ASSISTments is based, from the ground up, on data. Data collected through hundreds of A/B tests, quick randomized controlled trials. Data analyzed by humans.

ASSISTments looks a whole lot like this vision I am articulating. And it has scaled. And it has helped students learn (Koedinger et al. 2010).

This idea of intelligence being in humans rather than the tools they use is not a novel idea, of course. The research communities on human computation (Quinn and Bederson 2011), on crowdsourcing (Brabham 2008), and on intelligence amplification (Freund 2013), have had the same idea. The distinction between learning analytics and educational data mining can in some ways be brought back to this difference (Baker and Siemens 2014). The idea that people develop tools, and tools are refined and used by intelligent individuals based on practice, is a well-known and long-standing idea within human-computer interaction (Winograd and Flores 1986). This often takes the form of considering human-tool systems or human-human-tool systems or broader socio-technological systems. Sometimes, system-based views on technology and education can descend into fancy rhetoric and thought-provoking essays (Winograd and Flores 1986; Greeno 1997; Baker 2016), rather than solid empirical evidence. I am not arguing for that as a general research approach. I am a hard-edged passionate believer in data, data, and more data, with as quantitative a lens as possible.

But we seem to be entering an era where data is being more used in the service of design and human decision-making, than automated personalization.

Educational Data Mining: Making Discoveries, Improving Education

A lot of the rhetoric around the emerging field of educational data mining has been that big data will enable us to develop rich student models that can be used in the kinds of automated personalization that we see in intelligent tutoring systems. I am familiar with that rhetoric. I have written a lot of it (Baker & Yacef, 2009; Baker, 2010; Baker and Siemens 2014).

But that has not been the only vision for big data and education. A second vision, present from the start, is that we can use educational data mining to make basic discoveries in the science of learning and enhance theory (Beck and Mostow 2008; Jeong and Biswas 2008; Baker & Yacef, 2009; Baker, 2010).

For example, sophisticated models of affect, engagement, meta-cognition, self-regulated learning, and domain structure have often been looked at as tools to enable automated intervention; systems that can tell when a student is bored (for instance), and adapt to re-engage that student. But instead of building them into an intelligent tutor, we can make them tools for research and analysis by intelligent humans. The findings of these analyses can be used in turn to enhance the design of online learning systems.

For example, Koedinger and colleagues (2012) used learning factors analysis to re-fit the mappings between knowledge components and specific problem steps in Cognitive Tutor Geometry. Though the skills in this learning system were derived through extensive cognitive modeling, they were still imperfect, and educational data mining methods were able to figure out how. Koedinger and his colleagues found through this research that some problem steps that were thought to involve the same skill involved different cognitive skills; for example, some problems involving computing the area of a circle, thought to involve a single skill, involved backwards reasoning instead of forward reasoning, resulting in an additional knowledge component to learn. In subsequent work, Koedinger et al. (2013) used these findings to re-design a tutor lesson, leading in an experimental study to significantly faster learning.

A similar goal can be achieved with predictive analytics models – models that make a prediction of some longer-term outcome, such as course failure or dropout (Arnold and Pistilli 2012; Barber and Sharkey 2012; Ming and Ming 2012), or failure to graduate (Dekker et al. 2009; Kovačić 2010). Some of these models rely upon fine-grained student behavior (Arnold and Pistilli 2012; Ming and Ming 2012), others rely more upon demographics or other relatively stable student attributes (Kovačić 2010; Barber and Sharkey 2012). While the use of demographic data can be controversial and can be argued to be relatively less actionable (and I argue this point in my MOOC – Baker 2014), learner behaviors often provides direct indicators that are easy to think of interventions for.

As such, EDM has the potential to help us better understand learning and the phenomena that surround it. This can help us in turn to enhance the design of online learning environments. I’ll discuss some examples of the use of predictive analytics models for this in the following section.

Learning Analytics: Better Reporting

Several years after the emergence of the educational data mining community, a second community emerged, seemingly working in the same space – the learning analytics community. As is often the case when two communities emerge in the same general area, at first there was confusion as to what the boundaries were between these communities.

It quickly became clear that, despite common interests, there were important differences between learning analytics and educational data mining. George Siemens and I summarize a few of the core differences in (Siemens and Baker 2012; Baker and Siemens 2014). But one of the key differences, at least in terms of the questions this article considers, was a shift from using data mining to support automated intervention, to using it to support reporting.

A system can report on a student’s state to several different potential stakeholders. Open learner models and related systems report on a student to the student themselves, and can also provide reports to student peers for comparative purposes (e.g. Bull and Nghiem 2002). Many at-risk prediction systems report on a student to their instructors (e.g. Arnold and Pistilli 2012). Other systems present reports to guidance counselors, parents (Broderick et al. 2010; Hawn, 2015; Bergman, under review), regional curriculum coordinators, and school or university leaders (e.g. Zapata-Rivera and Katz 2014).

One of the best-known examples of the use of reporting to drive change is the case of Course Signals, originally Purdue Course Signals (Arnold and Pistilli 2012). This system takes models that can predict student success, applies the models to make predictions in real time, determines why students are at risk, and provides this information to instructors, along with practice recommendations. For example, the system may suggest that an instructor email a student to discuss their inactivity in the course, and may even recommend specific text for such an email. It is, of course, up to the instructor’s discretion whether he or she will follow those recommendations; this is typically seen as an advantage, as an instructor may be aware of situation-specific information unavailable to the system that suggest an alternate course of action is more appropriate in specific cases. Course Signals has been found to lead to significantly higher course and university retention (Arnold and Pistilli 2012).

Other early-warning systems, similar to Course Signals, have sprung up, with a veritable ecosystem of companies (and non-profits, and university projects) offering predictive analytics on student success and early warnings for instructors when a student is at risk, including ZogoTech (Wood and Williams 2013), and the Open Academic Analytics Initiative (Jayaprakash et al. 2014).

The emergence of these themes is also starting to be seen in other types of online learning and AIED technologies. For example, the S3 project gives teacher an ongoing distillation of student activities in a full-room multi-screen collaborative learning activity, giving the teacher the ability to orchestrate and change the activities students were working on based on this information, or to interact with individual student groups in real time (Slotta et al. 2013). In the Virtual Collaborative Research Institute system, real-time information is given to instructors on student participation in collaborative chat and whether students are agreeing or disagreeing, towards helping the instructors to take real-time action to improve the quality of collaborative discussions (Van Leeuwen et al. 2014), specifically targeting groups having problems (van Leeuwen et al. 2015). Learning analytics analysis of student performance in serious games is now starting to be offered to instructors as well (Serrano-Laguna and Fernández-Manjón 2014).

These systems join the long-term efforts of intelligent tutoring systems like Cognitive Tutor, Reasoning Mind, and ASSISTments to provide extensive reports to teachers (Anderson et al. 1995; Feng and Heffernan 2005; Miller et al., 2015). In the case of Reasoning Mind, teachers use these reports in real-time, obtaining information that a student is struggling with a specific concept right now, and engaging in proactive remediation (Miller et al., 2015). In the case of ASSISTments, teachers often read the reports of the previous night’s homework before class, and re-work their planned lecture based on data about what questions students struggled with (Feng and Heffernan 2005).

Analytics-based reporting for parents are just emerging, from the attempts of charter schools in New York City to provide predictions to parents along with details on the factors creating risk for individual students (Hawn, 2015), to text messages sent to parents that give details on missing assignments and low grades (Bergman et al., under review), to text messages and online reports to parents on what material their students are studying and how they are performing (Broderick et al. 2010).

These types of analytics have become useful at a broader grain-size as well. Data from automated detectors of student engagement is now being made available to regional coordinators for the Reasoning Mind system to identify specific classrooms where teachers need additional support (Mulqueeny et al., 2015).

These systems address different goals from each other – from trying to prevent course dropout at the college level, to changing content instruction and classroom pedagogy, to identifying trouble spots in regional implementations of learning technologies. But what they share in common is the goal of getting key information to a human being who can use it. Some solutions are more prescriptive – Course Signals recommends specific actions and email text to instructors. Other systems simply give indications of performance and behavior, and let the instructor or parent decide what to do. As a group, these solutions place the ultimate decisions in the hands of a human being.

Advantages of Humans

In the previous two sections, I have discussed how educational data mining and learning analytics methods – particularly automated detection of complex constructs, and predictive analytics – can be put to powerful use in two fashions beyond automated intervention: re-design based on discovery with models analysis, and reporting.

In both these uses of prediction models, the common thread is that AI technology is used to derive important information about an online learning system, but the action taken is not by the system itself; instead action is taken by a human. The learning system is not itself intelligent; the human intelligence that surrounds the system is supported and leveraged. Designers are informed to support re-design and enhancement of a learning system; instructors are informed so that they can support the student right away.

There are several advantages to this approach, relative to a more automated intervention strategy.

First of all, automated interventions can be time-consuming to author. Researchers seldom report how long it took them to develop new interventions, but authoring an entirely new behavior for a pedagogical agent is not cheap. For example, it took this author several months to design and implement the pedagogical agent which responded to gaming the system in Cognitive Tutors (Baker et al., 2006). That agent only worked in a small number of Cognitive Tutor lessons; scaling the intervention would have been less costly than building it in the first place, but it still would have taken considerable effort.

Second, automated interventions can be brittle. No predictive model is perfect (human predictions are not exactly perfect eitherFootnote 5). An automated system cannot recognize when a model is clearly wrong, due perhaps to unusual circumstances or a change in context. And if an automated intervention is not working, it’s difficult for a system to recognize this and change tack. More importantly, if an automated intervention goes badly wrong in an unexpected way (the student starts crying), the system has limited scope to recognize this and take action.

Automated interventions are brittle in a different way as well: students can adapt faster than automated systems. An encouraging message may not be so encouraging the 12th time; a student may figure out how to defeat an intervention designed to prevent gaming the system, and find new ways to game (a behavior reported in Murray and VanLehn 2005).

Third, students change over time. Automated interventions therefore need to be re-checked and adapted over time. For example, overall student attitudes towards intelligent tutoring systems appear to have changed a great deal over the last 20 years. Schofield (1995) reported that Pittsburgh high school students were extremely engaged and even came outside of regular class hours to use the Cognitive Tutor, a behavior that does not appear to be common in American classrooms today. However, extremely high engagement along the lines reported by Schofield has been more recently reported among students using Cognitive Tutors in the Philippines (Rodrigo, Baker, & Rossi, 2013). It is not clear that this engagement will persist if intelligent tutoring systems become a regular part of education there.

None of these limitations are insurmountable. If there’s sufficient resources, new interactions can be developed; an intelligent tutoring system could conceivably be designed to recognize when an intervention is failing for a specific student; systems can be re-checked and re-designed over time (and this already happens); and systems can be tested to see if students respond as expected to all interventions.

But on the whole, this presents some possible reasons why human-driven changes are playing a larger role than automated intervention. Humans are flexible and intelligent. Humans cannot sift through large amounts of information quickly, which is why they need data mining and reporting to inform them. But once informed, a human can respond effectively.

Going Forward

In this article, I have discussed how the original vision for intelligent tutoring systems – powerful, flexible systems that adapt in a range of ways to the learner – does not seem to entirely match the intelligent tutoring systems we see at scale. We are not seeing as much artificial intelligence as we expected, at least not in terms of how these systems interact with students. Instead, we seem to see much less rich tutoring systems that nonetheless leverage a lot of a different type of intelligence – human intelligence. We are developing what one could flippantly call stupid tutoring systems: tutors that are not, in and of themselves, behaving in an intelligent fashion. But tutors that are designed intelligently, and that leverage human intelligence .

Modern online learning systems used at scale are leveraging human intelligence to improve their design, and they are bringing human beings into the decision-making loop and trying to inform them (and the information they provide is in many cases distilled using sophisticated algorithms).

If we were to adopt this as an alternate paradigm for artificial intelligence in education (AIED) – artificial intelligence as intelligence amplification (Freund 2013), how would the field change? What would be some of the new problems?

First of all, we’d need to face the challenge that human beings vary in quality. We know that different teachers have different impacts on student success (Darling-Hammond 2000); we know that there is a range in the results produced by different human tutors (Kulik and Kulik 1991); we know that some designers and programmers are better than others (Brooks 1975). So not all human responses to reporting are going to be equally effective.

And no one, no matter how sharp, gets it right all the time.

So we need to design processes that help human beings figure out what works, and processes to scale what works, and processes to figure out why it works.

We know a decent amount about how to do this for designing learning systems. There are educational data mining methods like learning decomposition explicitly designed to help us figure out which strategies work (Beck and Mostow 2008); and a growing body of literature on A/B testing and automated experimentation in education (Mostow 2008; Ostrow and Heffernan 2014). Beyond this, there’s an active if controversial body of research on how to determine which teachers are effective (cf. McCaffrey et al. 2003). Platforms like the Pittsburgh Science of Learning Center LearnLabs (Koedinger et al. 2012a, b) and more recently the efforts to make ASSISTments an open platform for research (Ostrow and Heffernan 2014) are positive trends in this direction. And there’s a long tradition of distilling educational research into guidelines for practice and design (Bransford et al. 1999; Koedinger et al. 2012a; Pashler et al. 2007; Clark and Mayer 2003). These guidelines can support scalability, so that solutions we develop in one platform can influence future platforms.

There has been less work to do this for human action on reports. Systems like Course Signals attempt to scaffold effective practice by suggesting actions to instructors (Arnold and Pistilli 2012). But these systems do not allow bottom-up improvement, just improvement by designers. Other online platforms use teacher professional development to share and disseminate effective practices – for example, ASSISTments, Reasoning Mind, ALEKS, and Cognitive Tutor all provide extensive professional development for teachers. Beyond this, there are now online communities and discussion forums for teachers to share strategies (Maull et al. 2011). But these approaches bring the recommended practices outside the system, and as such are not particularly timely. A valuable area of future research may be to use crowd-sourcing to solicit strategies from instructors, and data mining to test their effectiveness. Human-driven intervention strategies found to be effective could then be automatically suggested to instructors, much like Course Signals does. This would effectively create a recommender system for instructors, helping less effective instructors to catch up to their more effective peers.

A second opportunity for research along these lines is how to improve our models to take account of how they are used. We already know that indicators used as the basis for intervention lose their effectiveness as predictors, a well-known phenomenon in economics (Campbell’s Law; Campbell 1976). For example, if Bayesian Knowledge Tracing (BKT) is used in a learning system without mastery learning, it can predict post-test scores (Corbett & Anderson, 1995). But if BKT is used to drive mastery learning, it ceases to be able to predict post-test scores (Corbett and Bhatnagar 1997). This is a plausible concern for the use of these models discussed above. Imagine a teacher getting predictive analytics on student failure after every assignment. If the instructor took action after the third assignment, but the system did not take this into account, the system’s prediction after the fourth assignment might be overly pessimistic. As such, we need to investigate how to create second-order models that provide useful analytics information after intervention has already begun.

Relatedly, we may also find that we can identify cases where our models are not working, based on instructor behavior. If an instructor chooses not to intervene in some cases, this may suggest that the instructor is recognizing an outlier or special case that our model cannot recognize. It may be possible to re-train and enhance our models based on this information. Even if a model is optimally accurate for its original training sample, it may become less accurate as the system it is in changes, as the academic culture around it changes, or as the populations it is used with shift. Human action will be most effective if the data and models provided to those humans is of the highest possible quality for the context of use. Building models that are robust to instructor behavior and that change as their context of application changes will become an essential challenge.

To sum up, the ultimate goal of the field of Artificial Intelligence in Education is not to promote artificial intelligence, but to promote education. The leading systems in AIED (in least in terms of degree of usage) seem to represent a different paradigm than the classic paradigm of intelligent tutoring systems. Reframing our research questions and perspectives in the light of this evidence may help us to better understand what we as a community are doing, and how we can be even more successful in doing it.

In the end, our goal is not to create intelligent tutoring systems or stupid tutoring systems, but to create intelligent and successful students.