Followers

Search This Blog

Saturday, August 19, 2023

Learning and Teaching-Cognitive Load Theory Optimizing Intrinsic Load Part 4

In Part 1 on Cognitive Load Theory (https://polymathtobe.blogspot.com/2023/02/learning-and-teaching-cognitive-load.html) , the framework of WHAT Cognitive Load Theory is was laid out in principle, following  Oliver Lovell’s book Cognitive Load Theory In Action on the subject (Lovell 2020).

Part 2 is on how teachers can minimize extrinsic load on the learner through honing their  presentation. (https://polymathtobe.blogspot.com/2023/04/learning-and-teaching-cognitive-load.html)

Part 3 is on how teachers can minimize the extrinsic load on the learner through structuring their practices and lessons. (https://polymathtobe.blogspot.com/2023/05/learning-and-teaching-cognitive-load.html)

This article roughly follows Oliver Lovell’s book in examining how the teacher, coach, and learner can apply  Cognitive Load Theory to minimize the extrinsic loading on the working memory and then learn to optimize the intrinsic load in the working memory.

The definitions of intrinsic and extrinsic loads are defined again for ease of reference.

The extrinsic cognitive loads are:

·       A part of  the manner and structure of how the information is conveyed to the learners.

·       Disruptive to the learning task because it distracts the learner from learning by occupying valuable working memory space.

Whereas the intrinsic cognitive loads are those that are critical to learning whatever it is that we need to learn. They are:

·       Part of the nature of the information that we are learning.

·       Core learning.

·       Information that we WANT the learner to have in their working memory.

The critical limitation is that the working memory has a finite capacity; that is, the intrinsic and the extrinsic loads are vying for the same finite resource. The emphasis is placed on minimizing the extrinsic load; that is, to offload unnecessary extrinsic cognitive load, to make space in the working memory before optimizing the intrinsic loads.

Note that even though Lovell’s book is relatively short, he presents quite a bit of results and information gained from the studies that make definitive arguments and gives excellent implementation examples, so it is worthwhile to read through the book.

Since I am coming from two familiar but different points of view: teaching at a university level and coaching, I will try to illustrate the points by giving simple examples from both milieus.

The defining difference between teaching and coaching is that teaching can be effected over a longer time frame. Learners in the academic milieu can take their time in building up the learning scaffolds because the academic learner does not need to implement the material immediately, there is time to spend on repeatedly going over the material, digging into the granularity as well as examining the broad scope of the topics. In sports coaching, the learner is expected to learn to act and react instantaneously to new situations which are never identical to the contrived practice situations, this increases extraneous extrinsic load on their working memory as they struggle to learn.

Optimizing Intrinsic Load

The language used to describe the necessary actions for intrinsic loads and extrinsic loads are different. Intrinsic loads need to be optimized, which means that the intrinsic loads can be maximized, minimized, or remain the same.

Optimizing the intrinsic load is the key to successful learning, the difference in wording for intrinsic and extrinsic loading is intentional and comes because of the effect that intrinsic and extrinsic load has on the working memory. The extrinsic loads must be minimized because it is extraneous and impedes learning. The goal is to  devote the maximum of the available working memory to intrinsic loading, it is the loading that facilitates maximum learning. Even as the extrinsic load is minimized, the amount of intrinsic loading placed on the learner could still overwhelm the learner’s working memory.

Intrinsic loads need to be optimized, i.e., adjusted to accommodate the learner’s learning capacity. If the learner can handle more intrinsic load, then the teacher must fill in the void in the learner’s working memory. If the learner is overloaded with intrinsic loads, then the teacher must simplify and remove some of the intrinsic loads to enable the learner to learn effectively. The next question is how do you know when to add and when to simplify? As with all things human, it depends on the person. Each learner is different, and their learning experience needs to be accommodated.  Slightly overloading the learner’s working memory may be beneficial because stressing the learner can sometimes accelerate their adaptation to the intrinsic load and allows them to automatically chunk the topics that is loading them down. At the same time, overstressing the working memory might diminish their ability to learn. There are many factors to this: whether the learner is a novice or an expert, the amount of related knowledge resident in the learner’s long-term memory which can be leveraged to associate with the new knowledge, the amount of time and the amount of exposure the learners are allowed to the knowledge. The complexity and difficulties associated with the topic. These facts makes teaching to a large number of learners challenging, but not impossible.

The overall aim of teaching is not just to teach the fundamental of the topic: the tools of the craft; the overall aim of teaching is to teach how to use the tools effectively and efficiently. Not just the experiential aspect of learning: the how’s, why’s, what if’s; but also, the ability to extrapolate and using their knowledge and reasoning abilities to adapt, improvise, and overcome.

As stated before, there are two key reasons for optimizing the intrinsic loads: to adjust the level of intrinsic load placed on the working memory, which eases the learning process for the learner without exceeding the finite working memory;  and also using the learner’s working memory to its fullest capability without exceeding the capacity.  

The former requires that the teacher adapt the material to the different learning abilities of the learner. The latter requires that the teacher structure the learning so that the learner can fully leverage their finite working memory resource, preventing the learner’s attention from wandering. Not too much, not too little, but just enough; optimize rather than just maximize.

Expertise-Reversal Effect

The Expertise Reversal Effect compounds the complexity of the strategies for optimizing intrinsic load.

The expertise-reversal effect suggests that learners need differing amounts of support depending upon their level of expertise.

This is explained by Guadagnoli and Lee in their paper (Guadagnoli 2004). In short, those learners who are learning new skills and concept without any previous experience are more likely to be lost and confused while learning the basics because their long-term memory is not equipped with existing tangible memories to retrieve which would aid in learning and can be leveraged to create new knowledge for the learner. 

The expert-reversal effect has significant impact on how the teacher needs to approach and implement their optimization strategy. That is, they need to carefully consider the level of the learners they are trying to reach and adjust their strategy.

Teaching

In the context of teaching engineering, the expert-reversal effect implies that :”…worked examples are better for novices, and problem solving is better for experts…”, i.e. the novices needs to be led through the example in discrete steps, and shown the reasoning for each step; while the experts can be given the problem to solve directly.

It can mean that the teacher must be judicious while choosing when to switch between solving examples and assigning problems. It is better to nudge the novice learners a bit more aggressively to solve problems rather than boring them with too many of the same examples, the reverse effect is that the bored learners will tune out the teacher.

Coaching

In the coaching context, the expert-reversal effect means that the coach needs to build up the novice learners’ long-term memory with tangible experiences. Initial experiences are critical to the novice learner because that is the scaffolding for their future learning, not only must the initial experience be practical for the moment, but they must also be useful in the future.

This is asking for quite a bit. The expert-reversal effect will be invoked later in the article to illustrate the impact of the expert-reversal effect has on the decisions for the teaching strategies chosen by the coaches.

A note on the term “tangible experiences”. There has been discussion about the topic of specificity. Some literalists insist that unless the experience is completely in the same domain and context, that experience is not pertinent. In the neurological explanation from Scott Grafton’s Physical Intelligence (Grafton 2020), in which I am an amateur, the neurons don’t know that they are firing for something completely different from the purpose that it had initially acquired the “muscle synergy” and the “basis set”. The neurons are firing because it has that experience to reuse in its “efference model”.

In other words, the neurons will fire for the arm motion of throwing a ball, even if the movement required is not throwing the ball. The bottom line is that the accrual of all motor skills could potentially be beneficial sometime, if not specifically for a domain or context.

Pre-teaching

“Pre-teaching is delivering a portion of the content before the main lesson, and reinforcing it through revision over time, can reduce the intrinsic load experienced by the learner when they attempt the final, complete task.” (Lovell 2020)

Pre-teaching focuses the learner’s attention on the pre-taught material and places it in the working memory of the learner prior to the formal instruction. It gives a basis for the learner to use for later learning.

Teaching

·       Teacher can foreshadow by introducing the coming topics before diving into the details.

·       Teachers can also expose the learners to the big picture of the elements of the topics prior to diving into the subtopics, giving them a framework to place each topic.

Coaching

·       Coaches can introduce advanced skills in practice, partly as incentives, and partly to expose the learners to the future of their own athletic abilities.

Part-Whole or Whole-Part

Part-whole means building constituent skills and knowledge before putting is all together. Whole -part requires providing a general overview first, followed by more focused practice of individual segments. (Lovell 2020)

On the surface, the difference between part-whole and whole-part may seem like rhetoric but the decisions have a significant impact on how well the learner can use their working memory to optimize the intrinsic load, i.e., how well the learner learns.

Part-whole partitions a topic into constituent parts, teaching the constituent parts in some imposed sequence, and reintegrating the constituent parts into the whole. Whole-part teaches the whole topic first and selectively isolate on the parts through partitioning and emphasizing the parts as the whole topic is taught.

There are two questions to be asked before making the decision: one is about the experience and maturity of the learners we are trying to reach; the other is about the complexity of the topic that is being taught. The expert-reversal effect tells us that novices do not have the fundamental tools or experiences to effectively integrate the whole topic into their intrinsic load, while the experts had already effectively integrated a significant portion of the topic into their long term memory. It seems obvious that the part-whole approach is best for the novice and the whole-part is best for the expert, because the expert can leverage their previous tools and experiences to create new knowledge. I would say that the whole-part becomes more beneficial to the learner as they progress from zero experience to expertise. The usual mistake is to maintain the part-whole paradigm too often.

Yet another qualifying criterion that need to be considered is whether the proposed partition of the topic is cognitively logical in isolation; that is, whether the parts can be presented logically without taking into consideration the cross-coupling effects that may be inherent in the topic. If so, then taking a part-whole approach is better for the novice because doing so would decrease the intrinsic loading of the learner to a manageable level. If the individual partitioned parts do not make sense in isolation, then the whole-part approach needs to be implemented. One major caveat is that the act of partitioning a topic needs to be done carefully because the couplings may not be obvious. It is the not-knowing-what-one-does-not-know effect.

Part-Whole

The part-whole approach is to start simple and build complexity as the learner adjusts to the increased load, the expense is that the holistic view is ignored until later in the learning process.

The arguments for the part-whole approach are:

·       “The initial presentation of the part tasks helps consolidate procedures or rules, which can be applied to the whole task at a later stage.” (Lovell 2020)

·       If the topic that is taught is complex, the complexity can overload the learner’s working memory so that the integration of the total task serves to the detriment of learning.

The thrust of the part-whole presentation approach is that the lesson must meet the level of the learner instead of the teacher. Presenting the whole skill at the beginning of the learning process may overwhelm the novice’s working memory.

Lovell lists a few techniques that can be deployed to implement the part-whole approach.

·       Chain forward-Forward chaining presents the partitioned parts of the skills in the chronological order in which they appear in the total skill.

·       Chain backward-Backward chaining is just the reverse, it presents the partitioned parts of the skills in backwards chronological order, starting from the ending.

·       Snowball-Snowballing is a variation on the forward and backward chaining. The idea is to perform a newly introduced partitioned parts of the skills with the previously presented partitioned parts of the skills in conjunction. As the new partitioned part of the skills is added to the repertoire, all the previous taught skills are integrated into the practice. This is to facilitate the learner to Chunk the partitioned parts of the skills together. Chunking is explained later in this article.

One note. Practicing partitioned parts of the skill may not resemble the whole skill in the least. The teacher’s feedback needs to reflect that fact, to not compare apples to oranges. The ultimate goal is to successfully execute the whole skill, not the partitioned parts of skills.

Whole-Part

The whole-part approach is to present the whole topic first and then simplify the complex whole skill as the learner’s learning progress demands to help the learner to absorb the simplified topic.

The argument for the whole-part approach is:

·       “For complex motor tasks and many professional real-life tasks, it is essential that the learner understand and learn the relevant interactions and coordination between the various subtasks. By learning the subtasks in isolation, these interactions may be missed.” (Lovell 2020)

·       Generally, presenting the whole skill first seem to make more sense because we want the learner to be exposed to the holistic view, this way the linkage and coupling of the parts can be demonstrated and illustrated. This helps the learner to picture the whole topic and can allow them to anticipate.

Lovell also lists a few techniques that can be deployed to implement the whole-part  approach.

·       Simplifying conditions-Simplifying the conditions mean that while the whole skill is being practiced and learned, the conditions under which the practices are conducted are simplified to ease the intrinsic loading on the learner.  Rather than focusing on the entire skill under real conditions, take away the real constraints and open up the degrees of freedom under which the skills are being practiced.

·       Manipulating the emphasis-By minimizing the total number of emphases of the skill, just have the learner focus on the parts of the whole skill that are giving them problems. This serves to unload some of the working memory so that the learner is not overwhelmed.

·       Introducing variations-This technique does not minimize the intrinsic loading as the previous techniques. Indeed, adding variations to a practice increases the intrinsic loading on the learners, it increases the intrinsic load. So why is this an effective technique? Introducing variations stresses the learnings working memory exposing them to tasks that are not in their repertory, it forces them to struggle with the complexity of performing the skill with increased variation.  The learner is forced to develop new neuronal pathways which create new working memory which is based on existing experience. These newly created neuronal pathways will be integrated into new long-term memory. One note of caution, the teacher needs to be perspicacious about how the learner reacts to the variation. They could easily overload their working memory. There is a fuzzy limit to how much overloading of the working memory with intrinsic serves to force the learner to adapt to the challenge or as an unintended consequence, completely overwhelming the learner. I am of the belief that most learners are often more resilient than the teacher believes. The best option is to just experiment.

Teaching

·       Teaching is usually taught as a part-whole exercise with the teacher walking through examples step by step.

·       The students will often be ahead of the examples and are able to anticipate the next step.

·       Backward chaining and snowballing would make an interesting exercise in helping the learner learn how to anticipate and connecting the parts.

·       In teaching the qualitative topics, the whole-part approach is taken by introducing the topic from a macro point of view, giving the student the linkages, which connect the different topics and fields within a broad topic. This way the learner can use the relationships taught holistically to anticipate and extrapolate into future topics.

·       A particular problem is that teachers tend to keep the learners in the part-whole realm for too long. Repeatedly giving the learners the same examples and problems to solve rather than giving the learners opportunities to extemporize on the partitioned parts of the skills that they have learned. Problem solving should be about the tool and abilities to make connections between each step.

Coaching

·       For absolute novices, a part-whole approach is best to not overwhelm them cognitively and over stress them emotionally. The frustration from being unable to grasp the entirety of a skill is a showstopper for many learners.

·       Coaches also tend to keep the learners in the part-whole realm for too long. Having the players drilling on the same part of the overall skill to perfection before allowing the players to extemporize and to learn how to solve problems on the court or pitch. Problem solving should be about the tool and abilities to make connections between each step. This problem is especially acute when playing a sport because the time necessary for problem solving and decision making is miniscule.

·       Start with part-whole for novices and proceed to whole-part as soon as possible. There is no room in sports training for perfection.

General Practices

Some general practices that are often recommended are listed here to demonstrate how the Cognitive Load Theory is applied.  These practices are particularly useful for optimizing the intrinsic loading.

Chunking

According Lemov’s The Coaches Guide to Teaching (Lemov 2020): Experts processes more information than novices because they process information in chunks.  This is a key goal of teaching: to induce the learner to chunk their information — knowledge and experiences — together so that the working memory capacity is less laden when the chunks are recalled because chunks are multiple pieces of information chunked together. According to Lemov. The practice of chunking is domain and context specific; that is, the chunks will most likely be useless when taken to a different domain and placed in a different context. Which by the way, contradicts the adage: the game teaches the game. The game teaches the game if and only if the learner understands the domain and context of their knowledge and experiences, that understanding gives meaning to the chunks of  knowledge.

The techniques that are mentioned in both the part-whole and whole-part sections will all expose the learner to the logical chain that underlies the main skill. Chaining, snowballing, simplification, manipulating the emphasis, and introducing variations all link the partitioned parts of the skills into the whole skill. Snowballing is particularly effective in creating the circumstances under which the learner can link and associate the different parts of the skills into the whole skills.

The limiting factor for chunking is the number of knowledges the working memory can handle at once. Once again, the expert-reversal effect affects the total number of tasks a learner can handle without overloading the working memory. The rule of thumb was that the average human working memory can manage to focus on seven salient tasks, although I have read in various literature that experts — especially while under duress — can only focus on three to four salient   tasks at once. It may be that a novice may only be able to focus on  one or two tasks at once. Indeed, it is more pragmatic to empirically decide individually on how many tasks a person can focus on.

Retrieval Spacing and Interleaving

Retrieval, spacing, and Interleaving practices go hand in hand. I first read about the practice in Brown,  Roedinger, et, al. Making It Stick (Brown 2014) and then it was reinforced in Lemov (Lemov 2020).

The idea is to plan and schedule the practice to give the learners a chance to actively retrieve the memory of the skill as often as possible from the long-term memory. Constant retrieval of the memories from long term memory strengthens the memory and migrates the memory closer to the top of the stack in the long-term memory, helping to make the memory permanent.

Spacing works in combination with retrieval practice. By spacing segments of the same practice in time, allows the memories of the practice subject to fade from the working memory and when the same practice is re-initiated later, whether it is within the same dedicated time segment or not, the memories are retrieved.

Interleaving accomplishes both retrieval and spacing by practicing in cycles rather than in one continuous sequence. Rather than doing one drill or studying one subject for a given amount of time or for the accomplishment of a final goal, interleave the same drills or study period by doing them in cycles. Examples are shown below.

Unfortunately, there are no general rules of thumb regarding the number of times retrievals need to happen to make the knowledge permanent in long term memory. Nor is there a recommended time for spacing which is optimal to guarantee forgetting and retrieval, I had asked Prof. Brown in an email on this aspect, he told me there had not been any studies in that regard, it depends on the specific group of learners with each learner having a distinct timing and the complexity of the topic.

I have tried to make retrieval practices as prevalent and numerous as possible in both teaching and coaching.

Teaching

·       Instituting short term assessments such as quizzes at regular intervals to motivate the learners to retrieve prior knowledge. Each quiz is comprehensive, not just focusing on the topic of the week.

·       Question and answer periods within the recitation where the teacher cold call students on previously learned topics, giving the learners opportunities to actively retrieve previously learned knowledge.

·       All assessments are comprehensive.

·       Alluding to previously covered topics and integrating them into the new topics, creating connections and context for the old and the new topics.

·       Encouraging the learners to modify their study habits by committing to 20–25-minute blocks that are devoted to one topic, then taking a 5-minute break before moving to another topic. But the learner must return to the topic at least once before the studying session is over, to actively retrieve the knowledge.

o   Study subject 1 for 20-25 minutes. Take a 5-minute break.

o   Study subject 2 for 20-25 minutes. Take a 5-minute break.

o   Study subject 3 for 20-25 minutes. Take a 5-minute break.

o   Return to subject 1.

o   Return to subject 2.

o   Return to subject 3.

o   Repeat as convenient.

Coaching

·       Retrieval, spacing, and interleaving can be combined in effective practice planning. Rather planning on having numerous drills, each lasting until a performance or timed target is achieved.

o   Traditionally:

§  Drill 1, with a time and/or performance goal.

§  Drill 2, with a time and/or performance goal.

§  Drill 3, with a time and/or performance goal.

§  Drill 4, with a time and/or performance goal.

o   Using interleaving

§  Drill 1, with a time and/or performance goal that is adjusted to reduce the total time spent from the traditional way.

§  Drill 2, with a time and/or performance goal that is adjusted to reduce the total time spent from the traditional way.

§  Drill 3, with a time and/or performance goal that is adjusted to reduce the total time spent from the traditional way.

§  Drill 4, with a time and/or performance goal that is adjusted to reduce the total time spent from the traditional way.

§  Drill 1 again, with a time and/or performance goal that is adjusted and based on the time and performance goal from the first time through.

§  Drill 2 again , with a time and/or performance goal that is adjusted and based on the time and performance goal from the first time through.

§  Drill 3 again, with a time and/or performance goal that is adjusted and based on the time and performance goal from the first time through.

§  Drill 4 again, with a time and/or performance goal that is adjusted and based on the time and performance goal from the first time through.

§  Drill 1 again, with a time and/or performance goal that is adjusted and based on the time and performance goal from the second time through.

§  And so on.

§  Note that the drill segments do not need to be in sequence, changing the order randomly is implementing the introduction of variations technique in the Whole-part paradigm. The key is to not put the same drills back-to-back, which defeats the purpose.

§  By judiciously changing the goals — time goals or performance goals — the intrinsic load is varied with each repetition of the drill, creating an elevating desirable difficulty to the intrinsic loading. Again, attention need to be paid to the learner’s response, to decide if the variation is overloading their working memory.

§  Interleaving, if properly practiced, can be more time effective than devoting a large block of time to each drill. Frequent changing of emphasis and physical requirements ensures that the learners are constantly being challenged and refreshed rather than being stuck in an interminable rut.

§  Depending on the performance goal selected, the learners may achieve the final desired goal through the escalating intermediate goals quicker than trying to achieve the desired goal at once.

§  Another element is to introduce a scrimmage or play segment in between the drills, allowing the learners to quickly incorporate the skills from the drills after the first cycle of the drill or through the sequence. This gives the teacher a chance to give them feedback on what they are missing and remind them of the purpose of the drills. All in time to go through the second cycle of the drill or drills. Repeat the scrimmage or play segment as desired.

Block and Random

Much like the part-whole versus whole-part discussion, the block and random debate has been controversial and going on indefinitely, especially in the sports context. While the vast majority of coaches agree that random practices more closely resemble reality, either in the classroom or on the sporting fields or courts, sometimes block training is necessary. Those instances are when the expert-reversal effect comes into play.

Going back to the arguments from before, block practices are necessary for those novices who need to integrate the basics through scaffolded repetitions. Random repetitions, while much more beneficial for anyone who is not a novice, imposes too much extrinsic load on the novice and will retard the initial learning. The problem is that most teachers will resort to block training intuitively. It is partly because they have been taught this way and because they want the learners to execute the skills immaculately prior to exposing them to reality. Learners are by and large more resilient and adaptive than teachers assume, which means that the optimal transition period to random training comes much sooner than the teachers estimate.

Summary

This article lays out several topics that I found personally convincing with regard to many topics that incorporates the ideas inherent in the Cognitive Load Theory. My extemporizing has digressed somewhat, but it encapsulates much of what I believe to be the foundation of my philosophy of teaching and coaching. It took a long way to get there.

Learning is not a game of perfect, it is a process that requires constant and consistent elevation of challenges to the learner, creating a process of desirable difficulties which stimulates their ability to integrate existing knowledge and experiences and to challenge their ability to create new personal solutions through problem solving and decision making.

I am convinced that the Cognitive Load Theory is the best model we have for understanding how the human learning process works. I have tried to examine the topic synoptically through reading several references, of course it is impossible to completely include every study and tome on the topic in my reading or to encapsulate the ideas in my mind. My working memory would be overloaded. 😊

I welcome any discussion with those experts in the areas of learning, skill acquisition, neuro, and cognitive science because the topic fascinates this dilettante.

References

Brown, Peter C. ,Roediger III, Henry L. , McDaniel,Mark A. Make It Stick: The Science of Successful Learning. Canbridge MA: Belknap Press, 2014.

Grafton, Scott. Physical Intelligence. New York: Pantheon Books, 2020.

Guadagnoli, Mark and Timothy D. Lee. "Challenge Point: a Framework for Conceptualizing the Effects of Various Practice Conditions in Motor Learning." Journal of Motor Behavior, June 2004: 212-224.

Lemov, Doug. The Coaches Guide to Teaching. Clearwater, FL: John Catt Educational Ltd., 2020.

Lovell, Oliver. Sweller's Cognitive Load Theory in Action. Melton: John Catt Educational Ltd, 2020.

 

 

 

 

No comments: