Last week was a significant one for UK academics and those interested in accessing scholarship; the funding councils announced a new policy mandating open access for the post-2014 research evaluation exercises. In the same week, Cetis added its name to the list of members of the Open Policy Network, (strap-line, “ensuring open access to publicly funded resources”). Thinking back only 5 years, the change in policy is not something I could imagine would have happened by now and I think it is a credit to the people who have pushed this through in the face of resistance from vested interests, and to the people in Jisc who have played a part in making this possible.
For me, it is now time to stop providing reviews for non-open-access journals.
This is not the end of the story, however… open access to only papers falls short of what I think we need for many disciplines, and certainly where we operate at the intersection of education and technology. Yes, I want access to data and source code. This is still too radical for many institutions today, for sure, but it will happen and, based on the speed with which Open Access has moved from being a hippy rant to funding council policy, I think we’ll have it sooner than many expect. Now is the time to make the change, and I was very pleased to hear the idea being posited for a pilot by people involved with the Journal of Learning Analytics, which is already OA. (NB: this is not agreed policy of JLA). On the other hand, the proceedings of the Learning Analytics and Knowledge conference are not yet OA. Is the subject matter only of interest to people working in organisations that subscribe to the ACM digital library? No… but it will doubtless take a few years for things to change. Come on SoLAR, anticipate the change!
A little bit of text mining on a fairly large number of blogs with an educational technology (or technology enhanced learning…) makes a neat set of snapshots on “open …”.
Considering the words following “open” from January 2009 to the end of October 2012 shows the following distribution (where words with a relative frequency of <2% are ignored, as are low-value words like “and”). Hence it shows a share of the dominant themes.
The share for “online+course” is largely attributable to MOOCs and similar, although some of it is likely to be the use of “open online” referring to something else. This probably confirms the guesswork of followers of Ed Tech fashion but it may be a bit more of a surprise to see that open educational/content has taken such a tumble. I wonder whether some of the “open education” share has been diverted into “open online/course”. I’m also pleased to see “open standards” gaining more of a foothold but a left with a feeling that “open data” got a bit over-hyped in 2011.
About the data: 28116 blog posts were harvested and these contained 13723 uses of “open”. The blog post harvesting was done by the Mediabase and the analysis was done by the author, both as part of the EC funded TELMap project.
During the CETIS Conference today (Feb 22nd), I showed a few graphs, plots and other visualisations that show the results of text mining around 7500 blog posts, mostly from 2011 and into early 2012. These were crawled by the RWTH Aachen University “Mediabase“.
There are far too many to show here and each of three analyses has its separate auto-generated output, which is linked to below. Each of these outlines key aspects of the method and headline statistics. I am quite aware that it is bad practice just to publish a load of visualisations without either an explicit or implicit story. If this bothers you, you might want to stop now, or visit my short piece “East and West: two worlds of technology enhanced learning“, which uses the first method outlined below but is not such a “bag of parts”. If you want to weave your own story… read on!
Stage 1: Dominant Themes
The starting point is simply to look at the dominant themes in blog posts from 2011 and early 2012 through the lens of frequent terms used. Common words with little significance (stop words) are removed and similar words are aggregated (e.g. learn, learner, learning). This set of blog posts is then split into two sets: those from CETIS and those from a broadly representative set of Ed Tech blogs. The frequent terms are then filtered into those that are statistically more significant in the CETIS set and those that are statistically more significant in the Ed Tech set.
In this case, I home in on CETIS blogs only, but go back further in time: to January 2009. The blog posts are split into two sets: one contains posts from the last 6 months and the other contains posts since the end of January 2009. The distribution of terms appearing in each set is compared to find those which are statistically significant in the change, taking into account the sample size. This process identifies four classes of term: terms that appear anew in recent months, terms that rose from very low frequencies, those that rose from moderate or higher frequencies and those that fell (or vanished).
The results of doing this are: “Rising and Falling Terms – CETIS Blogs Jan 31 2012“. This has a VERY LARGE number of plots, many of which can be skipped over but are of use when trying to dig deeper. This auto-generated report also contains links to the relevant blog posts and ratings for “novelty” and “subjectivity”.
Stage 2b: Visualising Changes Over Time
Various terms were chosen from Stage 2a and the changes in time rendered using the (in-) famous “bubble chart”. Although these should not be taken too seriously since the quantity of data per time step is rather small, these allow for quite a lot of experimentation with a range of related factors: term frequency, number of documents containing the term, positive/negative sentiment in posts containing the term. Four separate charts were created for CETIS blogs from 2009-2012: Rising, Established, Falling and Familiar (dominant terms from Stage 1). The dominant non-CETIS terms are also available, but only for 2011.
Due to some problems with the blog crawler, a number of blogs could not be processed or had incompletely extracted postings so this is not truly representative. The results are not expected to change dramatically but there will be some terms appearing and some disappearing when these issues are fixed. This posting will be altered and the various auto-generated reports will be re-generated in due course.
The R code used, and results from using the same methods on conference abstracts/papers are available from my GitHub. This site also includes some notes on the technicalities of the methods used (i.e. separate from the way these were actually coded).
Over on the JISC Observatory website a recent interview with Seb Schmoller has just been published in which he talks about his experiences – from the perspective of an online distance educator – of the recent large scale open online course “Introduction to AI” run in association with Stanford University. As the interview unfolded it occurred to me that the aspects of the course that had struck Seb as being of potentially profound importance fitted the criteria for a “low end disruptive innovation” in the terminology of innovation theorist Clayton M Christensen. Low end disruption refers to the way apparently well-run businesses could be disrupted by newcomers with cheaper but good-enough offerings that focus on core customer needs and often make use generic off-the-shelf technologies.
The problems that have to be solved in the 21st century to maintain or increase human health, wealth and happiness are highly complex. By “complex”, I mean that they are highly interconnected and impossible to understand accurately by looking at influential factors in isolation. Divide-and-conquer strategies and narrowly focussed-expertise are inadequate unless re-integrated to understand the bigger picture. This state of affairs is currently reflected in much research funding but isn’t just a concern for researchers.
Professionals in almost all walks of life will be faced with making decisions on matters for which there is little precedent and a shortage of established professional practice to fall back on. There is, and will be a growing need, for professionals capable of drawing on information and adapting to the paradigms of multiple disciplines.
The trend in supply of data from both research and public sector communities is clearly in the direction of more information being provided under suitable “open data” licences and employing basic semantic web techniques. This resource, not confined by disciplinary boundaries or utility to specific lines-of-argument, has great potential value in answering the complex and novel questions required to navigate humanity through the complexities of sustainable development. I contend that realisation of the potential of this information is contingent on a new information literacy, specifically a new digital literacy if we are concerned with open data on the web or otherwise.
Whereas the use of historical data is well known in research in many established disciplines a new information literacy is required to realise the potential noted above that is not limited to academic research, that uses data disembodied from the narrative and argument of the journal article and that transcends the limit of the established discipline. The challenge for the education system is to prepare professionals of the future (and helping professionals of today adapt through appropriate work-based learning) with this new information literacy. This “new information literacy” requires a deeper and more explicit understanding of models employed within and outwith a professional’s “home” discipline and the embedded epistemology of that discipline.
The philosophy of General Semantics and the practices advocated by Alfred Korzybski and subsequent thinkers are of interest in that their focus is on “consciousness of abstracting” as a means of avoiding the conceptual errors often made in interpreting linguistic acts or experience of events. Rather than making an assertion that General Semantics is “the answer”, indeed it certainly contains fallacies and unsubstantiated ideas, I suggest that it offers some valuable insight into the mental habits that can improve the ability of professionals to work across disciplines, whether using Open Data from research and public sector sources or not, to answer the questions of tactical and strategic character that sustainable development requires.
Neuro-Linguistic Programming (NLP), founded as a movement in the mid 1970s and with clear links to Korzybski, contains some further useful ideas if one looks beyond the psychotherapeutic dimension. My intention is not to develop a detailed position in this post but to suggest that there are some practices/habits advocated by Korzybski and others that offer a resource for us to consider. Some of the maxims and practices that I think are candidates for education in the new information literacy, hence also a new digital literacy, are:
Korzybski’s “extensional devices” are practical habits that stress relationship between things (as opposed to things defined in isolation)
Gregory Bateson in “Mind and Nature, A Necessary Unity” presents a number of pre-suppositions that “every schoolboy knows” (sic) that are actually more representative of gaps in thought.
The meta-model of NLP provides a set of heuristic questions to identify distortion, generalization and deletion in language. These kind of questions are potentially useful when working across disciplines to reduce the chance of false-reasoning.
I will now complete the title, where the ellipsis left off: “… General Semantics, Neuro-Linguistic Programming and Gregory Bateson”
JISC is hosting the September IMS Quarterly meeting from September 15th to 18th in Birmingham (UK). Usually these meetings are for IMS member representatives but this time the sessions are all open to the public (registration required).
Whereas there will be some work on interoperability specifications at the meeting, most of the sessions are far less hard-core but no less important in dealing with strategy and requirements:
Learning Object Discovery and Exchange
Question and Test Interoperability
Common Cartridge (including interoperability testing and demos)
I’ve been discussing eportfolios with colleagues this afternoon, particularly the Portfolio InterOperability Prototyping Project (PIOP). PIOP is looking at using Atom as a basis for ePortfolio interoperability. PLEX is an exemplar of a personal learning environment (PLE) created a few years ago. Quite a few people have made connections between the concept of a PLE and the personal character of an ePortfolio but we generally see ePortfolios realised in institutional software applications, i.e. institutional rather than personal environments.
The thought which occurred to me was triggered by talking about a perceived requirement for learners to take their portfolio content away with them when leaving a learning institution. Realistically, today, this would probably end up being a zipped-up collection of web pages, images and documents or a cryptic technical format. Maybe the learner would have a saved version at home. Chances are that structure and content would be lost or be unusable to the poor learner. But it probably educational institutions should support meaningful export of a learner’s digital material and tutor comments etc as well. One of the PIOP scenarios involves someone making a transition from an NVQ3 into a Foundation Degree at a different institution but what would happen if there is a small break between the two? Is it realistic to expect institutions to exchange portfolio content in such a case?
Assuming there will be institutional software holding ePortfolio information while a course is being undertaken, wouldn’t it be a good idea to use the PIOP specification to build an adapter for a PLE like PLEX?This could both give the learner the rich structure and allow them to select and publish into the ePortfolio system of their next course provider or into their personal blog. As many blog applications support Atom, the backbone of PIOP, it would be natural to include a “download your institutional blog” facility too. Something like Atom Publishing Protocol (APP) would work nicely for the uploading: APP already has support in the blog world and could be extended along the PIOP lines much as the SWORD project did for repository deposit.
Where does this approach get us? It bridges the personal and the institutional. It exploits a shared web technology (Atom) for ePortfolio, treading a path of low resistance. It is an uncontroversial scenario – a learner working with their stuff – that could be a Trojan Horse for getting adoption of ePortfolio interoperability, from which platform further innovation can later jump.
The licence terms of interoperability specifications has been a topic on which a great deal of fear, uncertainty and doubt has been spread over the last year or so. The tension between retaining control of a specification and reducing the degree to which divergent unofficial versions fuel the creation of numerous incompatible dialects on the one hand and supporting creative application on the other hand is clear as are the problems caused by slow-moving formal processes to address specification bugs.
A recent announcement by IMS on their intention to pilot a form of Creative Commons licence is, therefore, worthy of applause. The Fundamentalists will point to the “a form of…” in the press release but Realists will recognise a step in the right direction.