Open Access Button
Every day people around the world such as doctors, scientists, students and patients are denied access to the research they need. With the power of the internet, the results of academic research should be available to all. It’s time to showcase the impact of paywalls and help people get the research they need. That’s where Open Access Button comes in.

The Open Access Button is a browser plugin that allows people to report when they hit a paywall and cannot access a research article. Head to openaccessbutton.org to sign up for your very own Button and start using it.

I just want to flag up this cool project that’s trying to improve access to scholarly literature for everyone. I’ve been involved with the project from the start, helping to figure out how to tie it in with open access repositories, but it’s medical students David Carroll and Joe McArthur who deserve the credit for coming up with the idea and driving it forward.

To date, more than 5,000 blocked articles have already been logged. It even got mentioned in the Guardian! Take a look, give it a try or even get involved:


How did I get involved?

Last year, I spent some free time creating an experimental web tool to look up Open Access1 versions of scholarly articles from their DOIs2. There is already a system for getting the official version of record for any DOI, but it struck me that where that version is hidden behind a paywall and a free version is available elsewhere, it should be just as easy to find that.

This work got noticed by a group of people at a hack day3, which resulted in my contributing to their project, the Open Access Button. The primary purpose of the OA Button is to allow people to report whenever they hit a paywall while trying to read an article (so that the scale of the problem can be visualised), and as an added bonus, we’re adding functionality to help gain access through other channels, including finding green open access versions and contacting the corresponding author.

  1. “Open Access” refers to content which is freely available for all to read and use (usually referring to scholarly articles that publish the results of academic research), as distinct from that which is only accessible by paying a fee (either per-article or a as subscription to a journal).

  2. A Digital Object Identifier (DOI) is a unique string of characters that identifies a published object, such as an article, book or dataset. They look something like this: 10.1000/182.

  3. A hack day is an opportunity for developers and non-developers to get together and prototype projects without any risk of loss or ridicule if things don’t work out — great for getting the creative juices flowing!

Comments

Gnu

As I’ve mentioned previously, I periodically try out new task management software. The latest in that story is Emacs and Org-mode.

What is Org?

In its creator’s own words, Org is:

“for keeping notes, maintaining TODO lists, planning projects, and authoring documents with a fast and effective plain-text system”

It started as an Emacs extension for authoring documents with some neat outlining features, then went mad with power and became a complete personal information organiser.

But wait, what the **** is Emacs?

Emacs is the mother of all text editors. It’s one of the oldest pieces of free software, having been around since the dawn of time 1970’s, and is still under active development. Being so venerable, it still cleaves to the conventions of the 70’s and is entirely keyboard-controllable (though it now has excellent support for your favourite rodent as well).

“Text editor” is actually a pretty loose term in this instance: it’s completely programmable, in a slightly odd language called Elisp (which appeals to my computer scientist side). Because many of the people who use it are programmers, it’s been extended to do almost anything that you might want, from transparently editing encrypted or remote (or both) files to browsing the web and checking your email.

My needs for an organisational system

In my last productivity-related post I mentioned that the key properties of a task management system were:

  • One system for everything
  • Multiple ways of structuring and viewing tasks

I would now probably add a third property: the ability to “shrink-wrap”, or be as simple as possible for the current situation while keeping extra features hidden until needed.

And Org very much fits the bill.

One system for everything

Emacs has been ported to pretty much every operating system under the sun, so I know I can use it on my Linux desktop at work, my iMac at home plus whatever I end up with in the future. Because the files are all plain text, they’re trivial to keep synchronised between multiple machines.

There are also apps for iOS and Android, and while they’re not perfect, they’re good enough for when I want to take my todo list on the road.

Multiple ways of structuring and viewing tasks

Whatever I’m doing in Emacs, an instant agenda with all my current tasks is only two keystrokes away. That’s programmable too, so I have it customised to view my tasks in the way that makes most sense to me.

Shrink wrapping

Org has a lot of very clever features added by its user community over its 10+ years, but you don’t have to use them, or even know they exist, until you need them. As an illustration, a simple task list in Org looks like this:

* TODO Project 1
** TODO Task one
** TODO Task two

* TODO Project 2
** DONE Another task
** TODO A further task

And changing TODO to DONE is a single keystroke. Simplicity itself.

Here’s Carsten Dominik on the subject

“[Org-mode] is a zero-setup, totally simple TODO manager that works with plain files, files that can be edited on pretty much any system out there, either as plain text in any editor …

Of course, Org-mode allows you to do more, but I would hope in a non-imposing way! It has lots of features under the hood that you can pull in when you are ready, when you find out that there is something more you’d like to do.”

Wow, what else can it do?

“I didn’t know I could do that!”

If that’s not enough, here are a few more reasons:

  • Keyboard shortcuts for quick outline editing
  • Lots of detailed organisational tools (but only when you need them):
    • Schedule and deadline dates for tasks
    • Flexible system for repeating tasks/projects
    • Complete tasks in series or parallel
    • Arbitrary properties, notes and tags for tasks and projects
  • Use the same tools for authoring HTML/LaTeX documents or even literate programming
  • It’s programmable! If it doesn’t have the functionality you want just write it, from adding keyboard shortcuts to whole new use cases (such as a contact manager or habit tracker)

Give it a try

Emacs is worth trying on its own, especially if you do a lot of programming, web design or anything else that involves a lot of time editing text files. A recent version of Org is bundled with the latest GNU Emacs, and can easily be updated to the current version.

Comments

So, as you’ll have seen from my last post, I’ve been putting together an alternative DOI resolver that points to open access copies in institutional repositories. I’m enjoying learning some new tools and the challenge of cleaning up some not-quite-ideal data, but if it’s to grow into a useful service, it needs several several things:

A better name

Seriously. “Open Access DOI Resolver” is descriptive but not very distinctive. Sadly, the only name I’ve come up with so far is “Duh-DOI!” (see the YouTube video below), which doesn’t quite convey the right impression.

A new home

I’ve grabbed a list of DOI endpoints for British institutional repositories — well over 100. Having tested the code on my iMac, I can confirm it happily harvests DOIs from most EPrints-based repositories. But I’ve hit 10,000 database rows (the free limit on Heroku, the current host) with just the DOIs from a single repository, which means the public version won’t be able to resolve anything from outside Bath until the situation changes.

Better standards compliance

It’s a fact of life that everyone implements a standard differently. OAI-PMH and Dublin Core are no exception. Some repositories report both the DOI and the open access URL in <dc:identifier> elements; others use <dc:relation> for both while using <dc:identifier> for something totally different, like the title. Some don’t report a URL for the items repository entry at all, only the publisher’s (usually paywalled) official URL.

There are efforts under way to improve the situation (like RIOXX), but until then, the best I can do is to implement gradually better heuristics to standardise the diverse data available. To do that, I’m gradually collecting examples of repositories that break my harvesting algorithm and fixing them, but that’s a fairly slow process since I’m only working on this in my free time.

xkcd: Standards

Better data

Even with better standards compliance, the tool can only be as good as the available data. I can only resolve a DOI if it’s actually been associated with its article in an institutional repository, but not every record that should have a DOI has one. It’s possible that a side benefit of this tool is that it will flag up the proportion of IR records that have DOIs assigned.

Then there’s the fact that most repository front ends seem not to do any validation on DOIs. As they’re entered by humans, there’s always going to be scope for error, which there should be some validation in place to at least try and detect. Here are just a few of the “DOIs” from an anonymous sample of British repositories:

  • +10.1063/1.3247966
  • /10.1016/S0921-4526(98)01208-3
  • 0.1111/j.1467-8322.2006.00410.x
  • 07510210.1088/0953-8984/21/7/075102
  • 10.2436 / 20.2500.01.93
  • 235109 10.1103/PhysRevB.71.235109
  • DOI: 10.1109/TSP.2012.2212434
  • ShowEdit 10.1074/jbc.274.22.15678
  • http://doi.acm.org/10.1145/989863.989893
  • http://hdl.handle.net/10.1007/s00191-008-0096-6
  • <U+200B>10.<U+200B>1104/<U+200B>pp.<U+200B>111.<U+200B>186957

In some cases it’s clear what the error is and how to correct it programmatically. In other cases any attempt to correct it is guesswork at best and could introduce as many problems as it solves.

That last one is particularly interesting: the <U+200B> codes are “zero width spaces”. They don’t show on screen but are still there to trip up computers trying to read the DOI. I’m not sure how they would get there other than by a deliberate attempt on the part of the publisher to obfuscate the identifier.

It’s also only really useful where the repository record we’re pointing to actually has the open access full text, rather than just linking to the publisher version, which many do.

A license

Ok, this one’s pretty easy to solve. I’m releasing the code under the GNU General Public License. It’s on github so go fork it.

And here’s the video I promised:

Comments

The other week I was at a gathering of developers, librarians and researchers with an interest in institutional data repositories. Amongst other things, we spent some time brainstorming the requirements for such a repository, but there was one minor-sounding one that caught my imagination.

It boiled down to this question: given only the DOI for a published article (or other artefact), how do you find an open access copy archived in an institutional repository? Some (rather cursory) Googling didn’t come up with an obvious solution, so I thought “How hard can it be to implement?”.

All that’s required is a database mapping DOIs onto URLs, and a spot of glue to make it accessible over the web. The data that you need is freely available in machine-readable format from most repositories via OAI-PMH, so you can fill up the database using that as a data source.

So, without further ado here it is:

A few caveats:

  1. I don’t get much chance to write code at work at the moment, so this was an opportunity to exercise under-used brain muscles and learn some new stuff. It could probably be done better (and the source code is on github, so feel free to fork it and add improvements). It’s written in Ruby using the awesome Sinatra web framework.
  2. It’s currently hosted on Heroku’s free starter-level service, so there’s very little capacity. It therefore only includes DOI’s from the University of Bath’s Opus repository, and the database is full.

Go try it out and let me know what you think. If it’s useful, I’ll look into how I can make it more robust and resolve more DOIs.

Comments

It’s taken a while for me to realise it, but I’m a bit of a list-maker. Some years ago I read David Allen’s Getting Things Done (often abbreviated as GTD) and found some useful tips that have had a big impact in how I manage my tasks and my time.

There are heaps of apps to help you Get your Things Done, but I generally seem to oscillate between two: Omni Group’s OmniFocus and Cultured Code’s Things. The choice between the two is closely balanced in my head, and I seem to end up switching every 12-18 months. Until recently, Things’ lightning-fast cloud sync had be, but now OmniFocus has tempted me back with its general feature-richness.

Some key factors for me:

  • One system for everything:
    • One system that syncs across computers and mobile devices, so I always have it with me;
    • One system for work stuff and personal stuff, because sometimes I need phone my bank while at work and sometimes the solution to a work problem comes to me while watching TV;
  • Multiple ways of structuring and viewing tasks:
    • When I need to check that I’ve captured all my tasks, I need to view them by project to see what’s missing;
    • When I need to actually do things, I need to see my tasks by context, i.e. what equipment/location is required to do them.

Aside: switching is not inefficient

You might think that it’s a waste of time laboriously transferring all my projects and tasks from one system to another, but it’s really not. This only happens once every 12-18 months, and it’s a great way to do a full audit of everything I want to achieve, spot what’s missing and cull the dead wood.

Even if you have one task management system that works for you, I suggest you try occasionally printing the whole lot off (on real dead trees) and re-entering the important stuff. Because it takes more effort, it makes you more ruthless in what stuff you allow onto your todo list and sharpens your focus on what’s important.

OmniFocus vs. Things

OmniFocus’ strength is it’s flexibility. Each task has not only a title and a checkbox, but a project, a context, a start date, a due date, an expected amount of effort and, if that’s not enough, a freeform note field. It has a rich, hierarchical structure for projects and tasks, and the ability to create customised views of the system or “perspectives”.

Things, on the other hand, strives for simplicity. It lacks much of the complexity of OmniFocus and replaces it with tags. Tags can be hierarchical, which is handy, and because you can assign more than one to a task, you can actually use them to replicate a number of OmniFocus’ detail fields.

Things is pretty good…

That simplicity means that there’s very little effort involved in using Things — just throw in your tasks and get started. You can assign one or more tags to each task and then filter on those, and that allows you replicate quite a lot of what OmniFocus allows.

The other area where Things beats OmniFocus is in synchronisation. Every time you make a change in Things it’s synced up to the cloud, and updating another app takes moments. There’s no need to manually initiate a sync, so everything is always available everywhere.

…but OmniFocus is winning

Sooner or later, though, the lack of expressiveness in Things gets to me. OmniFocus panders to my desire for structure: I can have tasks from any project (or any part of a project) appear one at a time or all at once. That all takes a little more time to set up (though it soon becomes second nature), but it means when I actually want to get on with work I see only the tasks I need to see and no more.

OmniFocus’ perspectives are another example of where the extra power is useful. It’s trivial to set up one-click views that only show a certain set of projects (such as work stuff) or a particular set of tasks (such as things I can do offline), or even just group tasks differently (such as by due date or age).

Finally, the iPad app for OmniFocus has a killer feature: Review mode. This makes it trivial for me to sit down at the end of each week with a cup of tea and go through the entire system, finishing off loose ends and capturing next actions. This is central to the GTD way, and is the part of my routine that guarantees everything is in order and nothing gets missed.

Of course there are plenty of situations where you don’t need all of this complexity, and that’s fine too. It doesn’t force you into using all of the features to have a functioning system: you only have to use what you need for the current situation.

What about you?

So there you have it. I’d be interested in finding out how you use OmniFocus or Things, or if you have your own preferred system. There are even people who implement GTD using a biro, a binder clip and a stack of 6x4” index cards.

Comments

I quite often favourite tweets that I want to come back and refer to. Unfortunately, I rarely actually get round to going back over my favourite tweets, so I wanted a way to get them into an inbox that I check regularly (á la Getting Things Done).

I finally got round to figuring this out the other day, so here’s my recipe:

  1. You can get an RSS feed of your favourites using a URL of the form https://api.twitter.com/1/favorites.rss?screen_name=jezcope, though obviously you should replace “jezcope” with your own Twitter handle.
  2. Once you’ve checked that’s working, copy it and feed it to a daily email digest generator. I’m currently trying blogtrottr which seems to be working well and gives you the option of checking at a range of frequencies from 1 to 24 hours.

That’s it — pretty simple huh? You’ll probably get an email containing all of your favourites to start, and then future emails will contain just the latest favourites.

Comments

Here are a few links to things that I mentioned (and maybe a few that I didn’t) in today’s briefing session for University of Bath researchers. Please feel free to leave your own suggestions, or links to your own blog, in the comments at the bottom.

Reading blogs

Once you start following more than two or three blogs, you might find it easier to use a piece of software called a “feed reader” or “news aggregator” (or other similar terms) to do the hard work and collect all the new posts in one place. Here are a few options:

  • Google Reader — web based
  • FeedDemon — Windows (optional sync with Google Reader)
  • Reeder — Mac, iOS (Google Reader account required)
  • Feedly — Browser plugin, iOS, Android (Google Reader account required)
  • All major web browsers now have some sort of feed reader built in too

Technorati and Google Blog Search are good ways to find something to read.

Ways to blog

Hosted

The simplest way to start a blog is to use a service (free or paid-for) which handles everything for you. Here are some examples:

Self-hosted

If you’re a bit more technical and/or demanding, you may prefer to host your own blog on a server. Here are some examples of software that will help:

Other tips

Comments

Finally, the moment you’ve all been waiting for: day 3 of ALT-C 2012!

First up, Professor Mark Stubbs (Head of Learning and Research Technologies at Manchester Metropolitan University) gave an interesting talk on the MMU curriculum redesign. This isn’t my primary interest, but there were some useful nuggets in there about change management. The key lessons they learned from a complete redesign of the undergraduate curriculum in a very short time were:

  1. Engage people; and
  2. Keep it simple.

I particularly liked how they revamped the forms for approving new modules to keep them short, focused and aligned with the desired outcomes of the project (rather than gathering huge amounts of spurious info and getting loads of irrelevant people to sign off). This approach has important lessons for us at Bath as we introduce Data Management Planning to our researchers.

Next up was JISC Head of Innovation Sarah Porter, talking about the ongoing reshaping of JISC in the wake of the HEFCE review.

My second session of the day was James Clay’s “Pilot mentality” symposium. This was based on James’s observation that although “pilot” usually implies something that will be tried out then reported on and scaled up, there seem to be a lot of so-called “pilots” which end up being one-offs. More worryingly, we see the same “pilots” being run across the sector.

I actually ended up writing a whole lot about this session here originally, without feeling like I’d done the topic justice, so I’ve scooped all of that out into its own post, to appear in the near future.

So, onto the final session of the conference, entitled “TEL1 Research: Who needs it?” from the London Knowledge Lab’s Richard Noss. My reaction to this was mixed, I have to say, but overall there some good points.

80 years after the invention of the printing press, it was still only being used to print the bible, and we’ve been using computers in education for fewer than 50 years, so I agree that we probably don’t have a clue what ed. tech. will eventually end up looking like. We’re very good at using new technology to reproduce existing practices and processes, but it takes a while to realise its true potential.

He also wheeled out the old argument that you have to understand how a technology works to use it effectively. Agreed, his examples of senior managers in investment banks failing to understand basic statistics is compelling, but I don’t think it’s fully generalisable. After all, people have been making pretty good bread and cheese for centuries without understanding microbiology.

Understanding a technology means we can be more effective (and more subtle) about its use, but I don’t think complete understanding is a requirement for some level of effectiveness: part of being human is being very good at getting by.

I did like his comments about studying extremes of human behaviour to learn about the norm: I find in my work, sometimes, that I’m drawn to techies and luddites!

Anyway, it was quite a thought provoking conference again, the more so because I’m more focused on research technology at the moment and attending helped me cross-fertilise a bit. I’m not sure if I’ll be going again next year: Digital Research is looking very interesting and tends to clash, so we’ll see.

  1. For those not involved in this area, TEL is the acronym for technology-enhanced learning.

Comments

It’s October, which means the autumn TV season has started, which means that Strictly Come Dancing is back on for another year, which means it’s time for a flurry of blog posts as I leave my wonderful other half to shout at the TV on weekend evenings.

I’ve decided to have another go at joining in with another MOOC to give me some blog fuel, and this time round it’s Current & Future State of Higher Education 2012.

My last MOOC attempt, IOE12, sort of fizzled out (my participation, not the course itself) as I didn’t really have the time to keep it going. Hopefully I’ll do better this time, but if not I’m sure I’ll learn something anyway.

So, hello fellow MOOCers and watch this space!

Comments

It’s been a little while since ALT-C 2012 now, so I thought I’d better write up the rest of my notes. Here’s day 2 in all it’s glory.

My day started off with James Clay’s workshop entitled “A few of my favourite things” — just an opportunity for gadget lovers to share some of their favourite apps (mostly iPad/iPhone, but a few Androids in there too).

There were a lot of popular apps in there, like the ever-present Evernote and Instagram, but there were a few interesting ones I hadn’t come across, or was able to see in a new light:

JotNot
Lets you take a photo of a page and semi-automatically straightens it and enhances it so you get a flat, high-contrast version — a scanner in your pocket. Looks like this is abandonware, but instead I discovered Genius Scan, which has many more features.
TunePal
One for lovers of traditional music: search for info on and dots for a traditional tune by playing a bit of it into your phone.

Next followed an interesting session introducing some tools from projects on the JISC Digital Literacies programme. I particularly liked the digital literacies lens on the SCONUL Seven Pillars of Information Literacy. There’s a lot of (perhaps true but not very helpful) talk going round at the moment about “everyone having a different definition of digital literacy”, so it’s good to see a fairly concise high-level view of what we’re actually talking about on that subject.

As a recovering mathematician, I found Natasa Milic-Frayling’s keynote on network analysis fascinating. Her team at Microsoft Research have developed an Excel plugin, NodeXL for analysing networks (and obtaining data from social networks to analyse).

She described some interesting work analysing voting patterns of US senators, and correlating connections in social networks with geographic distribution.

Students introduced to NodeXL were able to get straight into playing with network data, and quickly took on board the basic concepts (various ideas of the importance of a network node) without needing to grasp the underlying maths (such as the various equations for centrality).

My last session of the day was from Clive Young of University College London, talking about “blended” roles in e-learning. These are typically those people who provide general admin support to lecturers, and are increasingly being expected to managed VLE modules and other online elements of courses on behalf of the lecturers.

At UCL, these teaching administrators with blended roles had self-organised into a support network, as they were getting no targeted support on how to use Moodle from the e-learning team. This was, of course, rectified, and in the end 10% of the staff identified in blended roles went on to achieve CMALT status.

All interesting stuff, and I’ll be back to post my thoughts on day 3 soon.

Comments