a blog about research communication & higher education & open culture & technology & making & librarianship & stuff

IDCC 2017 reflection

For most of the last few years I've been lucky enough to attend the International Digital Curation Conference (IDCC). One of the main audiences attending is people who, like me, work on research data management at universities around the world and it's begun to feel like a sort of "home" conference to me. This year, IDCC was held at the Royal College of Surgeons in the beautiful city of Edinburgh.

For the last couple of years, my overall impression has been that, as a community, we're moving away from the "first-order" problem of trying to convince people (from PhD students to senior academics) to take RDM seriously and into a rich set of "second-order" problems around how to do things better and widen support to more people. This year has been no exception. Here are a few of my observations and takeaway points.

Read more...

Chat rooms vs Twitter: how I communicate now

Telephones

CC0, Pixabay

This time last year, Brad Colbow published a comic in his “The Brads” series entitled “The long slow death of Twitter”. It really encapsulates the way I’ve been feeling about Twitter for a while now.

Go ahead and take a look. I’ll still be here when you come back.

According to my Twitter profile, I joined in February 2009 as user #20,049,102. It was nearing its 3rd birthday and, though there were clearly a lot of people already signed up at that point, it was still relatively quiet, especially in the UK.

Read more...

Introducing PyRefine: OpenRefine meets Python

I’m knocking the rust off my programming skills by attempting to write a pure-Python interpreter for OpenRefine “scripts”.

OpenRefine logo

OpenRefine is a great tool for exploring and cleaning datasets prior to analysing them. It also records an undo history of all actions that you can export as a sort of script in JSON format. One thing that bugs me though is that, having spent some time interactively cleaning up your dataset, you then need to fire up OpenRefine again and do some interactive mouse-clicky stuff to apply that cleaning routine to another dataset. You can at least re-import the JSON undo history to make that as quick as possible, but there’s no getting around the fact that there’s no quick way to do it from a cold start.

Read more...

Implementing Yesterbox in emacs with mu4e

I’ve been meaning to give Yesterbox a try for a while. The general idea is that each day you only deal with email that arrived yesterday or earlier. This forms your inbox for the day, hence “yesterbox”.

Once you’ve emptied your yesterbox, or at least got through some minimum number (10 is recommended) then you can look at emails from today. Even then you only really want to be dealing with things that are absolutely urgent. Anything else can wait til tomorrow.

Read more...

Rewarding good practice in research

Carrot + Stick < Love from opensource.com

From opensource.com on Flickr

Whenever I’m involved in a discussion about how to encourage researchers to adopt new practices, eventually someone will come out with some variant of the following phrase:

“That’s all very well, but researchers will never do XYZ until it’s made a criterion in hiring and promotion decisions.”

With all the discussion of carrots and sticks I can see where this attitude comes from, and strongly empathise with it, but it raises two main problems:

Read more...

Software Carpentry: SC Test; does your software do what you meant?

“The single most important rule of testing is to do it.”
Brian Kernighan and Rob Pike, The Practice of Programming (quote taken from SC Test page

One of the trickiest aspects of developing software is making sure that it actually does what it’s supposed to. Sometimes failures are obvious: you get completely unreasonable output or even (shock!) a comprehensible error message.

But failures are often more subtle. Would you notice if your result was out by a few percent, or consistently ignored the first row of your input data?

Read more...

Series: Software Carpentry Archaeology

Tools for collaborative markdown editing

Discount signs in a shop window

Photo by Alan Cleaver

I really love Markdown1. I love its simplicity; its readability; its plain-text nature. I love that it can be written and read with nothing more complicated than a text-editor. I love how nicely it plays with version control systems. I love how easy it is to convert to different formats with Pandoc and how it’s become effectively the native text format for a wide range of blogging platforms.

Read more...

Software Carpentry: SC Track; hunt those bugs!

This competition will be an opportunity for the next wave of developers to show their skills to the world — and to companies like ours. — Dick Hardt, ActiveState (quote taken from SC Track page)

All code contains bugs, and all projects have features that users would like but which aren’t yet implemented. Open source projects tend to get more of these as their user communities grow and start requesting improvements to the product. As your open source project grows, it becomes harder and harder to keep track of and prioritise all of these potential chunks of work. What do you do?

Read more...

Series: Software Carpentry Archaeology

Software Carpentry: SC Config; write once, compile anywhere

Nine years ago, when I first release Python to the world, I distributed it with a Makefile for BSD Unix. The most frequent questions and suggestions I received in response to these early distributions were about building it on different Unix platforms. Someone pointed me to autoconf, which allowed me to create a configure script that figured out platform idiosyncracies Unfortunately, autoconf is painful to use – its grouping, quoting and commenting conventions don’t match those of the target language, which makes scripts hard to write and even harder to debug. I hope that this competition comes up with a better solution — it would make porting Python to new platforms a lot easier!
Guido van Rossum, Technical Director, Python Consortium (quote taken from SC Config page)

Read more...

Semantic linefeeds: one clause per line

I’ve started using “semantic linefeeds”, a concept I discovered on Brandon Rhodes’ blog, when writing content, an idea described in that article far better than I could. I turns out this is a very old idea, promoted way back in the day by Brian W Kernighan, contributor to the original Unix system, co-creator of the AWK and AMPL programming languages and co-author of a lot of seminal programming textbooks including “The C Programming Language”.

Read more...