a blog about research communication & higher education & open culture & technology & making & librarianship & stuff

GLAM Data Science Network fellow travellers

Updates

  • 2021-02-04 Thanks to Gene @dzshuniper@ausglam.space for suggesting ADHO and a better attribution for the opening quote (see comments below for details)

See comments & webmentions for details.

β€œIf you want to go fast, go alone. If you want to go far, go together.” β€” African proverb, probably popularised in English by Kenyan church leader Rev. Samuel Kobia (original)

This quote is a popular one in the Carpentries community, and I interpret it in this context to mean that a group of people working together is more sustainable than individuals pursuing the same goal independently. That’s something that speaks to me, and that I want to make sure is reflected in nurturing this new community for data science in galleries, archives, libraries & museums (GLAM). To succeed, this work needs to be complementary and collaborative, rather than competitive, so I want to acknowledge a range of other networks & organisations whose activities complement this.

Read more...

Series: GLAM Data Science Network

I’ve updated my blog theme to use the quasi-proportional fonts Iosevka Aile and Iosevka Etoile. I really like the aesthetic, as they look like fixed-width console fonts (I use the true fixed-width version of Iosevka in my terminal and text editor) but they’re actually proportional which makes them easier to read.
https://typeof.net/Iosevka/

Training a model to recognise my own handwriting

If I’m going to train an algorithm to read my weird & awful writing, I’m going to need a decent-sized training set to work with. And since one of the main things I want to do with it is to blog “by hand” it makes sense to focus on that type of material for training. In other words, I need to write out a bunch of blog posts on paper, scan them and transcribe them as ground truth. The added bonus of this plan is that after transcribing, I also end up with some digital text I can use as an actual post β€” multitasking!

Read more...

Blogging by hand

I wrote the following text on my tablet with a stylus, which was an interesting experience:

So, thinking about ways to make writing fun again, what if I were to write some of them by hand? I mean I have a tablet with a pretty nice stylus, so maybe handwriting recognition could work. One major problem, of course, is that my handwriting is AWFUL! I guess I’ll just have to see whether the OCR is good enough to cope…

Read more...

What I want from a GLAM/Cultural Heritage Data Science Network

Introduction

As I mentioned last year, I was awarded a Software Sustainability Institute Fellowship to pursue the project of setting up a Cultural Heritage/GLAM data science network. Obviously, the global pandemic has forced a re-think of many plans and this is no exception, so I’m coming back to reflect on it and make sure I’m clear about the core goals so that everything else still moves in the right direction.

Read more...

Series: GLAM Data Science Network

Writing About Not Writing

Discount signs in a shop window

Under Construction Grunge Sign by Nicolas Raymond β€” CC BY 2.0

Every year, around this time of year, I start doing two things. First, I start thinking I could really start to understand monads and write more than toy programs in Haskell. This is unlikely to ever actually happen unless and until I get a day job where I can justify writing useful programs in Haskell, but Advent of Code always gets me thinking otherwise.

Read more...

IDCC20 reflections

I’m just back from IDCC20, so here are a few reflections on this year’s conference. You can find all the available slides and links to shared notes on the conference programme. There’s also a list of all the posters and an overview of the Unconference

Skills for curation of diverse datasets

Here in the UK and elsewhere, you’re unlikely to find many institutions claiming to apply a deep level of curation to every dataset/software package/etc deposited with them. There are so many different kinds of data and so few people in any one institution doing “curation” that it’s impossible to do this for everything. Absent the knowledge and skills required to fully evaluate an object the best that can be done is usually to make a sense check on the metadata and flag up with the depositor potential for high-level issues such as accidental disclosure of sensitive personal information.

Read more...

Iosevka is a nice, slender monospace font with a lot of configurable variations. Check it out: https://typeof.net/Iosevka/

I’m honoured and excited to be named one of this year’s Software Sustainability Institute Fellows. There’s not much to write about yet because it’s only just started, but I’m looking forward to sharing more with you. In the meantime, you can take a look at the 2020 fellowship announcement and get an idea of my plans from my application video: