Getting your (data) house in order: Time for a seasonal clean?
As the seasons change in many parts of the world, we are reminded of the importance of periodic maintenance - whether it's sweeping up the leaves in the autumn, or cleaning out our wardrobes in the spring. But what about our data? As we're inundated with more and more online surveys, it's easy for our digital spaces to become just as cluttered as our physical ones. And just like a cluttered closet, a cluttered set of spreadsheets can make it difficult to find what we need and gain valuable insights. That's why it's time for a seasonal clean of our data - a chance to take stock, review what we've collected, and make intentional choices about what to keep, and what to let go.
In the past months I’ve worked on several projects that involve combining, cleaning and analysing quantitative data stored on multiple spreadsheets. Lots of spreadsheets. Now don’t get me wrong, I love numbers. I’m originally a numbers girl. Nothing used to make me happier than a clean set of code in STATA, or a beautiful query in Microsoft Access (ask me sometime about the famous six-step append).
But in our work and our lives we are now inundated with online surveys about everything; the fact that it is so easy to create one has its downsides. The projects I’ve been working on have zillions of spreadsheets, sometimes containing data from the same people, but not in a way that can be easily combined and analysed over time. The data gathered through online surveys often have low response rates, so while all this work is going on to create survey forms, send them out, and presumably promote them, very little data is coming back. Often, that data just sits there, and someone might scan through the comments before moving on with their next task, day, or week, without taking the time to systematically go through the data that is collected. That means they often don’t take the time to explore what the numbers are telling them, nor individually or collectively reflect on the findings, and use that to drive learning and improvement.
I guess then I get the “joy” of cleaning up and making-sense of their data clutter, which is something I can do, and often like to geek out on, especially when I get to learn some new data analysis formulas and tricks. However this takes my time and costs the organisation money - time and money that might be spent on higher value add tasks.
There does seem to be something missing. Why go to all this effort of collecting this data that is poorly completed, generally doesn’t tell you the complete story, and hasn’t been collected in a way that makes it easy for you, as the organisation delivering the program or service, to analyse it? You can always pay someone else to clean up your data clutter, but you may want to invest that money on higher value add tasks. For instance, I pay a cleaner to come every fortnight, but to maximise the use of their time (including the quality of the clean) we all tidy up our surfaces and general mess before they come. Then every so often, I pay the cleaners to stay longer and take on the deeper cleaning tasks that I particularly hate, like cleaning the oven. Win!
Let's reduce data clutter. This means becoming more selective and thoughtful in what data we collect, and planning how we use it. Additionally, we should invest in periodic review and cleaning of our data to limit problems later, just like the periodic collection of leaves in autumn prevents clogging of drains and flooding in winter, and how an occasional deep clean of the fridge or oven is time or money well spent.. What’s your data cleaning practices? Is it time for a seasonal clean?
Grateful to the editing assistance provided by human (Emma Thomas) and non-human means.
Photo credit: Timothy Eberly on Unsplash