New features just released

22nd July 2014

If you’ve already got your beta invite then you’ll be able to check out the latest features:

  • Filter element - combine facets to filter down the data in your visualisation.
  • Average aggregation - you can now average values as well as sum them.
  • Searchable table charts added - customise -> elements -> chart type.
  • Integration with Google Docs, Dropbox, Github, or link to a file.
  • Proper support for dates so it’s easy to create time-series line charts.

Coming up: Create your own maps, bucketing for dates/numbers, and export your visualisations as images

Tell us what you want…

UI Designer / Developer Wanted

20th May 2014

Do you love data visualisation as much as we do? Would you like to spend your days crafting the future of Dataseed?

We’re looking for a UI designer with data visualisation experience, who incorporates UX research into their practice, and can ideally bring their designs to life using javascript.

We know we’re asking a lot, so if you come from a design background and you’re in the process of crossing over and becoming the person described above - please get in touch.

You will have a lot of creative freedom and will be expected to not only meet the needs of our users, but to delight them. We’re a small friendly team based in London, UK. Freelance positions considered initially, but with a view to developing a long-term relationship.

Please apply directly to: jobs@getdataseed.com

#dataviz, #d3js, #design, #ui, #ux

Preparing Data 101: Un-pivoting / de-normalizing

6th May 2014

In order for Dataseed to be able to understand your data, it must be in a standard format where each row is a single record.

Consider the following dataset showing life expectancy per year, per country. The data is in a “pivot table” format.

Pivot Table

Country Name 2005 2006 2007 2008 2009
Afghanistan 46.6 46.9 47.2 47.5 47.9
Albania 76.1 76.3 76.5 76.6 76.8
Algeria 71.6 71.9 72.2 72.4 72.6

We need to get the data into this format:

Standard Form

Country Year Life Expectancy
Afghanistan 2005 46.6
Afghanistan 2006 46.9
Afghanistan 2007 47.2
Afghanistan 2008 47.5
Afghanistan 2009 47.9
Albania 2005 76.1
Albania 2006 76.3
Albania 2007 76.5
Albania 2008 76.6
Albania 2009 76.8
Algeria 2005 71.6
Algeria 2006 71.9
Algeria 2007 72.2
Algeria 2008 72.4
Algeria 2009 72.6

This great article by Tariq Khokhar shows how to un-pivot your data using OpenRefine .

This YouTube video shows how to unpivot using Excel

Shortlisted for the Information Is Beautiful Awards

31st October 2013

We’re delighted and very honoured to be short-listed in the Information Is Beautiful Awards this year. There are some great entries so please go and check them out, and we certainly wouldn’t mind if you chose to vote for us :)

Merging spreadsheets with OpenRefine (Google Refine)

4th June 2013

Let’s face it - we’re not quite in the golden age of linked data yet.

While we’re here, let’s take a pragmatic look at how you can combine data from two spreadsheets, using OpenRefine. We’ll walk through a relatively simple real-life example.

We recently wanted to import some data from the UK 2011 Census into Dataseed. We wanted to use a map to visualise the geographic dimension. Unfortunately the data used exclusively the new geographic codes while we only had the boundary data for the old geographic codes. Luckily we had a lookup table that simply mapped the old codes to the new ones.

So, the challenge then became to take the spreadsheet containing our boundary data, and change the old geographic codes to the new ones.

These are the steps we took:

  1. Create a project in OpenRefine for the boundary data spreadsheet.

  2. Create a project in OpenRefine for the table mapping old codes to new codes.

  3. In the boundary data project, use the “Add column based on this column” feature.

  4. Use the following GREL expression to pull in the data from NEW_COLUMN in PROJECT_NAME, linking on the COMMON_COLUMN: cell.cross("PROJECT_NAME", "COMMON_COLUMN").cells["NEW_COLUMN"].value[0]

  5. Et voila, your new column has been created.

You’ll probably want to use the facet feature to check for rows that didn’t match. You might also want to add some exceptions or error handling to the GREL expression in step 4, but we’ll keep this simple for now.

Happy refining.

Most data isn’t “big,” and businesses are wasting money pretending it is

31st May 2013

Read the full article by Christopher Mims

I couldn’t agree with this more - Big data has become a synonym for data analysis.

Our philosophy with Dataseed has always been to focus on quality of data rather than quantity. Christopher points to a paper stating that most of Facebook’s data problems are in the megabyte to gigabyte range. I was surprised to read that about Facebook, but I’m sure it’s true for the vast majority of businesses.

With Dataseed, we’re focusing on 1mb - 1gb sized data. We believe most of the complexity comes from high-dimensionality, rather than size. Our datacube visualisations allow you to explore datasets with multiple dimensions using clickable interactive charts.

Your responses to our data import survey

2nd April 2013

Thanks so much to everyone who answered the survey on data importing. The concensus was overwhelmingly in favour of CSV / Excel spreadsheet imports - so that’s what we’re building!

It’s not Dataseed, but here’s a screen-shot of the aggregated responses. [roll on Survey Builder and Dataseed integration!]

How would you prefer to import your data?

12th March 2013

We’ve been beavering away, getting ready for beta. We want to make sure we get it right, so we’re asking you one important question:

Q: How would you prefer to get your data into Dataseed?

  • Google Spreadsheets
  • CSV / Excel file upload
  • Restful JSON API
  • Browser extension that allows you to clip tables from webpages
  • Other (e.g. integration with your data source)

Please answer our 1-minute survey

Thank you so much for your support, we appreciate and rely upon your feedback!

New Features - Coming Soon!

23rd December 2012

We’ve been thinking about how we can make Dataseed work well for even more types of statistical data. We’d love to hear your feedback on these features - are they useful you, how would you use them, can you think of a better way? Please leave a comment below…

Customising chart types

Fig 1. shows the “Vehicle Type” dimension is displayed as both a bar and a pie chart. Both charts can be displayed or more likely you can show one at a time and use the toogle tool to select between them. This feature is already available on the current demo visualisation - use the settings on the “Causes of death” dimension to choose between bubble or bar chart.

Fig 1.

Hierarchical dimensions

Fig 2. uses a bubble tree to explore hierarchial dimensions. The first level of the hierarchy is the central bubble and the second level is displayed around it. If you select a bubble in the second hierarchy it will replace the central bubble and the third hierarchy will be displayed around it.

Fig 2.

Fig 3. shows an alternative way of exploring hierarchial dimensions. It uses a breadcrumb as a way to navigate back up levels in the hierarchy, whereas double clicking on an bubble will drill down one level further into the data.

The first chart in Fig 3. is the first level (year). The second chart is the second level (month) showing a cut on month 2.

This approach is compatible with multiple chart types.

Fig 3.

Custom layouts

We have been experimenting with custom layouts of charts to form a dashboard. We would like the user to be able to drop and drag charts into the main content area from the set of possible dimensions. The charts would restack according to a predertmined grid.

Fig 4.