New features just released
If you’ve already got your beta invite then you’ll be able to check out the latest features:
- Filter element - combine facets to filter down the data in your visualisation.
- Average aggregation - you can now average values as well as sum them.
- Searchable table charts added - customise -> elements -> chart type.
- Integration with Google Docs, Dropbox, Github, or link to a file.
- Proper support for dates so it’s easy to create time-series line charts.
Coming up: Create your own maps, bucketing for dates/numbers, and export your visualisations as images
UI Designer / Developer Wanted
Do you love data visualisation as much as we do? Would you like to spend your days crafting the future of Dataseed?
We know we’re asking a lot, so if you come from a design background and you’re in the process of crossing over and becoming the person described above - please get in touch.
You will have a lot of creative freedom and will be expected to not only meet the needs of our users, but to delight them. We’re a small friendly team based in London, UK. Freelance positions considered initially, but with a view to developing a long-term relationship.
Please apply directly to: firstname.lastname@example.org
#dataviz, #d3js, #design, #ui, #ux
Preparing Data 101: Un-pivoting / de-normalizing
In order for Dataseed to be able to understand your data, it must be in a standard format where each row is a single record.
Consider the following dataset showing life expectancy per year, per country. The data is in a “pivot table” format.
We need to get the data into this format:
Shortlisted for the Information Is Beautiful Awards
We’re delighted and very honoured to be short-listed in the Information Is Beautiful Awards this year. There are some great entries so please go and check them out, and we certainly wouldn’t mind if you chose to vote for us :)
Merging spreadsheets with OpenRefine (Google Refine)
Let’s face it - we’re not quite in the golden age of linked data yet.
While we’re here, let’s take a pragmatic look at how you can combine data from two spreadsheets, using OpenRefine. We’ll walk through a relatively simple real-life example.
We recently wanted to import some data from the UK 2011 Census into Dataseed. We wanted to use a map to visualise the geographic dimension. Unfortunately the data used exclusively the new geographic codes while we only had the boundary data for the old geographic codes. Luckily we had a lookup table that simply mapped the old codes to the new ones.
So, the challenge then became to take the spreadsheet containing our boundary data, and change the old geographic codes to the new ones.
These are the steps we took:
Create a project in OpenRefine for the boundary data spreadsheet.
Create a project in OpenRefine for the table mapping old codes to new codes.
In the boundary data project, use the “Add column based on this column” feature.
Use the following GREL expression to pull in the data from NEW_COLUMN in PROJECT_NAME, linking on the COMMON_COLUMN:
Et voila, your new column has been created.
You’ll probably want to use the facet feature to check for rows that didn’t match. You might also want to add some exceptions or error handling to the GREL expression in step 4, but we’ll keep this simple for now.
Most data isn’t “big,” and businesses are wasting money pretending it is
I couldn’t agree with this more - Big data has become a synonym for data analysis.
Our philosophy with Dataseed has always been to focus on quality of data rather than quantity. Christopher points to a paper stating that most of Facebook’s data problems are in the megabyte to gigabyte range. I was surprised to read that about Facebook, but I’m sure it’s true for the vast majority of businesses.
With Dataseed, we’re focusing on 1mb - 1gb sized data. We believe most of the complexity comes from high-dimensionality, rather than size. Our datacube visualisations allow you to explore datasets with multiple dimensions using clickable interactive charts.
Your responses to our data import survey
Thanks so much to everyone who answered the survey on data importing. The concensus was overwhelmingly in favour of CSV / Excel spreadsheet imports - so that’s what we’re building!
It’s not Dataseed, but here’s a screen-shot of the aggregated responses. [roll on Survey Builder and Dataseed integration!]
How would you prefer to import your data?
We’ve been beavering away, getting ready for beta. We want to make sure we get it right, so we’re asking you one important question:
Q: How would you prefer to get your data into Dataseed?
- Google Spreadsheets
- CSV / Excel file upload
- Restful JSON API
- Browser extension that allows you to clip tables from webpages
- Other (e.g. integration with your data source)
Thank you so much for your support, we appreciate and rely upon your feedback!
New Features - Coming Soon!
We’ve been thinking about how we can make Dataseed work well for even more types of statistical data. We’d love to hear your feedback on these features - are they useful you, how would you use them, can you think of a better way? Please leave a comment below…
Customising chart types
Fig 1. shows the “Vehicle Type” dimension is displayed as both a bar and a pie chart. Both charts can be displayed or more likely you can show one at a time and use the toogle tool to select between them. This feature is already available on the current demo visualisation - use the settings on the “Causes of death” dimension to choose between bubble or bar chart.
Fig 2. uses a bubble tree to explore hierarchial dimensions. The first level of the hierarchy is the central bubble and the second level is displayed around it. If you select a bubble in the second hierarchy it will replace the central bubble and the third hierarchy will be displayed around it.
Fig 3. shows an alternative way of exploring hierarchial dimensions. It uses a breadcrumb as a way to navigate back up levels in the hierarchy, whereas double clicking on an bubble will drill down one level further into the data.
The first chart in Fig 3. is the first level (year). The second chart is the second level (month) showing a cut on month 2.
This approach is compatible with multiple chart types.
We have been experimenting with custom layouts of charts to form a dashboard. We would like the user to be able to drop and drag charts into the main content area from the set of possible dimensions. The charts would restack according to a predertmined grid.