From: Maxious Date: Mon, 15 Apr 2013 04:13:09 +0000 Subject: edits X-Git-Url: https://maxious.lambdacomplex.org/git/?p=tools.git&a=commitdiff&h=a7e436b84e0d1817dbab10d3c9b53704fe40229c --- edits --- --- a/index.md +++ b/index.md @@ -19,6 +19,7 @@ # General References {#general-data-hacking-and-programming-references} + ## The basics of being a data scientist @@ -65,11 +66,8 @@ [![](img/Screenshot-at-2012-04-29-172132-300x235.png "Git Screenshot")](http://progit.org/book/) -[tutorials on git](http://progit.org/book/) and -[GUIs to help you](http://code.google.com/p/tortoisegit/) - -[manual for Subversion](http://svnbook.red-bean.com/) -and a [similar GUI for Subversion](http://tortoisesvn.net/) +There are [tutorials on git](http://progit.org/book/) and [GUIs to help you](http://code.google.com/p/tortoisegit/) +There is also a [manual for Subversion](http://svnbook.red-bean.com/) and a [similar GUI for Subversion](http://tortoisesvn.net/) ### Task Tracking @@ -140,19 +138,15 @@ You can find some data visualisation tools below: -[http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/](http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/) - +[Essential Colletion](http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/) + [Drawing By Numbers Tools and Resources](http://drawingbynumbers.org/toolsandresources) + - http://selection.datavisualization.ch/ data viz tools catalog Also check out [http://thejit.org](http://thejit.org/) & [http://www.senchalabs.org/philogl/](http://www.senchalabs.org/philogl/) (contributed by Matt Adcock) -Have to use visual art concepts, good color schemes http://www.r-bloggers.com/the-paul-tol-21-color-salute/ - - - - https://graphics.stanford.edu/wikis/cs448b-12-fall/ data viz theory - - http://drawingbynumbers.org/toolsandresources - - http://selection.datavisualization.ch/ data viz tools catalog - -examples - http://sunfoundation.tumblr.com/ -### The Open Budget +A good infographic should use visual art concepts and [good color schemes](http://www.r-bloggers.com/the-paul-tol-21-color-salute/) +For more information on the theory of data visualisation check out the (Stanford CS448B notes)[https://graphics.stanford.edu/wikis/cs448b-12-fall/] + +Some examples of data visualisation can be seen on [the Sunlight Foundation tumblr](http://sunfoundation.tumblr.com/) or at the GovHack alumn [The Open Budget](http://www/.theopenbudget.org) ## Web Applications @@ -221,6 +215,8 @@ # Geographical Data Tools {#geographical-data-tools} Check out the [GeoRabble Boundary Mapper's Cookbook](http://georabble.org/2012/05/31/the-boundary-mappers-cookbook/) to see how you can tie all these things together! + +GeoDjango TileMill ## Key datasets base layers like agri http://agri.openstreetmap.org/, http://irs.gis-lab.info/ wms or http://www.gdal.org/frmt_wms_openstreetmap_tms.xml @@ -237,10 +233,9 @@ or locally using GDAL (better for many megabyte datasets) ### Geocoding -cloudmade, google (but you must display on a Google Map). - -Easiest way to do is with a Google Spreadsheet/Fusion Table http://williamparry.blogspot.com.au/2011/04/putting-data-into-google-fusion-tables.htm http://support.google.com/fusiontables/answer/1012281?hl=en&ref_topic=2592806 - +Google Maps APIs allow you to convert an address to map co-ordinates (geocoding) but you must display on a Google Map. The easiest way to do is with a Google Spreadsheet/Fusion Table http://williamparry.blogspot.com.au/2011/04/putting-data-into-google-fusion-tables.htm http://support.google.com/fusiontables/answer/1012281?hl=en&ref_topic=2592806 + +If you need geocoding for more than display (working out the distance between points etc) or you don't want to use Google Maps, Cloudmade offers free OpenStreetMap based geocoding http://developers.cloudmade.com/projects/show/geocoding-http-api ## Analysis @@ -380,8 +375,7 @@ # Graph (relationships and networks) Data Tools {#graph-relationships-and-networks-data-tools} - -Why? Find communities, hubs, connections between (the X degrees of separation) +Graph data can be very valuable for finding communities, hubs and connections between entities (the 6 degrees of separation). This is through the techniques of Social Network Analysis. - http://www.slideshare.net/OReillyStrata/visualizing-networks-beyond-the-hairball - http://blog.sciencenet.cn/blog-554179-622011.html SNA tools catalog - https://github.com/jacomyal/osdc2012-sigmajs-demo sigmajs filtering/searching @@ -396,10 +390,8 @@ ### Graph Databases -[![](img/webadmin-data-300x127.png "Neo4\. web admin screenshot")](img/webadmin-data.png)Help understand relationships - how is X connected to Y and via what other entities they both are connected to. Imports and exports - - - http://www.slideshare.net/maxdemarzi/etl-into-neo4j - http://blog.neo4j.org/2013/03/importing-data-into-neo4j-spreadsheet.html +[![](img/webadmin-data-300x127.png "Neo4\. web admin screenshot")](img/webadmin-data.png)Help understand relationships - how is X connected to Y and via what other entities they both are connected to. +Imports and exports can be done by [writing a java program](http://www.slideshare.net/maxdemarzi/etl-into-neo4j) or [spreadsheet](http://blog.neo4j.org/2013/03/importing-data-into-neo4j-spreadsheet.html) There are other graph databases worth considering like [OrientDB](http://www.orientdb.org/) or [Titan](http://thinkaurelius.github.com/titan/) Major graph databases like these can be accessed using a common syntax called Gremlin or by writing a simple Java/Python/Ruby application. Queries can be tested in the built in data browser. @@ -412,7 +404,7 @@ NetworkX is a social network analysis library for python. Many advanced analyses built in like finding communities within a graph. Also good for converting data into graphs. -tutorial/intro http://www.cl.cam.ac.uk/~cm542/teaching/2011/stna-pdfs/stna-lecture11.pdf +See this [introduction to Social Network Analysis with NetworkX](http://www.cl.cam.ac.uk/~cm542/teaching/2011/stna-pdfs/stna-lecture11.pdf) ## Visualisation @@ -434,5 +426,5 @@ ### [sigma.js](http://sigmajs.org/) -[![](img/How-to-participate-in-GovHack_html_m6006eaf3-300x130.jpg "Sigma.js Screenshot")](img/How-to-participate-in-GovHack_html_m6006eaf3.jpg)Javascript graph viewer, can use GEXF files exported from tools like neo4j, gephi and NetworkX. - +[![](img/How-to-participate-in-GovHack_html_m6006eaf3-300x130.jpg "Sigma.js Screenshot")](img/How-to-participate-in-GovHack_html_m6006eaf3.jpg)Javascript graph viewer for displaying graphs on webpages without any other plugins/applications required. It can use GEXF files exported from tools like neo4j, gephi or NetworkX. +