From: Maxious Date: Wed, 06 Mar 2013 04:05:46 +0000 Subject: edits X-Git-Url: https://maxious.lambdacomplex.org/git/?p=tools.git&a=commitdiff&h=720bff269cfdc477e0af33c81333731ab7fa6bcb --- edits --- --- a/index.md +++ b/index.md @@ -1,52 +1,20 @@ -- tools.disclo.gs - how to use data - - developer tools inc. linked data - - gephi -> neo4j - - neo4j lets you build on, do massive queries of who is friends with who - - - - postgis/quantum gis - - (google earth is alright but many limitations) NASA World Wind? - - ABS statistical areas - - - scraperwiki with new pytemplate libraries - - makes an API for your data to get in sqlite/json/csv - - - govhack library - - https://graphics.stanford.edu/wikis/cs448b-12-fall/ data viz theory - - http://drawingbynumbers.org/toolsandresources - - http://wmbriggs.com/blog/?p=6465 - - http://ofps.oreilly.com/titles/9781449339739/k_00000002.html list of d3 alternatives - - http://craigkerstiens.com/2012/10/01/understanding-postgres-performance/ - - https://github.com/clips/pattern for easy NLP/network analysis/data mining - - https://github.com/theodi/open-data-tech-review/wiki othr cleanup/linked data toola - - http://selection.datavisualization.ch/ data viz tools catalog - - manipulating data - grep/find replace/sed/regex - - - data viz - - http://k2company.com/blog/2012/09/06/toolbox-for-learning-machine-learning-and-data-science/ - - http://williamparry.blogspot.com.au/2011/04/putting-data-into-google-fusion-tables.html google fusion tutorial - - - http://www.slideshare.net/maxdemarzi/etl-into-neo4j - - - - http://dydra.com/ - - http://selection.datavisualization.ch/ data viz tools list - - http://nodexl.codeplex.com/ network graphs for excel - - http://sunfoundation.tumblr.com/ - - analysing - linked data tools - - http://govcampau.wikispaces.com/useful+tools - - http://linkeddata.org/home - - Welcome to the GovHack toolkit. This page provides all the information you need to prepare hackfest entries. These tools can be used to make entries like: mobile apps, web apps, data visualisations/infographics - -# General Data Hacking and Programming References {#general-data-hacking-and-programming-references} +# How to register and submit your entry +- how to use website "Hacker Space" to register and find teams etc. + +- screencast tools - preparing your submission + video tools, youtube video editor/slideshow, FOSS video editing tools + +- how to submit code + +# General References {#general-data-hacking-and-programming-references} ## The basics of being a data scientist -* Have a hypothesis � even if you’re making a tool/api that helps people with their questions too, remember what the objective of that is. +* Have a hypothesis - even if you're making a tool/api that helps people with their questions too, remember what the objective of that is. * Find the people and tools you need to prove/show/find. This rest of this page will help with the latter. -* Analyse and present results � were they what you expected? Do they help explain to others what you have found out? Can present as a interactive data visualisation or a web/mobile application or just a infographic/motion graphics video that tells a story. +* Analyse and present results - were they what you expected? Do they help explain to others what you have found out? Can present as a interactive data visualisation or a web/mobile application or just a infographic/motion graphics video that tells a story. Please note, there are a combination of Analysis and Visualisation tools in each of the major categories below. @@ -54,7 +22,7 @@ Illustration from Data Journalism Handbook, CC BY-SA 3.0 -The best high level reference is the �Understanding Data� and �Delivering Data� chapters of the Data Journalism Handbook which is available online for free at +The best high level reference is the 'Understanding Data' and 'Delivering Data' chapters of the Data Journalism Handbook which is available online for free at [datajournalismhandbook.org](http://datajournalismhandbook.org/) @@ -67,7 +35,6 @@ [http://flowingdata.com/2012/04/27/data-and-visualization-blogs-worth-following/](http://flowingdata.com/2012/04/27/data-and-visualization-blogs-worth-following/) - **Statistics** [http://greenteapress.com/thinkstats/html/index.html](http://greenteapress.com/thinkstats/html/index.html) @@ -78,9 +45,9 @@ Basic tutorials for a variety of languages are available for free online or you can learn -interactively with websites like [http://www.codecademy.com/](http://www.codecademy.com/#!/exercises/0\. for JavaScript or [http://www.learnpython.org/ ](http://www.learnpython.org/)or [http://tryruby.org](http://tryruby.org/) - -[https://developer.mozilla.org/en/JavaScript](https://developer.mozilla.org/en/JavaScript) –\. especially for web applications and visualisations, you’ll need a basic understanding of JS. Common libraries like prototype or jQuery can help +interactively with websites like [http://www.codecademy.com/](http://www.codecademy.com/#!/exercises/0). for JavaScript or [http://www.learnpython.org/ ](http://www.learnpython.org/)or [http://tryruby.org](http://tryruby.org/) + +[https://developer.mozilla.org/en/JavaScript](https://developer.mozilla.org/en/JavaScript) - especially for web applications and visualisations, you'll need a basic understanding of JS. Common libraries like prototype or jQuery can help **Accessibility/User Experience** @@ -91,7 +58,6 @@ ## Definitions - definitions, open licence reuse permissive hacker hack data journalism data bis UCX etc. - ## key datasets - key datasets, directory.gov.au gazetter/AEC electorates/suburbs/postcodes/LGAs @@ -142,11 +108,15 @@ server admin / technical tools many projects will require some kind of internet presence, webpage etc. - css framework like bootstrap or zurb foundation - video tools, youtube video editor/slideshow, FOSS video editing tools +- css gauges http://www.larentis.eu/donuts/ +- bootstrap themes, web fonts, css sprites, icon fonts + - http://designmodo.com/flat-free/ http://designmodo.github.com/Flat-UI/ + - http://ubuntu-tutorials.com/2008/11/11/relaying-postfix-smtp-via-smtpgmailcom/ - amon -### Source Control –\. Git / Subversion +### Source Control + Git / Subversion [![](http://www.govhack.org/wp-content/uploads/Screenshot-at-2012-04-29-172132-300x235.png "Git Screenshot")](http://progit.org/book/) @@ -188,7 +158,7 @@ # API Development {#api-development} -So an API isn’t just an XML file ![;)](http://www.govhack.org/wp-includes/images/smilies/icon_wink.gif) +So an API isn't just an XML file ![;)](http://www.govhack.org/wp-includes/images/smilies/icon_wink.gif) A good web based data API: @@ -222,7 +192,7 @@ Most of the categories to follow have visualisation tools specific to their purpose. -You can find some data visualisation “essential”\. tools below: +You can find some data visualisation tools below: [http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/](http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/) @@ -230,7 +200,17 @@ Have to use visual art concepts, good color schemes http://www.r-bloggers.com/the-paul-tol-21-color-salute/ + + - https://graphics.stanford.edu/wikis/cs448b-12-fall/ data viz theory + - http://drawingbynumbers.org/toolsandresources + +examples - http://sunfoundation.tumblr.com/ +tools - http://selection.datavisualization.ch/ data viz tools catalog + + + # Mobile +bom water, nz gov budget html5 jquery mobile like directory.gov.au - android datviz - http://code.google.com/p/afreechart/ http://code.google.com/p/snowdon/ http://code.google.com/p/chartdroid/ http://androidplot.com/ http://code.google.com/p/achartengine/ @@ -239,7 +219,7 @@ # Geographical Data Tools {#geographical-data-tools} -Check out the[ GeoRabble Boundary Mapper’s Cookbook](http://georabble.org/2012/05/31/the-boundary-mappers-cookbook/) to see how you can tie all these things together! +Check out the[ GeoRabble Boundary Mapper's Cookbook](http://georabble.org/2012/05/31/the-boundary-mappers-cookbook/) to see how you can tie all these things together! ## Key datasets - base layers like agri http://agri.openstreetmap.org/, http://irs.gis-lab.info/ wms or http://www.gdal.org/frmt_wms_openstreetmap_tms.xml @@ -256,6 +236,8 @@ ### geocoding cloudmade, google (but you must display on a Google Map). +Easiest way to do is with a Google Spreadsheet/Fusion Table http://williamparry.blogspot.com.au/2011/04/putting-data-into-google-fusion-tables.htm + ## Analysis @@ -304,20 +286,22 @@ [![](http://www.govhack.org/wp-content/uploads/google_refine_interface.png "google_refine_interface")](http://www.govhack.org/wp-content/uploads/google_refine_interface.png)Clean up duplicate or inconsistent data entries. +Can also use general purpose tools; grep/awk/sed +regex http://www.regexper.com/ http://www.debuggex.com/?re=&str= + ## Analysis ### Excel / Calc Great basic analysis and viewing. Older versions can be limited to 6500\. or so rows. Eg [http://www.tcij.org/training-material/car/data-mining/3474](http://www.tcij.org/training-material/car/data-mining/3474) - ### PostgreSQL/MySQL [![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_209ee972.jpg "SQL screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_209ee972.jpg)Next step up, large datasets can be manipulated/extracted efficiently for example [http://www.postgresql.org/docs/8.4/static/tutorial-window.html](http://www.postgresql.org/docs/8.4/static/tutorial-window.html) , no built-in data visualisation though. ### [Miso Dataset](http://misoproject.com/dataset/) -[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38-293x300.png "miso screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38.png)Javascript data transformation library � especially good if you want to use the output for javascript interactive visualisations because the transformations can be done on-the-fly by users. +[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38-293x300.png "miso screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38.png)Javascript data transformation library - especially good if you want to use the output for javascript interactive visualisations because the transformations can be done on-the-fly by users. ### R Statistical Language @@ -356,13 +340,14 @@ # Unstructured (text documents, webpages, metadata, tweets etc) Data Tools -Scraperwiki +## wranglying +Scraperwiki pytemplate scrapy + Overviewer/ Jigsaw http://www.cc.gatech.edu/gvu/ii/jigsaw/ - - opennlp/nltk, lucene/solr - - http://www.r-bloggers.com/simple-text-mining-with-r/ - -R + - opennlp/nltk / https://github.com/clips/pattern + - lucene/solr + - http://www.r-bloggers.com/simple-text-mining-with-r/ - http://blog.josephwilk.net/ruby/latent-semantic-analysis-in-ruby.html similar terms usually found together # Graph (relationships and networks) Data Tools {#graph-relationships-and-networks-data-tools} @@ -379,11 +364,17 @@ - http://is-r.tumblr.com/post/38240018815/making-prettier-network-graphs-with-sna-and-igraph -### Neo4j - -[![](http://www.govhack.org/wp-content/uploads/webadmin-data-300x127.png "Neo4\. web admin screenshot")](http://www.govhack.org/wp-content/uploads/webadmin-data.png)Help understand relationships � how is X connected to Y and via what other entities they both are connected to. Imports and exports - -can be done using a preexisting tool like Gremlin or by writing a simple Java/Python/Ruby application. Queries can be tested in the built in data browser. +### Neo4j / OrientDB + +[![](http://www.govhack.org/wp-content/uploads/webadmin-data-300x127.png "Neo4\. web admin screenshot")](http://www.govhack.org/wp-content/uploads/webadmin-data.png)Help understand relationships - how is X connected to Y and via what other entities they both are connected to. Imports and exports + + - http://www.slideshare.net/maxdemarzi/etl-into-neo4j + +http://www.orientdb.org/ + +Both can be accessed using a preexisting tool like Gremlin or by writing a simple Java/Python/Ruby application. Queries can be tested in the built in data browser. + + ### [NetworkX](http://networkx.lanl.gov/index.html) @@ -393,13 +384,16 @@ ## Visualisation -### +### Tree/Hierarchy visualisation - don't use network viz if what you actually have is a tree/hierarchy with no interconnections http://www.randelshofer.ch/treeviz/ http://thejit.org/demos/ http://mbostock.github.com/protovis/ex/treemap.html http://blog.pixelingene.com/2011/07/building-a-tree-diagram-in-d3-js/d3 for Trees and Hierarchies http://mbostock.github.com/d3/ex/pack.html http://mbostock.github.com/d3/ex/tree.html +### NodeXL for Microsoft Excel + - http://nodexl.codeplex.com/ network graphs for excel + ### [Graphviz](http://www.graphviz.org/) -[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d-300x184.png "Graphviz Screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d.png)Classic directed graph visualisation tool, can even [generate images online without installing](http://ashitani.jp/gv/) or use in webpages with [javascript port of software](http://code.google.com/p/canviz/). File format [�dot� very easy to learn](http://en.wikipedia.org/wiki/DOT_language) +[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d-300x184.png "Graphviz Screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d.png)Classic directed graph visualisation tool, can even [generate images online without installing](http://ashitani.jp/gv/) or use in webpages with [javascript port of software](http://code.google.com/p/canviz/). File format ["dot" very easy to learn](http://en.wikipedia.org/wiki/DOT_language) ### Gephi