From: maxious Date: Sun, 03 Mar 2013 10:00:40 +0000 Subject: editz X-Git-Url: https://maxious.lambdacomplex.org/git/?p=tools.git&a=commitdiff&h=c6bdb8ff5d9c8c837323bd404cbaf78ef6843963 --- editz --- --- a/index.md +++ b/index.md @@ -1,59 +1,3 @@ -geo - -R - -- tools.disclo.gs - how to use data - - developer tools inc. linked data - - gephi -> neo4j - - neo4j lets you build on, do massive queries of who is friends with who - - don't use network viz if what you actually have is a tree/hierarchy with no interconnections http://www.randelshofer.ch/treeviz/ http://thejit.org/demos/ http://mbostock.github.com/protovis/ex/treemap.html http://blog.pixelingene.com/2011/07/building-a-tree-diagram-in-d3-js/ - - http://mbostock.github.com/d3/ex/pack.html http://mbostock.github.com/d3/ex/tree.html - - postgis/quantum gis - - (google earth is alright but many limitations) NASA World Wind? - - ABS statistical areas - - base layers like agri, http://irs.gis-lab.info/ wms or http://www.gdal.org/frmt_wms_openstreetmap_tms.xml - - can do nearest/isin/union queries - personal geocoder - - cloudmade geocoder, google maps my maps - - - scraperwiki with new pytemplate libraries - - makes an API for your data to get in sqlite/json/csv - - - govhack library - - https://graphics.stanford.edu/wikis/cs448b-12-fall/ data viz theory - - http://drawingbynumbers.org/toolsandresources - - http://wmbriggs.com/blog/?p=6465 - - http://ofps.oreilly.com/titles/9781449339739/k_00000002.html list of d3 alternatives - - http://craigkerstiens.com/2012/10/01/understanding-postgres-performance/ - - https://github.com/clips/pattern for easy NLP/network analysis/data mining - - https://github.com/theodi/open-data-tech-review/wiki othr cleanup/linked data toola - - http://selection.datavisualization.ch/ data viz tools catalog - - manipulating data - grep/find replace/sed/regex - - d3 tools and tutorial http://enjalot.com/ http://news.ycombinator.com/item?id=4608440 - - Why d3 is the way it is and how to make charts http://bost.ocks.org/mike/chart/ - - how to make an xkcd chart http://bl.ocks.org/3914862 - - - data viz - - http://k2company.com/blog/2012/09/06/toolbox-for-learning-machine-learning-and-data-science/ - - http://williamparry.blogspot.com.au/2011/04/putting-data-into-google-fusion-tables.html google fusion tutorial - - andrewharvey4.wordpress.com postgis/asgs tutorial - - http://www.slideshare.net/maxdemarzi/etl-into-neo4j - - - http://www.twotorials.com/ for R - - http://www.r-bloggers.com/gradient-word-clouds/ http://www.rstudio.com/shiny/ http://blog.ouseful.info/2012/11/28/quick-shiny-demo-exploring-nhs-winter-sit-rep-data/ https://github.com/timelyportfolio/shiny-d3-plot https://github.com/trestletech/shiny-sandbox/tree/master/grn - - http://is-r.tumblr.com/post/38240018815/making-prettier-network-graphs-with-sna-and-igraph - - http://www.r-bloggers.com/video-simpler-tricks-and-tools-help-debugging-git-latex-and-workflow-with-r-by-prof-rob-hyndman/ - - http://yihui.name/knitr/ makes reports including google widgets/charts/maps via http://www.r-bloggers.com/googlevis-0-3-2-is-released-better-integration-with-knitr/ - - http://chartsnthings.tumblr.com/post/36978271916/r-tutorial-simple-charts http://flowingdata.com/2012/12/17/getting-started-with-charts-in-r/ - - - http://dydra.com/ - - http://selection.datavisualization.ch/ data viz tools list - - http://nodexl.codeplex.com/ network graphs for excel - - http://sunfoundation.tumblr.com/ - - analysing - linked data tools - - http://govcampau.wikispaces.com/useful+tools - - http://linkeddata.org/home - - Welcome to the GovHack toolkit. This page provides all the information you need to prepare hackfest entries. These tools can be used to make entries like: mobile apps, web apps, data visualisations/infographics @@ -61,9 +5,9 @@ # General Data Hacking and Programming References {#general-data-hacking-and-programming-references} ## The basics of being a data scientist -* Have a hypothesis � even if you’re making a tool/api that helps people with their questions too, remember what the objective of that is. +* Have a hypothesis - even if you're making a tool/api that helps people with their questions too, remember what the objective of that is. * Find the people and tools you need to prove/show/find. This rest of this page will help with the latter. -* Analyse and present results � were they what you expected? Do they help explain to others what you have found out? Can present as a interactive data visualisation or a web/mobile application or just a infographic/motion graphics video that tells a story. +* Analyse and present results - were they what you expected? Do they help explain to others what you have found out? Can present as a interactive data visualisation or a web/mobile application or just a infographic/motion graphics video that tells a story. Please note, there are a combination of Analysis and Visualisation tools in each of the major categories below. @@ -71,7 +15,7 @@ Illustration from Data Journalism Handbook, CC BY-SA 3.0 -The best high level reference is the �Understanding Data� and �Delivering Data� chapters of the Data Journalism Handbook which is available online for free at +The best high level reference is the 'Understanding Data' and 'Delivering Data' chapters of the Data Journalism Handbook which is available online for free at [datajournalismhandbook.org](http://datajournalismhandbook.org/) @@ -84,7 +28,6 @@ [http://flowingdata.com/2012/04/27/data-and-visualization-blogs-worth-following/](http://flowingdata.com/2012/04/27/data-and-visualization-blogs-worth-following/) - **Statistics** [http://greenteapress.com/thinkstats/html/index.html](http://greenteapress.com/thinkstats/html/index.html) @@ -95,9 +38,9 @@ Basic tutorials for a variety of languages are available for free online or you can learn -interactively with websites like [http://www.codecademy.com/](http://www.codecademy.com/#!/exercises/0\. for JavaScript or [http://www.learnpython.org/ ](http://www.learnpython.org/)or [http://tryruby.org](http://tryruby.org/) - -[https://developer.mozilla.org/en/JavaScript](https://developer.mozilla.org/en/JavaScript) –\. especially for web applications and visualisations, you’ll need a basic understanding of JS. Common libraries like prototype or jQuery can help +interactively with websites like [http://www.codecademy.com/](http://www.codecademy.com/#!/exercises/0). for JavaScript or [http://www.learnpython.org/ ](http://www.learnpython.org/)or [http://tryruby.org](http://tryruby.org/) + +[https://developer.mozilla.org/en/JavaScript](https://developer.mozilla.org/en/JavaScript) - especially for web applications and visualisations, you'll need a basic understanding of JS. Common libraries like prototype or jQuery can help **Accessibility/User Experience** @@ -108,7 +51,6 @@ ## Definitions - definitions, open licence reuse permissive hacker hack data journalism data bis UCX etc. - ## key datasets - key datasets, directory.gov.au gazetter/AEC electorates/suburbs/postcodes/LGAs @@ -163,7 +105,8 @@ - http://ubuntu-tutorials.com/2008/11/11/relaying-postfix-smtp-via-smtpgmailcom/ - amon -### Source Control –\. Git / Subversion +### Source Control + Git / Subversion [![](http://www.govhack.org/wp-content/uploads/Screenshot-at-2012-04-29-172132-300x235.png "Git Screenshot")](http://progit.org/book/) @@ -181,7 +124,12 @@ [Trello](https://trello.com/) and [Workflowy](https://workflowy.com/) are free, lightweight project management tools suitable for a rapid project! -# Hosted Developer Tools {#hosted-developer-tools} +## Hosted Developer Tools {#hosted-developer-tools} + +Can get many tools (source control, issue tracking) combined into one service cloud hosted so no setup required. + +### Github +Git obviously but svn/hg interfaces are possible. Provide their own GUI for Windows/OSX or use the variety of Git capable tools ### Sourceforge @@ -200,7 +148,7 @@ # API Development {#api-development} -So an API isn’t just an XML file ![;)](http://www.govhack.org/wp-includes/images/smilies/icon_wink.gif) +So an API isn't just an XML file ![;)](http://www.govhack.org/wp-includes/images/smilies/icon_wink.gif) A good web based data API: @@ -234,7 +182,7 @@ Most of the categories to follow have visualisation tools specific to their purpose. -You can find some data visualisation “essential”\. tools below: +You can find some data visualisation tools below: [http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/](http://www.visualisingdata.com/index.php/2011/07/part-6-the-essential-collection-of-visualisation-resources/) @@ -242,7 +190,17 @@ Have to use visual art concepts, good color schemes http://www.r-bloggers.com/the-paul-tol-21-color-salute/ + + - https://graphics.stanford.edu/wikis/cs448b-12-fall/ data viz theory + - http://drawingbynumbers.org/toolsandresources + +examples - http://sunfoundation.tumblr.com/ +tools - http://selection.datavisualization.ch/ data viz tools catalog + + + # Mobile +bom water, nz gov budget html5 jquery mobile like directory.gov.au - android datviz - http://code.google.com/p/afreechart/ http://code.google.com/p/snowdon/ http://code.google.com/p/chartdroid/ http://androidplot.com/ http://code.google.com/p/achartengine/ @@ -251,18 +209,24 @@ # Geographical Data Tools {#geographical-data-tools} -Check out the[ GeoRabble Boundary Mapper’s Cookbook](http://georabble.org/2012/05/31/the-boundary-mappers-cookbook/) to see how you can tie all these things together! - +Check out the[ GeoRabble Boundary Mapper's Cookbook](http://georabble.org/2012/05/31/the-boundary-mappers-cookbook/) to see how you can tie all these things together! + +## Key datasets + - base layers like agri http://agri.openstreetmap.org/, http://irs.gis-lab.info/ wms or http://www.gdal.org/frmt_wms_openstreetmap_tms.xml + ASGS including suburbs/postcodes + - andrewharvey4.wordpress.com postgis/asgs tutorial ## Wrangling -## Converting +### Converting There are many spatial data formats and often the one your tool requires is not the one the dataset is provided in Online - http://converter.mygeodata.eu/vector kml exporter for shp or locally using GDAL -## geocoding +### geocoding cloudmade, google (but you must display on a Google Map). + +Easiest way to do is with a Google Spreadsheet/Fusion Table http://williamparry.blogspot.com.au/2011/04/putting-data-into-google-fusion-tables.htm ## Analysis @@ -312,20 +276,22 @@ [![](http://www.govhack.org/wp-content/uploads/google_refine_interface.png "google_refine_interface")](http://www.govhack.org/wp-content/uploads/google_refine_interface.png)Clean up duplicate or inconsistent data entries. +Can also use general purpose tools; grep/awk/sed +regex http://www.regexper.com/ http://www.debuggex.com/?re=&str= + ## Analysis ### Excel / Calc Great basic analysis and viewing. Older versions can be limited to 6500\. or so rows. Eg [http://www.tcij.org/training-material/car/data-mining/3474](http://www.tcij.org/training-material/car/data-mining/3474) - ### PostgreSQL/MySQL [![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_209ee972.jpg "SQL screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_209ee972.jpg)Next step up, large datasets can be manipulated/extracted efficiently for example [http://www.postgresql.org/docs/8.4/static/tutorial-window.html](http://www.postgresql.org/docs/8.4/static/tutorial-window.html) , no built-in data visualisation though. ### [Miso Dataset](http://misoproject.com/dataset/) -[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38-293x300.png "miso screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38.png)Javascript data transformation library � especially good if you want to use the output for javascript interactive visualisations because the transformations can be done on-the-fly by users. +[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38-293x300.png "miso screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_m53b7ee38.png)Javascript data transformation library - especially good if you want to use the output for javascript interactive visualisations because the transformations can be done on-the-fly by users. ### R Statistical Language @@ -333,6 +299,11 @@ - http://blog.yhathq.com/posts/10-R-packages-I-wish-I-knew-about-earlier.html - excel -> R/rattle/ deducer? http://www.r-bloggers.com/updates-to-the-deducer-family-of-packages/ + - http://www.twotorials.com/ for R + - http://www.r-bloggers.com/gradient-word-clouds/ http://www.rstudio.com/shiny/ http://blog.ouseful.info/2012/11/28/quick-shiny-demo-exploring-nhs-winter-sit-rep-data/ https://github.com/timelyportfolio/shiny-d3-plot https://github.com/trestletech/shiny-sandbox/tree/master/grn + - http://www.r-bloggers.com/video-simpler-tricks-and-tools-help-debugging-git-latex-and-workflow-with-r-by-prof-rob-hyndman/ + - http://yihui.name/knitr/ makes reports including google widgets/charts/maps via http://www.r-bloggers.com/googlevis-0-3-2-is-released-better-integration-with-knitr/ + - http://chartsnthings.tumblr.com/post/36978271916/r-tutorial-simple-charts http://flowingdata.com/2012/12/17/getting-started-with-charts-in-r/ ## Visualisation @@ -351,18 +322,22 @@ d3 - http://datadrivenjournalism.net/resources/data_driven_documents_defined - http://www.benmcmahen.com/blog/posts/50eb57d55a94d35262000001 d3 svg + - d3 tools and tutorial http://enjalot.com/ http://news.ycombinator.com/item?id=4608440 + - Why d3 is the way it is and how to make charts http://bost.ocks.org/mike/chart/ + - how to make an xkcd chart http://bl.ocks.org/3914862 ### Processing.js # Unstructured (text documents, webpages, metadata, tweets etc) Data Tools -Scraperwiki +## wranglying +Scraperwiki pytemplate scrapy + Overviewer/ Jigsaw http://www.cc.gatech.edu/gvu/ii/jigsaw/ - - opennlp/nltk, lucene/solr - - http://www.r-bloggers.com/simple-text-mining-with-r/ - -R + - opennlp/nltk / https://github.com/clips/pattern + - lucene/solr + - http://www.r-bloggers.com/simple-text-mining-with-r/ - http://blog.josephwilk.net/ruby/latent-semantic-analysis-in-ruby.html similar terms usually found together # Graph (relationships and networks) Data Tools {#graph-relationships-and-networks-data-tools} @@ -374,11 +349,22 @@ ## Analysis -### Neo4j - -[![](http://www.govhack.org/wp-content/uploads/webadmin-data-300x127.png "Neo4\. web admin screenshot")](http://www.govhack.org/wp-content/uploads/webadmin-data.png)Help understand relationships � how is X connected to Y and via what other entities they both are connected to. Imports and exports - -can be done using a preexisting tool like Gremlin or by writing a simple Java/Python/Ruby application. Queries can be tested in the built in data browser. +### R + +- http://is-r.tumblr.com/post/38240018815/making-prettier-network-graphs-with-sna-and-igraph + + +### Neo4j / OrientDB + +[![](http://www.govhack.org/wp-content/uploads/webadmin-data-300x127.png "Neo4\. web admin screenshot")](http://www.govhack.org/wp-content/uploads/webadmin-data.png)Help understand relationships - how is X connected to Y and via what other entities they both are connected to. Imports and exports + + - http://www.slideshare.net/maxdemarzi/etl-into-neo4j + +http://www.orientdb.org/ + +Both can be accessed using a preexisting tool like Gremlin or by writing a simple Java/Python/Ruby application. Queries can be tested in the built in data browser. + + ### [NetworkX](http://networkx.lanl.gov/index.html) @@ -386,17 +372,18 @@ NetworkX is a social network analysis library for python. Many advanced analyses built in like finding communities within a graph. Also good for converting data into graphs. -### Palantir - -Palantir make a good computer forensics tool, which they will showcase and give GovHack attendees access to for GovHack data analysis purposes. For more information check out: - -[http://palantir.com.au/](http://palantir.com.au/) ## Visualisation +### Tree/Hierarchy visualisation + - don't use network viz if what you actually have is a tree/hierarchy with no interconnections http://www.randelshofer.ch/treeviz/ http://thejit.org/demos/ http://mbostock.github.com/protovis/ex/treemap.html http://blog.pixelingene.com/2011/07/building-a-tree-diagram-in-d3-js/d3 for Trees and Hierarchies + http://mbostock.github.com/d3/ex/pack.html http://mbostock.github.com/d3/ex/tree.html + +### NodeXL for Microsoft Excel + - http://nodexl.codeplex.com/ network graphs for excel ### [Graphviz](http://www.graphviz.org/) -[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d-300x184.png "Graphviz Screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d.png)Classic directed graph visualisation tool, can even [generate images online without installing](http://ashitani.jp/gv/) or use in webpages with [javascript port of software](http://code.google.com/p/canviz/). File format [�dot� very easy to learn](http://en.wikipedia.org/wiki/DOT_language) +[![](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d-300x184.png "Graphviz Screenshot")](http://www.govhack.org/wp-content/uploads/How-to-participate-in-GovHack_html_7579906d.png)Classic directed graph visualisation tool, can even [generate images online without installing](http://ashitani.jp/gv/) or use in webpages with [javascript port of software](http://code.google.com/p/canviz/). File format ["dot" very easy to learn](http://en.wikipedia.org/wiki/DOT_language) ### Gephi