Difference between revisions of "Datasets"

From
Jump to: navigation, search
Line 8: Line 8:
 
* [http://mlr.cs.umass.edu/ml/ UC Irvine Machine Learning Repository]  
 
* [http://mlr.cs.umass.edu/ml/ UC Irvine Machine Learning Repository]  
 
* [http://yann.lecun.com/exdb/mnist/ MNIST database]
 
* [http://yann.lecun.com/exdb/mnist/ MNIST database]
* [http://registry.opendata.aws/ Registry of Open Data | on AWS]
+
* [http://public.enigma.com/ Enigma Public]
 +
* [http://registry.opendata.aws/ Registry of Open Data on AWS | Amazon]
 +
* [http://www.google.com/publicdata/directory Public Data | Google]
 
* [http://storage.googleapis.com/openimages/web/index.html Open Images | Google]
 
* [http://storage.googleapis.com/openimages/web/index.html Open Images | Google]
 +
* [http://www.microsoft.com/en-us/research/academic-program/data-science-microsoft-research/ Data Science for Research | Microsoft]
 +
* [http://www.kdnuggets.com/datasets/index.html Datasets for Data Mining and Data Science | KDnuggets]
 
* [http://www.openml.org/search?type=data The Open Machine Learning project | OpenML.org]
 
* [http://www.openml.org/search?type=data The Open Machine Learning project | OpenML.org]
 
* [http://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research Datasets | Wikipedia]
 
* [http://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research Datasets | Wikipedia]
Line 16: Line 20:
 
* [http://www.usgs.gov/news/us-geological-survey-and-us-department-energy-release-online-public-dataset-and-viewer-us-wind Wind Turbine Map and Database | USGS & DOE]
 
* [http://www.usgs.gov/news/us-geological-survey-and-us-department-energy-release-online-public-dataset-and-viewer-us-wind Wind Turbine Map and Database | USGS & DOE]
 
* [http://isogg.org/wiki/Autosomal_DNA_testing_comparison_chart Autosomal DNA]
 
* [http://isogg.org/wiki/Autosomal_DNA_testing_comparison_chart Autosomal DNA]
 +
* [http://github.com/awesomedata/awesome-public-datasets#publicdomains PublicDomains - GitHub]
 
* [http://github.com/endgameinc/ember EMBER; benign and malicious Windows-portable executable files | Endgame]
 
* [http://github.com/endgameinc/ember EMBER; benign and malicious Windows-portable executable files | Endgame]
 
* [http://host.robots.ox.ac.uk/pascal/VOC Pascal Visual Object Classes Challenge (VOC)]
 
* [http://host.robots.ox.ac.uk/pascal/VOC Pascal Visual Object Classes Challenge (VOC)]
 
* [http://open.nasa.gov/ OpenNASA]
 
* [http://open.nasa.gov/ OpenNASA]
 +
* [http://lib.stat.cmu.edu/jasadata/  JASA Data Archive | Journal of the American Statistical Association]
 +
* [http://lib.stat.cmu.edu/datasets/ Datasets Archive | Journal of the American Statistical Association]
 +
* [http://data.world/ Data.World]
 +
* [http://archive.org/details/datasets The Dataset Collection | Archive.org]
 +
* [http://www.archive-it.org/explore?show=Collections Collections |Archive-it.org]
 +
  
 
== Articles ==
 
== Articles ==
* [http://gengo.ai/datasets/the-50-best-free-datasets-for-machine-learning/  The 50 Best Free Datasets for Machine Learning | Meiryum Ali]
+
* [http://gengo.ai/datasets/the-50-best-free-datasets-for-machine-learning/  The 50 Best Free Datasets for Machine Learning | Meiryum Ali - Gengo AI]
 
* [http://medium.com/datadriveninvestor/the-50-best-public-datasets-for-machine-learning-d80e9f030279 The 50 Best Public Datasets for Machine Learning | Stacy Stanford - Medium] 
 
* [http://medium.com/datadriveninvestor/the-50-best-public-datasets-for-machine-learning-d80e9f030279 The 50 Best Public Datasets for Machine Learning | Stacy Stanford - Medium] 
  

Revision as of 10:40, 9 January 2019

YouTube search... ...Google search

Datasets (often in combination with algorithms) are becoming more important themselves and can sometimes be seen as the primary intellectual output of the research. The revelations about Cambridge Analytica highlights the importance of datasets and data collection. Reference also: Privacy in Data Science

Sources


Articles