Difference between revisions of "Datasets"
m |
m |
||
| Line 197: | Line 197: | ||
<youtube>av8zkywXHSM</youtube> | <youtube>av8zkywXHSM</youtube> | ||
<b>VRmeta: Generating AI datasets one precise meta-tag at a time | <b>VRmeta: Generating AI datasets one precise meta-tag at a time | ||
| − | </b><br>VRmeta is the world's most precise means of adding time-based descriptive metadata to both digital and immersive video. Whether the goal is to create unmatched discoverability for your entire video library, leverage metrics from all that amazing content or license those clips to increase their inbound revenue - VRmeta makes it happen Today’s consumers have more choices than ever before for video entertainment and viewing platforms With this explosion of choice has come complexity. Finding engaging entertainment has become a time consuming and frustrating, resulting in declining engagement and viewer satisfaction The key to overcoming this discovery challenge lies in rich, time-based descriptive metadata VRmeta is your gateway to making this happen: VRmeta's patent-pending cross-hair and tactile navigation technology gives users the most precise means of applying metadata ever created VRmeta gives every clip it touches time and in-frame location data registered with in and out points, all saved into .csv and .xmp sidecar files VRmeta delivers AI precision now. VRmeta even learns your tagging vocabulary, offering users auto-completion for frequently used words and names By applying time-based descriptive metadata at the production level, stakeholders create additional value at every stage of the video content lifecycle VRmeta stands firmly at the nexus of artificial intelligence and healthcare, and is a recognized state-of-the-art solution central to the [[development]] of emotional AI datasets The science surrounding | + | </b><br>VRmeta is the world's most precise means of adding time-based descriptive metadata to both digital and immersive video. Whether the goal is to create unmatched discoverability for your entire video library, leverage metrics from all that amazing content or license those clips to increase their inbound revenue - VRmeta makes it happen Today’s consumers have more choices than ever before for video entertainment and viewing platforms With this explosion of choice has come complexity. Finding engaging entertainment has become a time consuming and frustrating, resulting in declining engagement and viewer satisfaction The key to overcoming this discovery challenge lies in rich, time-based descriptive metadata VRmeta is your gateway to making this happen: VRmeta's patent-pending cross-hair and tactile navigation technology gives users the most precise means of applying metadata ever created VRmeta gives every clip it touches time and in-frame location data registered with in and out points, all saved into .csv and .xmp sidecar files VRmeta delivers AI precision now. VRmeta even learns your tagging vocabulary, offering users auto-completion for frequently used words and names By applying time-based descriptive metadata at the production level, stakeholders create additional value at every stage of the video content lifecycle VRmeta stands firmly at the nexus of artificial intelligence and healthcare, and is a recognized state-of-the-art solution central to the [[development]] of emotional AI datasets The science surrounding [[Sentiment Analysis]] involves natural language processing or linguistic algorithms that assign values to positive, negative or neutral text (converting supposition into monetizable data silos). VRmeta is the ideal method for inputting this data VRmeta is the tool of choice for broadcasters looking to develop information rich, statistical data silos for any variety of sports. Think team and player performance aggregate, post-game data and deep dive statistic [[development]] "Great content without accurate metadata is, after all, a missed opportunity" |
|} | |} | ||
|}<!-- B --> | |}<!-- B --> | ||
Revision as of 20:52, 9 July 2023
YouTube ... Quora ...Google search ...Google News ...Bing News
- Data Science ... Governance ... Preprocessing ... Exploration ... Interoperability ... Master Data Management (MDM) ... Bias and Variances ... Benchmarks ... Datasets
- Excel ... Documents ... Database ... Graph ... LlamaIndex
- Data Quality ...validity, accuracy, cleaning, completeness, consistency, encoding, padding, augmentation, labeling, auto-tagging, normalization, standardization, and imbalanced data
- AI Governance / Algorithm Administration
- Managed Vocabularies
- Analytics ... Visualization ... Graphical Tools ... Diagrams & Business Analysis ... Requirements ... Loop ... Bayes ... Network Pattern
- Development ... Notebooks ... AI Pair Programming ... Codeless, Generators, Drag n' Drop ... AIOps/MLOps ... AIaaS/MLaaS
- Facets | Google...contains two robust Visualizations to aid in understanding and analyzing machine learning datasets.
- Hyperparameters
- Evaluation ... Prompts for assessing AI projects
- Train, Validate, and Test
- OpenML datasets
- Datasets and Machine Learning | Chris Nicholson - A.I. Wiki pathmind
- Datasets used in deep learning applications within X-ray security imaging | Towards Automatic Threat Detection: A Survey of Advances of Deep Learning within X-ray Security Imaging | Samet Akcay and Toby P. Breckon - Durham University, UK
Datasets (often in combination with algorithms) are becoming more important themselves and can sometimes be seen as the primary intellectual output of the research. The revelations about Cambridge Analytica highlights the importance of datasets and data collection. Reference also: Privacy
Sources
- MLCommons ...MLCommons debuts with public 86,000-hour speech data set for AI researchers | Devin Coldewey - TechCrunch
- Question Answering in Context (QuAC) ...Question Answering in context for modeling, understanding, and participating in information seeking dialog.
- Tatoeba a collection of sentences and translations - Tab-delimited Bilingual Sentence Pairs
- Kaggle Datasets
- COVID-19 Open Research Dataset (CORD-19) ...COVID-19
- UC Irvine Machine Learning Repository
- MNIST database
- Collections | DataHub
- Registry of Open Data on AWS | Amazon
- Public Data | Google
- BigQuery public datasets | Google
- Open Images | Google
- Data Science for Research | Microsoft
- Datasets for Data Mining and Data Science | KDnuggets
- Enigma Public
- A Comprehensive List of Open Data Portals from Around the World | DataPortals.org
- OpenDataSoft
- World Data Atlas | Knoema
- The Open Machine Learning project | OpenML.org
- World's Free Online Data | Research Pipeline
- List of datasets for machine learning research | Wikipedia
- Neural Net Repository | Wolfram
- Open Data for Deep Learning & Machine Learning | 4j
- Data Catalog | Data.gov
- 3D-Machine-Learning | GitHub
- Wind Turbine Map and Database | USGS & DOE
- Autosomal DNA
- Pascal Visual Object Classes Challenge (VOC)
- OpenNASA
- Data: Close encounters between two objects |European Space Agency (ESA)
- JASA Data Archive | Journal of the American Statistical Association
- Datasets Archive | Journal of the American Statistical Association
- Data.World
- The Dataset Collection | Archive.org
- Collections |Archive-it.org
- Eurostat | EU statistical office
- Re3data
- Resource on data and metadata standards - open research data | FAIRsharing
- List of Public Data Sources Fit for Machine Learning | bigml
- Open Datasets | Skymind
- Global Health Observatory resources | World Health Organization (WHO)
- CDC WONDER | Center for Disease Control (CDC)
- US health insurance program | Medicare
- International economy |International Monetary Fund (IMF)
- Data Catalog }| The World Bank
- Financial and economic | Quandl
- PublicDomains | GitHub
- datasets and related content | BuzzFeed - GitHub
- Sports, politics, economics, and other spheres of life | FiveThirtyEight
- EMBER; benign and malicious Windows-portable executable files | Endgame - GitHub
- r/datasets | reddit
- Microsoft Information-Seeking Conversation (MISC) - audio and video signals; transcripts of conversation
- Language-Independent Named Entity Recognition (II)
- VGG | Oxford
- Perfect-500K beauty and personal care
- Mozilla’s Common Voice project collect human voices
- CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. | A. Krizhevsky, V. Nair, and G. Hinton - Canadian Institute For Advanced Research]
Networks
- Bidirectional Encoder Representations from Transformers (BERT)
- ResNet-50
- ImageNet | Wikipedia
- AlexNet | Wikipedia
- WordNet
Articles
- Microsoft Scraps 10 Million Facial Recognition Photos On The Low | Kori Hale -Forbes
- The 50 Best Free Datasets for Machine Learning | Meiryum Ali - Gengo AI
- The 50 Best Public Datasets for Machine Learning | Stacy Stanford - Medium
- Best Public Datasets for Machine Learning and Data Science: Sources and Advice on the Choice | Altexsoft
- 25 Open Datasets for Deep Learning Every Data Scientist Must Work With | PRANAV DAR - Analytics Vidhya
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Human in the Loop...
- Amazon Mechanical Turk (MTurk) - Using MTurk with Amazon SageMaker for Supervised Learning (ML)
- Gengo.ai - high-quality multilingual data with a human touch for machine learning
- Figure Eight CrowdFlower AI - build a state-of-the-art machine learning model trained with human labeled data