Show simple item record

dc.contributor SICSA - Scottish Informatics and Computer Science Alliance
dc.contributor Pareti, Paolo
dc.creator Pareti, Paolo
dc.creator Klein, Ewan H.
dc.date 2016-04-29T14:33:47Z
dc.date 2016-04-29T14:33:47Z
dc.date.accessioned 2023-02-17T20:51:55Z
dc.date.available 2023-02-17T20:51:55Z
dc.identifier Pareti, Paolo; Klein, Ewan H.. (2016). The Human Know-How Dataset, 2014 [dataset]. https://doi.org/10.7488/ds/1394.
dc.identifier https://hdl.handle.net/10283/1985
dc.identifier https://doi.org/10.7488/ds/1394
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/243942
dc.description The Human Know-How Dataset describes 211,696 human activities from many different domains. These activities are decomposed into 2,609,236 entities (each with an English textual label). These entities represent over two million actions and half a million pre-requisites. Actions are interconnected both according to their dependencies (temporal/logical orders between actions) and decompositions (decomposition of complex actions into simpler ones). This dataset has been integrated with DBpedia (259,568 links). For more information see: - The project website: http://homepages.inf.ed.ac.uk/s1054760/prohow/index.htm - The data is also available on datahub: https://datahub.io/dataset/human-activities-and-instructions ---------------------------------------------------------------- * Quickstart: if you want to experiment with the most high-quality data before downloading all the datasets, download the file "9of11_knowhow_wikihow", and optionally files "Process - Inputs", "Process - Outputs", "Process - Step Links" and "wikiHow categories hierarchy". * Data representation based on the PROHOW vocabulary: http://w3id.org/prohow# Data extracted from existing web resources is linked to the original resources using the Open Annotation specification * Data Model: an example of how the data is represented within the datasets is available in the attached Data Model PDF file. The attached example represents a simple set of instructions, but instructions in the dataset can have more complex structures. For example, instructions could have multiple methods, steps could have further sub-steps, and complex requirements could be decomposed into sub-requirements. ---------------------------------------------------------------- Statistics: * 211,696: number of instructions. From wikiHow: 167,232 (datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow). From Snapguide: 44,464 (datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide). * 2,609,236: number of RDF nodes within the instructions From wikiHow: 1,871,468 (datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow). From Snapguide: 737,768 (datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide). * 255,101: number of process inputs linked to 8,453 distinct DBpedia concepts (dataset Process - Inputs) * 4,467: number of process outputs linked to 3,439 distinct DBpedia concepts (dataset Process - Outputs) * 376,795: number of step links between 114,166 different sets of instructions (dataset Process - Step Links)
dc.description Instruction datasets: * Datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow contain instructions from wikiHow. Instructions are allocated in the datasets in order of popularity. This means that the most popular and high-quality instructions are found in 9of11_knowhow_wikihow, while the least popular ones are in dataset 1of11_knowhow_wikihow. These instructions are also classified according to the hierarchy found in wikiHow categories hierarchy. * Datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide contain instructions from Snapguide. Instructions coming from Snapguide are not sorted by their popularity. Links datasets: * The Process - Inputs datasets contain detailed information about the inputs of the sets of instructions, including links to DBpedia resources * The Process - Outputs datasets contains detailed information about the outputs of the sets of instructions, including links to DBpedia resources * The Process - Step Links datasets contains links between different sets of instructions Other datasets: *The wikiHow categories hierarchy dataset contains information on how the various wikiHow categories are hierarchically structured
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/pdf
dc.language eng
dc.relation https://datahub.io/dataset/human-activities-and-instructions
dc.relation https://doi.org/10.1007/978-3-319-13704-9_30
dc.relation Pareti P, Testu B, Ryutaro I, Klein E, Barker A "Integrating Know-How into the Linked Data Cloud", chapter in "Knowledge Engineering and Knowledge Management", Volume 8876 of the series Lecture Notes in Computer Science pp 385-396 http://link.springer.com/chapter/10.1007%2F978-3-319-13704-9_30
dc.rights Dataset released under the Creative Commons Attribution-NonCommercial 4.0 International licence: http://creativecommons.org/licenses/by-nc/4.0/ Attribution to this dataset should be given by citing the following publication (https://doi.org/10.1007/978-3-319-13704-9_30): Paolo Pareti, Benoit Testu, Ryutaro Ichise, Ewan Klein and Adam Barker. Integrating Know-How into the Linked Data Cloud. Knowledge Engineering and Knowledge Management, volume 8876 of Lecture Notes in Computer Science, pages 385-396. Springer International Publishing (2014) N.B. the reason for the 'non-commercial use only' restriction is that part of the data comes from wikiHow and Snapguide, which do not allow the reuse of their data for commercial purposes.
dc.source http://www.wikihow.com/
dc.source https://snapguide.com/
dc.subject Linked Data
dc.subject Common Sense Reasoning
dc.subject Know-How
dc.subject Human Activities
dc.subject Instructions
dc.subject Procedures
dc.subject Processes
dc.subject Workflows
dc.subject Semantic Web
dc.subject Mathematical and Computer Sciences
dc.title The Human Know-How Dataset
dc.title The Web of Know-How: Human Activities and Instructions
dc.type dataset
dc.coverage start=2014-06-16; end=2014-07-16; scheme=W3C-DTF


Files in this item

Files Size Format View
10of11_knowhow_snapguide.zip 65.98Mb application/zip View/Open
11of11_knowhow_snapguide.zip 45.38Mb application/zip View/Open
1of11_knowhow_wikihow.zip 94.46Mb application/zip View/Open
2of11_knowhow_wikihow.zip 21.43Mb application/zip View/Open
3of11_knowhow_wikihow.zip 73.19Mb application/zip View/Open
4of11_knowhow_wikihow.zip 97.39Mb application/zip View/Open
5of11_knowhow_wikihow.zip 15.58Mb application/zip View/Open
6of11_knowhow_wikihow.zip 20.75Mb application/zip View/Open
7of11_knowhow_wikihow.zip 13.70Mb application/zip View/Open
8of11_knowhow_wikihow.zip 9.891Mb application/zip View/Open
9of11_knowhow_wikihow.zip 20.63Mb application/zip View/Open
process_inputs.zip 5.633Mb application/zip View/Open
process_outputs.zip 297.4Kb application/zip View/Open
process_step_links.zip 6.049Mb application/zip View/Open
PROHOW_DataModel_Example.pdf 61.01Kb application/pdf View/Open
wikiHow_categories_hierarchy.zip 31.25Kb application/zip View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse